Friday, January 3, 2014

SQL Connection Cache Size for OpenText eDOCS DM Server

The SQL Connection Cache Size value is one of the DM Server configuration values. You can set it in DM Server Manager on Libraries / Library Properties page.

The default value of this parameter is 5. However, I found quite a bit of speculation about increasing it to 150 and higher values. One of the sources is "OpenText eDOCS DM Implementation Best Practice Recommendations.pdf". I don't remember where I found this document, but I believe it's from OpenText eDOCS DM Champion Toolkit. My other source of information was a local consultant firm, which specializes on Hummingbird / OpenText solutions - they also recommended increasing the value to 50-100 at least.

After a few months of supporting OpenText DM one of the lessons I've learned is: Don't take as granted what 'experienced' guys say; check it out.

In the nutshell, SQL Connection Cache Size is the size of the database connection pool. It's not a secret that establishing the connection with a database (or any connection in general) is a time expensive operation. That's why pools of connections are used: once established, a connection is kept in the pool and acquired when it's needed to query the database. Once a query is processed, the connection is returned back to the pool. Connections from the pool are shared among concurrent users of DM Server.

Here is a passage from eDOCS DM 5.3.1 Administration Guide (and unlike my opinion about other sources of information given above, I tend to trust this one):

"The SQL-Connection Cache Size parameter specifies the minimum number of connection objects within the DM server. This value does not represent the actual number of SQL connections. If the number of SQL connections required increases above the specified value (to a maximum of 50), then the DM server instantiates more connection objects. However, these additional connections are dropped when processing completes, and the number of connections return to the value specified in the SQL-Connection Cache Size parameter.
[...]
To achieve the optimum value for the SQL connection cache, you have to strike a balance between minimizing the resources used by idle SQL connections and maximizing the frequency with which users are able to use an open connection."

So, what is the optimal value for SQL Connection Cache Size? It should be tested for your current environment. Here is the approach I use:
  1. Collect SQL logs on your DM server for one business day.
  2. Check how many connections were used.
#1 should be obvious. If not, here is a hint: enable logging in DM Server Manager, the Preferences tab.
For #2, I know two different approaches:

a) you can manually search through the log for phrases like 'acquired existing connection from pool #1' and 'acquired new connection for existing Pool #3' and figure out what the maximum connection number in the pool is and how often connections are recreated.

b) you can convert a daily log into XML and then filter / group / sort it in a tool of your preference. For example:
Microsoft Excel is a simplest solution. It works perfectly when the XML log size is less than 10 Mb. All you need to do is to load the XML log file produced by EDocsLogParser into Excel and then invoke the Filter popup for the connection column:

In this example, the maximum number you see is 6 - the server didn't use more than six connections simultaneously. If this is an average day, set Cache Size on this server to 10 or 12 - it should be just right.

If a daily XML log file is too large for Excel, apply the following transformation to it and load it into your Web Browser:

<?xml version="1.0" encoding="ISO-8859-1"?>
<!-- Unique_Connections.xsl -->
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  <xsl:output encoding="UTF-8" indent="yes" method="xml" version="1.0"> 
  <xsl:key match="BaseEvent" name="distinctConnection" use="@connection">

  <xsl:template match="ArrayOfBaseEvent">
    <connections>
    <xsl:for-each select="BaseEvent[generate-id() = generate-id(key('distinctConnection', @connection)[1])]">
      <xsl:sort select="@connection">
      <unique>
        <xsl:value-of select="@connection">
      </xsl:value-of></unique>
    </xsl:sort></xsl:for-each>
    </connections>
  </xsl:template>
</xsl:key></xsl:output></xsl:stylesheet>

Like this (please pay attention to the second line):

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="unique_connections.xsl"?>
<ArrayOfBaseEvent xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
  <BaseEvent xsi:type="SqlEvent" key="0693D438" connection="1" isNew="False">
...


And here is the result you are expected to see in your Web browser:



If your XML log file is too big for the Web browser (IE11 has failed to display a 40 Mb XML file on my machine), you can use the PowerShell script to apply Unique_Connections.xsl to your XML log.

Once you find the maximum connection # value in the log, increase it by 80-100% and set the result to SQL Connection Cache Size. Check logs again after a day with the highest load on the DM server from the business.

No comments:

Post a Comment