Wednesday, December 25, 2013

eDOCS DM Server Log Parser - Part 1: Understanding the Task

A good thing about eDOCS DM is that DM Server can create log files, which give you idea of what is going on behind the scene.
Thus, you can find information about ActiveX object calls and parameters, SQL queries, the time they took, database connections being taken from and released to the connection pool, etc.

The bad thing about server logs is that their format is useless for computer processing. For example, if you want to find the slowest SQL queries, or you want to understand the load distribution by work hours, or perform any statistical analysis, you cannot use the logs as-is despite of the fact that they already contain the necessary information. If logs were in XML format, you could do the search, data filtering and grouping via XPath. Unfortunately, eDOCS DM logs are bad structured text files and aren't ready for machine processing.

I contacted OpenText Support and brought this issue to their attention. Possibly they will rethink the log format in the future. Well, I have a performance optimization task on my plate now, that is, I need a solution right away. I've decided to try to parse DM Server logs and convert them into XML for further processing. As the first step I'll focus on SQL logs, but will keep in mind that there can be a mixture of other types of entries in the same file (e.g. Calls).

Here is a very simple example of what one may see in a log from DM Server (my comments are in yellow):
Hummingbird DM / OpenText eDOCS DM Server Log - SQL block, simple

A SQL block begins with an entry about a new database connection being either created or acquired from the pool of connections. The block ends when the connection is released back to pool.

The block may contain one or more SQL statements. For some statements you can get timing information. SQL blocks may be nested, when another database connection appears and ends within an outer block. You can also find intersected SQL blocks, where connection #1 starts, then connection #2 starts, then connection #1 ends, and finally connection #2 ends. When you start thinking how it can be parsed effectively, it's a real fun :-)

See Also:
eDOCS DM Server Log Parser - Part 2: Implementation
eDOCS DM Server Log Parser - Part 3: Usage

No comments:

Post a Comment