Thursday, January 2, 2014

How to Find Slow SQL Queries in eDOCS DM Server Logs

As a matter of performance optimization of an Hummingbird / OpenText eDOCS DM environment, I tried to detect performance issues and I started my research by checking the backend, i.e. the database. If I were a DBA, I would probably use monitoring and performance profiling tools on the database side. Unfortunately, I don't have that privileged level of access to the database as a DBA has. What I have is the access to DM servers and the DOCSADM login to the database. This should be enough to find the slowest SQL queries and to try to fine-tune performance.



eDOCS DM Server can create log files of SQL queries that it sends to the database. It also measures how much time it was required for the database server to process a given query. Hence, my first step was to enable the SQL & Calls logging on DM Server.

While server logs were collected, I've created a parser that converts raw logs to XML. I copied the server logs to a local folder and ran EDocsLogParser against it.

Now I have a set of XML files with the required data: queries and timings. The next step is to extract "slow" queries. I decided to write a PowerShell script that uses XPath filters and saves the results into a CSV file. Here is some code from the script (click here to download the Find_Slow.ps1 script):

#Find_Slow.ps1
$found = (Select-Xml -path $path\*.xml -xpath 
    "//SqlQueryEvent[@readItemDuration > $threshold]")

$results =@()

foreach($f in $found) {
  $dur = (Select-Xml -xml $f.Node -xpath "@readItemDuration")
  $cmd = (Select-Xml -xml $f.Node -xpath "Command")

  $qry = new-object PSObject
  $qry | add-member -membertype NoteProperty -name "ReadItemDuration" -Value $dur
  $qry | add-member -membertype NoteProperty -name "Command" -Value $cmd
  $results += $qry
}

$results | Export-csv $path\Results.csv -NoTypeInfo

The script takes a path to the folder with XML files and an optional "-threshold" parameter that defines how many seconds a query must take to be considered "slow":

The script creates a Results.csv file. Let's open it in Excel and sort by Command and then by ReadItemDuration column:

And this is exactly what I wanted to find: One slow query that is called over and over again! This means that the performance issue exists and it may be worthy to spend more time on fixing it.

No comments:

Post a Comment