How to read large SQL Server ERRORLOG (the 5GB test)

I often come across large ERRORLOGs mainly because they are not regularly recycled. Sometimes, the ERRORLOG can grow to an unmanageable size despite regular recycling. This happened to me during a significant transaction rollback over Synchronous Availability Group. The ERRORLOG log grew to 5GB in size within a few hours due to “Sending page request for Undo” messages.

The primary node failure triggered the rollback during an insert of 800 GB into a partitioned table. The failure caused the primary node to reboot, triggering failover to the secondary replica, and, of course, after the secondary node came back up, the rollback started. At that point, I needed to see what errors and failures were registered in the ERRORLOG during the service recovery, which proved difficult.

Using SSMS / Log File Viewer

Unless you use a log management solution such as Azure Log Analytics, the first course of action for most DBAs is to open SQL Server Log Viewer within the SSMS. In my case, however, it gave up quickly:

SSMS ERRORLOG 5GB Test

The out-of-memory exception is caused by the amount of data returned to the client. We have to be more clever. Let’s stop the loading right after opening the viewer using the “Stop” button on the top bar, apply the filter and then hit “Refresh”. This should return fewer rows and will work…, right?

SSMS ERRORLOG 5GB Test

Not much luck, either. Let’s look into alternative ways.

Using T-SQL

Behind the scenes, SSMS uses the dbo.xp_readerrorlog and runs the following command to read the ERRORLOG:

EXEC sp_executesql 
	N'EXEC master.dbo.xp_readerrorlog 
		@ArchiveID
		, 1
		, @FilterText
		, NULL
		, @FirstEntry
		, @LastEntry
		, N''desc''
		, @InstanceName '
    
	,N'@InstanceName nvarchar(4000)
		, @ArchiveID int
		, @FilterText nvarchar(4000)
		, @FirstEntry nvarchar(4000)
		, @LastEntry nvarchar(4000)'

    ,@InstanceName = NULL
    ,@ArchiveID = 3
    ,@FilterText = NULL
    ,@FirstEntry = NULL
    ,@LastEntry = NULL

The procedure dbo.xp_readerrorlog Accepts several parameters, including a filter. With this in mind, we can execute it in SSMS as a T-SQL code:

exec sp_executesql 
    N'EXEC master.dbo.xp_readerrorlog 
		@ArchiveID
		, 1
		, @FilterText
		, NULL
		, @FirstEntry
		, @LastEntry
		, N''desc''
		, @InstanceName '
		
	,N'@InstanceName nvarchar(4000)
		, @ArchiveID int
		, @FilterText nvarchar(4000)
		, @FirstEntry nvarchar(4000)
		, @LastEntry nvarchar(4000)'

    ,@InstanceName=NULL
    ,@ArchiveID=3
    ,@FilterText='ERROR' --our filter
    ,@FirstEntry=NULL
    ,@LastEntry=NULL

This has taken just under 30 seconds to scan through the 5GB log file and returned 387 rows that contain the word “ERROR”. Great, it gives me something, but on the other hand, the output isn’t particularly useful without the corresponding rows:

xp_readerrorlog

So, what better alternatives do we have to get any sensible information out of a large log file?

Using SQL Server Management Object (SMO)

We can use PowerShell to interrogate ERRORLOG via SMO:

Import-Module SQLPS
$SqlServer = new-object Microsoft.SqlServer.Management.Smo.Server  "localhost"
Clear-Host
$measure = Measure-Command {$SqlServer.ReadErrorLog(3) | ?{$_.Text -like "*ERROR*"} | Out-GridView}

Using the Measure-Command function, we can measure the execution time of a particular command or function. In this case, the above takes TotalMilliseconds : 1023529.8419 (over 17 minutes!) to run and eventually returns 387 rows into a GridView which we can filter further:

ERRORLOG SMO GridView

The slow performance is caused by the fact that it has to iterate through all the rows in the 5GB file to find those matching our ERROR filter. This is bound to be slow, and the result is the same as in the T-SQL example above.

Using text reader

The ERRORLOG is a text file on a local disk in the Log folder. We could open it in notepad, but that’s not a very convenient way to search through many text lines. We can also use PowerShell to read a text file using a highly efficient Select-String function.

# Here we are using SMO to only get path to the ERRORLOG:
Import-Module SQLPS
$SqlServer = new-object Microsoft.SqlServer.Management.Smo.Server  "localhost"
$ErrorLog = $SqlServer.ErrorLogPath
$LogNo = 3 #which ERROLOG do we want
Clear-Host
$measure = Measure-Command {Select-String -Path "$ErrorLogERRORLOG.$LogNo" -Pattern "ERROR" | Out-GridView}

In this case, the GridView opens instantly and loads filtered data in the background. In all earlier examples, we had to wait for the data set to be filtered and returned to us. In this example, we have access to the first filtered rows immediately!

ERRORLOG TEXT GridView

Eventually, the data load completes in the background after TotalMilliseconds : 21949.6703 (22 seconds!)

However, the Select-String function has a few other essential features. For example, it has a concept of exclusion and “context” line and can return lines before and after our match:

# Here we are using SMO to only get path to the ERRORLOG:
Import-Module SQLPS
$SqlServer = new-object Microsoft.SqlServer.Management.Smo.Server  "localhost"
$ErrorLog = $SqlServer.ErrorLogPath
$LogNo = 3 #which ERROLOG do we want

Clear-Host
Select-String -Path "$ErrorLogERRORLOG.$LogNo" -Pattern "Error: 35286" -Context 1, 1

Returns:

> 4115:2018-02-18 16:23:33.38 spid75s     Error: 35286, Severity: 16, State: 1.
  4116:2018-02-18 16:23:33.38 spid75s     Using the recovery LSN (1536:50088:1) stored in the metadata for the database with ID 14. This is an informational message only. No user action is required.
> 4127:2018-02-18 16:23:33.82 spid212s    Error: 35286, Severity: 16, State: 1.
  4128:2018-02-18 16:23:33.82 spid212s    Using the recovery LSN (2702:305880:1) stored in the metadata for the database with ID 35. This is an informational message only. No user action is required.x
> 4860:2018-02-18 16:23:37.96 spid201s    Error: 35286, Severity: 16, State: 1.
  4861:2018-02-18 16:23:37.96 spid201s    Using the recovery LSN (519:448856:1) stored in the metadata for the database with ID 18. This is an informational message only. No user action is required.
> 4945:2018-02-18 16:23:38.07 spid37s     Error: 35286, Severity: 16, State: 1.
  4946:2018-02-18 16:23:38.07 spid37s     Using the recovery LSN (1553:351536:1) stored in the metadata for the database with ID 20. This is an informational message only. No user action is required.
> 4966:2018-02-18 16:23:39.07 spid236s    Error: 35286, Severity: 16, State: 1.
  4967:2018-02-18 16:23:39.07 spid236s    Using the recovery LSN (31538:188792:1) stored in the metadata for the database with ID 21. This is an informational message only. No user action is required.

Conclusion

The winner in the performance and functionality category is the Select-String function takes over 20 seconds to find matching rows in a 5GB file. The native T-SQL takes under 30 seconds to do the same, and the slowest is the SMO method taking over 17 minutes! Reading ERRORLOG as plain text may not be the easiest way and has its limitations. For example, it will only work on localhost unless we get access to the remote ERRORLOG via UNC path. However, it is the most efficient and flexible way, especially if you have big files that you’d like to run some regular expression.

Help us grow, please share this content!

Author

Posted by
Marcin Gminski
October 19, 2018

Let us help you. Get in touch.

Help us grow, please share this content!