Hello,
We are maintaining an internal ASP.NET v2.0 website which is quite big and already in production. The underlying SQL Server 2005 database contains 350+ tables.
Recently, we have been asked to implement a new feature which seems functionally quite simple. We have to track every single data modification, which includes insertions, deletions and modifications. This information should be presented to power users in the form of readable strings right in an admin section of our website.
Our team of architects is working on a way to make it possible without putting the SQL Server to a crawl. One thing is for sure, SQL Server 2005 already does the job through its transaction log. It should be a good idea to use it directly instead of managing our own log based on triggers. Why put more pressure on the server to write data that is already logged by the database engine? We have heard that Microsoft's SQL Server team do not support this concept and are wondering why...
It's quite easy to find queries on the web that output very useful information such as date of transactions and what they have done. Although, the data involved in those transactions seems to be stored in a binary field which can be retrived using this query: SELECT "log record" FROM ::fn_dblog(null,null)
3rd parties such as Apex SQL are already doing a great job at decrypting it for us. This is very useful but not efficient since those tools do a very generic job. We would like to optimize it for our needs. All we need to know is who made the modifications, when, in which tables and what are the new values.
We believe that we would have to decrypt the "log record" field from the ::fn_dblog(null, null) table. Is there any way to get basic documentation about how to do it?
Thanks!
Marc Lacoursiere
RooSoft Computing
The format of the transaction log is undocumented, as it will change from release to release.Thanks,|||
Hi,
I'm working with Marc on that point and I would like to know if there is or if Microsoft expect to expose an interface that can allow us to read the transaction log. What are the plans for futur releases.
Thanks
|||The transaction log contains physical information that are often just blocks of bytes which are meaningless in terms of the DDL or DML that instigated them. The log was never intended for audit purposes and really should not be used that way.|||Thanks for your Post Peter,
I understand that the transaction log purpose is for recovery only but you must admit that it could be a great way to audit changes without adding any overhead to the server.
What we would like to do is taking a transaction log backup on a regulary basis and read those files as they cumulated in the directory. This will allow us to log any DML changes into a Log table located on another server.
I also understand that the online transaction log format can change from one version to another but correct me if I'm wrong, the transaction log backup is compatible between SQL 2000 and SQL 2005 that means that the format of a .trn file should remain compatible as SQL Server evoluate.
thanks for your feedback|||Adding the information needed to use the log for auditing would add significant overhead which is one of the reasons it has not been done.
You might look at the Change Data Capture functionality in the upcoming SQL Server 2008 release. http://connect.microsoft.com/sqlserver/ has a CTP preview release available.
|||Our guess as of now is that some timely transaction log backup analysis on a remote machine could help keep our database server usage to an acceptable level as it already handles loads of data.
We doubt that using triggers on each table would be much more effective. This would certainly slow down every transaction. As our database grows, we must optimize DML operations to keep the website running fast.
|||>> SELECT "log record" FROM ::fn_dblog(null,null)
>> All we need to know is who made the modifications, when, in which tables and what are the new values.
"Who" is not available on a log record by log record basis. It is not part of that binary data in most cases. There are some few records that contain a "who" such as BEGIN TRAN log records. This is included in other output columns of fn_dblog().
When is not available on a log record by log record basis. There are some few record types that contain a "when" such as BEGIN TRAN and END TRAN log records. These are included as other columns in fn_dblog().
Which table is not actually stored in the log record in SQL 2005 and later. This is due to partitioning. The partition has to be linked through the catalog metadata back to its base table and index. This is attempted by fn_dblog() itself and shows up as other columns in the output. DDL can make this lookup fail.
"New values" are often just a byte by byte binary diff of the old row from the new row, not the full values themselves. The log record code does not actually know how to crack the row binary data, it just passes it along to other components in the system. i.e., if you update a INT column from a value of 1 to a value of 257, we may only log a single byte 0x01 as the difference.
The SQL 2005 output for fn_dblog() has been supplemented to try to make some of this easier, but the reality is that some of what you want is just not in the log in many cases.
Thanks Peter for this interesting complement of information.
We actually take care of the "When" and the "Who" because each table of our database contain an updatedBy and updatedDt column. These columns are systematicly updated on each changes by the application layer. We only need to find the new values of the modification.
For the byte by byte binary diff, I guess that you are talking about the online transaction log because taking a closer look at the transaction log backup file (.trn) indicates that new values are stored in the file. I've opened it with an hex editor and I was able to see the new inserted values.
|||You could look at AuditDB of Lumigent, can be customized on details what you demand.
With best regards.
Jan H. Kanon
No comments:
Post a Comment