BeckerDatabaseConsulting.com

Home

About Us

Business Intelligence

Tuning/Replication

Database Programming

How to Improve System Performance with Database Replication

by Chris Becker, BeckerDatabaseConsulting.com

Database replication is seldom thought of as a tool to improve database performance. This article will focus on using replication to improve performance, however replication has many useful benefits e.g. disaster recovery, remote data backup, real-time updates of QA or the development environments to name a few.

In an OLTP (OnLine Transaction Processing) database, or simply and Order-Entry database, the majority of activity by users is “reading” data. The ratio of read activity to inserts/updates/deletes is typically about 60:40. That’s 60% of all database activity is simply displaying data to the users. Often this causes contention of your main order tables, users may complain that the system is slow, the first place admins usually look at are system resources. Is CPU or memory taxed? Next look at blocking, ah ha, more often complaints of system slowness is due to users blocking each other. This is contention at the table level. Maybe accounting is running the daily P&L report, or operations are running reports spanning days, weeks, or longer.

How can this be resolved or improved upon? The idea is to get those consumers of long running reports out of the order-entry system. Can they run on a backup copy? Can we implement transaction log shipping? What about a database cluster? What about data replication? These are the common questions raised when searching for a solution. The idea to get those users out of the order-entry system is a step in the right direction, implementing the right plan is often with trial and error.

Using a backup copy as a report server, this perhaps can be acceptable if users don’t mind day old data, or week old data given the frequency of your backups. How to use transaction log shipping as a solution? Two problems with this, the Microsoft suggested log shipping method can only be performed with the Enterprise Edition of SQL Server (very expensive), a home grown method of log shipping can be accomplished in Standard Edition, however users on the report server are disconnected each time a transaction log is “replayed”. Depending on the frequency of the report server updates from production, this can be very painful for your report user community. Database clustering is mainly for high availability, not a way to offload the work. A common configuration is to have 1 server sitting idle, the idle server cannot be used or updated, it is just “waiting” for the primary node to fail. To use the idle node, a second instance of SQL Server is installed and unfortunately can do nothing to help the primary node. When the primary node fails, the instance of SQL from the primary is now running on the secondary node (serious performance issues running 2 instances of SQL on 1 node).

Finally we get to database replication. The primary replication type to a report server is “Transactional Replication”. It is relatively easy to implement, and in most systems can be done completely with the wizards provided in Enterprise Manager. There are no table changes involved when Transactional Replication is implemented, and latency of data arriving at your report server or subscribers is as little as 10 seconds. I often use this as a way to provide “Read-Scalability”. Some of the larger OLTP systems at 100Gb with 500+ tables I often see excellent latency levels often at 10 to 20 seconds to multiple subscribers.

Implementation tips for a large enterprise environment: do not use the wizards. Clicking thru 500 tables in the wizard is excruciatingly painful when it comes time to rebuild. You can use the wizard once, however script the solution and use those scripts to manage. Use a dedicated distributor, much load is placed squarely on the “shipping” of data to each subscriber, you definitely do not want to place this burden on the publisher, the costs will outway the benefits dramatically. Remove foreign keys from your subscribers, child tables can be loaded before the parent. Disable triggers, you do not want these firing twice. And finally, add the NOT FOR REPLICATION option to each column with the IDENTITY property, you do not want a new identity value being generated when the record reaches the subscriber.

In summary, the key word of the entire article is “Read Scalability”. Giving your users the ability to run their reports on a separate server, or perform any type of analysis involving large amounts of data on a separate server is key. Much of that 60% “read” load could easily be offloaded to a report server, development or data architects neglect to archive data causing historical data to remain in the order-entry systems. The need to separate analysis from order processing is key to making your database administrator, IT management, and your users very happy.