PSB: the best of all possible worlds
Written By: Philip Howard
Published: 24th June, 2004
Content Copyright © 2004 Bloor. All Rights Reserved.
Traditionally, there have been two ways to implement replication, using either a log-based or a trigger-based approach. However, you could have been forgiven for thinking that any arguments between the two methods were settled a long time ago. After all, all the leading database vendors implement a log-based approach though it is true that IBM uses triggers to replicate data to non-IBM environments. However, the fact that it replicates from log files when dealing with its own databases gives the clear impression that IBM thinks that a trigger-based approach is definitely second best.
However, the issue is by no means as clear cut as you might think. I was talking to PSB recently, and it disagrees with the database vendors. First though: who is PSB? Actually it is a Dutch company, which specialises in data replication and synchronisation with a product called High Volume Replicator (HVR). Although the company is relatively small and has no offices outside the Netherlands the product is resold by Ericsson within the telecommunications market and by Lufthansa to the airline market, which says something about the quality of the product; and the company has installations throughout Europe, in the United States and elsewhere.
HVR started life as a specialist facility for Ingres databases but since then it has expanded to support Oracle and SQL Server environments. It does not currently support either DB2 or Sybase but as this is user-driven it really depends on demand.
As far as logs and triggers are concerned, the traditional argument in favour of a log-based approach is that it performs much better than when using triggers. The reason for this is that when you are doing replication (or synchronisation, which is really just low latency replication) you need to know about the timing of the process, because it is those times that allow you to handle collisions (that is, when different updates include contrary information). If you are capturing data as it is written to the database log then the log itself obviously provides that timing information. However, if you are using triggers then, conventionally at least, you have to write a timestamp against each piece of replicated data, and it is this that slows down a trigger-based approach.
However, as PSB points out, collisions are actually pretty rare so it is reasonable to turn off time stamping and simply treat any collisions that do occur as exceptions. Further, many application tables include their own time stamps, in which case HVR allows you the opportunity to use these. Or, as a third alternative, you can use time stamping as normal.
What I haven't discussed is why you might want to use triggers as opposed to logs. Well, the big reason is that you build business rules into database triggers so, for example, you could define a rule that replicated to different targets automatically, depending upon the value of the data; or you could make decisions about replicating or not replicating, and so on. In other words, triggers offer a much more flexible approach.
So that is the choice: performance versus flexibility. What you would really like is the choice to do either and, in fact, that is exactly what PSB intends to do in the future, by offering a log-based option, so it will be offering the best of all possible worlds.