This white paper examines the question of what to do about the growing size of data warehouses, given that much of the data that contributes to that size is rarely required for analysis or other business intelligence purposes. If such data is never needed for this purpose then it can be stored off-line ? it is when that data is required sometimes but only occasionally that disproportionate costs arise.
This white paper considers potential solutions to this problem but finds that most approaches are either expensive (because large amounts of disk are required, which remains expensive even if low cost disks are used) or inconvenient (because data is stored off-line and cannot, therefore, be queried without reconstitution back into the data warehouse).
As an alternative, this report suggests that users in this situation would be well advised to consider the merits of the SAND Searchable Archive. By working on a columnar basis, this product is able to achieve impressive compression ratios, which means that more data can be stored in less space (typically, around 1/10th of the storage capacity of the raw data). Moreover, you can perform (relatively simple) queries against this data while it is compressed. There are also facilities for rapidly reconstituting selected datasets back into the data warehouse when detailed analysis is required.
In practice, the adoption of the SAND Searchable Archive suggests a new architecture for data warehousing: one that incorporates the Searchable Archive as a component of that architecture to provide near-line storage. It is interesting to note that SAP has bought into this vision and that the Searchable Archive has been integrated into the SAP BI environment.