Content Copyright © 2014 Bloor. All Rights Reserved.
Also posted on: The IM Blog
This is a question that may not seem to have much relevance to IT. However, there are lots of geo-spatial applications where it is fundamental. One approach is to simply use longitude and latitude but the problem now is the time it takes to construct a map based on those co-ordinates. Or, you can store an image and then overlay data on top of the image. But what if the map is not static? For example, suppose you want to superimpose weather conditions and you want to do that in real-time, and you want to be able to layer other data on top as well (for example, where tweets are coming from) and you want to be able to drill down from a broader picture to more localised detail? How do you do all that with both scale and performance?
Successfully doing this with both accuracy and performance, as well as scale, is the claim behind a new database offering called SpaceCurve. So, how does SpaceCurve work? There are four things you need to know. The first is that its storage and execution engine bypasses the operating system so that you get improved performance. That’s the easy part. The second thing is that SpaceCurve stores data using polymorphic space-filling curves. Now, a space-filling curve is a mathematical concept (technically, a mapping from n-dimensional space to m-dimensional space) in which the curve goes through every point in the space and, it turns out, they are well suited to parallelisation. However, what you do not want to do is to limit yourself to a particular mapping or space-filling curve, that’s a bit like sharding your data: good for what you originally thought of but not so useful for other applications that have different requirements. This is where the polymorphic nature of the curves comes in—being used in a computer science sense (as in object orientation) rather than mathematically, where it has no meaning—such that SpaceCurve is not tied to any particular curve but can dynamically construct and remap to whatever is most appropriate (determined by built-in algorithms) for the data in question.
Thirdly, SpaceCurve makes use of hyper-dimensional spatial sieves. These are based on work down by the founder (J Andrew Rogers) of SpaceCurve when he worked on databases for Google Earth and are to do with how you store multi-dimensional polygons within a tree (indexing) structure.
Finally, SpaceCurve uses a version of Allen’s Interval Algebra (a form of Boolean algebra useful for spatiotemporal data) that has been extended to support multi-dimensional environments, which the company uses to parallelise SQL statements (with SQL being automatically translated into the Interval Algebra for you).
I hope you got your head around all of that. The bottom line is that SpaceCurve can store a lot—the product has been designed to scale to thousands of nodes, though that scale has not been implemented yet—of data (and it doesn’t just have to be spatial data). Moreover, it is extremely fast—the company describes its capabilities as “simultaneous real-time capture, fusion and analysis”—and having seen a demonstration of the product I have to say that it extremely impressive. As an aside, bear in mind that The Internet of Things is going to involve a lot of location-based information and traditional database technology has simply not been designed to handle the sort of scale and performance that this is going to require. It looks to me like SpaceCurve has a lot going for it.