
Fig 01 - Stardog functional architecture
The functional architecture of Stardog is illustrated in Figure 1. The storage layer is founded on RocksDB (a key-value store written in C++) and deployment is based on containers and Kubernetes.
As far as Figure 1 is concerned, we have already mentioned the support for SPARQL and Gremlin while the support for the GraphQL API is also noteworthy. However, the most important elements to explain are the virtual graph and natural language processing pipelines (BITES) capabilities, along with support for declarative models (which require no coding) that enable the creation of your knowledge graph(s). As far as BITES is concerned this is an extensible document storage system which provides configurable storage and processing for unifying unstructured data with Stardog graphs.
As far as virtual graphs are concerned, Stardog believes (and we agree) that support for data virtualisation is fundamental to supporting knowledge graphs. In this context it is important to appreciate that graphs can be used to represent any sort of data and not just business data. Within the context of federating data to create a suitable knowledge graph, this means a graph that represents metadata about source environments. With structured and semi-structured data sources, you create a graph that represents all the data sources you can address, while any particular query is defined by the sub-graph that defines the nodes (data sources) that you access. As for the details of each source that you might want to access, these are defined in what Stardog calls virtual graphs. To create these virtual graphs, you declaratively (no code) map tabular data into the graph model used by a Stardog database, typically using R2RML (relational to RDF modelling language). At run-time, Stardog makes use of caching to optimise federated query performance.

Fig 02 - Stardog supporting data virtualisation
In this context, Stardog provides out-of-the-box virtual connectors for more than 30 popular sources, some of which are illustrated in Figure 2. Examples of other supported sources include Google BigQuery, ElasticSearch and Exasol. An SDK is available for users to develop their own connectors.
There are some other notable features of Stardog that are worth mentioning, including extended SQL support so that you can use tools such as PowerBI and Tableau to query data, support for Active Directory, and built-in machine learning capabilities with support provided for supervised classification, supervised linear regression and unsupervised similarity detection.