Overview

Recall that:

  • Document Model: good for 1-m relationships or no relationships between records.
  • Relational model: can handle simple m-m relationships between records.

As the connections within the data grow more complex, it becomes natural to model the data as a graph.

Examples of Graph Data Models

Examples of data that can be modelled as graphs include:

  • Social Graphs: vertices are people, and edges indicate which people know each other.
  • Web Graph: vertices are web pages, and edges indicate which HTML links to other pages.
  • Road Networks: vertices are junctions, and edges represent the roads between them.

In these examples, all vertices in a graph are of the same kind. However, graphs can also provide a consistent way of storing different types of objects in a single data store. For example, Facebook maintains a single graph with many different types of vertices and edges.

Property Graphs

In the property graph model, each vertex consists of:

  • A unique identifier.
  • A set of outgoing edges.
  • A set of incoming edges.
  • A collection of properties (key-value pairs).

Each edges consists of:

  • A unique identifier.
  • The vertex at which the edge starts (tail vertex).
  • The vertex at which the edge ends (head vertex).
  • A label to describe the kind of relationship between the two vertices.
  • A collection of properties (key-value pairs).

Some important aspects of this model are:

  • Any vertex can have an edge connecting it with any other vertex.
  • Given any vertex, we can efficiently find both its incoming and outgoing edges, and thus traverse the graph.
  • By using different labels for different kinds of relationships, we can store several different kinds of information in a single graph.

Query languages for property graphs include Cypher.

Triple-Stores

In the triple-store model, all information is stored in the form of three-parth statements: (subject, predicate, object). This is conceptually mostly equivalent to the property graph model.

Query languages for triple stores include SPARQL and Datalog.