As RDFS and OWL are becoming increasingly popular for larger audiences to create and describe ontologies and models, it is becoming frequently more asked how would one import such an ontology into Neo4j.

To demonstrate how an RDFS/OWL ontology can be imported into Neo4j, we are going to show how we import the W3C Organizational Ontology using the neosematics tools. There will be other approaches to achieve this, of course, but this is a good starting point to explore.

What is neosemantics?

Before we dive straight into our worked example, what exactly is neosematics? Neosemantics (https://github.com/jbarrasa/neosemantics) is a Neo4j plugin written by Jesus Barassa to efficiently work with previewing and ingesting RDF/semantic data, as well as publishing Label Property Graph data as RDF.

Accompanying this plugin, Jesus has written a series of blogs describing the different ways one can import not only data, but RDFS/OWL models too. An excellent starting point to get more in-depth information on this would be the following post: https://jbarrasa.com/2016/04/06/building-a-semantic-graph-in-neo4j/.

Getting started – W3C Organizational Ontology

For reference, the ontology we are looking to import is the one below (available from: https://www.w3.org/TR/vocab-org/).

pic0

We are going to use the Turtle serialization of this model (https://www.w3.org/ns/org.ttl) and use neosemantics (https://github.com/jbarrasa/neosemantics) to ingest it. To do this, we’ll run the following Cypher query in the Browser:

CALL semantics.importRDF(“https://www.w3.org/ns/org.ttl",”Turtle”,{ languageFilter: ‘en’ })

A limitation with neosematics to be aware of is how neosemantics handles multivalued properties during import the process. In this scenario, the last of the multivalue properties imported will be the one available in the resulting model. Multivalued properties are most commonly used with different languages, so by using the languageFilter parameter as above, we will ensure we import the language type we want in our model. Occasionally there will be multivalued properties that are not language-based. A LPG will happy accommodate multivalue properties via arrays. Having said that, multivalued properties outside of languages are very rarely used.

Once we’ve imported the Organization ontology, we can have a look at it running the following Cypher query:

MATCH (n) RETURN *

Which will return the following (nodes coloured already):

pic2

As we can see there’s a lot of interconnected relationships going on. A brief exploration of the model shows a lot of elements linking to ‘org’. We can show these relationships, along with which nodes being joined by running the following Cypher query:

MATCH (n:Resource {uri:’http://www.w3.org/ns/org'})-[r]-()
WITH type(r) AS rel, startnode(r) AS sn
RETURN rel, sn.uri ORDER BY sn
pic5

This ‘isDefinedBy’ relationships effectively links every single resource to this specific definition, so we can safely ignore this. These types of statements are sometimes used to indicate an RDF vocabulary in which a resource is described (https://www.w3.org/TR/rdf-schema/#ch_isdefinedby). There are a couple of way we can go about doing this; we could delete the node and relationships. However the approach we’re going to take instead is to remove the Resource label from this node, so we’re still keeping all of the information, but we can now easily bring back the rest of the model. We'll use the following two Cypher queries to do this:

//Remove the Resource label
MATCH (n:Resource {uri:’http://www.w3.org/ns/org'})
REMOVE n:Resource
//Return only the nodes with Resource labels and associated relationships
MATCH (n:Resource) RETURN *

With a bit of node dragging to move the classes in roughly the same locations as the above diagram, we get the following:

pic1

And there is the imported RDFS/OWL ontology with no loss of data. We can take this diagram and re-serialize it back to an RDF format.

Some comments about the imported Organizational ontology. As will be clear quite quickly, what we’d consider to be relationships and properties in Neo4j (edges, data type properties) are all represented as nodes, along with relationships between these nodes to indicate range, domain and so forth. This is the case as there are certain relationship types and definitions in RDF that Neo4j doesn’t provide for out of the box. This is not a problem, but just requires the user to have understanding of these patterns within Neo4j. For example, take the RDF concept of showing that two relationship types are inverses of each other:

pic4

The left image illustrates a user-friendly view of what this diagrammatically looks like in an RDF ontology. This same concept has been replicated in Neo4j on the right as an RDF formalisation of this view, preserving all of the information integrity.

Tailoring views

Of course, we may well want to make what we bring back to view via query look more similar like what we have modelling in our graphical interface, and worry less about the relationship consistency constraints. This can be achieved in an number of ways; one approach being through the use of APOC’s virtual relationships function (https://neo4j-contrib.github.io/neo4j-apoc-procedures/#_virtual). For example, a starter for 10 may look like this:

MATCH (n:ns0__Class)-[:ns1__range]-(r)-[:ns1__domain]-(m:ns0__Class)
WITH n, r.ns1__label as lab, m, count(*) as count
CALL apoc.create.vRelationship(n, lab, {count:count}, m) YIELD rel
RETURN n, m, rel
pic6

This is a virtual view, and does not in any way change or affect the data underneath.

Summary

We have demonstrated an approach of how you can import your RDFS/OWL-based ontology into Neo4j, preserving the detail, and how to make user-friendly views for exploration. For further information on working on RDFS and OWL ontologies and models in Neo4j, please do check out the excellent series of posts written by Jesus Barassa (jbarrasa.com).