(Draft Paper to be submitted to 4th International Conference on Interactive Digital Media (ICIDM) 2015)
We create a graph backing store for OpenCog[2, 3, 26, 27, 28] using Neo4j graph database. The GraphBackingStore API extends the current BackingStore C++ API. It can take special queries that map naturally into Cypher queries and simple manipulations of Neo4j graph traversals.
The Neo4j node-relationship structures and custom indices are optimized for AtomSpace usage and performance, with considerations for integration with Linked Data. This makes it easier to integrate projects which use common Linked Data foundation, such as Lumen Robot Friend. Data access is using Protocol Buffers-based protocol over ZeroMQ messaging transport.
The initial work for Neo4j Graph Backing Store for OpenCog was implemented during Google Summer of Code 2015. All source code is open source and available at https://github.com/ceefour/opencog-neo4j, https://github.com/opencog/atomspace, and https://github.com/opencog/opencog.
Keywords: OpenCog, graph database, Neo4j, artificial intelligence, ZeroMQ, Linked Data, YAGO, semantic web, protocol buffers, knowledge representation
OpenCog “[aims] to create an open source framework for Artificial General Intelligence, intended to one day express general intelligence at the human level and beyond.” OpenCog comprises of several integrated open source projects, with AtomSpace hypergraph knowledge representation system and Cog Server as the most essential components.
For persisting knowledge, AtomSpace uses a Backing Store implementation. The most mature Backing Store implementation for OpenCog is SQL Backing Store written by Linas Vepstas. As AtomSpace represents knowledge in a hypergraph structure, persisting AtomSpace knowledge in a graph database is expected to be more intuitive and more portable. The Google Summer of Code 2015 work implemented initial work on the Neo4j Graph Backing Store for OpenCog. Neo4j was chosen due to its popularity, flexible structure, good performance, portability, open source license, and availability of commercial support if required.
AtomSpace is embedded within Cog Server, written in combination of C++, Python, and Scheme programming languages. The AtomSpace accesses the Neo4j Backing Store via a ZeroMQ Backing Store implementation, which implements the AtomSpace C++ BackingStore API using message patterns with Protocol Buffers serialization and transported via ZeroMQ.
The Neo4j Backing Store Server itself is implemented in Java and contains the Neo4j graph database service. The Neo4j Backing Store Server listens and communicates with client(s) using the OpenCog ZeroMQ protocol, which uses Protocol Buffers serialization and ZeroMQ message transport. The primary client of Neo4j Backing Store Server is the ZeroMQ Backing Store implementation in AtomSpace.
A truth value gives each Atom a valuation or an interpretation; thus all Atoms in a given, fixed AtomSpace always carry a default valuation/interpretation along with them. All truth values expose at least two parameters describing truth:
false, and 1.0 denoting
Currently, OpenCog supports 7 types of truth values:
SimpleTruthValue. Holds two floating-point values, commonly called “strength” and “confidence”.
CountTruthValue. Holds three floating point values. One is typically a raw count (integer) of having seen some event. Another is typically the logarithm of the normalized frequency (i.e. observed probability) of the event.
IndefiniteTruthValue. Holds four floating point numbers.
FuzzyTruthValue. Fuzzy truth value.
GenericTruthValue. Generic truth value.
NullTruthValue. A special type of truth value.
ProbabilisticTruthValue. A truth value that stores a mean, a confidence and the number of observations.
AtomSpace is an essential component of OpenCog, “an API for storing and querying hypergraphs, and is the central Knowledge Representation system provided by the OpenCog Framework. The hypergraphs stored in the AtomSpace consist of Atoms (the superclass for Nodes and Links).”
Scheme is a principal dialect of the computer programming language Lisp. Scheme follows a minimalist design philosophy that specifies a small standard core accompanied by powerful tools for language extension, this is in contrast to Common Lisp which has a relatively richer “core”.
Scheme is used by OpenCog in many places to represent Atoms, for example a simple
ConceptNode with a
(ConceptNode (stv 0.9 0.2) "tree")
ReferenceLink with a
(ReferenceLink (stv 0.987 0.234) (ConceptNode "dog-instance-1") (WordNode "dog"))
OpenCog’s Scheme notation makes it easier to represent and understand simple to moderately complex Atom relationships.
AtomSpace works mainly in-memory. It supports persisting Atoms to permanent storage via Backing Store mechanism. AtomSpace supports 5 Backing Store implementations in various states:
file. File-based storage. Works, but deprecated.
hypertable. Experimental HyperTable support. Unmaintained. (Won’t compile at this time.)
memcache. Experimental/broken, uses memcached for persistence. (Won’t compile at this time.)
sql. Works well for most uses – with caveats.
zmq. ZeroMQ-based AtomSpace serialization and deserialization. This is the Backing Store implementation that is used with the Neo4j Graph Backing Store Server.
The AtomSpace hypergraph model and Neo4j graph model have many similarities. However, we still needed to map particular AtomSpace concepts to their representation in Neo4j.
OpenCog Neo4j Graph Backing Store represents all
Nodes as graph vertices, and also by default represents
Links as graph vertices. The Atom types are mapped one-to-one to Neo4j vertex labels. By default, Atom types are mapped to a vertex label prefixed with
WordNode maps to
Neo4j allows graph vertices to have labels. While multiple labels are possible, OpenCog Neo4j Graph Backing Store only uses one label for each vertex.
Neo4j Backing Store differs between binary
Links and n-ary
Link. For binary
Links, edge types are:
rdf_subject. Edge that connects to the first element of the outgoing set.
rdf_object. Edge that connects to the second/last element of the outgoing set.
Links, Neo4j Backing Store uses these edge types:
opencog_predicate. Only used by
PredicateLink, this connects the
opencog_parameter. This connects the
Linkto each element of the outgoing set. In case of
PredicateLink, the first outgoing edge to
PredicateNodeis excluded. Each edge has an integer
positionproperty which specifies the parameter index of the outgoing set (starting from 0).
Neo4j already has excellent graph traversal performance. To provide faster lookup performance, all graph vertices are also indexed.
While Neo4j Server provides a REST/HTTP endpoint out-of-the-box, AtomSpace operations may require better performance by reducing serialization overhead, and also to prepare for asynchronous/event-based functionality. Therefore, the Neo4j Graph Backing Store Server supports the OpenCog Protocol which uses the ZeroMQ transport and Protocol Buffers serialization.
ZeroMQ transport layer is used to communicate between Neo4j Graph Backing Store Server and its clients. The primary client is the OpenCog Server.
ZeroMQ has several attractive features:
OpenCog Protocol serializes messages using Protocol Buffers, a language-neutral, platform-neutral extensible mechanism for serializing structured data.
The following table lists the core data structures of the OpenCog Protobuf serialization:
In a ZeroMQ message, the client can request a particular function of the OpenCog Neo4j Backing Store Server. It is planned to expand these functions to optimize more OpenCog graph operations. The currently supported functions are listed the following table.
||Get a single atom by handle UUID.|
||Get name of a node.|
||Get multiple atoms by UUID, atomType + node name, or atomType + outgoing set.|
||Store multiple atoms by atomType + node name (for node), or atomType + outgoing set (for link).|
OpenCog Neo4j Backing Store Server exposes its functionality through the ZeroMQ-based OpenCog Protocol, which means any client can access it through the appropriate messages. The primary client for OpenCog Neo4j Backing Store Server is the OpenCog Server.
OpenCog Server is the primary software which contains the OpenCog Artificial General Intelligence and the AtomSpace knowledge hypergraph. AtomSpace hypergraph can be loaded from and persisted to Neo4j using appropriate Scheme commands. These commands allow AtomSpace to integrate with Neo4j Backing Store Server using OpenCog ZeroMQ Protocol. The supported commands are listed in the table below.
||Close the currently open ZeroMQ persistence.|
||Load contents of ZeroMQ persistence.|
||Open connection to ZeroMQ persistence.|
||Save the atomtable on the ZeroMQ persistence.|
A Java-based Command Line Interface (CLI) client is also available for ad-hoc testing of the Neo4j Graph Backing Store Server or any backing store supporting the OpenCog ZeroMQ Protocol.
A typical deployment of an OpenCog application with the Neo4j Graph Backing Store requires the following components:
cogserverScheme shell at port 17001, or via the OpenCog ZeroMQ Protocol.
OpenCog Framework is a mature and comprehensive Artificial General Intelligence (AGI) framework which is open source and can be used for various purposes, including humanoid robotics, natural language processing, and probabilistic reasoning. AtomSpace, an essential part of OpenCog, provides the knowledge hypergraph which allows individual artificial intelligence modules to integrate and manipulate knowledge in a uniform way and using common metamodel. OpenCog Neo4j Graph Backing Store enhances this functionality by providing persisted storage of the AtomSpace knowledge in a graph database which has very similar structure to AtomSpace. By supporting a platform-agnostic ZeroMQ-based protocol, OpenCog Neo4j Graph Backing Store opens potential uses cases of integration with intelligent applications. OpenCog Neo4j Graph Backing Store also maps well to Linked Data technologies, especially reusing Resource Description Framework (RDF), Simple Knowledge Organization System (SKOS), Schema.org, and YAGO[20, 30, 32] ontologies. This is an elaborate design to ease collaboration with other researchers and the technology industry players, particularly from the Linked Data community, YAGO researchers, Artificial Intelligence community, and Lumen Robot Friend researchers.
We would like to thank OpenCog Foundation mentors especially Linas Vepstas, Amen Belayneh, and Ben Goertzel, without whose great efforts we could not have built Neo4j Graph Backing Store for OpenCog. We thank Fabian M. Suchanek and his team for their excellent work on YAGO Semantic Knowledge Base[20, 30, 32]. We also thank all former, current, and newer Lumen Robot Friend team members, including Budhi Yulianto who researched Graph Database for Lumen Robot Friend, Marzuki Syahfirin as coordinator of Lumen Robot Friend, Maria Shusanti for Lumen Robot Friend gesture research, Wahyudi for natural language understanding research, Sigit Ari Wijanarko for researching Lumen Robot Friend augmented reality visualization, Taufiq Nuzwir Nizar for image processing algorithms, Ahmad Syarif for NAO avatar integration, Putri Nhirun for Lumen speech capabilities, Setyaki Sholata Sya for Lumen visual recognition.