(Draft Paper to be submitted to 4th International Conference on Interactive Digital Media (ICIDM) 2015)
We create a graph backing store[1] for OpenCog[2, 3, 26, 27, 28] using Neo4j graph database[4]. The GraphBackingStore API extends the current BackingStore C++ API. It can take special queries that map naturally into Cypher queries and simple manipulations of Neo4j graph traversals.
The Neo4j node-relationship structures and custom indices are optimized for AtomSpace[5] usage and performance, with considerations for integration with Linked Data[17]. This makes it easier to integrate projects which use common Linked Data foundation, such as Lumen Robot Friend[29]. Data access is using Protocol Buffers-based[6] protocol over ZeroMQ[7] messaging transport.
The initial work for Neo4j Graph Backing Store for OpenCog was implemented during Google Summer of Code 2015[8]. All source code is open source and available at https://github.com/ceefour/opencog-neo4j, https://github.com/opencog/atomspace, and https://github.com/opencog/opencog.
Keywords: OpenCog, graph database, Neo4j, artificial intelligence, ZeroMQ, Linked Data, YAGO, semantic web, protocol buffers, knowledge representation
OpenCog “[aims] to create an open source framework for Artificial General Intelligence, intended to one day express general intelligence at the human level and beyond.”[9] OpenCog comprises of several integrated open source projects, with AtomSpace hypergraph knowledge representation system[5] and Cog Server as the most essential components.
For persisting knowledge, AtomSpace uses a Backing Store implementation. The most mature Backing Store implementation for OpenCog is SQL Backing Store written by Linas Vepstas. As AtomSpace represents knowledge in a hypergraph structure, persisting AtomSpace knowledge in a graph database is expected to be more intuitive and more portable. The Google Summer of Code 2015 work[8] implemented initial work on the Neo4j Graph Backing Store for OpenCog. Neo4j was chosen due to its popularity, flexible structure, good performance, portability, open source license, and availability of commercial support if required.
AtomSpace[5] is embedded within Cog Server, written in combination of C++, Python, and Scheme programming languages. The AtomSpace accesses the Neo4j Backing Store via a ZeroMQ Backing Store implementation, which implements the AtomSpace C++ BackingStore API using message patterns with Protocol Buffers serialization and transported via ZeroMQ.
The Neo4j Backing Store Server itself is implemented in Java and contains the Neo4j graph database service. The Neo4j Backing Store Server listens and communicates with client(s) using the OpenCog ZeroMQ protocol, which uses Protocol Buffers serialization and ZeroMQ message transport. The primary client of Neo4j Backing Store Server is the ZeroMQ Backing Store implementation in AtomSpace.
Atoms are the basic elements of knowledge in AtomSpace, an Atom “captures the notion of an atomic formula (or atom) in mathematical logic.”[10] All Atoms have a type, which inherit from either Node or Link. The type of an Atom is immutable.
15 Node types supported out-of-the-box by AtomSpace are:
ConceptNode
NumberNode
TypeNode
TypeChoice
VariableNode
VariableList
ProcedureNode
GroundedProcedureNode
SchemaNode
DefinedSchemaNode
GroundedSchemaNode
PredicateNode
DefinedPredicateNode
GroundedPredicateNode
AnchorNode
57 Link types supported out-of-the-box by AtomSpace are:
OrderedLink
UnorderedLink
SetLink
ListLink
MemberLink
SubsetLink
IntensionalInheritanceLink
ExtensionalSimilarityLink
IntensionalSimilartiyLink
AndLink
OrLink
NotLink
SequentialAndLink
AbsentLink
ChoiceLink
PresentLink
ContextLink
TypedVariableLink
LambdaLink
DefineLink
PutLink
PatternLink
GetLink
SatisfactionLink
BindLink
QuoteLink
UnquoteLink
ForallLink
ExistsLink
AverageLink
SatisfyingSetLink
ScholemLink
ImplicationLink
EquivalenceLink
AssociativeLink
InheritanceLink
SimilarityLink
AttractionLink
FreeLink
EvaluationLink
ExecutionLink
SchemaExecutionLink
SchemaEvaluationLink
QuantityLink
VirtualLink
GreaterThanLink
EqualLink
FunctionLink
ExecutionOutputLink
FoldLink
ArithmeticLink
PlusLink
TimesLink
DeleteLink
AssignLink
InsertLink
RemoveLink
A truth value gives each Atom a valuation or an interpretation; thus all Atoms in a given, fixed AtomSpace always carry a default valuation/interpretation along with them.[10] All truth values expose at least two parameters describing truth:[13]
Boolean
false
, and 1.0 denoting true
.Currently, OpenCog supports 7 types of truth values:
SimpleTruthValue
. Holds two floating-point values, commonly called “strength” and “confidence”.CountTruthValue
. Holds three floating point values. One is typically a raw count (integer) of having seen some event. Another is typically the logarithm of the normalized frequency (i.e. observed probability) of the event.IndefiniteTruthValue
. Holds four floating point numbers.FuzzyTruthValue
. Fuzzy truth value.GenericTruthValue
. Generic truth value.NullTruthValue
. A special type of truth value.ProbabilisticTruthValue
. A truth value that stores a mean, a confidence and the number of observations.AtomSpace is an essential component of OpenCog, “an API for storing and querying hypergraphs, and is the central Knowledge Representation system provided by the OpenCog Framework. The hypergraphs stored in the AtomSpace consist of Atoms (the superclass for Nodes and Links).”[5]
Scheme is a principal dialect of the computer programming language Lisp. Scheme follows a minimalist design philosophy that specifies a small standard core accompanied by powerful tools for language extension, this is in contrast to Common Lisp which has a relatively richer “core”.[14]
Scheme is used by OpenCog in many places to represent Atoms,[15] for example a simple ConceptNode
:
(ConceptNode "tree")
A ConceptNode
with a SimpleTruthValue
:
(ConceptNode (stv 0.9 0.2) "tree")
A ReferenceLink
with a SimpleTruthValue
:
(ReferenceLink (stv 0.987 0.234)
(ConceptNode "dog-instance-1") (WordNode "dog"))
OpenCog’s Scheme notation makes it easier to represent and understand simple to moderately complex Atom relationships.
AtomSpace works mainly in-memory. It supports persisting Atoms to permanent storage via Backing Store mechanism. AtomSpace supports 5 Backing Store implementations[16] in various states:
file
. File-based storage. Works, but deprecated.hypertable
. Experimental HyperTable support. Unmaintained. (Won’t compile at this time.)memcache
. Experimental/broken, uses memcached for persistence. (Won’t compile at this time.)sql
. Works well for most uses – with caveats.zmq
. ZeroMQ-based AtomSpace serialization and deserialization. This is the Backing Store implementation that is used with the Neo4j Graph Backing Store Server.The AtomSpace hypergraph model and Neo4j graph model have many similarities. However, we still needed to map particular AtomSpace concepts to their representation in Neo4j.
OpenCog Neo4j Graph Backing Store represents all Node
s as graph vertices, and also by default represents Link
s as graph vertices. The Atom types are mapped one-to-one to Neo4j vertex labels. By default, Atom types are mapped to a vertex label prefixed with opencog_
, e.g., WordNode
maps to opencog_WordNode
.
Neo4j allows graph vertices to have labels. While multiple labels are possible, OpenCog Neo4j Graph Backing Store only uses one label for each vertex.
Neo4j Backing Store differs between binary Link
s and n-ary Link
. For binary Link
s, edge types are:
rdf_subject
. Edge that connects to the first element of the outgoing set.rdf_object
. Edge that connects to the second/last element of the outgoing set.For n-ary Links
, Neo4j Backing Store uses these edge types:
opencog_predicate
. Only used by PredicateLink
, this connects the PredicateLink
to the PredicateNode
.opencog_parameter
. This connects the Link
to each element of the outgoing set. In case of PredicateLink
, the first outgoing edge to PredicateNode
is excluded. Each edge has an integer position
property which specifies the parameter index of the outgoing set (starting from 0).The structure and naming of the vertices, edges, and properties of the OpenCog Neo4j graph database are designed to facilitate integration with Linked Data[17] technologies, with special considerations for Resource Description Framework (RDF)[18], Simple Knowledge Organization System (SKOS)[31], Schema.org[19], and YAGO[20, 30, 32] ontologies.
The following table lists the Linked Data-friendly Neo4j vertex labels to represent AtomSpace knowledge base:
Atom type | Linked Data URI | Vertex label | Ontology |
---|---|---|---|
ConceptNode |
http://schema.org/Thing | schema_Thing |
Schema.org |
InheritanceLink |
http://www.w3.org/2000/01/rdf-schema#subClassOf | rdfs_subClassOf |
Resource Description Framework (RDF) |
MemberLink |
http://www.w3.org/1999/02/22-rdf-syntax-ns#type | rdf_type |
Resource Description Framework (RDF) |
The Neo4j edge types for binary Link
s use reuse Linked Data properties from Resource Description Framework (RDF):
Description | Linked Data URI | Edge type | Ontology |
---|---|---|---|
First outgoing element of binary Link |
http://www.w3.org/1999/02/22-rdf-syntax-ns#subject | rdf_subject |
Resource Description Framework (RDF) |
Second outgoing element of binary Link |
http://www.w3.org/1999/02/22-rdf-syntax-ns#object | rdf_object |
Resource Description Framework (RDF) |
The vertex and edge properties used by OpenCog Neo4j Graph Backing Store are designed to interoperate with Linked Data technologies:
Property | Description | Linked Data URI | Ontology |
---|---|---|---|
nn |
Node name or Linked Data QName[21] | http://www.w3.org/1999/02/22-rdf-syntax-ns#about | Resource Description Framework (RDF) |
prefLabel |
Preferred label | http://www.w3.org/2008/05/skos#prefLabel | Simple Knowledge Organization System (SKOS) |
isPreferredMeaningOf |
Indicates the Node is the preferred meaning of specified textual label |
http://yago-knowledge.org/resource/isPreferredMeaningOf | YAGO |
tv |
Generic truth value (array of floating-point values) |
Neo4j already has excellent graph traversal performance. To provide faster lookup performance, all graph vertices are also indexed.
Property | Index type |
---|---|
nn |
Unique constraint |
prefLabel |
Index |
isPreferredMeaningOf |
Index |
While Neo4j Server provides a REST/HTTP endpoint out-of-the-box, AtomSpace operations may require better performance by reducing serialization overhead, and also to prepare for asynchronous/event-based functionality. Therefore, the Neo4j Graph Backing Store Server supports the OpenCog Protocol which uses the ZeroMQ transport and Protocol Buffers serialization.
ZeroMQ[7] transport layer is used to communicate between Neo4j Graph Backing Store Server and its clients. The primary client is the OpenCog Server.
ZeroMQ has several attractive features:
OpenCog Protocol serializes messages using Protocol Buffers[6], a language-neutral, platform-neutral extensible mechanism for serializing structured data.
The following table lists the core data structures of the OpenCog Protobuf serialization[22]:
Name | Type |
---|---|
ZMQAttentionValueHolderMessage |
Message |
ZMQTruthValueType |
Enumeration |
ZMQVersionHandleMessage |
Message |
ZMQSingleTruthValueMessage |
Message |
ZMQTruthValueMessage |
Message |
ZMQTrailMessage |
Message |
ZMQAtomType |
Enumeration |
ZMQAtomMessage |
Message |
ZMQAtomFetchKind |
Enumeration |
ZMQAtomFetch |
Message |
ZMQFunctionType |
Enumeration |
ZMQAtomTypeInfo |
Message |
ZMQRequestMessage |
Message |
ZMQReplyMessage |
Message |
In a ZeroMQ message, the client can request a particular function of the OpenCog Neo4j Backing Store Server. It is planned to expand these functions to optimize more OpenCog graph operations. The currently supported functions are listed the following table.
Function | Description |
---|---|
ZMQgetAtom |
Get a single atom by handle UUID. |
ZMQgetName |
Get name of a node. |
ZMQgetAtoms |
Get multiple atoms by UUID, atomType + node name, or atomType + outgoing set. |
ZMQstoreAtoms |
Store multiple atoms by atomType + node name (for node), or atomType + outgoing set (for link). |
OpenCog Neo4j Backing Store Server exposes its functionality through the ZeroMQ-based OpenCog Protocol, which means any client can access it through the appropriate messages. The primary client for OpenCog Neo4j Backing Store Server is the OpenCog Server.
OpenCog Server is the primary software which contains the OpenCog Artificial General Intelligence and the AtomSpace knowledge hypergraph. AtomSpace hypergraph can be loaded from and persisted to Neo4j using appropriate Scheme commands[23]. These commands allow AtomSpace to integrate with Neo4j Backing Store Server using OpenCog ZeroMQ Protocol. The supported commands are listed in the table below.
Command | Description |
---|---|
zmq-close |
Close the currently open ZeroMQ persistence. |
zmq-load |
Load contents of ZeroMQ persistence. |
zmq-open |
Open connection to ZeroMQ persistence. |
zmq-store |
Save the atomtable on the ZeroMQ persistence. |
A Java-based Command Line Interface (CLI) client is also available for ad-hoc testing of the Neo4j Graph Backing Store Server or any backing store supporting the OpenCog ZeroMQ Protocol.
A typical deployment of an OpenCog application with the Neo4j Graph Backing Store requires the following components:
cogutils
, atomspace
, and opencog
.cogserver
Scheme shell at port 17001, or via the OpenCog ZeroMQ Protocol.OpenCog Framework is a mature and comprehensive Artificial General Intelligence (AGI) framework which is open source and can be used for various purposes, including humanoid robotics, natural language processing, and probabilistic reasoning. AtomSpace, an essential part of OpenCog, provides the knowledge hypergraph which allows individual artificial intelligence modules to integrate and manipulate knowledge in a uniform way and using common metamodel. OpenCog Neo4j Graph Backing Store enhances this functionality by providing persisted storage of the AtomSpace knowledge in a graph database which has very similar structure to AtomSpace. By supporting a platform-agnostic ZeroMQ-based protocol, OpenCog Neo4j Graph Backing Store opens potential uses cases of integration with intelligent applications. OpenCog Neo4j Graph Backing Store also maps well to Linked Data technologies, especially reusing Resource Description Framework (RDF)[18], Simple Knowledge Organization System (SKOS)[31], Schema.org[19], and YAGO[20, 30, 32] ontologies. This is an elaborate design to ease collaboration with other researchers and the technology industry players, particularly from the Linked Data community, YAGO researchers, Artificial Intelligence community, and Lumen Robot Friend[29] researchers.
We would like to thank OpenCog Foundation mentors especially Linas Vepstas, Amen Belayneh, and Ben Goertzel, without whose great efforts we could not have built Neo4j Graph Backing Store for OpenCog. We thank Fabian M. Suchanek and his team for their excellent work on YAGO Semantic Knowledge Base[20, 30, 32]. We also thank all former, current, and newer Lumen Robot Friend[29] team members, including Budhi Yulianto who researched Graph Database for Lumen Robot Friend, Marzuki Syahfirin as coordinator of Lumen Robot Friend, Maria Shusanti for Lumen Robot Friend gesture research, Wahyudi for natural language understanding research, Sigit Ari Wijanarko for researching Lumen Robot Friend augmented reality visualization, Taufiq Nuzwir Nizar for image processing algorithms, Ahmad Syarif for NAO avatar integration, Putri Nhirun for Lumen speech capabilities, Setyaki Sholata Sya for Lumen visual recognition.
atomspace/opencog/persist/README
. https://github.com/opencog/atomspace/blob/master/opencog/persist/README.atomspace/opencog/persist/zmq/atomspace/ZMQMessages.proto
. https://github.com/opencog/atomspace/blob/master/opencog/persist/zmq/atomspace/ZMQMessages.proto.opencog/opencog/modules/PersistZmqModule.h
. https://github.com/ceefour/opencog/blob/persist-zmq/opencog/modules/PersistZmqModule.h.