The GraphMappingConfig
that you create specifies how data extracted from
a source data store should be loaded into a Neptune DB cluster. Its format differs
depending on whether it is intended for loading RDF data or for loading property-graph
data.
For RDF data, you can use the W3 R2RML
If you are loading property-graph data to be queried using Gremlin, you create a
JSON object for GraphMappingConfig
.
GraphMappingConfig Layout for RDF/SPARQL Data
If you are loading RDF data to be queried using SPARQL, you write the
GraphMappingConfig
in R2RMLR2RML
is a standard W3 language for mapping relational data to RDF. Here is
one example:
@prefix rr: <http://www.w3.org/ns/r2rml#> . @prefix ex: <http://example.com/ns#> . <#TriplesMap1> rr:logicalTable [ rr:tableName "nodes" ]; rr:subjectMap [ rr:template "http://data.example.com/employee/{id}"; rr:class ex:Employee; ]; rr:predicateObjectMap [ rr:predicate ex:name; rr:objectMap [ rr:column "label" ]; ] .
Here is another example:
@prefix rr: <http://www.w3.org/ns/r2rml#> . @prefix ex: <http://example.com/#> . @prefix foaf: <http://xmlns.com/foaf/0.1/> . @prefix xsd: <http://www.w3.org/2001/XMLSchema#> . <#TriplesMap2> rr:logicalTable [ rr:tableName "Student" ]; rr:subjectMap [ rr:template "http://example.com/{ID}{Name}"; rr:class foaf:Person ]; rr:predicateObjectMap [ rr:predicate ex:id ; rr:objectMap [ rr:column "ID"; rr:datatype xsd:integer ] ]; rr:predicateObjectMap [ rr:predicate foaf:name ; rr:objectMap [ rr:column "Name" ] ] .
The W3 Recommendation at R2RML:
RDB to RDF Mapping Language
GraphMappingConfig Layout for Property-Graph/Gremlin Data
A comparable GraphMappingConfig
for property-graph data is a JSON object
that provides a mapping rule for each graph entity to be genereated from the source data.
The following template shows what each rule in this object looks like:
{
"rules": [
{
"rule_id": "(an identifier for this rule)",
"rule_name": "(a name for this rule)",
"table_name": "(the name of the table or view being loaded)",
"vertex_definitions": [
{
"vertex_id_template": "{col1}",
"vertex_label": "(the vertex to create)",
"vertex_definition_id": "(an identifier for this vertex)",
"vertex_properties": [
{
"property_name": "(name of the property)",
"property_value_template": "{col2} or text",
"property_value_type": "(data type of the property)"
}
]
}
]
},
{
"rule_id": "(an identifier for this rule)",
"rule_name": "(a name for this rule)",
"table_name": "(the name of the table or view being loaded)",
"edge_definitions": [
{
"from_vertex": {
"vertex_id_template": "{col1}",
"vertex_definition_id": "(an identifier for the vertex referenced above)"
},
"to_vertex": {
"vertex_id_template": "{col3}",
"vertex_definition_id": "(an identifier for the vertex referenced above)"
},
"edge_id_template": {
"label": "(the edge label to add)",
"template": "{col1}_{col3}"
},
"edge_properties":[
{
"property_name": "(the property to add)",
"property_value_template": "{col4} or text",
"property_value_type": "(data type like String, int, double)"
}
]
}
]
}
]
}
Note that the presence of a vertex label implies that the vertex is being created here, whereas its absence implies that the vertex is created by a different source, and this definition is only adding vertex properties.
Here is a sample rule for an employee record:
{
"rules": [
{
"rule_id": "1",
"rule_name": "vertex_mapping_rule_from_nodes",
"table_name": "nodes",
"vertex_definitions": [
{
"vertex_id_template": "{emp_id}",
"vertex_label": "employee",
"vertex_definition_id": "1",
"vertex_properties": [
{
"property_name": "name",
"property_value_template": "{emp_name}",
"property_value_type": "String"
}
]
}
]
},
{
"rule_id": "2",
"rule_name": "edge_mapping_rule_from_emp",
"table_name": "nodes",
"edge_definitions": [
{
"from_vertex": {
"vertex_id_template": "{emp_id}",
"vertex_definition_id": "1"
},
"to_vertex": {
"vertex_id_template": "{mgr_id}",
"vertex_definition_id": "1"
},
"edge_id_template": {
"label": "reportsTo",
"template": "{emp_id}_{mgr_id}"
},
"edge_properties":[
{
"property_name": "team",
"property_value_template": "{team}",
"property_value_type": "String"
}
]
}
]
}
]
}