在 Gremlin 查询中使用 Neptune 全文搜索 - Amazon Neptune

本文属于机器翻译版本。若本译文内容与英语原文存在差异,则一律以英文原文为准。

在 Gremlin 查询中使用 Neptune 全文搜索

对未转换为 Neptune 步骤的 Gremlin 遍历部分,NeptuneSearchStep 启用全文搜索查询。例如,请考虑以下查询:

g.withSideEffect("Neptune#fts.endpoint", "your-es-endpoint-URL") .V() .tail(100) .has("name", "Neptune#fts mark*") <== # Limit the search on name

此查询在 Neptune 中将转换为以下优化遍历。

Neptune steps: [ NeptuneGraphQueryStep(Vertex) { JoinGroupNode { PatternNode[(?1, <~label>, ?2, <~>) . project distinct ?1 .], {estimatedCardinality=INFINITY} }, annotations={path=[Vertex(?1):GraphStep], maxVarId=4} }, NeptuneTraverserConverterStep ] + not converted into Neptune steps: [NeptuneTailGlobalStep(100), NeptuneTinkerpopTraverserConverterStep, NeptuneSearchStep { JoinGroupNode { SearchNode[(idVar=?3, query=mark*, field=name) . project ask .], {endpoint=your-OpenSearch-endpoint-URL} } JoinGroupNode { SearchNode[(idVar=?3, query=mark*, field=name) . project ask .], {endpoint=your-OpenSearch-endpoint-URL} } }]

以下是 Gremlin 对航空航线数据的查询示例:

不区分大小写的 Gremlin 基本 match 查询

g.withSideEffect("Neptune#fts.endpoint", "your-OpenSearch-endpoint-URL") .withSideEffect('Neptune#fts.queryType', 'match') .V().has("city","Neptune#fts dallas") ==>v[186] ==>v[8]

Gremlin match 查询

g.withSideEffect("Neptune#fts.endpoint", "your-OpenSearch-endpoint-URL") .withSideEffect('Neptune#fts.queryType', 'match') .V().has("city","Neptune#fts southampton") .local(values('code','city').fold()) .limit(5) ==>[SOU, Southampton]

Gremlin fuzzy 查询

g.withSideEffect("Neptune#fts.endpoint", "your-OpenSearch-endpoint-URL") .V().has("city","Neptune#fts allas~").values('city').limit(5) ==>Dallas ==>Dallas ==>Walla Walla ==>Velas ==>Altai

Gremlin query_string 模糊查询

g.withSideEffect("Neptune#fts.endpoint", "your-OpenSearch-endpoint-URL") .withSideEffect('Neptune#fts.queryType', 'query_string') .V().has("city","Neptune#fts allas~").values('city').limit(5) ==>Dallas ==>Dallas

Gremlin query_string 正则表达式查询

g.withSideEffect("Neptune#fts.endpoint", "your-OpenSearch-endpoint-URL") .withSideEffect('Neptune#fts.queryType', 'query_string') .V().has("city","Neptune#fts /[dp]allas/").values('city').limit(5) ==>Dallas ==>Dallas

Gremlin 混合查询

此查询在同一查询中使用 Neptune 内部索引和 OpenSearch 索引。

g.withSideEffect("Neptune#fts.endpoint", "your-OpenSearch-endpoint-URL") .V().has("region","GB-ENG") .has('city','Neptune#fts L*') .values('city') .dedup() .limit(10) ==>London ==>Leeds ==>Liverpool ==>Land's End

简单 Gremlin 全文搜索示例

g.withSideEffect("Neptune#fts.endpoint", "your-OpenSearch-endpoint-URL") .V().has('desc','Neptune#fts regional municipal') .local(values('code','desc').fold()) .limit(100) ==>[HYA, Barnstable Municipal Boardman Polando Field] ==>[SPS, Sheppard Air Force Base-Wichita Falls Municipal Airport] ==>[ABR, Aberdeen Regional Airport] ==>[SLK, Adirondack Regional Airport] ==>[BFD, Bradford Regional Airport] ==>[EAR, Kearney Regional Airport] ==>[ROT, Rotorua Regional Airport] ==>[YHD, Dryden Regional Airport] ==>[TEX, Telluride Regional Airport] ==>[WOL, Illawarra Regional Airport] ==>[TUP, Tupelo Regional Airport] ==>[COU, Columbia Regional Airport] ==>[MHK, Manhattan Regional Airport] ==>[BJI, Bemidji Regional Airport] ==>[HAS, Hail Regional Airport] ==>[ALO, Waterloo Regional Airport] ==>[SHV, Shreveport Regional Airport] ==>[ABI, Abilene Regional Airport] ==>[GIZ, Jizan Regional Airport] ==>[USA, Concord Regional Airport] ==>[JMS, Jamestown Regional Airport] ==>[COS, City of Colorado Springs Municipal Airport] ==>[PKB, Mid Ohio Valley Regional Airport]

使用 query_string 以及“+”和“-”运算符的 Gremlin 查询

尽管 query_string 查询类型比默认 simple_query_string 类型要严格得多,但它确实可以获得更精确的查询结果。下面的第一个查询使用 query_string,而第二个查询使用默认的 simple_query_string

g.withSideEffect("Neptune#fts.endpoint", "your-OpenSearch-endpoint-URL") .withSideEffect('Neptune#fts.queryType', 'query_string') . V().has('desc','Neptune#fts +London -(Stansted|Gatwick)') .local(values('code','desc').fold()) .limit(10) ==>[LHR, London Heathrow] ==>[YXU, London Airport] ==>[LTN, London Luton Airport] ==>[SEN, London Southend Airport] ==>[LCY, London City Airport]

请注意以下示例中的 simple_query_string 如何悄悄地忽略“+”和“-”运算符:

g.withSideEffect("Neptune#fts.endpoint", "your-OpenSearch-endpoint-URL") .V().has('desc','Neptune#fts +London -(Stansted|Gatwick)') .local(values('code','desc').fold()) .limit(10) ==>[LHR, London Heathrow] ==>[YXU, London Airport] ==>[LGW, London Gatwick] ==>[STN, London Stansted Airport] ==>[LTN, London Luton Airport] ==>[SEN, London Southend Airport] ==>[LCY, London City Airport] ==>[SKG, Thessaloniki Macedonia International Airport] ==>[ADB, Adnan Menderes International Airport] ==>[BTV, Burlington International Airport]
g.withSideEffect("Neptune#fts.endpoint", "your-OpenSearch-endpoint-URL") .withSideEffect('Neptune#fts.queryType', 'query_string') .V().has('desc','Neptune#fts +(regional|municipal) -(international|bradford)') .local(values('code','desc').fold()) .limit(10) ==>[CZH, Corozal Municipal Airport] ==>[MMU, Morristown Municipal Airport] ==>[YBR, Brandon Municipal Airport] ==>[RDD, Redding Municipal Airport] ==>[VIS, Visalia Municipal Airport] ==>[AIA, Alliance Municipal Airport] ==>[CDR, Chadron Municipal Airport] ==>[CVN, Clovis Municipal Airport] ==>[SDY, Sidney Richland Municipal Airport] ==>[SGU, St George Municipal Airport]

使用 ANDOR 运算符的 Gremlin query_string 查询

g.withSideEffect("Neptune#fts.endpoint", "your-OpenSearch-endpoint-URL") .withSideEffect('Neptune#fts.queryType', 'query_string') .V().has('desc','Neptune#fts (St AND George) OR (St AND Augustin)') .local(values('code','desc').fold()) .limit(10) ==>[YIF, St Augustin Airport] ==>[STG, St George Airport] ==>[SGO, St George Airport] ==>[SGU, St George Municipal Airport]

Gremlin term 查询

g.withSideEffect("Neptune#fts.endpoint", "your-OpenSearch-endpoint-URL") .withSideEffect('Neptune#fts.queryType', 'term') .V().has("SKU","Neptune#fts ABC123DEF9") .local(values('code','city').fold()) .limit(5) ==>[AUS, Austin]

Gremlin prefix 查询

g.withSideEffect("Neptune#fts.endpoint", "your-OpenSearch-endpoint-URL") .withSideEffect('Neptune#fts.queryType', 'prefix') .V().has("icao","Neptune#fts ka") .local(values('code','icao','city').fold()) .limit(5) ==>[AZO, KAZO, Kalamazoo] ==>[APN, KAPN, Alpena] ==>[ACK, KACK, Nantucket] ==>[ALO, KALO, Waterloo] ==>[ABI, KABI, Abilene]

在 Neptune Gremlin 中使用 Lucene 语法

在 Neptune Gremlin 中,您还可以使用 Lucene 查询语法编写功能非常强大的查询。请注意,只有 OpenSearch 中的 query_string 查询支持 Lucene 语法。

假设以下数据:

g.addV("person") .property(T.id, "p1") .property("name", "simone") .property("surname", "rondelli") g.addV("person") .property(T.id, "p2") .property("name", "simone") .property("surname", "sengupta") g.addV("developer") .property(T.id, "p3") .property("name", "simone") .property("surname", "rondelli")

使用 Lucene 语法(当 queryTypequery_string 时调用),您可以按名字和姓氏搜索此数据,如下所示:

g.withSideEffect("Neptune#fts.endpoint", "es_endpoint") .withSideEffect("Neptune#fts.queryType", "query_string") .V() .has("*", "Neptune#fts predicates.name.value:simone AND predicates.surname.value:rondelli") ==> v[p1], v[p3]

请注意,在上面的 has() 步骤中,此字段替换为 "*"。实际上,放在此处的任何值都会被您在查询中访问的字段所覆盖。您可以使用 predicates.name.value, 访问名字字段,因为这是数据模型的结构方式。

您可以按名字、姓氏和标签进行搜索,如下所示:

g.withSideEffect("Neptune#fts.endpoint", getEsEndpoint()) .withSideEffect("Neptune#fts.queryType", "query_string") .V() .has("*", "Neptune#fts predicates.name.value:simone AND predicates.surname.value:rondelli AND entity_type:person") ==> v[p1]

该标签使用 entity_type 进行访问,也是因为这是数据模型的结构方式。

您还可以包括嵌套条件:

g.withSideEffect("Neptune#fts.endpoint", getEsEndpoint()) .withSideEffect("Neptune#fts.queryType", "query_string") .V() .has("*", "Neptune#fts (predicates.name.value:simone AND predicates.surname.value:rondelli AND entity_type:person) OR predicates.surname.value:sengupta") ==> v[p1], v[p2]

插入现代 TinkerPop 图形

g.addV('person').property(T.id, '1').property('name', 'marko').property('age', 29) .addV('personr').property(T.id, '2').property('name', 'vadas').property('age', 27) .addV('software').property(T.id, '3').property('name', 'lop').property('lang', 'java') .addV('person').property(T.id, '4').property('name', 'josh').property('age', 32) .addV('software').property(T.id, '5').property('name', 'ripple').property('lang', 'java') .addV('person').property(T.id, '6').property('name', 'peter').property('age', 35) g.V('1').as('a').V('2').as('b').addE('knows').from('a').to('b').property('weight', 0.5f).property(T.id, '7') .V('1').as('a').V('3').as('b').addE('created').from('a').to('b').property('weight', 0.4f).property(T.id, '9') .V('4').as('a').V('3').as('b').addE('created').from('a').to('b').property('weight', 0.4f).property(T.id, '11') .V('4').as('a').V('5').as('b').addE('created').from('a').to('b').property('weight', 1.0f).property(T.id, '10') .V('6').as('a').V('3').as('b').addE('created').from('a').to('b').property('weight', 0.2f).property(T.id, '12') .V('1').as('a').V('4').as('b').addE('knows').from('a').to('b').property('weight', 1.0f).property(T.id, '8')

按字符串字段值排序示例

g.withSideEffect("Neptune#fts.endpoint", "your-OpenSearch-endpoint-URL") .withSideEffect('Neptune#fts.queryType', 'query_string') .withSideEffect('Neptune#fts.sortOrder', 'asc') .withSideEffect('Neptune#fts.sortBy', 'name') .V().has('name', 'Neptune#fts marko OR vadas OR ripple')

按非字符串字段值排序示例

g.withSideEffect("Neptune#fts.endpoint", "your-OpenSearch-endpoint-URL") .withSideEffect('Neptune#fts.queryType', 'query_string') .withSideEffect('Neptune#fts.sortOrder', 'asc') .withSideEffect('Neptune#fts.sortBy', 'age.value') .V().has('name', 'Neptune#fts marko OR vadas OR ripple')

按 ID 字段值排序示例

g.withSideEffect("Neptune#fts.endpoint", "your-OpenSearch-endpoint-URL") .withSideEffect('Neptune#fts.queryType', 'query_string') .withSideEffect('Neptune#fts.sortOrder', 'asc') .withSideEffect('Neptune#fts.sortBy', 'Neptune#fts.entity_id') .V().has('name', 'Neptune#fts marko OR vadas OR ripple')

按标签字段值排序示例

g.withSideEffect("Neptune#fts.endpoint", "your-OpenSearch-endpoint-URL") .withSideEffect('Neptune#fts.queryType', 'query_string') .withSideEffect('Neptune#fts.sortOrder', 'asc') .withSideEffect('Neptune#fts.sortBy', 'Neptune#fts.entity_type') .V().has('name', 'Neptune#fts marko OR vadas OR ripple')

document_type 字段值排序示例

g.withSideEffect("Neptune#fts.endpoint", "your-OpenSearch-endpoint-URL") .withSideEffect('Neptune#fts.queryType', 'query_string') .withSideEffect('Neptune#fts.sortOrder', 'asc') .withSideEffect('Neptune#fts.sortBy', 'Neptune#fts.document_type') .V().has('name', 'Neptune#fts marko OR vadas OR ripple')