Elasticsearch DSL 查询

太过爱你忘了你带给我的痛 2021-12-03 11:27 547阅读 0赞

DSL 是 Domain Specific Language(特定领域语言)的缩写。

Elasticsearch 提供了功能丰富且灵活的查询语言,即 DSL 查询。你可以用它构建出更加复杂、功能更加强大的查询语句。

先来一个简单的例子,比如,我们可以这样查询 first_name 字段包含 John 的用户。

  1. GET /alibaba/user/_search
  2. {
  3. "query" : {
  4. "match" : {
  5. "first_name" : "John"
  6. }
  7. }
  8. }

它会返回和上一节中的查询字符串搜索相同的结果。只不过这里我们不再使用查询字符串作为请求参数,而是使用 JSON 格式的请求体。

bool 联合查询

逻辑与(AND)

下面,我们来进行一个稍微复杂点的查询。比如,查询姓氏包含 Smith 且年龄大于 30 岁的用户。

  1. GET /alibaba/user/_search
  2. {
  3. "query" : {
  4. "bool" : {
  5. "filter" : {
  6. "range" : {
  7. "info.age" : { "gt" : 30 }
  8. }
  9. },
  10. "must" : [{
  11. "match" : {
  12. "last_name" : "smith"
  13. }
  14. }]
  15. }
  16. }
  17. }

返回结果:

  1. {
  2. "took": 4,
  3. "timed_out": false,
  4. "_shards": {
  5. "total": 5,
  6. "successful": 5,
  7. "failed": 0
  8. },
  9. "hits": {
  10. "total": 1,
  11. "max_score": 0.2876821,
  12. "hits": [
  13. {
  14. "_index": "alibaba",
  15. "_type": "user",
  16. "_id": "3",
  17. "_score": 0.2876821,
  18. "_source": {
  19. "email": "john3@smith.com",
  20. "first_name": "John3",
  21. "last_name": "Smith",
  22. "full_name": "John3 Smith",
  23. "info": {
  24. "age": 35,
  25. "interests": [
  26. "sports",
  27. "musics"
  28. ],
  29. "address": "guangdong shenzhen longhua"
  30. },
  31. "created_at": "2019-06-08 09:40:10"
  32. }
  33. }
  34. ]
  35. }
  36. }

查询 last_name 包含 Smith,且 full_name 包含 John 的用户。

  1. GET /alibaba/user/_search
  2. {
  3. "query": {
  4. "bool": {
  5. "must": [
  6. { "match": { "last_name": "Smith" } },
  7. { "match": { "full_name": "John" } }
  8. ]
  9. }
  10. }
  11. }

返回结果:

  1. {
  2. "took": 4,
  3. "timed_out": false,
  4. "_shards": {
  5. "total": 5,
  6. "successful": 5,
  7. "failed": 0
  8. },
  9. "hits": {
  10. "total": 1,
  11. "max_score": 0.5457982,
  12. "hits": [
  13. {
  14. "_index": "alibaba",
  15. "_type": "user",
  16. "_id": "1",
  17. "_score": 0.5457982,
  18. "_source": {
  19. "email": "john@smith.com",
  20. "first_name": "John",
  21. "last_name": "Smith",
  22. "full_name": "John Smith",
  23. "info": {
  24. "age": 25,
  25. "interests": [
  26. "games",
  27. "musics"
  28. ],
  29. "address": "guangdong shenzhen nanshan"
  30. },
  31. "created_at": "2019-06-06 08:30:10"
  32. }
  33. }
  34. ]
  35. }
  36. }

逻辑或(OR)

查询 info.age 大于 30 或 info.address 包含 baoan 的用户。

  1. GET /alibaba/user/_search
  2. {
  3. "query": {
  4. "bool": {
  5. "should": [{
  6. "range": {
  7. "info.age": {
  8. "gt": 30
  9. }
  10. }
  11. },
  12. {
  13. "match": {
  14. "info.address": "baoan"
  15. }
  16. }
  17. ]
  18. }
  19. }
  20. }

返回结果:

  1. {
  2. "took": 8,
  3. "timed_out": false,
  4. "_shards": {
  5. "total": 5,
  6. "successful": 5,
  7. "failed": 0
  8. },
  9. "hits": {
  10. "total": 2,
  11. "max_score": 1,
  12. "hits": [
  13. {
  14. "_index": "alibaba",
  15. "_type": "user",
  16. "_id": "3",
  17. "_score": 1,
  18. "_source": {
  19. "email": "john3@smith.com",
  20. "first_name": "John3",
  21. "last_name": "Smith",
  22. "full_name": "John3 Smith",
  23. "info": {
  24. "age": 35,
  25. "interests": [
  26. "sports",
  27. "musics"
  28. ],
  29. "address": "guangdong shenzhen longhua"
  30. },
  31. "created_at": "2019-06-08 09:40:10"
  32. }
  33. },
  34. {
  35. "_index": "alibaba",
  36. "_type": "user",
  37. "_id": "2",
  38. "_score": 0.25316024,
  39. "_source": {
  40. "email": "john2@smith.com",
  41. "first_name": "John2",
  42. "last_name": "Smith",
  43. "full_name": "John2 Smith",
  44. "info": {
  45. "age": 28,
  46. "interests": [
  47. "games",
  48. "books"
  49. ],
  50. "address": "guangdong shenzhen baoan"
  51. },
  52. "created_at": "2019-06-07 08:30:10"
  53. }
  54. }
  55. ]
  56. }
  57. }

查询 last_name 包含 Smith,或者 full_name 包含 John 的用户。

  1. GET /alibaba/user/_search
  2. {
  3. "query": {
  4. "bool": {
  5. "should": [
  6. { "match": { "last_name": "Smith" } },
  7. { "match": { "full_name": "John" } }
  8. ]
  9. }
  10. }
  11. }

返回结果:

  1. {
  2. "took": 7,
  3. "timed_out": false,
  4. "_shards": {
  5. "total": 5,
  6. "successful": 5,
  7. "failed": 0
  8. },
  9. "hits": {
  10. "total": 3,
  11. "max_score": 0.5457982,
  12. "hits": [
  13. {
  14. "_index": "alibaba",
  15. "_type": "user",
  16. "_id": "1",
  17. "_score": 0.5457982,
  18. "_source": {
  19. "email": "john@smith.com",
  20. "first_name": "John",
  21. "last_name": "Smith",
  22. "full_name": "John Smith",
  23. "info": {
  24. "age": 25,
  25. "interests": [
  26. "games",
  27. "musics"
  28. ],
  29. "address": "guangdong shenzhen nanshan"
  30. },
  31. "created_at": "2019-06-06 08:30:10"
  32. }
  33. },
  34. {
  35. "_index": "alibaba",
  36. "_type": "user",
  37. "_id": "2",
  38. "_score": 0.2876821,
  39. "_source": {
  40. "email": "john2@smith.com",
  41. "first_name": "John2",
  42. "last_name": "Smith",
  43. "full_name": "John2 Smith",
  44. "info": {
  45. "age": 28,
  46. "interests": [
  47. "games",
  48. "books"
  49. ],
  50. "address": "guangdong shenzhen baoan"
  51. },
  52. "created_at": "2019-06-07 08:30:10"
  53. }
  54. },
  55. {
  56. "_index": "alibaba",
  57. "_type": "user",
  58. "_id": "3",
  59. "_score": 0.2876821,
  60. "_source": {
  61. "email": "john3@smith.com",
  62. "first_name": "John3",
  63. "last_name": "Smith",
  64. "full_name": "John3 Smith",
  65. "info": {
  66. "age": 35,
  67. "interests": [
  68. "sports",
  69. "musics"
  70. ],
  71. "address": "guangdong shenzhen longhua"
  72. },
  73. "created_at": "2019-06-08 09:40:10"
  74. }
  75. }
  76. ]
  77. }
  78. }

总结说明

  • bool:表示联合查询,用来合并多种查询条件。
  • must:表示多个查询条件的肯定匹配,多个条件必须都满足,相当于 and。
  • should:表示多个查询条件的肯定匹配,多个条件只要有一个满足,相当于 or。
  • must_not:表示多个查询条件的否定匹配,多个条件必须都满足,相当于 and。
  • filter:用于快速过滤结果集。
  • 文档字段的内部字段,通过 . 号来访问,如 info.age。

全文搜索

这里,我们搜索 info.address 包含 “shenzhen baoan” 的用户。

  1. GET /alibaba/user/_search
  2. {
  3. "query" : {
  4. "match" : {
  5. "info.address" : "shenzhen baoan"
  6. }
  7. }
  8. }

返回结果:

  1. {
  2. "took": 5,
  3. "timed_out": false,
  4. "_shards": {
  5. "total": 5,
  6. "successful": 5,
  7. "failed": 0
  8. },
  9. "hits": {
  10. "total": 3,
  11. "max_score": 0.5063205,
  12. "hits": [
  13. {
  14. "_index": "alibaba",
  15. "_type": "user",
  16. "_id": "2",
  17. "_score": 0.5063205,
  18. "_source": {
  19. "email": "john2@smith.com",
  20. "first_name": "John2",
  21. "last_name": "Smith",
  22. "full_name": "John2 Smith",
  23. "info": {
  24. "age": 28,
  25. "interests": [
  26. "games",
  27. "books"
  28. ],
  29. "address": "guangdong shenzhen baoan"
  30. },
  31. "created_at": "2019-06-07 08:30:10"
  32. }
  33. },
  34. {
  35. "_index": "alibaba",
  36. "_type": "user",
  37. "_id": "1",
  38. "_score": 0.25316024,
  39. "_source": {
  40. "email": "john@smith.com",
  41. "first_name": "John",
  42. "last_name": "Smith",
  43. "full_name": "John Smith",
  44. "info": {
  45. "age": 25,
  46. "interests": [
  47. "games",
  48. "musics"
  49. ],
  50. "address": "guangdong shenzhen nanshan"
  51. },
  52. "created_at": "2019-06-06 08:30:10"
  53. }
  54. },
  55. {
  56. "_index": "alibaba",
  57. "_type": "user",
  58. "_id": "3",
  59. "_score": 0.25316024,
  60. "_source": {
  61. "email": "john3@smith.com",
  62. "first_name": "John3",
  63. "last_name": "Smith",
  64. "full_name": "John3 Smith",
  65. "info": {
  66. "age": 35,
  67. "interests": [
  68. "sports",
  69. "musics"
  70. ],
  71. "address": "guangdong shenzhen longhua"
  72. },
  73. "created_at": "2019-06-08 09:40:10"
  74. }
  75. }
  76. ]
  77. }
  78. }

可以发现,Elasticsearch 自动把要搜索匹配的字符串 “shenzhen baoan”,拆分成了两个词 “shenzhen” 和 “baoan”。然后再到 info.address 中进行全文搜索匹配。

搜索到的 3 条记录,默认是按照搜索排名分数降序排列的。

搜索排名分数(score),就是搜索结果相关性大小的评分,即文档与查询条件的匹配程度。相关性越大,搜索排名分数就越高,排名(排序)就越靠前。

这个概念在传统的关系型数据库中是无法想象的。

上面的查询语句,等价于下面的这种写法:

  1. GET /alibaba/user/_search
  2. {
  3. "query" : {
  4. "bool":{
  5. "should": [
  6. {"match" : {"info.address" : "shenzhen"}},
  7. {"match" : {"info.address" : "baoan"}}
  8. ]
  9. }
  10. }
  11. }

短语搜索

有时,我们确切地想要匹配多个单词或者短语(phrase)。这时,可以采用 match_phrase 来实现。

  1. GET /alibaba/user/_search
  2. {
  3. "query" : {
  4. "match_phrase" : {
  5. "info.address" : "shenzhen baoan"
  6. }
  7. }
  8. }

返回结果:

  1. {
  2. "took": 21,
  3. "timed_out": false,
  4. "_shards": {
  5. "total": 5,
  6. "successful": 5,
  7. "failed": 0
  8. },
  9. "hits": {
  10. "total": 1,
  11. "max_score": 0.5063205,
  12. "hits": [
  13. {
  14. "_index": "alibaba",
  15. "_type": "user",
  16. "_id": "2",
  17. "_score": 0.5063205,
  18. "_source": {
  19. "email": "john2@smith.com",
  20. "first_name": "John2",
  21. "last_name": "Smith",
  22. "full_name": "John2 Smith",
  23. "info": {
  24. "age": 28,
  25. "interests": [
  26. "games",
  27. "books"
  28. ],
  29. "address": "guangdong shenzhen baoan"
  30. },
  31. "created_at": "2019-06-07 08:30:10"
  32. }
  33. }
  34. ]
  35. }
  36. }

高亮显示

有时,我们需要高亮显示匹配到的关键字。可以通过 highlight 来实现。

  1. GET /alibaba/user/_search
  2. {
  3. "query" : {
  4. "match_phrase" : {
  5. "info.address" : "shenzhen baoan"
  6. }
  7. },
  8. "highlight": {
  9. "fields" : {
  10. "info.address" : {}
  11. }
  12. }
  13. }

返回结果:

  1. {
  2. "took": 77,
  3. "timed_out": false,
  4. "_shards": {
  5. "total": 5,
  6. "successful": 5,
  7. "failed": 0
  8. },
  9. "hits": {
  10. "total": 1,
  11. "max_score": 0.5063205,
  12. "hits": [
  13. {
  14. "_index": "alibaba",
  15. "_type": "user",
  16. "_id": "2",
  17. "_score": 0.5063205,
  18. "_source": {
  19. "email": "john2@smith.com",
  20. "first_name": "John2",
  21. "last_name": "Smith",
  22. "full_name": "John2 Smith",
  23. "info": {
  24. "age": 28,
  25. "interests": [
  26. "games",
  27. "books"
  28. ],
  29. "address": "guangdong shenzhen baoan"
  30. },
  31. "created_at": "2019-06-07 08:30:10"
  32. },
  33. "highlight": {
  34. "info.address": [
  35. "guangdong <em>shenzhen</em> <em>baoan</em>"
  36. ]
  37. }
  38. }
  39. ]
  40. }
  41. }

多字段搜索

有时,我们想要让同一个查询字符串同时应用到多个字段进行搜索匹配。可以采用 multi_match 来实现。

例如,查询 first_name 或者 full_name 包含 “smith” 的文档。

  1. GET /alibaba/user/_search
  2. {
  3. "query" : {
  4. "multi_match": {
  5. "query": "smith",
  6. "fields": ["first_name", "full_name"]
  7. }
  8. }
  9. }

发表评论

表情:
评论列表 (有 0 条评论,547人围观)

还没有评论,来说两句吧...

相关阅读