Elastic

Elastic Cheat Sheet

Elasticsearch is a search engine based on the Lucene library. It provides a distributed, multitenant-capable full-text search engine with an HTTP web interface and schema-free JSON documents.

Elastic General

Cardinality deduplication statistics

GET twitter/_search
{
  "size": 0,
  "aggs": {
    "number_of_cities": {
      "cardinality": {
        "field": "city"
      }
    }
  }
}

Statistics grouped by multiple filters, grouped by filter

GET twitter/_search
{
  "size": 0,
  "aggs": {
    "by_cities": {
      "filters": {
        "filters": {
          "beijing": {
            "match": {
              "city": "北京"
            }
          },
          "shanghai": {
            "match": {
              "city": "上海"
            }
          }
        }
      }
    }
  }
}

Terms group by field

GET twitter/_search
{
  "size": 0,
  "aggs": {
    "city": {
      "terms": {
        "field": "city",
        "size": 10
      }
    }
  }
}

bool combined query

POST _search
{
  "query": {
    "bool" : {
      "must" : {
        "term" : { "user" : "kimchy" }
      },
      "filter": {
        "term" : { "tag" : "tech" }
      },
      "must_not" : {
        "range" : {
          "age" : { "gte" : 10, "lte" : 20 }
        }
      },
      "should" : [
        { "term" : { "tag" : "wow" } },
        { "term" : { "tag" : "elasticsearch" } }
      ],
      "minimum_should_match" : 1,
      "boost" : 1.0
    }
  }
}

Compound query

must
子句必须在文档中匹配。 正匹配有助于提高相关性分数。
should
不强制要求必须匹配。 但是,如果匹配,相关性得分就会提高。
must_not
条件不得与文档匹配。 该子句不会对分数做出贡献(它在过滤上下文执行上下文中运行)
filter
条件必须与文档匹配,类似于 must 子句。 该子句不会对分数做出贡献 (它在过滤上下文执行上下文中运行)

_msearch searches multiple indexes simultaneously

get /_msearch
{"index":"es_db"}
{"query":{"match_all":{}}, "from":0, "size": 2}
{"index":"article"}
{"query":{"match":{"title":"fox"}}}

Batch query based on ID (search multiple indexes at the same time)

get _mget
{
  "docs":[
    {"_index":"es_db", "_id":3},
    {"_index":"article", "_id":1}
  ]
}

match_phrase strict match

get /es_db/_search
{
    "query": {
        "match_phrase":{
            "address": "广州白云山"
        }
    }
}

range range query

GET twitter/_search
{
  "query": {
    "range": {
      "age": {
        "gte": 30,
        "lte": 40
      }
    }
  }
}

simple_query_string simple version of combined keywords

get /es_db/_search
{
    "query":{
        "simple_query_string":{
            "query": "张三 + 李四"
        }
    }
}

match split words, sort by score after matching

get /es_db/_search
{
  "query":{
    "match": {
      "address": "重庆市南岸区",
      "operator": "and"
    }
  }
}
- 先对关键字进行分词,然后按分词进行查询
- 按匹配度从大到小排序
- 分词之前默认匹配关系为or

multi_match match multiple fields

get /es_db/_search
{
    "query":{
        "multi_match":{
            "query":"重庆",
            "fields":["address", "name"]
        }
    }
}

ids single index, matching multiple IDs

get /twitter/_search
{
    "query":{
        "ids":{
            "values":[1, 2]
        }
    }
}

fuzzy query

get /es_db/_search
{
    "query":{
        "fuzzy":{
            "address":{
                "value": "白云山",
                "fuzziness": 1
            }
        }
    }
}
- 允许不完全匹配,比如错别字,多字,少字
- fuzziness:设置误差范围,允许0-2之间

wildcard query

GET twitter/_search
{
  "query": {
    "wildcard": {
      "city.keyword": {
        "value": "*海"
      }
    }
  }
}

query_string Combining keywords

get /es_db/_search
{
    "query":{
        "query_string":{
            "query": "张三 OR 李四"
        }
    }
}

Term does not segment words, matches keyword type

get /es_db/_search
{
  "query":{
    "term": {
      "address": {
        "value": "重庆"
      }
    }
  }
}