Getting Started with Elasticsearch - Installation, operation and simple use of Elasticsearch and Kibana on Mac

1. Installation of Elasticsearch and Kibana on Mac

Elasticsearch is a search server based on Apache Lucene, suitable for all types of data, including text, numbers, geospatial, structured and unstructured data, and is an integral part of ELK (ELK stands for: E is ElasticSearch, L is Logstach, K is kibana).

It provides a distributed and scalable real-time search and analysis engine, which is known for its simple REST style API, distributed features, speed and scalability, and is a very powerful search engine for full-text retrieval.

Elasticsearch is created and maintained by Elastic, which also owns the Logstash and Kibana open source projects.

Together, the three open source projects form a strong ecosystem. Simply put, Logstash is responsible for data collection and processing (data enrichment, data transformation, etc.), and Kibana is responsible for data display, analysis and management. Elasticsearch is at the core, it can help us quickly search and analyze data.

1.1 Environment and download

Before installation, know the java version of the local PC in advance, because the correspondence between the java version and elasticsearch and kibana is strictly required

My local Mac is using:

java version "1.8.0_121"

elasticsearch-6.8.2 download address: https://www.elastic.co/cn/downloads/elasticsearch

kibana-6.8.23 download address: https://www.elastic.co/cn/downloads/kibana

1.2 Installation and operation

After the download is complete, find a directory on the Mac to decompress the above two compressed packages

Then enter the respective bin directory

The startup command of elasticsearch:

./elasticsearch

kibana's startup command:

./kibana

After the output log is finished running, visit http://localhost:9200 (return data in json format) and http://localhost:5601 (return a page), if both pages are displayed normally, the operation is successful

Note: It takes a long time to start kibana, and it is normal to not see the log output immediately after executing the command

1.3 Questions

1.3.1 Other machines cannot access after elasticsearch is installed

After running successfully on the Mac, if Windows on the same network segment cannot access it, go to config/elasticsearch.yml in the installation directory on the Mac and add or modify a line

network.bind_host: 0.0.0.0

reboot, verify

http://xx.xx.xx.xx:9200

1.3.2 After kibana is installed, other machines cannot access it

Same as above, go to config/kibana.yml in the installation directory, add or modify two lines

server.port: 5602
server.host: 0.0.0.0

reboot, verify

http://xx.xx.xx.xx:5601

Second, the common commands of Elasticsearch in Kibana

First of all, before using the command, you need to know where the following commands can be run

Open the homepage of kibana, click [Dev Tools] on the left column, [Console] under the right column is divided into left and right columns, enter commands in the left column, and then click the triangular green button to display the results in the right column, as shown below :

2.1 View the health status of the cluster

GET _cat/health
================================ result ================================
1673923769 02:49:29 elasticsearch yellow 1 1 7 7 0 0 5 0 - 58.3%

If you want to know what each value means

GET _cat/health?v
================================ result ================================
epoch      timestamp cluster       status node.total node.data shards pri relo init unassign pending_tasks max_task_wait_time active_shards_percent
1673923848 02:50:48  elasticsearch yellow          1         1      7   7    0    0        5             0                  -                 58.3%

Interpretation of common attributes:

  • epoch: the timestamp of the current time (the default is eight hours different from the East Eighth District)
  • timestamp: current time
  • cluster: cluster name
  • status: cluster status, green means healthy, yellow means currently stand-alone, no copy
  • node.total: the number of online nodes
  • node.data: the number of online data nodes
  • ...

Get more detailed content

GET _all
================================ result ================================
#! Deprecation: [types removal] The parameter include_type_name should be explicitly specified in get indices requests to prepare for 7.0. In 7.0 include_type_name will default to 'false', which means responses will omit the type name in mapping definitions.
{
  ".kibana_1" : {
    "aliases" : {
      ".kibana" : { }
    },
    "mappings" : {
      "doc" : {
        "dynamic" : "strict",
        "properties" : {
        ......

2.2 Index

2.2.1 View all indexes

GET _cat/indices
================================ result ================================
yellow open human_index          Mf-9YNYrSdyiLZFgZCP7ow 5 1 4 0 22.6kb 22.6kb
green  open .kibana_task_manager J9YFrgfOS1W2N3dvqXwxOg 1 0 2 0 12.5kb 12.5kb
green  open .kibana_1            hgDx6B-6QmC0KjWLWB3wgQ 1 0 5 1 26.5kb 26.5kb

If you want to know what each value means

GET _cat/indices?v
================================ result ================================
health status index                uuid                   pri rep docs.count docs.deleted store.size pri.store.size
yellow open   human_index          Mf-9YNYrSdyiLZFgZCP7ow   5   1          4            0     22.6kb         22.6kb
green  open   .kibana_task_manager J9YFrgfOS1W2N3dvqXwxOg   1   0          2            0     12.5kb         12.5kb
green  open   .kibana_1            hgDx6B-6QmC0KjWLWB3wgQ   1   0          5            1     26.5kb         26.5kb

Interpretation of common attributes:

  • health: index health status
  • status: index startup status
  • index: index name
  • uuid: the unique identifier of the index
  • pri: index primary shard number
  • rep: number of index replica shards
  • docs.count: number of documents in the index
  • docs.deleted: Documents with deleted status in the index

2.2.2 Add new index

PUT /human_index1
================================ result ================================
#! Deprecation: the default number of shards will change from [5] to [1] in 7.0.0; if you wish to continue using the default of [5] shards, you must manage this on the create index request or with an index template
{
  "acknowledged" : true,
  "shards_acknowledged" : true,
  "index" : "human_index1"
}

2.2.3 View a single index

GET /human_index1
================================ result ================================
#! Deprecation: [types removal] The parameter include_type_name should be explicitly specified in get indices requests to prepare for 7.0. In 7.0 include_type_name will default to 'false', which means responses will omit the type name in mapping definitions.
{
  "human_index1" : {
    "aliases" : { },
    "mappings" : { },
    "settings" : {
      "index" : {
        "creation_date" : "1673926295232",
        "number_of_shards" : "5",
        "number_of_replicas" : "1",
        "uuid" : "i9ESnW6ETN2n5C6V5PLZ8Q",
        "version" : {
          "created" : "6082399"
        },
        "provided_name" : "human_index1"
      }
    }
  }
}

2.2.4 Delete a single index

DELETE /human_index1
================================ result ================================
{
  "acknowledged" : true
}

2.3 View node list

GET _cat/nodes
================================ result ================================
10.197.29.203 21 45 9 2.11   mdi * 2FgJQbJ

or

GET _cat/nodes?v
================================ result ================================
ip            heap.percent ram.percent cpu load_1m load_5m load_15m node.role master name
10.197.29.203           23          45   9    1.99                  mdi       *      2FgJQbJ

Interpretation of common attributes:

  • ip: the ip address of the deployment
  • heap.percent: percentage of heap memory usage
  • ram.percent: memory usage percentage
  • cup: CPU usage percentage
  • load_1m: 1 minute system load
  • node.role: the role of the node
  • master: whether it is the master node
  • name: node name

2.4 Addition, deletion, checking and modification of documents

2.4.1 New documentation

put /human_index/user/1
{
  "name": "hh",
  "desc": "my name is hh",
  "age": 25,
  "country": "China GuangDong",
  "sex": "female"
}
================================ result ================================
{
  "_index" : "human_index",
  "_type" : "user",
  "_id" : "1",
  "_version" : 1,
  "result" : "created",
  "_shards" : {
    "total" : 2,
    "successful" : 1,
    "failed" : 0
  },
  "created": true,
  "_seq_no" : 1,
  "_primary_term" : 2
}

In the above adding method, the id (1) of the document has been specified. If you don’t need to customize the id, you can use the following method:

POST /human_index/user
{
  "name": "id_test",
  "desc": "test no id"
}
================================ result ================================
{
  "_index" : "human_index",
  "_type" : "user",
  "_id" : "MnS2woUBq_u6VYKKJjno",
  "_version" : 1,
  "result" : "created",
  "_shards" : {
    "total" : 2,
    "successful" : 1,
    "failed" : 0
  },
  "_seq_no" : 10,
  "_primary_term" : 3
}

It can be seen that the default randomly generated id is MnS2woUBq_u6VYKKJjno

When creating a document, if the index (human_index) and type (user) of the command line do not exist, they will be created automatically by default.

2.4.2 Querying Documents

Query a single

get /human_index/user/1
================================ result ================================
{
  "_index" : "human_index",
  "_type" : "user",
  "_id" : "1",
  "_version" : 1,
  "_seq_no" : 1,
  "_primary_term" : 2,
  "found" : true,
  "_source" : {
    "name" : "hh",
    "desc" : "my name is hh",
    "age" : 25,
    "country" : "China GuangDong",
    "sex" : "female"
  }
}

query all

get /human_index/user/_search
================================ result ================================
{
  "took" : 1,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 4,
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "human_index",
        "_type" : "user",
        "_id" : "2",
        "_score" : 1.0,
        "_source" : {
          "name" : "sb",
          "desc" : "my name is sb",
          "age" : 25,
          "country" : "China GuangDong Jieyang",
          "sex" : "female"
        }
      },
      {
        "_index" : "human_index",
        "_type" : "user",
        "_id" : "4",
        "_score" : 1.0,
        "_source" : {
          "doc" : {
            "name" : "lmc hh",
            "country" : "China GuangDong Jieyang",
            "sex" : "male",
            "desc" : "my name is leemon",
            "age" : 11
          }
        }
      },
      {
        "_index" : "human_index",
        "_type" : "user",
        "_id" : "1",
        "_score" : 1.0,
        "_source" : {
          "name" : "hh",
          "desc" : "my name is hh",
          "age" : 25,
          "country" : "China GuangDong",
          "sex" : "female"
        }
      },
      {
        "_index" : "human_index",
        "_type" : "user",
        "_id" : "3",
        "_score" : 1.0,
        "_source" : {
          "age" : 24,
          "country" : "China GuangDong Shenzhen",
          "sex" : "male",
          "name" : "ln",
          "desc" : "my name is lee nai"
        }
      }
    ]
  }
}

or

get /human_index/user/_search
{
  "query":{
    "match_all": {}
  }
}

Since I have gone through the process, there are multiple records

Field explanation:

  • took: time spent (milliseconds)
  • _shards: Fragmentation
  • hits: the obtained data
    • total: the total number of data
    • max_score: the highest score in the data

2.4.3 Modify the document

Modifications can be handled through POST and PUT, but there is a difference between the two

  • The modification of PUT is a global modification, which will lose data
  • The modification of POST is a partial update, and other data remains unchanged; the content of the request body document must be wrapped in the key doc,

PUT

When using put, if the original document already exists, it will be directly replaced with a new one

put /human_index/user/1
{
  "sex": "female"
}
================================ result ================================
{
  "_index" : "human_index",
  "_type" : "user",
  "_id" : "1",
  "_version" : 3,
  "result" : "updated",
  "_shards" : {
    "total" : 2,
    "successful" : 1,
    "failed" : 0
  },
  "_seq_no" : 2,
  "_primary_term" : 2
}

Continue to view:

get /human_index/user/1
================================ result ================================
{
  "_index" : "human_index",
  "_type" : "user",
  "_id" : "1",
  "_version" : 3,
  "_seq_no" : 2,
  "_primary_term" : 2,
  "found" : true,
  "_source" : {
    "sex" : "female"
  }
}

It can be found that, except for the sex field, everything else is missing

POST

restore the document

put /human_index/user/1
{
  "name": "hh",
  "desc": "my name is hh",
  "age": 25,
  "country": "China GuangDong",
  "sex": "female"
}

Then modify it via POST

post /human_index/user/1/_update
{
  "doc": {
    "sex": "male"
  }
}
================================ result ================================
{
  "_index" : "human_index",
  "_type" : "user",
  "_id" : "1",
  "_version" : 7,
  "result" : "updated",
  "_shards" : {
    "total" : 2,
    "successful" : 1,
    "failed" : 0
  },
  "_seq_no" : 6,
  "_primary_term" : 2
}

check again

get /human_index/user/1
================================ result ================================
{
  "_index" : "human_index",
  "_type" : "user",
  "_id" : "1",
  "_version" : 7,
  "_seq_no" : 6,
  "_primary_term" : 2,
  "found" : true,
  "_source" : {
    "name" : "hh",
    "desc" : "my name is hh",
    "age" : 25,
    "country" : "China GuangDong",
    "sex" : "male"
  }
}

At this time, other attributes except sex exist, which are local modifications

2.4.4 Delete document

DELETE /human_index/user/1
================================ result ================================
{
  "_index" : "human_index",
  "_type" : "user",
  "_id" : "1",
  "_version" : 8,
  "result" : "deleted",
  "_shards" : {
    "total" : 2,
    "successful" : 1,
    "failed" : 0
  },
  "_seq_no" : 7,
  "_primary_term" : 2
}

Continue to view

get /human_index/user/1
================================ result ================================
{
  "_index" : "human_index",
  "_type" : "user",
  "_id" : "1",
  "found" : false
}

Deleted successfully

2.5 query

Before querying, all records under the index type user are as follows

{
  "took" : 2,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 6,
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "human_index",
        "_type" : "user",
        "_id" : "5",
        "_score" : 1.0,
        "_source" : {
          "name" : "ln-1",
          "country" : "China GuangDong Jieyang",
          "sex" : "male",
          "desc" : "my name is leemon-1",
          "age" : 21
        }
      },
      {
        "_index" : "human_index",
        "_type" : "user",
        "_id" : "2",
        "_score" : 1.0,
        "_source" : {
          "name" : "sb",
          "desc" : "my name is sb",
          "age" : 25,
          "country" : "China GuangDong Jieyang",
          "sex" : "female"
        }
      },
      {
        "_index" : "human_index",
        "_type" : "user",
        "_id" : "4",
        "_score" : 1.0,
        "_source" : {
          "doc" : {
            "name" : "lmc hh",
            "country" : "China GuangDong Jieyang",
            "sex" : "male",
            "desc" : "my name is leemon",
            "age" : 11
          }
        }
      },
      {
        "_index" : "human_index",
        "_type" : "user",
        "_id" : "6",
        "_score" : 1.0,
        "_source" : {
          "name" : "ln sb",
          "country" : "China GuangDong Jieyang",
          "sex" : "male",
          "desc" : "my name is sb leemon",
          "age" : 27
        }
      },
      {
        "_index" : "human_index",
        "_type" : "user",
        "_id" : "1",
        "_score" : 1.0,
        "_source" : {
          "name" : "hh",
          "desc" : "my name is hh",
          "age" : 25,
          "country" : "China GuangDong",
          "sex" : "female"
        }
      },
      {
        "_index" : "human_index",
        "_type" : "user",
        "_id" : "3",
        "_score" : 1.0,
        "_source" : {
          "age" : 24,
          "country" : "China GuangDong Shenzhen",
          "sex" : "male",
          "name" : "ln",
          "desc" : "my name is lee nai"
        }
      }
    ]
  }
}

2.5.1 Single/full table query

See 2.4.2 for details

2.5.2 Word segmentation query

get /human_index/user/_search
{
  "query": {
    "match": {
      "name": "ln"
    }
  }
}

As a result, three records will be found (some results are omitted)

{
    "_index" : "human_index",
    "_type" : "user",
    "_id" : "6",
    "_score" : 0.6099695,
    "_source" : {
        "name" : "ln sb",
        "country" : "China GuangDong Jieyang",
        "sex" : "male",
        "desc" : "my name is sb leemon",
        "age" : 27
    }
},
{
    "_index" : "human_index",
    "_type" : "user",
    "_id" : "5",
    "_score" : 0.2876821,
    "_source" : {
        "name" : "ln-1",
        "country" : "China GuangDong Jieyang",
        "sex" : "male",
        "desc" : "my name is leemon-1",
        "age" : 21
    }
},
{
    "_index" : "human_index",
    "_type" : "user",
    "_id" : "3",
    "_score" : 0.2876821,
    "_source" : {
        "age" : 24,
        "country" : "China GuangDong Shenzhen",
        "sex" : "male",
        "name" : "ln",
        "desc" : "my name is lee nai"
    }
}

It can be seen that when querying through match, when ln appears from the value of the name attribute in the document, the condition is met

2.5.3 Sub-attribute word segmentation query

get /human_index/user/_search
{
  "query": {
    "match": {
      "doc.name": "hh"
    }
  }
}

A record was found

{
    "_index" : "human_index",
    "_type" : "user",
    "_id" : "4",
    "_score" : 0.2876821,
    "_source" : {
        "doc" : {
            "name" : "lmc hh",
            "country" : "China GuangDong Jieyang",
            "sex" : "male",
            "desc" : "my name is leemon",
            "age" : 11
        }
    }
}

2.5.4 Short sentence query

The previous one is to query a single word, and a short sentence refers to a sentence formed by combining multiple words

get /human_index/user/_search
{
  "query": {
    "match_phrase": {
      "country": "GuangDong Jieyang"
    }
  }
}

As a result, 3 records were found, and the id s were: 2, 5, 6

If match_phrase is changed to match, it means that as long as GuangDong or Jieyang appears in the country, it will be found out, which means that the query condition will be segmented first, and then the union of the query after word segmentation will be returned

2.5.5 Fuzzy query

The fuzzy query here is quite different from the fuzzy query of the relational database. The fuzzy query of the relational type is similar to the word segmentation and short sentence query above. The fuzzy query of Elasticsearch means that the edit distance between the query parameter content and the actual content is within 2 documents within

get /human_index/user/_search
{
  "query": {
    "fuzzy": {
      "country": "Jieyank"
    }
  }
}

or

get /human_index/user/_search
{
  "query": {
    "fuzzy": {
      "country": "Jieyamg"
    }
  }
}

etc.

Since the edit distance of Jieyang, Jieyank and Jieyamg is within 2, it can be obtained through fuzzy query. As a result, the record id s detected are: 2, 5, 6

2.5.6 Sorting

get /human_index/user/_search
{
  "query": {
    "match": {
      "country": "Jieyang"
    }
  },
  "sort":[
    {
      "_id":{
        "order": "desc"
      }
    }
  ]
}

The number of documents queried is the same as 2.5.5, but sorted in descending order according to id

2.5.7 Pagination query

get /human_index/user/_search
{
  "query": {
    "match_all": {}
  },
  "sort":[
    {
      
      "age": {
        "order": "asc"
      }
    }
  ],
  "from": 0,
  "size": 3
}

Find the three document records with the smallest age, and the record id s of the returned results are in order: 5, 3, 2

2.5.8 Specified Field Query

get /human_index/user/_search
{
  "query": {
    "match": {
      "country": "Jieyang"
    }
  },
  "_source": ["name"]
}

The query results are as follows:

{
  "took" : 1,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 3,
    "max_score" : 0.2876821,
    "hits" : [
      {
        "_index" : "human_index",
        "_type" : "user",
        "_id" : "5",
        "_score" : 0.2876821,
        "_source" : {
          "name" : "ln-1"
        }
      },
      {
        "_index" : "human_index",
        "_type" : "user",
        "_id" : "2",
        "_score" : 0.18232156,
        "_source" : {
          "name" : "sb"
        }
      },
      {
        "_index" : "human_index",
        "_type" : "user",
        "_id" : "6",
        "_score" : 0.18232156,
        "_source" : {
          "name" : "ln sb"
        }
      }
    ]
  }
}

2.5.9 Multi-condition query

If multiple query conditions need to be spliced ​​together, you need to use bool

bool filter Boolean logic that can be used to combine query results of multiple filter conditions, it contains the following operators:

  • must: an exact match of multiple query conditions, equivalent to AND
  • must_not: The opposite match of multiple query conditions, equivalent to NOT
  • should: at least one condition matches, equivalent to OR

Search for documents where Jieyang appears in country, sb appears in name, and age is in 24-26

get /human_index/user/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "match": {
            "country": "Jieyang"
          }
        },
        {
          "match": {
            "name": "sb"
          }
        }
      ],
      "filter": {
        "range": {
          "age": {
            "gte": 24,
            "lte": 26
          }
        }
      }
    }
  }
}

As a result, only documents with id 2 were found

Regarding range queries:

  • gte: greater than or greater than
  • gt: greater than
  • lte: less than or equal to
  • le: less than

Find the documents where Jieyang appears in country or sb appears in name, and the age is in 24-26

get /human_index/user/_search
{
  "query": {
    "bool": {
      "should": [
        {
          "match": {
            "country": "Jieyang"
          }
        },
        {
          "match": {
            "name": "sb"
          }
        }
      ],
      "filter": {
        "range": {
          "age": {
            "gte": 24,
            "lte": 26
          }
        }
      }
    }
  }
}

As a result, documents with id s 1, 2, and 3 were found

2.5.10 Highlighting

When the query returns results, highlight the content of the query condition

get /human_index/user/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "match": {
            "country": "Jieyang"
          }
        },
        {
          "match": {
            "name": "sb"
          }
        }
      ],
      "filter": {
        "range": {
          "age": {
            "gte": 24,
            "lte": 26
          }
        }
      }
    }
  },
  "highlight": {
    "fields": {
      "country": {}
    }
  }
}

return result

{
  "took" : 3,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 1,
    "max_score" : 0.39343074,
    "hits" : [
      {
        "_index" : "human_index",
        "_type" : "user",
        "_id" : "2",
        "_score" : 0.39343074,
        "_source" : {
          "name" : "sb",
          "desc" : "my name is sb",
          "age" : 25,
          "country" : "China GuangDong Jieyang",
          "sex" : "female"
        },
        "highlight" : {
          "country" : [
            "China GuangDong <em>Jieyang</em>"
          ]
        }
      }
    ]
  }
}

2.6 Aggregation analysis

2.6.1 Simple grouping

Group each word of country and count the number of documents that appear (number of user s)

get /human_index/user/_search
{
  "aggs": {
    "group_by_tag": {
      "terms": {
        "field": "country"
        
      }
    }
  }
}

return result

{
  "error": {
    "root_cause": [
      {
        "type": "illegal_argument_exception",
        "reason": "Fielddata is disabled on text fields by default. Set fielddata=true on [country] in order to load fielddata in memory by uninverting the inverted index. Note that this can however use significant memory. Alternatively use a keyword field instead."
      }
    ],
    "type": "search_phase_execution_exception",
    "reason": "all shards failed",
    "phase": "query",
    "grouped": true,
    "failed_shards": [
      {
        "shard": 0,
        "index": "human_index",
        "node": "2FgJQbJ5QhWVXfvoaI2kqQ",
        "reason": {
          "type": "illegal_argument_exception",
          "reason": "Fielddata is disabled on text fields by default. Set fielddata=true on [country] in order to load fielddata in memory by uninverting the inverted index. Note that this can however use significant memory. Alternatively use a keyword field instead."
        }
      }
    ],
    "caused_by": {
      "type": "illegal_argument_exception",
      "reason": "Fielddata is disabled on text fields by default. Set fielddata=true on [country] in order to load fielddata in memory by uninverting the inverted index. Note that this can however use significant memory. Alternatively use a keyword field instead.",
      "caused_by": {
        "type": "illegal_argument_exception",
        "reason": "Fielddata is disabled on text fields by default. Set fielddata=true on [country] in order to load fielddata in memory by uninverting the inverted index. Note that this can however use significant memory. Alternatively use a keyword field instead."
      }
    }
  },
  "status": 400
}

It is found that an error is reported here, but the reason is not the problem of executing the command. It is because the default fielddata value of elasticsearch is false. At this time, the grouped fields must be processed first, and the fielddata value should be changed to true.

get /human_index/_mapping/user
{
  "properties": {
    "country": {
      "type": "text",
      "fielddata": true
    }
  }
}
================================ result ================================
{
  "acknowledged" : true
}

Execute the statistics command again and get the result:

{
  "took" : 4,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 6,
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "human_index",
        "_type" : "user",
        "_id" : "5",
        "_score" : 1.0,
        "_source" : {
          "name" : "ln-1",
          "country" : "China GuangDong Jieyang",
          "sex" : "male",
          "desc" : "my name is leemon-1",
          "age" : 21
        }
      },
      {
        "_index" : "human_index",
        "_type" : "user",
        "_id" : "2",
        "_score" : 1.0,
        "_source" : {
          "name" : "sb",
          "desc" : "my name is sb",
          "age" : 25,
          "country" : "China GuangDong Jieyang",
          "sex" : "female"
        }
      },
      {
        "_index" : "human_index",
        "_type" : "user",
        "_id" : "4",
        "_score" : 1.0,
        "_source" : {
          "doc" : {
            "name" : "lmc hh",
            "country" : "China GuangDong Jieyang",
            "sex" : "male",
            "desc" : "my name is leemon",
            "age" : 11
          }
        }
      },
      {
        "_index" : "human_index",
        "_type" : "user",
        "_id" : "6",
        "_score" : 1.0,
        "_source" : {
          "name" : "ln sb",
          "country" : "China GuangDong Jieyang",
          "sex" : "male",
          "desc" : "my name is sb leemon",
          "age" : 27
        }
      },
      {
        "_index" : "human_index",
        "_type" : "user",
        "_id" : "1",
        "_score" : 1.0,
        "_source" : {
          "name" : "hh",
          "desc" : "my name is hh",
          "age" : 25,
          "country" : "China GuangDong",
          "sex" : "female"
        }
      },
      {
        "_index" : "human_index",
        "_type" : "user",
        "_id" : "3",
        "_score" : 1.0,
        "_source" : {
          "age" : 24,
          "country" : "China GuangDong Shenzhen",
          "sex" : "male",
          "name" : "ln",
          "desc" : "my name is lee nai"
        }
      }
    ]
  },
  "aggregations" : {
    "group_by_tag" : {
      "doc_count_error_upper_bound" : 0,
      "sum_other_doc_count" : 0,
      "buckets" : [
        {
          "key" : "china",
          "doc_count" : 5
        },
        {
          "key" : "guangdong",
          "doc_count" : 5
        },
        {
          "key" : "jieyang",
          "doc_count" : 3
        },
        {
          "key" : "shenzhen",
          "doc_count" : 1
        }
      ]
    }
  }
}

You can see the number of documents that appear for each word in the aggregations

2.6.2 Group Statistics

Group sex, calculate the average age of each group, and then sort in descending order according to the average age. Before querying, remember to set the fielddata of sex

get /human_index/user/_search
{
  "aggs": {
    "group_by_tag": {
      "terms": {
        "field": "sex",
        "order": {
          "avg_age": "desc"
        }
      },
      "aggs": {
        "avg_age": {
          "avg": {
            "field": "age"
          }
        }
      }
    }
  }
}

The result looks like this:

{
  "took" : 4,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 6,
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "human_index",
        "_type" : "user",
        "_id" : "5",
        "_score" : 1.0,
        "_source" : {
          "name" : "ln-1",
          "country" : "China GuangDong Jieyang",
          "sex" : "male",
          "desc" : "my name is leemon-1",
          "age" : 21
        }
      },
      {
        "_index" : "human_index",
        "_type" : "user",
        "_id" : "2",
        "_score" : 1.0,
        "_source" : {
          "name" : "sb",
          "desc" : "my name is sb",
          "age" : 25,
          "country" : "China GuangDong Jieyang",
          "sex" : "female"
        }
      },
      {
        "_index" : "human_index",
        "_type" : "user",
        "_id" : "4",
        "_score" : 1.0,
        "_source" : {
          "doc" : {
            "name" : "lmc hh",
            "country" : "China GuangDong Jieyang",
            "sex" : "male",
            "desc" : "my name is leemon",
            "age" : 11
          }
        }
      },
      {
        "_index" : "human_index",
        "_type" : "user",
        "_id" : "6",
        "_score" : 1.0,
        "_source" : {
          "name" : "ln sb",
          "country" : "China GuangDong Jieyang",
          "sex" : "male",
          "desc" : "my name is sb leemon",
          "age" : 27
        }
      },
      {
        "_index" : "human_index",
        "_type" : "user",
        "_id" : "1",
        "_score" : 1.0,
        "_source" : {
          "name" : "hh",
          "desc" : "my name is hh",
          "age" : 25,
          "country" : "China GuangDong",
          "sex" : "female"
        }
      },
      {
        "_index" : "human_index",
        "_type" : "user",
        "_id" : "3",
        "_score" : 1.0,
        "_source" : {
          "age" : 24,
          "country" : "China GuangDong Shenzhen",
          "sex" : "male",
          "name" : "ln",
          "desc" : "my name is lee nai"
        }
      }
    ]
  },
  "aggregations" : {
    "group_by_tag" : {
      "doc_count_error_upper_bound" : 0,
      "sum_other_doc_count" : 0,
      "buckets" : [
        {
          "key" : "female",
          "doc_count" : 2,
          "avg_age" : {
            "value" : 25.0
          }
        },
        {
          "key" : "male",
          "doc_count" : 3,
          "avg_age" : {
            "value" : 24.0
          }
        }
      ]
    }
  }
}

2.6.3 Interval grouping

Divide the age range interval, group by age interval, group by sex within each group, and then calculate the average age of each group, sort in descending order

get /human_index/user/_search
{
  "aggs": {
    "group_age_range": {
      "range": {
        "field": "age",
        "ranges": [
            {
              "from": 0,
              "to": 10
            },{
              "from": 11,
              "to": 20
            },{
              "from": 21,
              "to": 25
            },{
              "from": 25,
              "to": 30
            }
          ]
      },
      "aggs": {
        "group_by_sex": {
          "terms": {
            "field": "sex",
            "order": {
              "avg_age": "desc"
            }
          },
          "aggs": {
            "avg_age": {
              "avg": {
                "field": "age"
              }
            }
          }
        }
      }
    }
  }
}

The aggregations of the output results are as follows:

{
    "group_age_range" : {
        "buckets" : [
            {
                "key" : "0.0-10.0",
                "from" : 0.0,
                "to" : 10.0,
                "doc_count" : 0,
                "group_by_sex" : {
                    "doc_count_error_upper_bound" : 0,
                    "sum_other_doc_count" : 0,
                    "buckets" : [ ]
                }
            },
            {
                "key" : "11.0-20.0",
                "from" : 11.0,
                "to" : 20.0,
                "doc_count" : 0,
                "group_by_sex" : {
                    "doc_count_error_upper_bound" : 0,
                    "sum_other_doc_count" : 0,
                    "buckets" : [ ]
                }
            },
            {
                "key" : "21.0-25.0",
                "from" : 21.0,
                "to" : 25.0,
                "doc_count" : 2,
                "group_by_sex" : {
                    "doc_count_error_upper_bound" : 0,
                    "sum_other_doc_count" : 0,
                    "buckets" : [
                        {
                            "key" : "male",
                            "doc_count" : 2,
                            "avg_age" : {
                                "value" : 22.5
                            }
                        }
                    ]
                }
            },
            {
                "key" : "25.0-30.0",
                "from" : 25.0,
                "to" : 30.0,
                "doc_count" : 3,
                "group_by_sex" : {
                    "doc_count_error_upper_bound" : 0,
                    "sum_other_doc_count" : 0,
                    "buckets" : [
                        {
                            "key" : "male",
                            "doc_count" : 1,
                            "avg_age" : {
                                "value" : 27.0
                            }
                        },
                        {
                            "key" : "female",
                            "doc_count" : 2,
                            "avg_age" : {
                                "value" : 25.0
                            }
                        }
                    ]
                }
            }
        ]
    }
}

2.7 Mapping

Through _mapping, you can set and view the data type of each field of each type, etc.

2.7.1 View all type s of mapping s

get /human_index/_mapping
================================ result ================================
#! Deprecation: [types removal] The parameter include_type_name should be explicitly specified in get mapping requests to prepare for 7.0. In 7.0 include_type_name will default to 'false', which means responses will omit the type name in mapping definitions.
{
  "human_index" : {
    "mappings" : {
      "user" : {
        "properties" : {
          "age" : {
            "type" : "long"
          },
          "country" : {
            "type" : "text",
            "fields" : {
              "keyword" : {
                "type" : "keyword",
                "ignore_above" : 256
              }
            },
            "fielddata" : true
          },
          "desc" : {
            "type" : "text",
            "fields" : {
              "keyword" : {
                "type" : "keyword",
                "ignore_above" : 256
              }
            }
          },
          "doc" : {
            "properties" : {
              "age" : {
                "type" : "long"
              },
              "country" : {
                "type" : "text",
                "fields" : {
                  "keyword" : {
                    "type" : "keyword",
                    "ignore_above" : 256
                  }
                }
              },
              "desc" : {
                "type" : "text",
                "fields" : {
                  "keyword" : {
                    "type" : "keyword",
                    "ignore_above" : 256
                  }
                }
              },
              "name" : {
                "type" : "text",
                "fields" : {
                  "keyword" : {
                    "type" : "keyword",
                    "ignore_above" : 256
                  }
                }
              },
              "sex" : {
                "type" : "text",
                "fields" : {
                  "keyword" : {
                    "type" : "keyword",
                    "ignore_above" : 256
                  }
                }
              }
            }
          },
          "name" : {
            "type" : "text",
            "fields" : {
              "keyword" : {
                "type" : "keyword",
                "ignore_above" : 256
              }
            }
          },
          "sex" : {
            "type" : "text",
            "fields" : {
              "keyword" : {
                "type" : "keyword",
                "ignore_above" : 256
              }
            },
            "fielddata" : true
          },
          "tags" : {
            "type" : "text",
            "fielddata" : true
          }
        }
      }
    }
  }
}

2.7.2 Check the mapping of a single type type

get /human_index/_mapping/user

Since there is currently only one index human_index, and there is only one type user under the index, the result is basically consistent with 2.7.1

2.7.3 Modify mapping

Refer to modifying the fielddata attribute in 2.6.1

Tags: Big Data ElasticSearch macOS

Posted by ksmatthews on Thu, 19 Jan 2023 10:18:07 +0530