6

ElasticSearch学习笔记

 3 years ago
source link: https://jiajunhuang.com/articles/2020_07_04-elasticsearch.md.html
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

ElasticSearch学习笔记

搭建单节点ES

用Docker来搭建是比较简单的方式:

$ docker pull elasticsearch
$ docker run -d --name elasticsearch -p 9200:9200 -p 9300:9300 -e "discovery.type=single-node" elasticsearch

ES引入了几个新的概念,我们和数据库对比一下:

ES vs MySQL

可以看到ES对应数据库的概念,然后看下ES在URL里如何对应这些概念:

http://host:port/[index]/[type]/[_action/id]

还有一个概念就是document,其实就是每一个JSON就是一个document。比如插入一个document:

$ http POST :9200/customer/doc2/1 name="John Doe"

这里,customer 就是index,doc2就是type,1 就是id,如果不想指定id,想要实现数据库里自增id的方式,就这样:

$ http POST :9200/customer/doc/ name="John Doye"
HTTP/1.1 201 Created
Location: /customer/doc/AXMTYwk4PDh6wP1Uklpb
content-encoding: gzip
content-length: 156
content-type: application/json; charset=UTF-8

{
    "_id": "AXMTYwk4PDh6wP1Uklpb",
    "_index": "customer",
    "_shards": {
        "failed": 0,
        "successful": 1,
        "total": 2
    },
    "_type": "doc",
    "_version": 1,
    "created": true,
    "result": "created"
}

看,返回结果中, 就会把自动生成的id一起返回。 获取文档就要用到这个id:

$ http :9200/customer/doc/AXMTYwk4PDh6wP1Uklpb
HTTP/1.1 200 OK
content-encoding: gzip
content-length: 137
content-type: application/json; charset=UTF-8

{
    "_id": "AXMTYwk4PDh6wP1Uklpb",
    "_index": "customer",
    "_source": {
        "name": "John Doye"
    },
    "_type": "doc",
    "_version": 1,
    "found": true
}

ES本身就是为了搜索的,我们来看下如何搜索,搜索就是往 _search 这个endpoint请求:

$ http :9200/customer/_search
HTTP/1.1 200 OK
content-encoding: gzip
content-length: 258
content-type: application/json; charset=UTF-8

{
    "_shards": {
        "failed": 0,
        "skipped": 0,
        "successful": 5,
        "total": 5
    },
    "hits": {
        "hits": [
            {
                "_id": "AXMTWg-BPDh6wP1UklpX",
                "_index": "customer",
                "_score": 1.0,
                "_source": {
                    "name": "John Doge"
                },
                "_type": "doc"
            },
            {
                "_id": "2",
                "_index": "customer",
                "_score": 1.0,
                "_source": {
                    "name": "John Dog"
                },
                "_type": "doc"
            },
            {
                "_id": "AXMTYwk4PDh6wP1Uklpb",
                "_index": "customer",
                "_score": 1.0,
                "_source": {
                    "name": "John Doye"
                },
                "_type": "doc"
            },
            {
                "_id": "1",
                "_index": "customer",
                "_score": 1.0,
                "_source": {
                    "name": "John Doe"
                },
                "_type": "doc"
            },
            {
                "_id": "1",
                "_index": "customer",
                "_score": 1.0,
                "_source": {
                    "name": "John Doe"
                },
                "_type": "doc2"
            },
            {
                "_id": "_create",
                "_index": "customer",
                "_score": 1.0,
                "_source": {
                    "name": "John Doge"
                },
                "_type": "doc"
            }
        ],
        "max_score": 1.0,
        "total": 6
    },
    "timed_out": false,
    "took": 62
}

瞧,这样,不带条件,就把所有的文档搜出来了。如果只想搜索一个 type 里的,那就:http :9200/customer/doc2/_search。 如果想搜索整个ES里的,那就:http :9200/_search

更复杂的搜索,就要用到Elastic的Query DSL来进行操作了。

Query DSL

Query DSL比较灵活,代价就是相对复杂,其实是用JSON的形式,来表达查询规则。分为两种:

  • query。query会模糊查找文档,然后根据匹配程度有一个打分,根据打分来排序。
  • filter。filter就是看是否匹配,结果要么匹配,要么不匹配。相对简单。

最简单的DSL如下:

{
  "query":{
    "match_all": {}
  }
}

作用就是查询所有的文档。

可以根据 sizefrom 来指定从何处开始取结果,取多少:

$ cat query.json 
{
  "query":{
    "match_all": {}
  },
  "from": 2,
  "size": 1
}

$ http GET :9200/_search < query.json 
HTTP/1.1 200 OK
content-encoding: gzip
content-length: 203
content-type: application/json; charset=UTF-8

{
    "_shards": {
        "failed": 0,
        "skipped": 0,
        "successful": 5,
        "total": 5
    },
    "hits": {
        "hits": [
            {
                "_id": "AXMTYwk4PDh6wP1Uklpb",
                "_index": "customer",
                "_score": 1.0,
                "_source": {
                    "name": "John Doye"
                },
                "_type": "doc"
            }
        ],
        "max_score": 1.0,
        "total": 6
    },
    "timed_out": false,
    "took": 2
}

除了上面的 match_all,ES还有好几个查询语句,他们都是放在 query 里面,我们来看看:

  • match_all: 查询全部
  • match: 简单匹配
  • multi_match: 在多个字段上执行相同的match查询
  • query_string: 可以在查询里边使用AND或者OR来完成复杂的查询
  • term: term可以用来精确匹配,精确匹配的值可以是数字、时间、布尔值
  • range: range用来查询落在指定区间内的数字或者时间
  • bool: bool可以通过 must, must_not, filter, should 把多个查询条件组合起来

聚合查询就更强大了,这个还是直接看文档吧:https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations.html


微信公众号
关注公众号,获得及时更新

在KVM里安装Minikube

搞定面试中的系统设计题

Crontab + Sendmail实现定时任务并且通知

Nginx设置Referer来防止盗图

Graphviz dot简明教程

jQuery简明教程

Python RQ(Redis Queue)添加gevent支持

读《超级运营术》- 如何做社区?

技术人,光有技术是不行的

搭建aria2服务器

使用shairport-sync搭建airplay音频服务器

VirtManager Windows自适应屏幕

使用btrfs组建RAID1

Swagger? 不好用

Golang/Python最佳实践




Recommend

About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK