Elasticsearch

Elasticsearch

一个分布式的搜索引擎

  1. 下载地址 : https://www.elastic.co/cn/downloads/elasticsearch
  2. 解压文件
    解压es : unzip elasticsearch-5.3.2.zip
    解压kibana : tar zxvf kibana-5.3.2-darwin-x86_64.tar.gz
  3. 启动es
    cd elasticsearch-5.3.2
    运行elasticsearch ./bin/elasticsearch
    在浏览器中输入127.0.0.1:9200可查看elasticsearch 相关信息
     {
         "name": "umA0x2J",
         "cluster_name": "elasticsearch",
         "cluster_uuid": "Z7YY0ULAT2OMGxVlTlDhRA",
         "version": {
         "number": "5.3.2",
         "build_hash": "3068195",
         "build_date": "2017-04-24T16:15:59.481Z",
         "build_snapshot": false,
         "lucene_version": "6.4.2"
         },
         "tagline": "You Know, for Search"
     }
    
    127.0.0.1:9200/_cat 快速查看相关接口
    127.0.0.1:9200/_cat/health :
    1539651266 08:54:26 elasticsearch green 1 1 0 0 0 0 0 0 - 100.0%

为了避免直接如此操作,随意被人读写。使用x-pack增加安全保护

x-pack安装

安装x-pack ./bin/elasticsearch-plugin install file:///Users/albin/Downloads/x-pack-5.3.2.zip
重启es ./bin/elasticsearch
初始化es 用户和密码相关信息
./bin/x-pack/setup-passwords auto

Changed password for user kibana
PASSWORD kibana = 4USn7ktCajr1FhrurORZ

Changed password for user logstash_system
PASSWORD logstash_system = 2uIzU9rUktZWOvC3StP0

Changed password for user elastic
PASSWORD elastic = 8DI6Wylew640GBji0UZ8

启动kibana

cd kibana-6.2.4-darwin-x86_64
vim config/kibana.yml # 打开kibana配置
#elasticsearch.username: "user" => "kibana" # 修改elastic用户名
#elasticsearch.password: "pass" => "4USn7ktCajr1FhrurORZ" # 修改elastic密码
./bin/kibana    # 启动kibana

运行kibana
在浏览器中输入http://localhost:5601,输入用户名elastic,密码8DI6Wylew640GBji0UZ8
登陆kibana使用elasticsearch 用户登陆,原因在于用kibana不是管理员,查看不了用户,也没法导入授权,重新用elastic用户登录即可

{
  "error": {
    "root_cause": [
      {
        "type": "security_exception",
        "reason": "action [indices:admin/create] is unauthorized for user [kibana]"
      }
    ],
    "type": "security_exception",
    "reason": "action [indices:admin/create] is unauthorized for user [kibana]"
  },
  "status": 403
}

Dev Tools

执行rest命令,对es进行CURD。

GET /
Response 200
{
  "name": "KXsFyxR",
  "cluster_name": "elasticsearch",
  "cluster_uuid": "KbMywZ8TSSa2ZczKB390Hw",
  "version": {
    "number": "6.2.4",
    "build_hash": "ccec39f",
    "build_date": "2018-04-12T20:37:28.497551Z",
    "build_snapshot": false,
    "lucene_version": "7.2.1",
    "minimum_wire_compatibility_version": "5.6.0",
    "minimum_index_compatibility_version": "5.0.0"
  },
  "tagline": "You Know, for Search"
}
CURD

写入数据

POST twitter/doc/1
Request params
{
    "user": "model",
    "uid" : 1,
    "city" : "changsha",
    "provice" : "Hunan",
    "country" : "China"
}

### 参数说明
- POST 把数据已POST方式写入到elasticsearch里
- twitter 索引名称 index
- doc type
- 1 id

Response
{
  "_index": "twitter", // index
  "_type": "doc", // type
  "_id": "1", // ID
  "_version": 1,
  "result": "created", // 结果
  "_shards": { //分片数
    "total": 2,
    "successful": 1,
    "failed": 0
  },
  "_seq_no": 0,
  "_primary_term": 1
}

获取写入的信息

# 存在索引的情况
GET twitter/doc/1
Reponse [true]
{
  "_index": "twitter",
  "_type": "doc",
  "_id": "1",
  "_version": 1,
  "found": true,
  "_source": {
    "user": "model",
    "uid": 1,
    "city": "changsha",
    "provice": "Hunan",
    "country": "China"
  }
}

# 索引不存在的情况
GET twitter/doc/2
Reponse [false]
{
  "_index": "twitter",
  "_type": "doc",
  "_id": "2",
  "found": false
}

修改信息

PUT twitter/doc/1
Request 
{
    "user": "model",
    "uid" : 1,
    "city" : "长沙",
    "provice" : "湖南",
    "country" : "中国",
    "location" : {
      "lat" : "29.084661",
      "lon" : "111.335210"
    }
}
Reponse
{
  "_index": "twitter",
  "_type": "doc",
  "_id": "1",
  "_version": 5,
  "result": "updated",
  "_shards": {
    "total": 2,
    "successful": 1,
    "failed": 0
  },
  "_seq_no": 4,
  "_primary_term": 1
}

删除信息

DELETE twitter/doc/1

Response
{
  "_index": "twitter",
  "_type": "doc",
  "_id": "1",
  "_version": 6,
  "result": "deleted",
  "_shards": {
    "total": 2,
    "successful": 1,
    "failed": 0
  },
  "_seq_no": 5,
  "_primary_term": 3
}

数据检索

GET twitter/_search

使用_bulk API 批量插入多个索引信息数据

POST _bulk
{"index":{"_index":"twitter","_type":"doc"}} // 操作 index操作
{"user":"双榆树-张三","message":"今儿天气不错啊,出去转转去","uid":2,"age":20,"city":"北京","provice":"北京","country":"中国","address":"中国北京市海淀区","location":{"lat":"39.970718","lon":"116.325747"}} // body
{"index":{"_index":"twitter","_type":"doc"}}
{"user":"东城区-老刘","message":"出发,下一站云南!","uid":3,"age":30,"city":"北京","provice":"北京","country":"中国","address":"中国北京市东城区台基厂条三号","location":{"lat":"39.9043413","lon":"116.412754"}}
{ "index" : {"_index" : "twitter","_type" : "doc"}}
{ "user" : "东城区-李四","message":"happy birthday!","uid":4,"age":30,"city":"北京","provice":"北京","country":"中国","address":"中国北京市东城区","location":{"lat":"39.893801","lon":"116.408986"}}
{"index":{"_index":"twitter","_type":"doc"}}
{ "user" : "朝阳区-老贾","message":"123,gogogo","uid":5,"age":35,"city":"北京","provice":"北京","country":"中国","address":"中国北京市朝阳区建国门","location":{"lat":"39.718256","lon":"116.367910"}}
{ "index" : {"_index" : "twitter","_type" : "doc"}}
{ "user" : "朝阳区-老王","message":"Happy Birthday My friend!","uid":6,"age":50,"city":"北京","provice":"北京","country":"中国","address":"中国北京市朝阳区国贸","location":{"lat":"39.918256","lon":"116.467910"}}
{ "index" : {"_index" : "twitter","_type" : "doc"}}
{ "user" : "虹桥-老吴","message":"好友来了都今天我生日好友来了,什么 birthday happy 就成!","uid":7,"age":90,"city":"上海","provice":"上海","country":"中国","address":"中国上海市闵行区","location":{"lat":"31.175927","lon":"121.383328"}}

条件过滤查找

# 查找city条件为北京的信息
GET twitter/_search
Request
{
    "query":{"match":{
        "city":"北京"
    }}
}
Response
{
  "took": 11,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 6,
    "max_score": 1.3862944,
    "hits": [
      {
        "_index": "twitter",
        "_type": "doc",
        "_id": "P2PmhGYBdftA3KUc0kxE",
        "_score": 1.3862944,
        "_source": {
          "user": "东城区-李四",
          "message": "happy birthday!",
          "uid": 4,
          "age": 30,
          "city": "北京",
          "provice": "北京",
          "country": "中国",
          "address": "中国北京市东城区",
          "location": {
            "lat": "39.893801",
            "lon": "116.408986"
          }
        }
      },
      {
        "_index": "twitter",
        "_type": "doc",
        "_id": "PWPmhGYBdftA3KUc0kxE",
        "_score": 0.5753642,
        "_source": {
          "user": "双榆树-张三",
          "message": "今儿天气不错啊,出去转转去",
          "uid": 2,
          "age": 20,
          "city": "北京",
          "provice": "北京",
          "country": "中国",
          "address": "中国北京市海淀区",
          "location": {
            "lat": "39.970718",
            "lon": "116.325747"
          }
        }
      },
      {
        "_index": "twitter",
        "_type": "doc",
        "_id": "pmPlhGYBdftA3KUcNUvi",
        "_score": 0.5753642,
        "_source": {
          "user": "双榆树-张三",
          "message": "今儿天气不错啊,出去转转去",
          "uid": 2,
          "age": 20,
          "city": "北京",
          "provice": "北京",
          "country": "中国",
          "address": "中国北京市海淀区",
          "location": {
            "lat": "39.970718",
            "lon": "116.325747"
          }
        }
      },
      {
        "_index": "twitter",
        "_type": "doc",
        "_id": "PmPmhGYBdftA3KUc0kxE",
        "_score": 0.5753642,
        "_source": {
          "user": "东城区-老刘",
          "message": "出发,下一站云南!",
          "uid": 3,
          "age": 30,
          "city": "北京",
          "provice": "北京",
          "country": "中国",
          "address": "中国北京市东城区台基厂条三号",
          "location": {
            "lat": "39.9043413",
            "lon": "116.412754"
          }
        }
      },
      {
        "_index": "twitter",
        "_type": "doc",
        "_id": "QGPmhGYBdftA3KUc0kxE",
        "_score": 0.36464313,
        "_source": {
          "user": "朝阳区-老贾",
          "message": "123,gogogo",
          "uid": 5,
          "age": 35,
          "city": "北京",
          "provice": "北京",
          "country": "中国",
          "address": "中国北京市朝阳区建国门",
          "location": {
            "lat": "39.718256",
            "lon": "116.367910"
          }
        }
      },
      {
        "_index": "twitter",
        "_type": "doc",
        "_id": "QWPmhGYBdftA3KUc0kxE",
        "_score": 0.36464313,
        "_source": {
          "user": "朝阳区-老王",
          "message": "Happy Birthday My friend!",
          "uid": 6,
          "age": 50,
          "city": "北京",
          "provice": "北京",
          "country": "中国",
          "address": "中国北京市朝阳区国贸",
          "location": {
            "lat": "39.918256",
            "lon": "116.467910"
          }
        }
      }
    ]
  }
}


# 查找多个条件 city条件为北京的信息且age为30
GET twitter/_search
Request
{
  "query":{
    "bool":{
      "must":[
        {
          "match":{
            "city":"北京"
          }
        },
        {
          "match":{
            "age":"30"
          }
        }
      ]
    }
  }
}
Response
{
  "took": 17,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 2,
    "max_score": 2.3862944,
    "hits": [
      {
        "_index": "twitter",
        "_type": "doc",
        "_id": "P2PmhGYBdftA3KUc0kxE",
        "_score": 2.3862944,
        "_source": {
          "user": "东城区-李四",
          "message": "happy birthday!",
          "uid": 4,
          "age": 30,
          "city": "北京",
          "provice": "北京",
          "country": "中国",
          "address": "中国北京市东城区",
          "location": {
            "lat": "39.893801",
            "lon": "116.408986"
          }
        }
      },
      {
        "_index": "twitter",
        "_type": "doc",
        "_id": "PmPmhGYBdftA3KUc0kxE",
        "_score": 1.5753641,
        "_source": {
          "user": "东城区-老刘",
          "message": "出发,下一站云南!",
          "uid": 3,
          "age": 30,
          "city": "北京",
          "provice": "北京",
          "country": "中国",
          "address": "中国北京市东城区台基厂条三号",
          "location": {
            "lat": "39.9043413",
            "lon": "116.412754"
          }
        }
      }
    ]
  }
}

# 查找多个条件 city不是北京的信息且age不是30(只需要把must换成must_not)  
GET twitter/_search
{
  "query":{
    "bool":{
      "must_not":[
        {
          "match":{
            "city":"北京"
          }
        },
        {
          "match":{
            "age":"30"
          }
        }
      ]
    }
  }
}

should满足条件中任意一个

GET twitter/_search
{
  "query": {
    "bool": {"should": [
      {"match": {
        "city": "北京"
      }},{
        "match": {
          "city": "上海"
        }
      }
    ]}
  }
}

查找符合条件的总数量

GET twitter/_count
{
  "query": {
    "bool": {"should": [
      {"match": {
        "city": "北京"
      }},{
        "match": {
          "city": "上海"
        }
      }
    ]}
  }
}
Response
{
  "count": 7,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  }
}

数据地理位置查询(Mapping)

GET twitter/_mapping # 动态创建的

# 自动创建索引mapping没有被识别出来,删除要有索引 `DELETE twitter`,必须手动设置

# 重新创建索引(设置索引分片数为1)
PUT twitter
{
    "settings": {"number_of_shards": 1}
}
Response
{
  "acknowledged": true,
  "shards_acknowledged": true,
  "index": "twitter"
}

# 设置新的数据类型
PUT twitter/doc/_mapping
{
  "properties": {
    "address":{
      "type": "text",
      "fields": {
        "keyword":{
          "type": "keyword",
          "ignore_above": 256
        }
      }
    },
    "city":{
      "type": "keyword" //不分词,作为一个数据整体。索引会更小
    },
    "country":{
      "type": "keyword"
    },
    "location":{
      "type": "geo_point"
    },
    "provice":{
      "type": "keyword"
    },
    "uid":{
      "type": "long"
    },
    "user":{
      "type": "text",
      "fields": {
        "keyword":{
          "type": "keyword",
          "ignore_above": 256
        }
      }
    }
  }
}
Response
{
  "acknowledged": true
}

# 重新倒入上述bulk


### 使用重新创建的mapping,进行地址位置搜索查询  

# 查找address为北京
GET twitter/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "match": {
            "address": "北京"
          }
        }
      ]
    }
  },
  "post_filter": { // 且坐标点为朝外soho(39.920086,116.454182)附近3km
    "geo_distance": {
      "distance": "3km",
      "location": { 
        "lat": 39.920086,
        "lon": 116.454182
      }
    }
  },
  "sort": [ // 并根据搜索出来的结构进行自定义排序
    {
      "_geo_distance": { // 按某字段排序
        "location": "39.920086,116.454182",
        "order": "asc",
        "unit": "km" // 排序后显示离中心点的距离单位
      }
    }
  ]
}

range 按一定范围去检索

# range 按一定范围去查找(包含边界值)
GET twitter/_search
{
  "query": {
    "range": {
      "age": {
        "gte": 30,
        "lte": 40
      }
    }
  },
  "sort": [ // 根据UID排序
    {
      "uid": {
        "order": "desc"
      }
    }
  ]
}

全文检索
```json

一个短语检索(^模式)

GET twitter/_search
{
“query”: {
“match_phrase”: {
“message”: “Happy birthday”
}
}
}

一个短语检索(like模式)

GET twitter/_search
{
“query”: {
“match”: {
“message”: “Happy birthday”
}
},
“highlight”: { // 对关键字做高亮查询(增加了em标签)
“fields”: {
“message”: {}
}
}
}


# 对搜索结果进行聚合统计
```json
# range
GET twitter/_search
{
  "size": 0, // 设置0后,不返回搜索的结果
  "aggs": {
    "age": {
      "range": {
        "field": "age",
        "ranges": [
          {
            "from": 20,
            "to": 30
          },
          {
            "from": 30,
            "to": 40
          },
          {
            "from": 40,
            "to": 50
          }
        ]
      }
    }
  }
}

# terms 对搜索结果字段进行统计
GET twitter/_search
{
  "query": {
    "match": {
      "message": "happy birthday"
    }
  },
  "size": 5,
  "aggs": {
    "city": {
      "terms": {
        "field": "city",
        "size": 10
      }
    }
  }
}

analyzer(分析器 默认standard)

# 分成两个token,并转换成小写
GET twitter/_analyze
{
  "text": ["Happy Birthday"],
  "analyzer": "standard"
}

# 按点
GET twitter/_analyze
{
  "text": ["Happy.Birthday"], // 中间有点.的时候分析器,standard分析器不能拆分,需要使用simple分析器   
  "analyzer": "simple"
}

# tokenizer 
GET twitter/_analyze
{
  "text": ["Happy.Birthday"],
  "tokenizer": "keyword",
  "filter": ["lowercase"] // 对tokenizer的结果处理 转换成小写    
}