Elastic Search

一 Elastic Search

1 介紹

參考資料

2 安裝

2.1 第一步安裝

##1. 解壓
[root@qphone01 software]# tar -zxvf elasticsearch-6.5.3.tar.gz -C /opt/apps/

##2. 配置環(huán)境變量
[root@qphone01 elasticsearch-6.5.3]# vi /etc/profile
#envrioment
export JAVA_HOME=/opt/apps/jdk1.8.0_45
export HADOOP_HOME=/opt/apps/hadoop-2.6.0-cdh5.7.6
export SCALA_HOME=/opt/apps/scala-2.11.8
export SPARK_HOME=/opt/apps/spark-2.2.0
export HIVE_HOME=/opt/apps/hive-1.1.0-cdh5.7.6
export ZOOKEEPER_HOME=/opt/apps/zookeeper-3.4.5-cdh5.7.6
export KAFKA_HOME=/opt/apps/kafka-2.11
export FLUME_HOME=/opt/apps/flume-1.9.0
export REDIS_HOME=/opt/apps/redis-3.2.8
export REDIS_CONF=$REDIS_HOME/conf
export ELASTICSEARCH_HOME=/opt/apps/elasticsearch-6.5.3
export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar
export PATH=$PATH:$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$SCALA_HOME/bin:$HIVE_HOME/bin:$REDIS_HOME/bin
export PATH=$PATH:$SPARK_HOME/bin:$SPARK_HOME/sbin:$ZOOKEEPER_HOME/bin:$KAFKA_HOME/bin:$FLUME_HOME/bin:$ELASTICSEARCH_HOME/bin

##3. 配置es的elasticsearch.yml

cluster.name: es-hzbigdata2002
node.name: qphone01
node.master: true
node.data: true
path.data: /opt/apps/elasticsearch-6.5.3/data
path.logs: /opt/apps/elasticsearch-6.5.3/logs
network.host: 0.0.0.0
discovery.zen.ping.unicast.hosts: ["qphone01", "qphone02", "qphone03"]

##4. 建立一個普通用戶

[root@qphone01 config]# useradd qphone01
[root@qphone01 config]# passwd qphone01
更改用戶 qphone01 的密碼 。

##5. 授權(quán)
[root@qphone01 config]# vi /etc/sudoers

## Allow root to run any commands anywhere
root    ALL=(ALL)       ALL
qphone01 ALL=(ALL)       ALL

##6. 對整個目錄授權(quán)
[root@qphone01 apps]# chown -R qphone01:qphone01 elasticsearch-6.5.3/

2.2 第二步解決環(huán)境問題

[qphone01@qphone01 bin]$ sudo vi /etc/security/limits.conf
*       soft    nofile  65536
*       hard    nofile  131072
*       soft    nproc   2048
*       hard    nproc   4096

[qphone01@qphone01 bin]$ sudo vi /etc/security/limits.d/20-nproc.conf

*          soft    nproc     4096
root       soft    nproc     unlimited

[bigdata@qphone01 limits.d]$ sudo vi /etc/sysctl.conf

vm.max_map_count=262144
tip:
修改完之后重啟

2.3 測試

http://192.168.49.111:9200/

{
  "name" : "qphone01",
  "cluster_name" : "es-hzbigdata2002",
  "cluster_uuid" : "iUEJ5-BRRsieI0vd7Uooww",
  "version" : {
    "number" : "6.5.3",
    "build_flavor" : "default",
    "build_type" : "tar",
    "build_hash" : "159a78a",
    "build_date" : "2018-12-06T20:11:28.826501Z",
    "build_snapshot" : false,
    "lucene_version" : "7.5.0",
    "minimum_wire_compatibility_version" : "5.6.0",
    "minimum_index_compatibility_version" : "5.0.0"
  },
  "tagline" : "You Know, for Search"
}

2.4 安裝head插件-谷歌瀏覽器

2.4.1 下載谷歌瀏覽器

2.4.2 安裝head插件

3 使用

3.1 RESTFul簡介

看資料

3.2 curl

3.2.1 在es當(dāng)中的增刪改查的method type

資源 一組資源的URI,比如:http://example.com/res/ 單個資源的URI,比如:http://example.com/res/123
GET 列出URI,以及該資源組中每個資源的詳細信息(后者可選) 獲取指定的資源的詳細信息,格式可以自選一個合適的網(wǎng)絡(luò)媒體類型(比如:XML、JSON等)
PUT 使用給定的一組資源替換當(dāng)前整組資源 替換/創(chuàng)建指定的資源。并將其追加到相應(yīng)的資源組中。
POST 在本組資源中創(chuàng)建/追加一個新的資源。該操作往往返回新的URL 把指定的資源當(dāng)做一個資源組,并在其下創(chuàng)建/追加一個新的元素,使其隸屬于當(dāng)前資源。
DELETE 刪除整組資源 刪除指定的元素

3.2.2 curl

  • 特殊指令
URL 描述
/index/_search 搜索指定索引下的數(shù)據(jù)
/_aliases 獲取或操作索引的別名
/index/ 查看指定索引的詳細信息
/index/type/ 創(chuàng)建或操作類型
/index/_mapping 創(chuàng)建或操作mapping
/index/_setting 創(chuàng)建或操作設(shè)置(比如number_of_shards分片數(shù))
/index/_open 打開指定被關(guān)閉的索引
/index/_close 關(guān)閉指定索引
/index/_refresh 刷新索引(使新加內(nèi)容對搜索可見,不保證數(shù)據(jù)被寫入磁盤)
/index/flush 刷新索引(會觸發(fā)Lucene提交)
  • 基本用法:3大參數(shù)
-X 指定http的請求方式:head、put、get、post、delete
-D 要傳輸?shù)臄?shù)據(jù)
-H 指定請求頭信息
  • 入門例子:創(chuàng)建了一個索引庫
curl -XPUT 'http://hbase1:9200/bigdata' ## 向es的集群發(fā)送put請求(新建),bigdata的索引庫

3.3 操作es的crud

3.3.1 put

 curl -H "Content-Type:application/json" -XPUT 'http://qphone01:9200/bigdata/emp/1' -d '{"name":"lixi", "age":34}'
{"_index":"bigdata","_type":"emp","_id":"1","_version":1,"result":"created","_shards":{"total":2,"successful":2,"failed":0},"_seq_no":0,"_primary_term":1}

curl -H "Content-Type:application/json" -XPUT 'http://qphone01:9200/bigdata/emp/3' -d '{"name":"蒼老師", "age":40}'
{"_index":"bigdata","_type":"emp","_id":"3","_version":2,"result":"updated","_shards":{"total":2,"successful":2,"failed":0},"_seq_no":1,"_primary_term":1}

tip:
1. 一個索引庫中只能由一個type,index/type視為一張表
2. 1表示doc_id,表示的一個文檔的編號,在es中一條數(shù)據(jù)表示一個文檔
3. 一個index/type可以由多個doc
4. 一個doc中的數(shù)據(jù)一定是一個json,并且多個doc之間的json是非對稱的

"_index":"bigdata" : 索引庫的庫名
"_type":"emp" : 類型是emp,你可以理解為bigdata庫下有一個表,這個表叫做emp
"_id":"1" : 表示doc的編號
"_shards":{"total":2,"successful":1,"failed":0} : 分片,有個副本

默認的分片是5,默認的副本因子是1
狀態(tài) 描述
綠色 所有主分片和副本分片都可用
黃色 所有的主分片都可用,不是所有的副本分片可用
紅色 不是所有主分片和副本分片可用

3.3.2 post操作,創(chuàng)建/修改索引庫

curl -H "Content-Type:application/json" -XPOST 'http://qphone01:9200/bigdata/emp/1' -d '{"name":"程志遠", "age":18}'
{"_index":"bigdata","_type":"emp","_id":"1","_version":2,"result":"updated","_shards":{"total":2,"successful":2,"failed":0},"_seq_no":1,"_primary_term":1}

curl -H "Content-Type:application/json" -XPOST 'http://qphone01:9200/bigdata/emp/4' -d '{"name":"李洪良", "age":22}'
{"_index":"bigdata","_type":"emp","_id":"4","_version":1,"result":"created","_shards":{"total":2,"successful":2,"failed":0},"_seq_no":1,"_primary_term":1}

tip:
put和post都是既可以添加數(shù)據(jù)又可以修改數(shù)據(jù)。但是post還能修改其他的設(shè)置

3.3.3 Get

##1. 查詢指定的一個文檔
curl -H "Content-Type:application/json" -XGET 'http://qphone01:9200/bigdata/emp/1'
{"_index":"bigdata","_type":"emp","_id":"1","_version":2,"found":true,"_source":{"name":"程志遠", "age":18}}

##2. 查詢并優(yōu)化查詢的json的格式
curl -H "Content-Type:application/json" -XGET 'http://qphone01:9200/bigdata/emp/1?pretty'
{
  "_index" : "bigdata",
  "_type" : "emp",
  "_id" : "1",
  "_version" : 2,
  "found" : true,
  "_source" : {
    "name" : "程志遠",
    "age" : 18
  }
}


##3. 查詢所有
curl -H "Content-Type:application/json" -XGET 'http://qphone01:9200/bigdata/_search?pretty'
{
  "took" : 7,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 4,
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "bigdata",
        "_type" : "emp",
        "_id" : "2",
        "_score" : 1.0,
        "_source" : {
          "name" : "rock",
          "age" : 35
        }
      },
      {
        "_index" : "bigdata",
        "_type" : "emp",
        "_id" : "4",
        "_score" : 1.0,
        "_source" : {
          "name" : "李洪良",
          "age" : 22
        }
      },
      {
        "_index" : "bigdata",
        "_type" : "emp",
        "_id" : "1",
        "_score" : 1.0,
        "_source" : {
          "name" : "程志遠",
          "age" : 18
        }
      },
      {
        "_index" : "bigdata",
        "_type" : "emp",
        "_id" : "3",
        "_score" : 1.0,
        "_source" : {
          "name" : "蒼老師",
          "age" : 40
        }
      }
    ]
  }
}

##4. 條件查詢
curl -XGET 'http://qphone01:9200/bigdata/_search?q=name:rock&pretty'
{
  "took" : 6,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 1,
    "max_score" : 0.87138504,
    "hits" : [
      {
        "_index" : "bigdata",
        "_type" : "emp",
        "_id" : "2",
        "_score" : 0.87138504,
        "_source" : {
          "name" : "rock",
          "age" : 35
        }
      }
    ]
  }
}

##5. 條件查詢
curl -XGET 'http://qphone01:9200/bigdata/_search?q=name:rock&_source=name&pretty'
{
  "took" : 7,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 1,
    "max_score" : 0.87138504,
    "hits" : [
      {
        "_index" : "bigdata",
        "_type" : "emp",
        "_id" : "2",
        "_score" : 0.87138504,
        "_source" : {
          "name" : "rock"
        }
      }
    ]
  }
}

##6. 分頁顯示
curl -XGET 'http://qphone01:9200/bigdata/_search?from=0&size=2&pretty'
{
  "took" : 3,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 4,
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "bigdata",
        "_type" : "emp",
        "_id" : "2",
        "_score" : 1.0,
        "_source" : {
          "name" : "rock",
          "age" : 35
        }
      },
      {
        "_index" : "bigdata",
        "_type" : "emp",
        "_id" : "4",
        "_score" : 1.0,
        "_source" : {
          "name" : "李洪良",
          "age" : 22
        }
      }
    ]
  }
}

3.3.4 Post:局部修改

curl -H "Content-Type:application/json" -XPOST 'http://qphone01:9200/bigdata/emp/4/_update?pretty' -d '{"doc": {"name":"lixi"}}'

{
  "_index" : "bigdata",
  "_type" : "emp",
  "_id" : "4",
  "_version" : 2,
  "result" : "updated",
  "_shards" : {
    "total" : 2,
    "successful" : 2,
    "failed" : 0
  },
  "_seq_no" : 2,
  "_primary_term" : 1
}

3.3.5 Delete

##1. 刪除docid(索引)
curl -H "Content-Type:application/json" -XDELETE 'http://qphone01:9200/bigdata/emp/3?pretty'
##2. 索引庫
curl -H "Content-Type:application/json" -XDELETE 'http://qphone01:9200/bigdata?pretty'

3.3.6 batch

  • 批量插入
curl -H 'Content-Type:application/json' -i -XPUT 'http://qphone01:9200/qphone/student/_bulk?pretty' \
-d '
{"index":{"_id":"3"}}
{"name":"李洪浪", "sex":"男", "age":32}
{"index":{"_id":"4"}}
{"name":"李洪風(fēng)", "sex":"男", "age":45}
{"index":{"_id":"5"}}
{"name":"李洪云", "sex":"男", "age":67}
{"index":{"_id":"6"}}
{"name":"李洪雨", "sex":"男", "age":8}
{"index":{"_id":"7"}}
{"name":"李洪雷", "sex":"男", "age":56}
{"index":{"_id":"8"}}
{"name":"李洪火", "sex":"男", "age":15}
'

4 ES的插件管理之Kibana

4.1 安裝

##1. 解壓
[root@qphone01 software]# tar -zxvf kibana-6.5.3-linux-x86_64.tar.gz -C /opt/apps/
##2. 環(huán)境變量
export KIBANA_HOME=/opt/apps/kibana-6.5.3
export PATH=$PATH:$KIBANA_HOME/bin
##3. kibana.yml

server.port: 5601
server.host: "192.168.49.111"
server.name: "qphone01"
elasticsearch.url: "http://qphone01:9200"

##4. 啟動
nohup kibana serve > /dev/null 2>&1 &

4.2 測試結(jié)果

001.png

二 ES的概念

1 通用概念

1.1 Index庫和Index

    索引(index)是ElasticSearch中的對邏輯數(shù)據(jù)的邏輯存儲。所以它可以分為更小的部分,你可以直接把它理解為RDBMS中的Table的數(shù)據(jù)的主鍵
    索引庫可以理解為RDBMS中的DATABASE。ES可以把索引存放在一個機器或者分散到多臺服務(wù)器,每個索引有一個或者多個分片(shard),每個分片有多個副本。

1.2 Document: 文檔

    存儲在ElasticSearch中的主要實體叫做文檔(document)。用RDBMS來對比的話,一個文檔相當(dāng)于數(shù)據(jù)庫表中的一行記錄。
    一個doc是一個可被索引的基本信息單元。這些文檔都是以json格式來表示的。在index/type里面存儲的。

1.2.1 創(chuàng)建文檔

    文檔通過index API被索引——使數(shù)據(jù)可以被存儲和搜索。但是首先要先決定文檔所在,如何確定:通過index\type\id來唯一確定。
語法:
    PUT {index}/{type}/{id} -d '{"":""}'
    POST {index}/{type} -d '{"":""}' 自定id
e.g.

[root@hbase1 kibana-6.5.3]# curl -H 'Content-Type:application/json' -XPUT 'http://hbase1:9200/bigdata/emp/4' -d '{"name":"wyl", "age":18}'

1.2.2 獲取文檔

1. 普通查詢
通過index\type\id,但是請求方式改為GET來獲取文檔
e.g.
curl -H 'Content-Type:application/json' -XGET 'http://hbase1:9200/bigdata/emp/4?pretty'
{
  "_index" : "bigdata",
  "_type" : "emp",
  "_id" : "4",
  "_version" : 1,
  "found" : true,
  "_source" : {
    "name" : "wyl",
    "age" : 18
  }
}

tip:
pretty : 在任意的查詢字符串中添加pretty參數(shù),都會然讓es美化輸出,讓json在響應(yīng)的時候更容易閱讀。
_source : 字段不會被美化,它的樣子于輸入的時候一致,這個source存放的就是文檔的數(shù)據(jù)
"found" : true : 表示你的文旦給已經(jīng)被查找到了。如果我們請求一個不存在的文旦給,依舊會得到一個json,found為false


2. 帶響應(yīng)碼的查詢
curl -H 'Content-Type:application/json' -i -XGET 'http://hbase1:9200/bigdata/emp/4?pretty'

[root@hbase1 kibana-6.5.3]# curl -H 'Content-Type:application/json' -i -XGET 'http://hbase1:9200/bigdata/emp/4?pretty'
HTTP/1.1 200 OK
content-type: application/json; charset=UTF-8
content-length: 153

{
  "_index" : "bigdata",
  "_type" : "emp",
  "_id" : "4",
  "_version" : 1,
  "found" : true,
  "_source" : {
    "name" : "wyl",
    "age" : 18
  }
}

3. 檢索文檔一部分
[root@hbase1 kibana-6.5.3]# curl -H 'Content-Type:application/json' -i -XGET 'http://hbase1:9200/bigdata/emp/4?_source=name&pretty'
HTTP/1.1 200 OK
content-type: application/json; charset=UTF-8
content-length: 137

{
  "_index" : "bigdata",
  "_type" : "emp",
  "_id" : "4",
  "_version" : 1,
  "found" : true,
  "_source" : {
    "name" : "wyl"
  }
}

1.2.3 更新文檔

//1. 是以覆蓋的方式修改數(shù)據(jù),版本疊加1
[root@hbase1 kibana-6.5.3]# curl -H 'Content-Type:application/json' -i -XPOST 'http://hbase1:9200/bigdata/emp/4?pretty' \
> -d '{"name":"yl", "age":27}'
HTTP/1.1 200 OK
content-type: application/json; charset=UTF-8
content-length: 220

{
  "_index" : "bigdata",
  "_type" : "emp",
  "_id" : "4",
  "_version" : 2,
  "result" : "updated",
  "_shards" : {
    "total" : 2,
    "successful" : 2,
    "failed" : 0
  },
  "_seq_no" : 5,
  "_primary_term" : 2
}

//2. 局部更新
[root@hbase1 kibana-6.5.3]# curl -H 'Content-Type:application/json' -i -XPOST 'http://hbase1:9200/bigdata/emp/4/_update?pretty' \
> -d '{"doc":{"name":"wyl"}}'
HTTP/1.1 200 OK
content-type: application/json; charset=UTF-8
content-length: 220

{
  "_index" : "bigdata",
  "_type" : "emp",
  "_id" : "4",
  "_version" : 3,
  "result" : "updated",
  "_shards" : {
    "total" : 2,
    "successful" : 2,
    "failed" : 0
  },
  "_seq_no" : 6,
  "_primary_term" : 2
}

1.2.4 刪除文檔

[root@hbase1 kibana-6.5.3]# curl -H 'Content-Type:application/json' -i -XDELETE 'http://hbase1:9200/bigdata/emp/4?pretty'
HTTP/1.1 200 OK
content-type: application/json; charset=UTF-8
content-length: 220

{
  "_index" : "bigdata",
  "_type" : "emp",
  "_id" : "4",
  "_version" : 4,
  "result" : "deleted",
  "_shards" : {
    "total" : 2,
    "successful" : 2,
    "failed" : 0
  },
  "_seq_no" : 7,
  "_primary_term" : 2
}

1.2.5 批量插入

[root@hbase1 kibana-6.5.3]# curl -H 'Content-Type:application/json' -i -XPOST 'http://hbase1:9200/blog/emp/_bulk?pretty' \
> -d '
> {"index":{"_id":"1"}}
> {"name":"James", "sex":"man", "salary":50000000}
> {"index":{"_id":"2"}}
> {"name":"Kobe", "sex":"man", "salary":60000000}
> '

1.2.6 檢索多個文檔

curl -H 'Content-Type:application/json' -i -XGET 'http://qphone01:9200/_mget?pretty' \
-d '{
"docs":[
{
"_index":"qphone",
"_type":"student",
"_id":1,
"_source":"name"
},
{
"_index":"qphone",
"_type":"student",
"_id":2
}
]
}'
HTTP/1.1 200 OK
content-type: application/json; charset=UTF-8
content-length: 437

{
  "docs" : [
    {
      "_index" : "qphone",
      "_type" : "student",
      "_id" : "1",
      "_version" : 1,
      "found" : true,
      "_source" : {
        "name" : "程志遠"
      }
    },
    {
      "_index" : "qphone",
      "_type" : "student",
      "_id" : "2",
      "_version" : 1,
      "found" : true,
      "_source" : {
        "name" : "李洪良",
        "sex" : "男",
        "age" : 19
      }
    }
  ]
}

1.3 Type

文檔類型
    在es中,一個索引對象可以存儲很多不同用途的對象。例如,一個博客可以保存文章和評論。文檔類型讓我們可以輕易的區(qū)分單個索引中的不同的對象。每個文檔可以有不同的結(jié)構(gòu),但是在實際部署中,對文檔按類型區(qū)分對于操作有很大的幫助。但有一個限制,不同的文檔類型不能為相同的屬性設(shè)置不同的類型。例如,在同一個索引中的所有的文檔類型中,一個叫title的字段必須具有相同的類型。
    在es6之后,一個index只能有一個type
    
curl -H "Content-Type:application/json" -XPOST 'http://qphone01:9200/blog/article/1' -d '{"title":"lijieweishenmzhemshuai", "content":"yinweitabenlaijiuhenshuai"}'

curl -H "Content-Type:application/json" -XPOST 'http://qphone01:9200/blog/comment/1' -d '{"title":"lijieweishenmzhemshuai", "content":"yinweitayongpiaorou", "user":"wangyushan"}'

{"error":{"root_cause":[{"type":"illegal_argument_exception","reason":"Rejecting mapping update to [blog] as the final mapping would have more than 1 type: [comment, article]"}],"type":"illegal_argument_exception","reason":"Rejecting mapping update to [blog] as the final mapping would have more than 1 type: [comment, article]"},"status":400}[root@hbase1 elasticsearch-6.5.3]#

1.4 Field(數(shù)據(jù)類型)

1.4.1 基本數(shù)據(jù)類型

字符串:text、keyword
數(shù)值:long、integer、short、byte、double、float、half_float、scaled_float
日期:date
布爾類型:boolean
二進制類型:binary
范圍類型:integer_range、float_range、long_range、double_range、date_range

1.4.2 復(fù)雜的數(shù)據(jù)類型

數(shù)組:array
對象:object
嵌套類型:nested object

1.4.3 地理位置數(shù)據(jù)類型

geo_point(點)、geo_shape(形狀)

1.4.4 專用類型

記錄ip:ip
自動補全:completion
記錄分詞:token_count

1.4.5 通過mapping映射手動指定你插入的字段類型

##1. 執(zhí)行命令,發(fā)現(xiàn)以下的信息
curl -H "Content-Type:application/json" -XPOST 'http://qphone01:9200/blog/article/2' -d '{"title":"qphoneshizuihaodepeixunjigou", "content":"bigdatashizuihaodexuek", "author":"lixi", "dt":"2020-09-04"}'

##2. 查詢?nèi)缦滤饕畔?
002.png

003.png
他自動的將json中的字段轉(zhuǎn)換唯es中的對應(yīng)的字段類型,這個轉(zhuǎn)換是自動完成的

##3. 手動指定類型
curl -XPUT -H "Content-Type:application/json" 'http://qphone01:9200/spark?pretty' -d \
'{
"mappings":{
"sparkcore":{
"properties":{
"scala":{
"type":"double"
}
}
}
}
}'


tip:
##1) 手動的指定我們的field的類型是可以的,但是必須得是新建的索引庫
##2) 必須通過mappings的映射的去指定
##3) 可以自動映射的
##4)  我們的自定義字段只是一個申請,我們可以選擇用或不用,但是在實際生產(chǎn)中,定義好的字段就是一種規(guī)范,一般在沒有得批準(zhǔn)的前提是不允許隨意的添加字段的。

##4. 以下代碼我們發(fā)現(xiàn)這個日期不是date,是text。因為我們沒有指定識別日期的格式
curl -H "Content-Type:application/json" -XPOST 'http://qphone01:9200/blog/article/5' -d '{"title":"qphoneshizuihaodepeixunjigou", "content":"bigdatashizuihaodexuek", "author":"rock", "dt2":"20200904"}'

##5. 添加日期識別格式
curl -XPUT -H "Content-Type:application/json" 'http://qphone01:9200/blog2?pretty' -d \
'{
"mappings":{
"article":{
"dynamic_date_formats":["yyyyMMdd"]
}
}}'

curl -H "Content-Type:application/json" -XPOST 'http://qphone01:9200/blog2/article/1?pretty' -d '{"title":"qphoneshizuihaodepeixunjigou", "content":"bigdatashizuihaodexuek", "author":"rock", "dt":"20200904"}'

##6. 關(guān)閉自動識別日期
curl -XPUT -H "Content-Type:application/json" 'http://qphone01:9200/blog2?pretty' -d \
'{"mappings":{
"article":{
"date_detection":false
}
}}'


##7. 開啟將字符串全是數(shù)字的情況識別為long類型
curl -XPUT -H "Content-Type:application/json" 'http://qphone01:9200/blog3?pretty' -d \
'{
"mappings":{
"article":{
"numeric_detection":true
}
}}'

curl -H "Content-Type:application/json" -XPOST 'http://qphone01:9200/blog3/article/1?pretty' -d '{"title":"qphoneshizuihaodepeixunjigou", "content":"bigdatashizuihaodexuek", "author":"rock", "dt":"20200904", "num":"111"}'

1.5 核心概念

1.5.1 Cluster :集群

    表示es的集群,集群中有多個節(jié)點,其中有一個為主節(jié)點,這個主節(jié)點是可以通過選舉產(chǎn)生的,主從節(jié)點都是對于集群內(nèi)部來說的。因為ES本身其實有一個概念:去中心化。字面理解上是表示es集群是沒有主節(jié)點,但是這個沒有主節(jié)點是對外部來說的。也就是我們可以認為es在邏輯上是一個整體,你與任何一個節(jié)點通信都與整個es集群通信時等價的。
    主節(jié)點的職責(zé)時負責(zé)管理整個集群的狀態(tài),包括管理分片的狀態(tài)和副本的狀態(tài)。新節(jié)點的發(fā)現(xiàn),節(jié)點的刪除。
    只要在同一個網(wǎng)段之內(nèi)啟動多個es節(jié)點,就可以自動組成一個集群(es2.0之前可以自動發(fā)現(xiàn),es2.0之后就不可以了)
    如何查看集群的狀態(tài):
[root@hbase1 config]# curl -XGET -H "Content-Type:application/json" 'http://hbase1:9200/_cluster/health?pretty'
{
  "cluster_name" : "bigdata-etc",
  "status" : "green",
  "timed_out" : false,
  "number_of_nodes" : 3,
  "number_of_data_nodes" : 3,
  "active_primary_shards" : 6,
  "active_shards" : 12,
  "relocating_shards" : 0,
  "initializing_shards" : 0,
  "unassigned_shards" : 0,
  "delayed_unassigned_shards" : 0,
  "number_of_pending_tasks" : 0,
  "number_of_in_flight_fetch" : 0,
  "task_max_waiting_in_queue_millis" : 0,
  "active_shards_percent_as_number" : 100.0

1.5.2 分片

    可以在創(chuàng)建索引庫的時候指定分片,相當(dāng)于rdd或者kafka中的partition的概念。
如:
curl -XPUT 'ip:port/index' -d {"settings":{"number_of_shards":3}}
    默認每個索引庫都是5個分片
    需要注意的是,索引庫一旦被創(chuàng)建,分片的個數(shù)是不能修改的。

1.5.3 副本

    代表索引庫的副本。副本的作用是提供系統(tǒng)的容錯性,當(dāng)某節(jié)點掛點可以從副本中恢復(fù)數(shù)據(jù)。
如:
curl -XPUT 'ip:port/index' -d {"settings":{"number_of_replicas":3}}

1.5.4 數(shù)據(jù)重分布

    代表數(shù)據(jù)恢復(fù)或者叫做數(shù)據(jù)重新分布。es在有節(jié)點加入或者退出的時候會根據(jù)機器的負載對索引分片進行重新分配,掛掉的節(jié)點重新啟動的時候也會進行數(shù)據(jù)恢復(fù)。

1.5.5 數(shù)據(jù)持久化

    代表的是es的持久化存儲方式,es默認是先把索引存放到內(nèi)存中,當(dāng)內(nèi)存滿了的時候再存儲到硬盤。當(dāng)這個es集群再關(guān)閉的時候、重啟的時候就會從gateway中讀取索引數(shù)據(jù)
    es本身支持多種類型的gateway,由本地的文件系統(tǒng)(默認),分布式文件系統(tǒng):HDFS、amazon。。。

1.5.6 自動發(fā)現(xiàn)機制

    代表es的自動發(fā)現(xiàn)節(jié)點的機制。es是一個基于p2p的系統(tǒng),他先通過廣播尋找存在的節(jié)點,再通過多廣播協(xié)議來進行節(jié)點與節(jié)點之間的通信,同時支持點對點的交互。
    禁用自動發(fā)現(xiàn)機制:
    discovery.zen.ping.multicast.enabled : true/false
    設(shè)置新節(jié)點被啟動時能夠發(fā)現(xiàn)的列表
    discovery.zen.ping.unicast.hosts: ["hbase1", "hbase2", "hbase3"]

三 Java API

1 導(dǎo)入依賴

<dependencies>
    <!-- es -->
    <dependency>
        <groupId>org.elasticsearch.client</groupId>
        <artifactId>transport</artifactId>
        <version>6.5.3</version>
    </dependency>

    <dependency>
        <groupId>org.projectlombok</groupId>
        <artifactId>lombok</artifactId>
        <version>1.18.8</version>
    </dependency>

    <!-- json -->
    <dependency>
        <groupId>com.alibaba</groupId>
        <artifactId>fastjson</artifactId>
        <version>1.2.71</version>
    </dependency>
</dependencies>

2 入門

2.1 elasticSearchUtils

package cn.qphone.es.api;

import org.elasticsearch.client.transport.TransportClient;
import org.elasticsearch.common.settings.Settings;
import org.elasticsearch.common.transport.TransportAddress;
import org.elasticsearch.transport.client.PreBuiltTransportClient;

import java.net.InetAddress;

public class ElasticSearchUtils {
    private static TransportClient client;

    static {
        try {
            //1. 配置對象
            Settings settings = Settings.builder()
                    .put("cluster.name", "es-hzbigdata2002")
                    .build();
            //2. Transport對象
            client = new PreBuiltTransportClient(settings);
            //3. 創(chuàng)建es的集群地址
            TransportAddress[] trans = {
                    new TransportAddress(InetAddress.getByName("qphone01"), 9300),
                    new TransportAddress(InetAddress.getByName("qphone02"), 9300),
                    new TransportAddress(InetAddress.getByName("qphone03"), 9300)
            };
            //4. 連接es的服務(wù)器
            client.addTransportAddresses(trans);
        }catch (Exception e) {
            e.printStackTrace();
        }
    }

    /**
     * 獲取連接的es的客戶端對象
     */
    public static TransportClient getClient()  {
        return client;
    }
}

2.2 quickstart

package cn.qphone.es.api;

import org.elasticsearch.action.get.GetResponse;
import org.elasticsearch.client.transport.TransportClient;

import java.net.UnknownHostException;
import java.util.Map;

public class Demo1_quickstart {
    public static void main(String[] args) throws UnknownHostException {
        //1. 獲取到操作es的核心類
        TransportClient client = ElasticSearchUtils.getClient();

        //2. 操作es
        //2.1 創(chuàng)建索引庫
        // curl -XPUT -H 'json/application' 'xxxxxx/index/type' -d '{"name":"lixi"}'
//        String json = "{\"name\":\"wyl\", \"age\":18}";
//        IndexResponse response = client.prepareIndex("hadoop", "hdfs")
//                .setSource(json, XContentType.JSON)
//                .get();
//        System.out.println("create json version:" + response.getVersion());
//        System.out.println(response.getIndex());
//        System.out.println(response.getType());

        //2.2 刪除索引
//        DeleteResponse deleteResponse = client.prepareDelete("hadoop", "hdfs", "2")
//                .get();
//        System.out.println(deleteResponse.getIndex() + "/" + deleteResponse.getType() + "/" + deleteResponse.getId());

        //2.3 get
        GetResponse getResponse = client.prepareGet("hadoop", "hdfs", "ZZFVZnQBTuYsqQgZhqPE")
                .get();
        Map<String, Object> parm = getResponse.getSourceAsMap();
        System.out.println(parm.get("name"));
        System.out.println(parm.get("age"));
    }
}

四 中文分詞

1 測試es的默認分詞器

curl -H 'Content-Type: application/json' -XGET 'http://qphone01:9200/_analyze?&pretty' -d '{
"text":"i am a big big boy"
}'

curl -H 'Content-Type: application/json' -XGET 'http://qphone01:9200/_analyze?&pretty' -d '{
"text":"這里是好記性不如爛筆頭感嘆號的博客們"
}'

2 中文分詞器:ik分詞器

2.1 安裝

##1. 安裝解壓工具
yum -y install unzip

##2. 上傳ik分詞器
##3. 將ik分詞器拷貝到es的plugins目錄
mkdir -p /opt/apps/elasticsearch-6.5.3/plugins/ik && mv /opt/software/elasticsearch-analysis-ik-6.5.3.zip /opt/apps/elasticsearch-6.5.3/plugins/ik && cd /opt/apps/elasticsearch-6.5.3/plugins/ik

##4. 解壓
unzip elasticsearch-analysis-ik-6.5.3.zip && rm -f elasticsearch-analysis-ik-6.5.3.zip

##5. 分發(fā)
scp -r ik qphone02:/opt/apps/elasticsearch-6.5.3/plugins/ && scp -r ik qphone03:/opt/apps/elasticsearch-6.5.3/plugins/

##6. 重啟es集群

2.2 測試

curl -H 'Content-Type: application/json' -XGET 'http://qphone01:9200/_analyze?&pretty' -d \
'{
"analyzer":"ik_max_word",
"text":"這里是好記性不如爛筆頭感嘆號的博客們"
}'

curl -H 'Content-Type: application/json' -XGET 'http://qphone01:9200/_analyze?&pretty' -d \
'{
"analyzer":"ik_max_word",
"text":"i am a big big girl"
}'

##2. 創(chuàng)建chinese的索引庫,并指定其分詞器的策略
curl -H 'Content-Type: application/json' -XPUT 'http://qphone01:9200/chinese?pretty' -d \
'
{
  "settings": {
    "number_of_shards": 3,
    "number_of_replicas": 1,
    "analysis": {
      "analyzer": {
        "ik": {
          "tokenizer": "ik_max_word"
        }
      }
    }
  },
  "mappings": {
    "test1":{
      "properties": {
        "content": {
          "type": "text",
          "analyzer": "ik_max_word",
          "search_analyzer": "ik_max_word"
        }
      }
    }
  }
}'

##3. 向chinese導(dǎo)入數(shù)據(jù)
curl -H 'Content-Type: application/json' -XPUT 'http://qphone01:9200/chinese/test1/1?pretty' -d \
'
{
  "content": "里皮是一位牌面足夠大、支持率足夠高的教練"
}
'

curl -H 'Content-Type: application/json' -XPUT 'http://qphone01:9200/chinese/test1/2?pretty' -d \
'
{
  "content": "他不僅在意大利國家隊取得過成功"
}
'

curl -H 'Content-Type: application/json' -XPUT 'http://qphone01:9200/chinese/test1/3?pretty' -d \
'
{
  "content": "教練還帶領(lǐng)廣州恒大稱霸中超并首次奪得亞冠聯(lián)賽冠軍"
}
'

##4. 向chinese檢索教練關(guān)鍵詞
curl -H 'Content-Type: application/json' -XGET 'http://qphone01:9200/chinese/_search?pretty' -d \
'
{
  "query": {
    "match": {
      "content": "教練"
    }
  }
}
'

3 全文檢索的java api

package cn.qphone.es.api;

import org.elasticsearch.action.search.SearchResponse;
import org.elasticsearch.action.search.SearchType;
import org.elasticsearch.client.transport.TransportClient;
import org.elasticsearch.index.query.QueryBuilders;
import org.elasticsearch.search.SearchHit;
import org.elasticsearch.search.SearchHits;

public class Demo2_Search {

    private static final String  INDEX = "chinese";

    public static void main(String[] args) {
        //1. 獲取核心對象
        TransportClient client = ElasticSearchUtils.getClient();

        //2. 查詢_search
        /*
         * matchAll --> select * from t
         * matchQuery --> select * from t where name like "%baby%"
         * termQuery --> select * from t where name = baby
         */
        SearchResponse response = client.prepareSearch(INDEX)
                .setSearchType(SearchType.QUERY_THEN_FETCH)
                .setQuery(QueryBuilders.matchQuery("content", "意大利"))
                .get();

        //3. 獲取到搜索的記錄
        SearchHits hits = response.getHits();
        long totalHits = hits.totalHits; // 總的記錄
        float maxScore = hits.getMaxScore(); // 最大分數(shù)
        System.out.println("total hits: " + totalHits);
        System.out.println("max socres : " + maxScore);
        SearchHit[] searchHits = hits.getHits(); // 包含了具體的記錄
        for (SearchHit hit : searchHits) {
            System.out.println("index : " + hit.getIndex());
            System.out.println("當(dāng)前分數(shù):" + hit.getScore());
            System.out.println("content : " + hit.getSourceAsString());
        }

    }
}

最后編輯于
?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請聯(lián)系作者
【社區(qū)內(nèi)容提示】社區(qū)部分內(nèi)容疑似由AI輔助生成,瀏覽時請結(jié)合常識與多方信息審慎甄別。
平臺聲明:文章內(nèi)容(如有圖片或視頻亦包括在內(nèi))由作者上傳并發(fā)布,文章內(nèi)容僅代表作者本人觀點,簡書系信息發(fā)布平臺,僅提供信息存儲服務(wù)。

相關(guān)閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容