搜索引擎ElasticSearch之(1)、架構(gòu)簡介及基本服務(wù)搭建

1、ElasticSearch基本術(shù)語及概念

1.1、索引詞(term)

ElasticSearch中的索引詞是為一個(gè)能被索引的精確值,索引詞可以為文檔的某個(gè)字段或某個(gè)字段經(jīng)過分詞器分詞后的token單元。索引詞可以通過term查詢進(jìn)行精確搜索。

1.2、文本(text)

文本為普通的非結(jié)構(gòu)化文字,可被分詞器分解為索引詞,并通過索引詞構(gòu)建倒排索引。而搜索時(shí),搜索詞也會(huì)被分解為索引詞,并通過倒排索引精確匹配,最終獲取相關(guān)的文檔。

1.3、分析(analysis)

分析即將文本轉(zhuǎn)換為索引詞的過程,不同的分詞器會(huì)有不同的分析過程。分析過程如大小寫轉(zhuǎn)換、去掉標(biāo)點(diǎn)符號(hào)、去除停頓詞等。

1.4、集群(cluster)

ElasticSearch的集群由分布式的多個(gè)節(jié)點(diǎn)組成,每個(gè)節(jié)點(diǎn)有相同的集群名。集群的各個(gè)節(jié)點(diǎn)通過組播或單播的方式來發(fā)現(xiàn)彼此并加入集群。

1.5、節(jié)點(diǎn)(node)

節(jié)點(diǎn)時(shí)物理上獨(dú)立的服務(wù),一個(gè)節(jié)點(diǎn)就是一個(gè)ElasticSearch進(jìn)程,節(jié)點(diǎn)會(huì)存儲(chǔ)數(shù)據(jù),參與集群索引、搜索等,每個(gè)節(jié)點(diǎn)都會(huì)保存集群完整的狀態(tài)及配置信息,集群中每個(gè)節(jié)點(diǎn)必須有唯一名稱。若不定義節(jié)點(diǎn)名稱,則會(huì)在啟動(dòng)的時(shí)候自動(dòng)分配一個(gè)。

1.6、分片(shard)

一個(gè)分片即為一個(gè)lucene進(jìn)程實(shí)例,索引時(shí)一個(gè)或多個(gè)分片的邏輯集合。一個(gè)索引在創(chuàng)建時(shí),可以指定主分片及分片副本數(shù)量,而一旦創(chuàng)建完畢后,主分片數(shù)是不可再改變,但分片副本數(shù)量是可以改變的。ElasticSearch會(huì)自動(dòng)將分片在不同階段間進(jìn)行均衡,以達(dá)到集群的負(fù)載均衡及高可用。

1.7、主分片(primary shard)

一個(gè)索引的文檔只會(huì)存儲(chǔ)在一個(gè)主分片及其副本中。

1.8、副本分片(replica shard)

每個(gè)主分片可以有0或多個(gè)副本分片,副本分片保存了主分配的數(shù)據(jù)副本。副本分片有以下作用:a、增強(qiáng)高可用性:當(dāng)某個(gè)主分片失敗的時(shí)候,其對(duì)應(yīng)的某個(gè)副本分片會(huì)自動(dòng)被選舉為主分片,以保障系統(tǒng)的可用性及穩(wěn)定性;b、提高性能:當(dāng)進(jìn)行搜索的時(shí)候,可以在主分片或副本分片中進(jìn)行搜索,當(dāng)主分片有多個(gè)副本的時(shí)候,可以提升搜索的并發(fā)性能。

1.9、路由(routing)

當(dāng)存儲(chǔ)或獲取某個(gè)文檔的時(shí)候,其會(huì)通過路由找到唯一的主分片,路由的算法為某個(gè)字段的散列值對(duì)主分片數(shù)量進(jìn)行取模。默認(rèn)字段為文檔id,其可以由ElasticSearch自動(dòng)生成或用戶指定。

1.10、復(fù)制(replica)

當(dāng)集群的某個(gè)節(jié)點(diǎn)出現(xiàn)故障時(shí),復(fù)制可以對(duì)故障進(jìn)行轉(zhuǎn)移,以保證系統(tǒng)的高可用。復(fù)制主要在主分片和副本分片中發(fā)生。

1.11、索引(index)

索引是具有相同結(jié)構(gòu)的文檔集合。索引是一個(gè)邏輯概念,一個(gè)索引可以包含多個(gè)類型,而在物理上,一個(gè)索引包含一個(gè)或多個(gè)分片,索引數(shù)據(jù)分布在分片中,而分片存儲(chǔ)在多個(gè)節(jié)點(diǎn)中。

1.12、類型(type)

一個(gè)索引中可以定義多個(gè)類型,類型是索引的邏輯分區(qū)。

1.13、文檔(document)

文檔時(shí)存儲(chǔ)在ElasticSearch中的一個(gè)JSON格式字符串,如關(guān)系數(shù)據(jù)中的一行數(shù)據(jù)。每個(gè)文檔都有一個(gè)唯一id。在ElasticSearch集群中,索引+類型+文檔id共同確定一個(gè)唯一的文檔。

1.14、映射(mapping)

映射如同關(guān)系數(shù)據(jù)庫中的表結(jié)構(gòu),其定義了索引中類型、文檔的結(jié)構(gòu)、每個(gè)字段類型、分片配置等。映射可以在創(chuàng)建索引的時(shí)候定義,或是在第一次存儲(chǔ)文檔時(shí)由ElasticSearch自動(dòng)識(shí)別。

1.15、字段(field)

一個(gè)文檔包含0或多個(gè)字段,字段類型可以為簡單的如數(shù)字、日期等類型,也可是數(shù)組或嵌套的對(duì)象類型。

1.16、字段來源(source field)

ElasticSearch存儲(chǔ)文檔時(shí),會(huì)對(duì)某些字段做分析并構(gòu)建倒排索引,而來源字段則保持文檔字段的原始數(shù)據(jù),當(dāng)用戶搜索時(shí)刻返回這些原始數(shù)據(jù),來源字段可以只索引而不存儲(chǔ)。

1.17、主鍵(ID)

主鍵是文檔的唯一標(biāo)識(shí),存儲(chǔ)時(shí)若未指定id,則ElasticSearch會(huì)自動(dòng)生成一個(gè)id,index+type+id會(huì)唯一確定一個(gè)文檔。

2、ElasticSearch環(huán)境搭建

2.1、ElasticSearch下載

可以在ElasticSearch官網(wǎng):https://www.elastic.co/cn/downloads/,或華為的鏡像服務(wù)器下載:https://mirrors.huaweicloud.com/

2.2、文件說明

下載解壓文件,目錄如下:


目錄.png

其中bin文件下為ElasticSearch的可執(zhí)行文件,config為配置文件,data為默認(rèn)數(shù)據(jù)存儲(chǔ)目錄,logs為默認(rèn)日志存儲(chǔ)目錄。

2.3、安裝

可以無需更改配置文件,直接使用config中的默認(rèn)文件啟動(dòng)。默認(rèn)http端口為9200。

$ ./elasticsearch.bat

啟動(dòng)日志如下:

future versions of Elasticsearch will require Java 11; your Java version from [C:\Program Files\Java\jdk1.8.0_211\jre] does not meet this requirement
Warning: with JDK 8 on Windows, Elasticsearch may be unable to derive correct
  ergonomic settings due to a JDK issue (JDK-8074459). Please use a newer
  version of Java.
Warning: MaxDirectMemorySize may have been miscalculated due to JDK-8074459.
  Please use a newer version of Java or set MaxDirectMemorySize explicitly.
[2020-03-21T21:58:07,990][INFO ][o.e.e.NodeEnvironment    ] [test-node-1] using [1] data paths, mounts [[NewDisk (D:)]], net usable_space [63.6gb], net total_space [137.9gb], types [NTFS]
[2020-03-21T21:58:07,995][INFO ][o.e.e.NodeEnvironment    ] [test-node-1] heap size [990.7mb], compressed ordinary object pointers [true]
[2020-03-21T21:58:08,752][INFO ][o.e.n.Node               ] [test-node-1] node name [test-node-1], node ID [_tCUuwGSQ1CiF1vxKS42nA], cluster name [test-cluster]
[2020-03-21T21:58:08,754][INFO ][o.e.n.Node               ] [test-node-1] version[7.6.0], pid[21580], build[default/zip/7f634e9f44834fbc12724506cc1da681b0c3b1e3/2020-02-06T00:09:00.449973Z], OS[Windows 10/10.0/amd64], JVM[Oracle Corporation/Java HotSpot(TM) 64-Bit Server VM/1.8.0_211/25.211-b12]
[2020-03-21T21:58:08,754][INFO ][o.e.n.Node               ] [test-node-1] JVM home [C:\Program Files\Java\jdk1.8.0_211\jre]
[2020-03-21T21:58:08,755][INFO ][o.e.n.Node               ] [test-node-1] JVM arguments [-Des.networkaddress.cache.ttl=60, -Des.networkaddress.cache.negative.ttl=10, -XX:+AlwaysPreTouch, -Xss1m, -Djava.awt.headless=true, -Dfile.encoding=UTF-8, -Djna.nosys=true, -XX:-OmitStackTraceInFastThrow, -Dio.netty.noUnsafe=true, -Dio.netty.noKeySetOptimization=true, -Dio.netty.recycler.maxCapacityPerThread=0, -Dio.netty.allocator.numDirectArenas=0, -Dlog4j.shutdownHookEnabled=false, -Dlog4j2.disable.jmx=true, -Djava.locale.providers=COMPAT, -Xms1g, -Xmx1g, -XX:+UseConcMarkSweepGC, -XX:CMSInitiatingOccupancyFraction=75, -XX:+UseCMSInitiatingOccupancyOnly, -Djava.io.tmpdir=C:\Users\zhaozhou\AppData\Local\Temp\elasticsearch, -XX:+HeapDumpOnOutOfMemoryError, -XX:HeapDumpPath=data, -XX:ErrorFile=logs/hs_err_pid%p.log, -XX:+PrintGCDetails, -XX:+PrintGCDateStamps, -XX:+PrintTenuringDistribution, -XX:+PrintGCApplicationStoppedTime, -Xloggc:logs/gc.log, -XX:+UseGCLogFileRotation, -XX:NumberOfGCLogFiles=32, -XX:GCLogFileSize=64m, -XX:MaxDirectMemorySize=536870912, -Delasticsearch, -Des.path.home=D:\soft\elasticsearch-7.6.0-windows-x86_64\elasticsearch-7.6.0, -Des.path.conf=D:\soft\elasticsearch-7.6.0-windows-x86_64\elasticsearch-7.6.0\config, -Des.distribution.flavor=default, -Des.distribution.type=zip, -Des.bundled_jdk=true]
[2020-03-21T21:58:16,134][INFO ][o.e.p.PluginsService     ] [test-node-1] loaded module [aggs-matrix-stats]
[2020-03-21T21:58:16,136][INFO ][o.e.p.PluginsService     ] [test-node-1] loaded module [analysis-common]
[2020-03-21T21:58:16,137][INFO ][o.e.p.PluginsService     ] [test-node-1] loaded module [flattened]
[2020-03-21T21:58:16,137][INFO ][o.e.p.PluginsService     ] [test-node-1] loaded module [frozen-indices]
[2020-03-21T21:58:16,137][INFO ][o.e.p.PluginsService     ] [test-node-1] loaded module [ingest-common]
[2020-03-21T21:58:16,137][INFO ][o.e.p.PluginsService     ] [test-node-1] loaded module [ingest-geoip]
[2020-03-21T21:58:16,138][INFO ][o.e.p.PluginsService     ] [test-node-1] loaded module [ingest-user-agent]
[2020-03-21T21:58:16,138][INFO ][o.e.p.PluginsService     ] [test-node-1] loaded module [lang-expression]
[2020-03-21T21:58:16,138][INFO ][o.e.p.PluginsService     ] [test-node-1] loaded module [lang-mustache]
[2020-03-21T21:58:16,138][INFO ][o.e.p.PluginsService     ] [test-node-1] loaded module [lang-painless]
[2020-03-21T21:58:16,138][INFO ][o.e.p.PluginsService     ] [test-node-1] loaded module [mapper-extras]
[2020-03-21T21:58:16,138][INFO ][o.e.p.PluginsService     ] [test-node-1] loaded module [parent-join]
[2020-03-21T21:58:16,139][INFO ][o.e.p.PluginsService     ] [test-node-1] loaded module [percolator]
[2020-03-21T21:58:16,139][INFO ][o.e.p.PluginsService     ] [test-node-1] loaded module [rank-eval]
[2020-03-21T21:58:16,139][INFO ][o.e.p.PluginsService     ] [test-node-1] loaded module [reindex]
[2020-03-21T21:58:16,139][INFO ][o.e.p.PluginsService     ] [test-node-1] loaded module [repository-url]
[2020-03-21T21:58:16,139][INFO ][o.e.p.PluginsService     ] [test-node-1] loaded module [search-business-rules]
[2020-03-21T21:58:16,140][INFO ][o.e.p.PluginsService     ] [test-node-1] loaded module [spatial]
[2020-03-21T21:58:16,140][INFO ][o.e.p.PluginsService     ] [test-node-1] loaded module [transform]
[2020-03-21T21:58:16,140][INFO ][o.e.p.PluginsService     ] [test-node-1] loaded module [transport-netty4]
[2020-03-21T21:58:16,140][INFO ][o.e.p.PluginsService     ] [test-node-1] loaded module [vectors]
[2020-03-21T21:58:16,140][INFO ][o.e.p.PluginsService     ] [test-node-1] loaded module [x-pack-analytics]
[2020-03-21T21:58:16,141][INFO ][o.e.p.PluginsService     ] [test-node-1] loaded module [x-pack-ccr]
[2020-03-21T21:58:16,141][INFO ][o.e.p.PluginsService     ] [test-node-1] loaded module [x-pack-core]
[2020-03-21T21:58:16,141][INFO ][o.e.p.PluginsService     ] [test-node-1] loaded module [x-pack-deprecation]
[2020-03-21T21:58:16,141][INFO ][o.e.p.PluginsService     ] [test-node-1] loaded module [x-pack-enrich]
[2020-03-21T21:58:16,141][INFO ][o.e.p.PluginsService     ] [test-node-1] loaded module [x-pack-graph]
[2020-03-21T21:58:16,142][INFO ][o.e.p.PluginsService     ] [test-node-1] loaded module [x-pack-ilm]
[2020-03-21T21:58:16,142][INFO ][o.e.p.PluginsService     ] [test-node-1] loaded module [x-pack-logstash]
[2020-03-21T21:58:16,142][INFO ][o.e.p.PluginsService     ] [test-node-1] loaded module [x-pack-ml]
[2020-03-21T21:58:16,142][INFO ][o.e.p.PluginsService     ] [test-node-1] loaded module [x-pack-monitoring]
[2020-03-21T21:58:16,142][INFO ][o.e.p.PluginsService     ] [test-node-1] loaded module [x-pack-rollup]
[2020-03-21T21:58:16,142][INFO ][o.e.p.PluginsService     ] [test-node-1] loaded module [x-pack-security]
[2020-03-21T21:58:16,143][INFO ][o.e.p.PluginsService     ] [test-node-1] loaded module [x-pack-sql]
[2020-03-21T21:58:16,143][INFO ][o.e.p.PluginsService     ] [test-node-1] loaded module [x-pack-voting-only-node]
[2020-03-21T21:58:16,143][INFO ][o.e.p.PluginsService     ] [test-node-1] loaded module [x-pack-watcher]
[2020-03-21T21:58:16,144][INFO ][o.e.p.PluginsService     ] [test-node-1] loaded plugin [analysis-ik]
[2020-03-21T21:58:24,489][INFO ][o.e.x.s.a.s.FileRolesStore] [test-node-1] parsed [0] roles from file [D:\soft\elasticsearch-7.6.0-windows-x86_64\elasticsearch-7.6.0\config\roles.yml]
[2020-03-21T21:58:26,058][INFO ][o.e.x.m.p.l.CppLogMessageHandler] [test-node-1] [controller/17176] [Main.cc@110] controller (64 bit): Version 7.6.0 (Build 1c8cca13fa9631) Copyright (c) 2020 Elasticsearch BV

瀏覽器訪問:http://127.0.0.1:9200/會(huì)有如下輸出。

輸出.png

2.4、Elasticsearch-Head插件安裝

可以在chrome中安裝Elasticsearch-Head插件,直接在chrome應(yīng)用商店搜索或在https://chrome.google.com/webstore/detail/elasticsearch-head/ffmkiejjmecolpfloofpjologoblkegm/related中下載。

安裝完成后可查看Elasticsearch的機(jī)器狀態(tài)信息:

head.png
?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請(qǐng)聯(lián)系作者
【社區(qū)內(nèi)容提示】社區(qū)部分內(nèi)容疑似由AI輔助生成,瀏覽時(shí)請(qǐng)結(jié)合常識(shí)與多方信息審慎甄別。
平臺(tái)聲明:文章內(nèi)容(如有圖片或視頻亦包括在內(nèi))由作者上傳并發(fā)布,文章內(nèi)容僅代表作者本人觀點(diǎn),簡書系信息發(fā)布平臺(tái),僅提供信息存儲(chǔ)服務(wù)。

友情鏈接更多精彩內(nèi)容