亚洲性爱在一区,国产日韩欧美二区,国产aaa

三范式建模與反范式建模

public class Department {
    
    private Integer deptId;
    private String name;
    private String desc;
    private List<Employee> employees;

}

public class Employee {
    
    private Integer empId;
    private String name;
    private Integer age;
    private String gender;
    private Department dept;

}

第一種建模類型

PUT /website/users/1 
{
  "name":     "小魚兒",
  "email":    "xiaoyuer@sina.com",
  "birthday":      "1980-01-01"
}

PUT /website/blogs/1
{
  "title":    "我的第一篇博客",
  "content":     "這是我的第一篇博客，開通啦?。?！"
  "userId":     1 
}

搜索小魚兒發(fā)表的所有博客,那么需要先搜索小魚兒的userId，然后根據(jù)userId 來搜索博客，
優(yōu)點：數(shù)據(jù)不冗余，維護方便
缺點：應用層join，如果關聯(lián)數(shù)據(jù)過多，導致查詢過大，性能很差，一般不采用這種方案

第二種建模方式：用冗余數(shù)據(jù)，采用文檔數(shù)據(jù)模型，進行數(shù)據(jù)建模，實現(xiàn)用戶和博客的關聯(lián)


PUT /website/users/1
{
  "name":     "小魚兒",
  "email":    "xiaoyuer@sina.com",
  "birthday":      "1980-01-01"
}

PUT /website/blogs/1
{
  "title": "小魚兒的第一篇博客",
  "content": "大家好，我是小魚兒。。。",
  "userInfo": {
    "userId": 1,
    "username": "小魚兒"
  }
}

2、基于冗余用戶數(shù)據(jù)搜索博客

GET /website/blogs/_search 
{
  "query": {
    "term": {
      "userInfo.username.keyword": {
        "value": "小魚兒"
      }
    }
  }
}

優(yōu)點：性能高，不需要執(zhí)行兩次搜索
缺點：數(shù)據(jù)冗余，維護成本高 --> 每次如果你的username變化了，同時要更新user type和blog type
一般來說，對于es這種NoSQL類型的數(shù)據(jù)存儲來講，都是冗余模式....

GET /website/blogs/_search 
{
  "size": 0, 
  "aggs": {
    "group_by_username": {
      "terms": {
        "field": "userInfo.userName.keyword"
      },
      "aggs": {
        "top_blogs": {
          "top_hits": {
            "_source": {
              "include": "title"
            }, 
            "size": 5
          }
        }
      }
    }
  }
}

es處理多層次的文件系統(tǒng)

path_hierarchy： /a/b/c/d --> path_hierarchy -> /a/b/c/d, /a/b/c, /a/b, /a

PUT /fs
{
  "settings": {
    "analysis": {
      "analyzer": {
        "paths": { 
          "tokenizer": "path_hierarchy"
        }
      }
    }
  }
}

fs: filesystem

PUT /fs/_mapping/file
{
  "properties": {
    "name": { 
      "type":  "keyword"
    },
    "path": { 
      "type":  "keyword",
      "fields": {
        "tree": { 
          "type":     "text",
          "analyzer": "paths"
        }
      }
    }
  }
}

PUT /fs/file/1
{
  "name":     "README.txt", 
  "path":     "/workspace/projects/helloworld", 
  "contents": "這是我的第一個elasticsearch程序"
}

文件搜索需求：查找一份，內(nèi)容包括elasticsearch，在/workspace/projects/hellworld這個目錄下的文件

GET /fs/file/_search 
{
  "query": {
    "bool": {
      "must": [
        {
          "match": {
            "contents": "elasticsearch"
          }
        },
        {
          "constant_score": {
            "filter": {
              "term": {
                "path": "/workspace/projects/helloworld"
              }
            }
          }
        }
      ]
    }
  }
}

搜索需求2：搜索/workspace目錄下，內(nèi)容包含elasticsearch的所有的文件

/workspace/projects/helloworld doc1
/workspace/projects doc1
/workspace doc1

GET /fs/file/_search 
{
  "query": {
    "bool": {
      "must": [
        {
          "match": {
            "contents": "elasticsearch"
          }
        },
        {
          "constant_score": {
            "filter": {
              "term": {
                "path.tree": "/workspace"
              }
            }
          }
        }
      ]
    }
  }
}

全局鎖(鎖整個索引)實現(xiàn)并發(fā)控制

給fs索引上全局鎖

PUT /fs/lock/global/_create
{}

fs: 你要上鎖的那個index
lock: 就是你指定的一個對這個index上全局鎖的一個type
global: 就是你上的全局鎖對應的這個doc的id
_create：強制必須是創(chuàng)建，如果/fs/lock/global這個doc已經(jīng)存在，那么創(chuàng)建失敗，報錯
此時假設另外一個線程也來上鎖，es會報錯。知道那個全局鎖被刪除了,

DELETE /fs/lock/global

全局鎖的優(yōu)點和缺點

優(yōu)點：操作非常簡單，非常容易使用，成本低
缺點：你直接就把整個index給上鎖了，這個時候?qū)ndex中所有的doc的操作，都會被block住，導致整個系統(tǒng)的并發(fā)能力很低
上鎖解鎖的操作不是頻繁，然后每次上鎖之后，執(zhí)行的操作的耗時不會太長，用這種方式，方便

悲觀鎖(鎖單個doc)實現(xiàn)并發(fā)控制

悲觀鎖上鎖語法

POST /fs/lock/1/_update
{
  "upsert": { "process_id": 123 },
  "script": {
    "lang": "groovy",
    "file": "judge-lock", // 腳本內(nèi)容為：if ( ctx._source.process_id != process_id ) { assert false }; ctx.op = 'noop';
    "params": {
      "process_id": 123
    }
  }
}
//解鎖語法
DELETE /fs/lock/1

/fs/lock，是固定的，就是說fs下的lock type，專門用于進行上鎖
/fs/lock/id，比如1，id其實就是你要上鎖的那個doc的id，代表了某個doc數(shù)據(jù)對應的lock（也是一個doc）
_update + upsert：執(zhí)行upsert操作
params，里面有個process_id，process_id，是你的要執(zhí)行增刪改操作的進程的唯一id，比如說可以在java系統(tǒng)，啟動的時候，給你的每個線程都用UUID自動生成一個thread id。

process_id很重要，會在lock中，設置對對應的doc加鎖的進程的id，這樣其他進程過來的時候，才知道，這條數(shù)據(jù)已經(jīng)被別人給鎖了

如果該document之前沒有被鎖，/fs/lock/1之前不存在，也就是doc id=1沒有被別人上過鎖; upsert的語法，那么執(zhí)行index操作，創(chuàng)建一個/fs/lock/id這條數(shù)據(jù)，
而且用params中的數(shù)據(jù)作為這個lock的數(shù)據(jù)。process_id被設置為123，script不執(zhí)行。這個時候象征著process_id=123的進程已經(jīng)鎖了一個doc了。

如果document被鎖了，就是說/fs/lock/1已經(jīng)存在了，代表doc id=1已經(jīng)被某個進程給鎖了。那么執(zhí)行update操作，script，此時會比對process_id，如果相同，就是說，某個進程，之前鎖了這個doc，
然后這次又過來，就可以直接對這個doc執(zhí)行操作，說明是該進程之前鎖的doc，則不報錯，不執(zhí)行任何操作，返回success; 如果process_id比對不上，說明doc被其他doc給鎖了，此時報錯

es共享讀鎖和排它寫鎖

1、共享鎖和排他鎖的說明

共享鎖：這份數(shù)據(jù)是共享的，然后多個線程過來，都可以獲取同一個數(shù)據(jù)的共享鎖，然后對這個數(shù)據(jù)執(zhí)行讀操作
排他鎖：是排他的操作，只能一個線程獲取排他鎖，然后執(zhí)行增刪改操作

讀寫鎖的分離

如果只是要讀取數(shù)據(jù)的話，那么任意個線程都可以同時進來然后讀取數(shù)據(jù)，每個線程都可以上一個共享鎖
但是這個時候，如果有線程要過來修改數(shù)據(jù)，那么會嘗試上排他鎖，排他鎖會跟共享鎖互斥，也就是說，如果有人已經(jīng)上了共享鎖了，那么排他鎖就不能上，就得等，也就是說如果有人在讀數(shù)據(jù)，就不允許別人來修改數(shù)據(jù)

反之，也是一樣的

如果有人在修改數(shù)據(jù)，就是加了排他鎖
這時有人過來同時要讀取數(shù)據(jù)，那么會嘗試加共享鎖，此時會失敗，因為共享鎖和排他鎖是沖突的
另外如果其他線程過來要修改數(shù)據(jù)，也會嘗試加排他鎖，此時會失敗，鎖沖突，必須等待，同時只能有一個線程修改數(shù)據(jù)

共享鎖加鎖

POST /fs/lock/1/_update 
{
  "upsert": { 
    "lock_type":  "shared",
    "lock_count": 1
  },
  "script": {
    "lang": "groovy",
    "file": "judge-lock-2"http://加鎖腳本：if (ctx._source.lock_type == 'exclusive') { assert false }; ctx._source.lock_count++
  }
}
//共享鎖解鎖 ，這里語法有點問題 但是大概意思就沒問題      
POST /fs/lock/1/_update
{
  "script": {
    "lang": "groovy",
    "file": "unlock-shared"http://解鎖腳本內(nèi)容 if (--ctx._source.lock_type == 0) { ctx.op = 'delete' };
  }
}

每次解鎖一個共享鎖，就對lock_count先減1，如果減了1之后，是0，那么說明所有的共享鎖都解鎖完了，此時就就將/fs/lock/1刪除，就徹底解鎖所有的共享鎖

排它鎖加鎖

PUT /fs/lock/1/_create
{ "lock_type": "exclusive" }

//排它鎖解鎖
DELETE /fs/lock/1

nested object 嵌套關系的數(shù)據(jù)搜索

加入數(shù)據(jù)

PUT /website/blogs/6
{
  "title": "花無缺發(fā)表的一篇帖子",
  "content":  "我是花無缺，大家要不要考慮一下投資房產(chǎn)和買股票的事情啊。。。",
  "tags":  [ "投資", "理財" ],
  "comments": [ 
    {
      "name":    "小魚兒",
      "comment": "什么股票??？推薦一下唄",
      "age":     28,
      "stars":   4,
      "date":    "2016-09-01"
    },
    {
      "name":    "黃藥師",
      "comment": "我喜歡投資房產(chǎn)，風，險大收益也大",
      "age":     31,
      "stars":   5,
      "date":    "2016-10-22"
    }
  ]
}

搜索被年齡是28歲的黃藥師評論過的博客

GET /website/blogs/_search
{
  "query": {
    "bool": {
      "must": [
        { "match": { "comments.name": "黃藥師" }},
        { "match": { "comments.age":  28      }} 
      ]
    }
  }
}

期望結(jié)果是無數(shù)據(jù)返回，但是卻返回了數(shù)據(jù)，怎么回事呢？

object類型數(shù)據(jù)結(jié)構(gòu)的底層存儲。。。

{
  "title":            [ "花無缺", "發(fā)表", "一篇", "帖子" ],
  "content":             [ "我", "是", "花無缺", "大家", "要不要", "考慮", "一下", "投資", "房產(chǎn)", "買", "股票", "事情" ],
  "tags":             [ "投資", "理財" ],
  "comments.name":    [ "小魚兒", "黃藥師" ],
  "comments.comment": [ "什么", "股票", "推薦", "我", "喜歡", "投資", "房產(chǎn)", "風險", "收益", "大" ],
  "comments.age":     [ 28, 31 ],
  "comments.stars":   [ 4, 5 ],
  "comments.date":    [ 2016-09-01, 2016-10-22 ]
}

object類型底層數(shù)據(jù)結(jié)構(gòu)，會將一個json數(shù)組中的數(shù)據(jù)，進行扁平化，所以，直接命中了這個document，name=黃藥師，age=28，正好符合
2、引入nested object類型，來解決object類型底層數(shù)據(jù)結(jié)構(gòu)導致的問題
修改mapping，將comments的類型從object設置為nested

PUT /website
{
  "mappings": {
    "blogs": {
      "properties": {
        "comments": {
          "type": "nested", 
          "properties": {
            "name":    { "type": "string"  },
            "comment": { "type": "string"  },
            "age":     { "type": "short"   },
            "stars":   { "type": "short"   },
            "date":    { "type": "date"    }
          }
        }
      }
    }
  }
}

嵌套Object數(shù)據(jù)存儲結(jié)構(gòu)如下

{ 
  "comments.name":    [ "小魚兒" ],
  "comments.comment": [ "什么", "股票", "推薦" ],
  "comments.age":     [ 28 ],
  "comments.stars":   [ 4 ],
  "comments.date":    [ 2014-09-01 ]
}
{ 
  "comments.name":    [ "黃藥師" ],
  "comments.comment": [ "我", "喜歡", "投資", "房產(chǎn)", "風險", "收益", "大" ],
  "comments.age":     [ 31 ],
  "comments.stars":   [ 5 ],
  "comments.date":    [ 2014-10-22 ]
}
{ 
  "title":            [ "花無缺", "發(fā)表", "一篇", "帖子" ],
  "body":             [ "我", "是", "花無缺", "大家", "要不要", "考慮", "一下", "投資", "房產(chǎn)", "買", "股票", "事情" ],
  "tags":             [ "投資", "理財" ]
}

再次搜索，就是和期望的一樣了

聚合數(shù)據(jù)分析的需求1：按照評論日期進行bucket劃分，然后拿到每個月的評論的評分的平均值

GET /website/blogs/_search 
{
  "size": 0, 
  "aggs": {
    "comments_path": {
      "nested": {
        "path": "comments"
      }, 
      "aggs": {
        "group_by_comments_date": {
          "date_histogram": {
            "field": "comments.date",
            "interval": "month",
            "format": "yyyy-MM"
          },
          "aggs": {
            "avg_stars": {
              "avg": {
                "field": "comments.stars"
              }
            }
          }
        }
      }
    }
  }
}

父子關系數(shù)據(jù)建模

object的建模，有個不好的地方，就是采取的是類似冗余數(shù)據(jù)的方式，將多個數(shù)據(jù)都放在一起了，維護成本就比較高

parent child建模方式，采取的是類似于關系型數(shù)據(jù)庫的三范式類的建模，多個實體都分割開來，每個實體之間都通過一些關聯(lián)方式，進行了父子關系的關聯(lián)，各種數(shù)據(jù)不需要都放在一起，父doc和子doc分別在進行更新的時候，都不會影響對方
一對多關系的建模，維護起來比較方便，而且我們之前說過，類似關系型數(shù)據(jù)庫的建模方式，應用層join的方式，會導致性能比較差，因為做多次搜索。父子關系的數(shù)據(jù)模型，不會，性能很好。因為雖然數(shù)據(jù)實體之間分割開來，但是我們在搜索的時候，由es自動為我們處理底層的關聯(lián)關系，并且通過一些手段保證搜索性能。
父子關系數(shù)據(jù)模型，相對于nested數(shù)據(jù)模型來說，優(yōu)點是父doc和子doc互相之間不會影響

要點：父子關系元數(shù)據(jù)映射，用于確保查詢時候的高性能，但是有一個限制，就是父子數(shù)據(jù)必須存在于一個shard中
父子關系數(shù)據(jù)存在一個shard中，而且還有映射其關聯(lián)關系的元數(shù)據(jù)，那么搜索父子關系數(shù)據(jù)的時候，不用跨分片，一個分片本地自己就搞定了，性能當然高咯

研發(fā)中心員工管理案例，一個IT公司有多個研發(fā)中心，每個研發(fā)中心有多個員工

PUT /company
{
  "mappings": {
    "rd_center": {},
    "employee": {
      "_parent": {
        "type": "rd_center" 
      }
    }
  }
}

父子關系建模的核心，多個type之間有父子關系，用_parent指定父type

POST /company/rd_center/_bulk
{ "index": { "_id": "1" }}
{ "name": "北京研發(fā)總部", "city": "北京", "country": "中國" }
{ "index": { "_id": "2" }}
{ "name": "上海研發(fā)中心", "city": "上海", "country": "中國" }
{ "index": { "_id": "3" }}
{ "name": "硅谷人工智能實驗室", "city": "硅谷", "country": "美國" }

shard路由的時候，id=1的rd_center doc，默認會根據(jù)id進行路由，到某一個shard

PUT /company/employee/1?parent=1 
{
  "name":  "張三",
  "birthday":   "1970-10-24",
  "hobby": "爬山"
}

維護父子關系的核心，parent=1，指定了這個數(shù)據(jù)的父doc的id

此時，parent-child關系，就確保了說，父doc和子doc都是保存在一個shard上的。內(nèi)部原理還是doc routing，employee和rd_center的數(shù)據(jù)，都會用parent id作為routing，這樣就會到一個shard
就不會根據(jù)id=1的employee doc的id進行路由了，而是根據(jù)parent=1進行路由，會根據(jù)父doc的id進行路由，那么就可以通過底層的路由機制，保證父子數(shù)據(jù)存在于一個shard中

POST /company/employee/_bulk
{ "index": { "_id": 2, "parent": "1" }}
{ "name": "李四", "birthday": "1982-05-16", "hobby": "游泳" }
{ "index": { "_id": 3, "parent": "2" }}
{ "name": "王二", "birthday": "1979-04-01", "hobby": "爬山" }
{ "index": { "_id": 4, "parent": "3" }}
{ "name": "趙五", "birthday": "1987-05-11", "hobby": "騎馬" }

我們已經(jīng)建立了父子關系的數(shù)據(jù)模型之后，就要基于這個模型進行各種搜索和聚合了

1、搜索有1980年以后出生的員工的研發(fā)中心

GET /company/rd_center/_search
{
  "query": {
    "has_child": {
      "type": "employee",
      "query": {
        "range": {
          "birthday": {
            "gte": "1980-01-01"
          }
        }
      }
    }
  }
}

2、搜索有名叫張三的員工的研發(fā)中心

GET /company/rd_center/_search
{
  "query": {
    "has_child": {
      "type":       "employee",
      "query": {
        "match": {
          "name": "張三"
        }
      }
    }
  }
}

3、搜索有至少2個以上員工的研發(fā)中心

GET /company/rd_center/_search
{
  "query": {
    "has_child": {
      "type":         "employee",
      "min_children": 2, 
      "query": {
        "match_all": {}
      }
    }
  }
}

4、搜索在中國的研發(fā)中心的員工

GET /company/employee/_search 
{
  "query": {
    "has_parent": {
      "parent_type": "rd_center",
      "query": {
        "term": {
          "country.keyword": "中國"
        }
      }
    }
  }
}

5 統(tǒng)計每個國家的喜歡每種愛好的員工有多少個

GET /company/rd_center/_search 
{
  "size": 0,
  "aggs": {
    "group_by_country": {
      "terms": {
        "field": "country.keyword"
      },
      "aggs": {
        "group_by_child_employee": {
          "children": {
            "type": "employee"
          },
          "aggs": {
            "group_by_hobby": {
              "terms": {
                "field": "hobby.keyword"
              }
            }
          }
        }
      }
    }
  }
}

祖孫三層關系的數(shù)據(jù)建模及搜索

PUT /company
{
  "mappings": {
    "country": {},
    "rd_center": {
      "_parent": {
        "type": "country" 
      }
    },
    "employee": {
      "_parent": {
        "type": "rd_center" 
      }
    }
  }
}

country -> rd_center -> employee，祖孫三層數(shù)據(jù)模型

POST /company/country/_bulk
{ "index": { "_id": "1" }}
{ "name": "中國" }
{ "index": { "_id": "2" }}
{ "name": "美國" }

POST /company/rd_center/_bulk
{ "index": { "_id": "1", "parent": "1" }}
{ "name": "北京研發(fā)總部" }
{ "index": { "_id": "2", "parent": "1" }}
{ "name": "上海研發(fā)中心" }
{ "index": { "_id": "3", "parent": "2" }}
{ "name": "硅谷人工智能實驗室" }

PUT /company/employee/1?parent=1&routing=1
{
  "name":  "張三",
  "dob":   "1970-10-24",
  "hobby": "爬山"
}

孫子輩兒，要手動指定routing，指定為爺爺輩兒的數(shù)據(jù)的id
country，用的是自己的id去路由; rd_center，parent，用的是country的id去路由; employee，如果也是僅僅指定一個parent，那么用的是rd_center的id去路由，這就導致祖孫三層數(shù)據(jù)不會在一個shard上

搜索有爬山愛好的員工所在的國家

GET /company/country/_search
{
  "query": {
    "has_child": {
      "type": "rd_center",
      "query": {
        "has_child": {
          "type": "employee",
          "query": {
            "match": {
              "hobby": "爬山"
            }
          }
        }
      }
    }
  }
}

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九欧美,1769亚洲,黄色成人av

es使用與原理7 -- 數(shù)據(jù)建模

es使用與原理7 -- 數(shù)據(jù)建模

三范式建模與反范式建模

es處理多層次的文件系統(tǒng)

全局鎖(鎖整個索引)實現(xiàn)并發(fā)控制

悲觀鎖(鎖單個doc)實現(xiàn)并發(fā)控制

es共享讀鎖和排它寫鎖

nested object 嵌套關系的數(shù)據(jù)搜索

父子關系數(shù)據(jù)建模

相關閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九 欧美,1769亚洲,黄色成人av

es使用與原理7 -- 數(shù)據(jù)建模

三范式建模與反范式建模

es處理多層次的文件系統(tǒng)

全局鎖(鎖整個索引)實現(xiàn)并發(fā)控制

悲觀鎖(鎖單個doc)實現(xiàn)并發(fā)控制

es共享讀鎖和排它寫鎖

nested object 嵌套關系的數(shù)據(jù)搜索

父子關系數(shù)據(jù)建模

相關閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九欧美,1769亚洲,黄色成人av