www.日韩一二三,9日本国产视频

1. Lucene入門程序

使用到的jar包:
    mysql5.1驅(qū)動包：mysql-connector-java-5.1.7-bin.jar
    核心包：lucene-core-4.10.3.jar
    分詞器通用包：lucene-analyzers-common-4.10.3.jar
    junit包：junit-4.9.jar

/**
 * 創(chuàng)建索引
 */
@Test
public void testIndex() throws IOException {
    //第一步. 創(chuàng)建一個IndexWriter對象(反向確定需要的條件)
    //1. 指定索引庫的存放位置Directory對象
    //2. 指定一個分析器,對文檔的內(nèi)容進行分詞分析
    Directory directory = FSDirectory.open(new File("D:/lucence/index"));
    Analyzer analyzer = new StandardAnalyzer();
    IndexWriterConfig config = new IndexWriterConfig(Version.LATEST, analyzer);

    IndexWriter indexWriter = new IndexWriter(directory, config);

    //第二步. 獲取原數(shù)據(jù)
    File file = new File("D:/searchsource");
    File[] files = file.listFiles();
    //第三部.遍歷原數(shù)據(jù),創(chuàng)建field對象,將field對象添加到document文檔對象中
    for (File f : files) {
        //2. 創(chuàng)建document文檔對象
        Document document = new Document();
        //文件名稱
        String fileName = f.getName();
        Field fileNameField = new TextField("fileName", fileName, Field.Store.YES);
        //文件大小
        long fileSize = FileUtils.sizeOf(f);
        Field fileSizeField = new LongField("fileSize", fileSize, Field.Store.YES);
        //文件位置
        String path = f.getPath();
        String absolutePath = f.getAbsolutePath();
        Field fieldPathField = new StoredField("filePath", path);
        //文件內(nèi)容
        String fileContent = FileUtils.readFileToString(f);
        Field fileContentField = new TextField("fileContent", fileContent, Field.Store.YES);

        //將各個Field,添加到document對象中
        document.add(fileNameField);
        document.add(fileSizeField);
        document.add(fieldPathField);
        document.add(fileContentField);
        //4. 使用IndexWriter對象將document對象寫入索引庫,此過程中進行索引創(chuàng)建。并將索引與document對象寫入索引庫
        indexWriter.addDocument(document);
    }
    //5. 關閉IndexWriter對象
    indexWriter.close();
}


@Test
//搜索索引
public void testQuery() throws IOException {
    try {
        // 第一步：創(chuàng)建一個Directory對象，也就是索引庫存放的位置D:\lucence\index
        Directory directory = FSDirectory.open(new File("D:/lucence/index"));//磁盤
        // 第二步：創(chuàng)建一個indexReader對象，需要指定Directory對象。
        IndexReader indexReader = DirectoryReader.open(directory);
        // 第三步：創(chuàng)建一個indexsearcher對象，需要指定IndexReader對象
        IndexSearcher indexSearcher = new IndexSearcher(indexReader);
        // 第四步：創(chuàng)建一個TermQuery對象，指定查詢的域和查詢的關鍵詞。
        Query query = new TermQuery(new Term("fileName", "web"));
        // 第五步：執(zhí)行查詢。參數(shù)1:查詢對象,參數(shù)2:查詢條數(shù)
        TopDocs topDocs = indexSearcher.search(query, 10);
        // 第六步：返回查詢結果。遍歷查詢結果并輸出。
        ScoreDoc[] docs = topDocs.scoreDocs;
        System.out.println(docs.length);
        for (ScoreDoc scoreDoc : docs) {
            //查詢出來的文件的索引
            int doc = scoreDoc.doc;
            //根據(jù)索引查詢索引庫,獲取文檔
            Document document = indexSearcher.doc(doc);

            //查詢索引庫存儲的文檔內(nèi)容,完成測試
            //文件名稱
            String fileName = document.get("fileName");
            System.out.println(fileName);
            //文件大小
            String fileSize = document.get("fileSize");
            System.out.println(fileSize);
            //文件位置
            String filePath = document.get("filePath");
            System.out.println(filePath);
            //文件內(nèi)容
            String fileContent = document.get("fileContent");
            System.out.println(fileContent);
        }
        //關閉IndexReader對象
        indexReader.close();
    } catch (IOException e) {
        e.printStackTrace();
    }
}

2. Field 域

2.1 Field屬性

Field是文檔中的域，包括Field名和Field值兩部分，一個文檔可以包括多個Field，
Document只是Field的一個承載體，F(xiàn)ield值即為要索引的內(nèi)容，也是要搜索的內(nèi)容。

是否分詞(tokenized)
- 是，將field的內(nèi)容分成一個一個單詞。分詞的目的：分詞目的為了索引
  - 商品的名稱，商品的介紹。
- 否，不分詞，將內(nèi)容作為一個整體存儲。
  - 商品ID 身份證號，圖片路徑

是否索引(indexed)
- 是, 將field的值建立索引，索引的目的：索引的目的為了搜索。
  - 商品的名稱，商品的介紹
- 否，不建立索引
  - 圖片路徑
是否存儲(stored)
- 是，存儲field的值。存儲的目的：（為了展示在頁面）
  - 商品名稱，圖片路徑
- 否，不存儲field的值。
  - 商品介紹。如果需要展示，根據(jù)ID從數(shù)據(jù)庫查詢展示在詳情頁面

2.2 Field的常用類型

1.png

3. 第三方中文分詞器IK-analyzer

Lucene自帶的默認分詞器不能滿足中文分詞的需求。

因此使用第三方中文分詞器IK-annalyzer.

使用：

IKAnalyzer繼承Lucene的Analyzer抽象類，使用IKAnalyzer和Lucene自帶的分詞器方法一樣，將Analyzer測試代碼改為IKAnalyzer測試中文分詞效果。

可以配置擴展詞典和停用詞詞典。

如果使用中文分詞器ik-analyzer，就需要在索引和搜索流程程序中使用一致的分詞器ik-analyzer。


1. 添加jar包
2. 拷貝IK-analyzer的三個配置文件
3. 使用即可


配置文件：
IKAnalyzer.cfg.xml:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE properties SYSTEM "http://java.sun.com/dtd/properties.dtd">  
<properties>  
    <comment>IK Analyzer 擴展配置</comment>
    <!--用戶可以在這里配置自己的擴展字典--> 
    <entry key="ext_dict">mydict.dic;</entry> 
     
     <!--用戶可以在這里配置自己的擴展停止詞字典-->
    <entry key="ext_stopwords">ext_stopword.dic</entry> 
</properties>

ext.dic: 擴展詞匯
    高富帥
    二維表

stopword.dic:停止詞典
    我
    是
    用
    的
    二
    維
    表
    來
    a
    an

------------------------------------------

使用：
     Analyzer analyzer = new IKAnalyzer();

4. 索引維護

這里我將獲取IndexWriter的方法抽取出來：

private IndexWriter getIndexWriter() throws IOException {
    //get index directory
    Directory directory = FSDirectory.open(new File("D:/lucence/index3"));
    //get analyzer
    //Analyzer analyzer = new StandardAnalyzer();
    //獲取中文分詞器IKAnalyzer
    Analyzer analyzer = new IKAnalyzer();
    IndexWriterConfig config = new IndexWriterConfig(Version.LATEST,analyzer);
    //get indexWriter
    return new IndexWriter(directory,config);
}

4.1 添加索引

參考上面入門程序,調(diào)用indexWriter.addDocument(document);

4.2 刪除所有索引

@Test
public void testDeleteAllIndex() throws IOException {
    IndexWriter indexWriter = getIndexWriter();
    indexWriter.deleteAll();
    indexWriter.close();
}

4.3 刪除指定field的索引

@Test
public void testDeleteIndex() throws IOException {
    IndexWriter indexWriter = getIndexWriter();
    indexWriter.deleteDocuments(new Term("fileName","web"));
    indexWriter.close();
}

4.4 更新索引

@Test
public void testUpdateIndex() throws IOException {
    IndexWriter indexWriter = getIndexWriter();
    //build update document
    Document document = new Document();
    document.add(new TextField("fileName","testUpdate",Field.Store.YES));

    //update
    indexWriter.updateDocument(new Term("fileName","welcome"),document);
    //close indexWriter
    indexWriter.close();
}

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九欧美,1769亚洲,黄色成人av

03_Lucene學習筆記

03_Lucene學習筆記

1. Lucene入門程序

2. Field 域

2.1 Field屬性

2.2 Field的常用類型

3. 第三方中文分詞器IK-analyzer

4. 索引維護

4.1 添加索引

4.2 刪除所有索引

4.3 刪除指定field的索引

4.4 更新索引

相關閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九 欧美,1769亚洲,黄色成人av

03_Lucene學習筆記

1. Lucene入門程序

2. Field 域

2.1 Field屬性

2.2 Field的常用類型

3. 第三方中文分詞器IK-analyzer

4. 索引維護

4.1 添加索引

4.2 刪除所有索引

4.3 刪除指定field的索引

4.4 更新索引

相關閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九欧美,1769亚洲,黄色成人av