Clustering and Ranking(CaR)代碼框架解讀

論文和倉(cāng)庫(kù)

論文地址
官方代碼

核心貢獻(xiàn)

主要貢獻(xiàn):在alpaca數(shù)據(jù)集上實(shí)驗(yàn)得到一個(gè)抽樣子集進(jìn)行訓(xùn)練,在這樣節(jié)省訓(xùn)練資源和時(shí)間的情況下,得到更好的實(shí)驗(yàn)結(jié)果
優(yōu)點(diǎn):采用小模型(IQS)評(píng)分,小模型訓(xùn)練非???,且有針對(duì)性;
缺點(diǎn):IQS訓(xùn)練需要標(biāo)注數(shù)據(jù),如果利用更大的模型評(píng)分,比如gpt4,則不需要標(biāo)注;但有個(gè)問(wèn)題是,如果你本身的sft數(shù)據(jù)就是足夠優(yōu)秀的LLM生成的(比如 gpt4),那么再使用該模型來(lái)進(jìn)行評(píng)分,可能不那么準(zhǔn)確
局限性: 該方案中的cluster 和 rank分別引入了兩個(gè)小模型embedding 和 IQS,小模型的上限就是該方案的上限

主要步驟

  1. 根據(jù)一個(gè)評(píng)分模型對(duì)sft數(shù)據(jù)集進(jìn)行評(píng)分(score),并根據(jù)評(píng)分結(jié)果進(jìn)行排序(rank)
  2. 對(duì)sft數(shù)據(jù)進(jìn)行聚類成N個(gè)簇,對(duì)簇中選取num個(gè)條數(shù)據(jù)

代碼框架解讀

一、評(píng)分并排序

  1. 使用COMET項(xiàng)目在標(biāo)注的數(shù)據(jù)集上微調(diào)([原始標(biāo)注數(shù)據(jù)來(lái)源]),得到一個(gè)IQS模型,下載地址

  2. 使用IQS模型進(jìn)行推理,得到評(píng)分score

  3. 訓(xùn)練IQS的原數(shù)據(jù)集下載地址:Expert Revision Dataset;標(biāo)注數(shù)據(jù)集格式如下,其中distance是計(jì)算的修改前后數(shù)據(jù)(Instruction+Input+Response)的編輯距離,用于根據(jù)閾值刪掉一些距離過(guò)小的,大致閾值是200, 最終保留1211條數(shù)據(jù)

    ```json
    {
            "Raw Instruction": "Select two of the given topics and write a paragraph discussing the similarities between them.",
            "Raw Input": "The Internet of Things, Artificial Intelligence, and Blockchain Technology",
            "Raw Response": "The Internet of Things (IoT) and Artificial Intelligence (AI) both involve the use of machines to collect and process data. Both technologies rely heavily on automation and machine learning in order to operate efficiently and accurately. Similarly, both technologies have the potential to improve existing processes, automate tedious tasks, and create new opportunities. Furthermore, both technologies require a large amount of data to be collected and analyzed to become truly intelligent and efficient.\\n\\nSimilarly, blockchain technology and AI both involve the use of computers to store and process data. This data can be used to create new applications and services, or to improve existing ones. Additionally, both technologies are designed to be secure, decentralized, and resilient, making them suitable for applications across a variety of fields.",
            "Revised Instruction": "Select two of the given topics and write a paragraph discussing the similarities between them.",
            "Revised Input": "The Internet of Things, Artificial Intelligence, and Blockchain Technology",
            "Revised Response": "Here is a paragraph discussing the similarities between Artificial Intelligence and Blockchain Technology:\\n\\nArtificial Intelligence (AI) and Blockchain Technology are two emerging fields that have the potential to transform various industries and sectors. Both of them rely on data and algorithms to perform complex tasks and solve problems. AI uses data to train machines to learn from experience and perform human-like functions, such as speech recognition, image analysis, and natural language processing. Blockchain Technology uses data to create a distributed ledger that records transactions and ensures their security, transparency, and immutability. Both of them can enhance efficiency, accuracy, and innovation in various domains, such as healthcare, finance, education, and governance.",
            "Distance": 640
        }
    
    ```
    
  4. 用于comet微調(diào)的數(shù)據(jù)集格式如下,數(shù)據(jù)路徑Ranking/data/expert-revised-comet:

    {
        "src": "Instruction: Given natural language sentences, identify the relations between entities. Input: Sharad and Sayansh are brothers. ", 
        "pos": "Response: The relation between Sharad and Sayansh is brothers.", 
        "neg": "Response: Sharad and Sayansh: Relation - Brothers"
        }
    
    
  5. 用于ISQ微調(diào)的數(shù)據(jù)集格式如下(0,1標(biāo)注,修改前是0, 修改后是1),數(shù)據(jù)路徑Ranking/data/expert-revised:

    {"src": "Instruction: List four countries in Africa Response: Egypt, South Africa, Nigeria, Morocco.", 
    "score": 0.0}
    {"src": "Instruction: List four countries in Africa Response: Here are four countries in Africa: Egypt, South Africa, Nigeria, Morocco.", 
    "score": 1.0}
    
    
  6. 訓(xùn)練ISQ模型,終端執(zhí)行腳本,進(jìn)入到Ranking目錄下:

    python -m comet.cli.train --cfg configs/models/instruction_score.yaml --early_stopping configs/early_stopping.yaml --model_checkpoint configs/model_checkpoint.yaml --trainer configs/trainer.yaml --instruction_metric configs/models/instruction_score.yaml
    
    

    注意:有時(shí)可能會(huì)報(bào)一些這樣的error “No action for key "amp_backend" to check its value”,這是因?yàn)閜ytorch_lightning不同版本中Trainer的參數(shù)發(fā)生變化導(dǎo)致

二、聚類

  1. 使用一個(gè)embedding模型對(duì)數(shù)據(jù)進(jìn)行向量化
  2. 使用PCA對(duì)句子向量進(jìn)行降維
  3. 對(duì)降維后的數(shù)據(jù)進(jìn)行聚類,這里alpaca聚類了161類( sqrt(52k/2))

三、數(shù)據(jù)選擇

  1. 先根據(jù)rank之后的score選擇前1000條數(shù)據(jù)
  2. 根據(jù)聚類之后的結(jié)果,每個(gè)簇選score最高的n條數(shù)據(jù)
最后編輯于
?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請(qǐng)聯(lián)系作者
【社區(qū)內(nèi)容提示】社區(qū)部分內(nèi)容疑似由AI輔助生成,瀏覽時(shí)請(qǐng)結(jié)合常識(shí)與多方信息審慎甄別。
平臺(tái)聲明:文章內(nèi)容(如有圖片或視頻亦包括在內(nèi))由作者上傳并發(fā)布,文章內(nèi)容僅代表作者本人觀點(diǎn),簡(jiǎn)書(shū)系信息發(fā)布平臺(tái),僅提供信息存儲(chǔ)服務(wù)。

友情鏈接更多精彩內(nèi)容