統(tǒng)計(jì)關(guān)鍵字個(gè)數(shù)

問題描述

某個(gè)表中有多行數(shù)據(jù),其中第一列數(shù)據(jù)是一個(gè)ID編號(hào),想知道該ID編號(hào)是不是該表的主鍵?

解決思路

wc -l file
awk -F"," '{print $1}' file | sort | uniq | wc -l

比較wc輸出的行數(shù)是否一致

注意事項(xiàng)

  1. 為什么需要sort呢?
  • 答:uniq的一個(gè)特性,檢查重復(fù)行的時(shí)候,只會(huì)檢查相鄰的行是否有重復(fù)數(shù)據(jù),肯定存在重復(fù)數(shù)據(jù)不是在相鄰位置的情況。
  1. 實(shí)例
[root@localhost ~]# cat uniqtest    #測(cè)試文件
this is a test  
this is a test  
this is a test  
i am tank  
i love tank  
i love tank  
this is a test  
whom have a try  
WhoM have a try  
you  have a try  
i want to abroad  
those are good men  
we are good men  

[zhangy@BlackGhost mytest]$ uniq -c uniqtest    #uniq的一個(gè)特性,檢查重復(fù)行的時(shí)候,只會(huì)檢查相鄰的行是否有重復(fù)數(shù)據(jù),肯定存在重復(fù)數(shù)據(jù)不是在相鄰位置的情況
 3 this is a test
 1 i am tank
 2 i love tank
 1 this is a test          #和第一行是重復(fù)的
 1 whom have a try
 1 WhoM have a try
 1 you? have a try
 1 i want to abroad
 1 those are good men
 1 we are good men

 [zhangy@BlackGhost mytest]$ sort uniqtest |uniq -c      #這樣就可以解決上個(gè)例子中提到的問題
 1 WhoM have a try  
 1 i am tank  
 2 i love tank  
 1 i want to abroad  
 4 this is a test  
 1 those are good men  
 1 we are good men  
 1 whom have a try  
 1 you  have a try  
最后編輯于
?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請(qǐng)聯(lián)系作者
【社區(qū)內(nèi)容提示】社區(qū)部分內(nèi)容疑似由AI輔助生成,瀏覽時(shí)請(qǐng)結(jié)合常識(shí)與多方信息審慎甄別。
平臺(tái)聲明:文章內(nèi)容(如有圖片或視頻亦包括在內(nèi))由作者上傳并發(fā)布,文章內(nèi)容僅代表作者本人觀點(diǎn),簡(jiǎn)書系信息發(fā)布平臺(tái),僅提供信息存儲(chǔ)服務(wù)。

友情鏈接更多精彩內(nèi)容