色婷婷AV一区观看,国产激情一区二区三区,日韩欧美熟女三区

這一節(jié)所介紹的NLTK，是時下非常流行的在python解釋器環(huán)境中用于自然語言處理的工具包。對于NLTK的使用者而言，它就像一名極其高效的語言學(xué)家，為您快速完成對自然語言文本的深層處理和分析。
如果沒有自然語言處理技術(shù)，除了3.1.1.1學(xué)習(xí)到的詞袋法之外，似乎沒有更多的處理和分析手段。

使用詞袋法對示例文本進(jìn)行特征向量化

sent1='The cat is walking in the bedroom.'
sent2='A dog was running across the kitchen.'

from sklearn.feature_extraction.text import CountVectorizer
count_vec=CountVectorizer()

sentences=[sent1,sent2]
#輸出特征向量化后的表示
print(count_vec.fit_transform(sentences).toarray())

[[0 1 1 0 1 1 0 0 2 1 0]
[1 0 0 1 0 0 1 1 1 0 1]]

#輸出向量各個維度的特征含義
print(count_vec.get_feature_names())

['across', 'bedroom', 'cat', 'dog', 'in', 'is', 'kitchen', 'running', 'the', 'walking', 'was']

使用NLTK對示例文本進(jìn)行語言學(xué)分析

import nltk
nltk.download('punkt')

#對句子進(jìn)行詞匯分割和正規(guī)化，有些情況如aren‘t需要分割為are和n’t；或者i‘m要分割為i和’m。
tokens_1=nltk.word_tokenize(sent1)
print(tokens_1)

['The', 'cat', 'is', 'walking', 'in', 'the', 'bedroom', '.']

tokens_2=nltk.word_tokenize(sent2)
print(tokens_2)

['A', 'dog', 'was', 'running', 'across', 'the', 'kitchen', '.']

#整理兩句的詞表，并且按照ASCII的排序輸出
vocab_1=sorted(set(tokens_1))
print(vocab_1)

['.', 'The', 'bedroom', 'cat', 'in', 'is', 'the', 'walking']

vocab_2=sorted(set(tokens_2))
print(vocab_2)

['.', 'A', 'across', 'dog', 'kitchen', 'running', 'the', 'was']

#初始化stemmer尋找各個詞匯最原始的詞根
stemmer=nltk.stem.PorterStemmer()
stem_1=[stemmer.stem(t) for t in tokens_1]
print(stem_1)

['the', 'cat', 'is', 'walk', 'in', 'the', 'bedroom', '.']

stem_2=[stemmer.stem(t) for t in tokens_2]
print(stem_2)

['A', 'dog', 'wa', 'run', 'across', 'the', 'kitchen', '.']

nltk.download('averaged_perceptron_tagger')

#初始化詞性標(biāo)注器，對每個詞匯進(jìn)行標(biāo)注
pos_tag_1=nltk.tag.pos_tag(tokens_1)
print(pos_tag_1)

[('The', 'DT'), ('cat', 'NN'), ('is', 'VBZ'), ('walking', 'VBG'), ('in', 'IN'), ('the', 'DT'), ('bedroom', 'NN'), ('.', '.')]

pos_tag_2=nltk.tag.pos_tag(tokens_2)
print(pos_tag_2)

[('A', 'DT'), ('dog', 'NN'), ('was', 'VBD'), ('running', 'VBG'), ('across', 'IN'), ('the', 'DT'), ('kitchen', 'NN'), ('.', '.')]

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九欧美,1769亚洲,黄色成人av

3.2.1自然語言處理包（NLTK）

3.2.1自然語言處理包（NLTK）

使用詞袋法對示例文本進(jìn)行特征向量化

使用NLTK對示例文本進(jìn)行語言學(xué)分析

相關(guān)閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九 欧美,1769亚洲,黄色成人av

3.2.1自然語言處理包（NLTK）

使用詞袋法對示例文本進(jìn)行特征向量化

使用NLTK對示例文本進(jìn)行語言學(xué)分析

相關(guān)閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九欧美,1769亚洲,黄色成人av