wordcloud使用總結(jié)

在使用wordcloud過(guò)程中遇到一個(gè)error

ValueError: We need at least 1 word to plot a word cloud, got 0.
這個(gè)問(wèn)題是在用wordcloud處理中文的時(shí)候遇到的,但是用jieba分詞后的內(nèi)容,傳給wordcloud.generate()時(shí),卻沒(méi)有問(wèn)題。單獨(dú)把jieba分詞后的結(jié)果打印出來(lái),發(fā)現(xiàn)時(shí)unicode,所以嘗試把內(nèi)容轉(zhuǎn)換為unicode后傳給generate()函數(shù),錯(cuò)誤消失。

下面是代碼:

#! coding=utf-8
import os
import jieba

from os import path
from wordcloud import WordCloud
import numpy as np
from PIL import Image
from os import path

d = path.dirname(__file__)


words = [
         u'古天樂(lè)',
         u'郭富城',
         u'劉德華',
         u'周杰倫',
         ]

mask = np.array(Image.open(path.join(d, "xxx.jpg")))

font = r'C:\Windows\Fonts\simfang.ttf'
wordcloud = WordCloud(font_path=font,width=500, height=600 margin=5, background_color="white").generate(" ".join(words))
wordcloud.to_file(path.join(d, "sb_mask.png"))

效果如圖
sb_mask.png

默認(rèn)效果是生成一張矩形圖片,你也可以自己找一張背景圖,來(lái)生成背景圖案中的形狀,需要注意的是,背景圖案中除形狀所需部分,必須是純白(255,255,255)

mask = np.array(Image.open(path.join(d, "xxx.jpg")))

將mask傳給wordcloud,將生成mask形狀的圖案。

wordcloud的參數(shù)介紹

--text
specify file of words to build the word cloud (default: stdin)

Default: -

--regexp
override the regular expression defining what constitutes a word

--stopwords
specify file of stopwords (containing one word per line) to remove from the given text after parsing

--imagefile
file the completed PNG image should be written to (default: stdout)

Default: -

--fontfile
path to font file you wish to use (default: DroidSansMono)

--mask
mask to use for the image form

--colormask
color mask to use for image coloring

--contour_width
if greater than 0, draw mask contour (default: 0)

Default: 0

--contour_color
use given color as mask contour color - accepts any value from PIL.ImageColor.getcolor

Default: “black”

--relative_scaling
scaling of words by frequency (0 - 1)

Default: 0

--margin
spacing to leave around words

Default: 2

--width
define output image width

Default: 400

--height
define output image height

Default: 200

--color
use given color as coloring for the image - accepts any value from PIL.ImageColor.getcolor

--background
use given color as background color for the image - accepts any value from PIL.ImageColor.getcolor

Default: “black”

--no_collocations
do not add collocations (bigrams) to word cloud (default: add unigrams and bigrams)

Default: True

--include_numbers
include numbers in wordcloud?

Default: False

--min_word_length
only include words with more than X letters

Default: 0

--prefer_horizontal
ratio of times to try horizontal fitting as opposed to vertical

Default: 0.9

--scale
scaling between computation and drawing

Default: 1

--colormap
matplotlib colormap name

Default: “viridis”

--mode
use RGB or RGBA for transparent background

Default: “RGB”

--max_words
maximum number of words

Default: 200

--min_font_size
smallest font size to use

Default: 4

--max_font_size
maximum font size for the largest word

--font_step
step size for the font

Default: 1

--random_state
random seed

--no_normalize_plurals
whether to remove trailing ‘s’ from words

Default: True

--repeat
whether to repeat words and phrases

Default: False

--version
show program’s version number and exit
?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請(qǐng)聯(lián)系作者
【社區(qū)內(nèi)容提示】社區(qū)部分內(nèi)容疑似由AI輔助生成,瀏覽時(shí)請(qǐng)結(jié)合常識(shí)與多方信息審慎甄別。
平臺(tái)聲明:文章內(nèi)容(如有圖片或視頻亦包括在內(nèi))由作者上傳并發(fā)布,文章內(nèi)容僅代表作者本人觀點(diǎn),簡(jiǎn)書(shū)系信息發(fā)布平臺(tái),僅提供信息存儲(chǔ)服務(wù)。

相關(guān)閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容