DeepSpeech調(diào)測

說明:環(huán)境MacOS 11.13.3 [MacBook Pro (Retina, 15-inch, Mid 2015)]

DeepSpeech是mozilla利用Tensorflow實現(xiàn)的一種語音識別引擎,參見https://github.com/mozilla/DeepSpeech。

1) 創(chuàng)建項目目錄

mkdir deepspeech

cd deepspeech


2)創(chuàng)建虛擬環(huán)境

virtualenv env-deepspeech --system-site-packages


3)使虛擬環(huán)境生效

source env-deepspeech/bin/activate


4)安裝deepspeech

pip install deepspeech

此時可以運行deepspeech -h檢查環(huán)境是否正常,我運行時出現(xiàn)下面的錯誤:

RuntimeError: module compiled against API version 0xc but this version of numpy is 0x9

此時只要升級一下numpy即可:

pip install --upgrade numpy


5) 獲取已訓練好的模型并解壓縮

wget? https://github.com/mozilla/DeepSpeech/releases/download/v0.1.1/deepspeech-0.1.1-models.tar.gz

tar xvzf deepspeech-0.1.1-models.tar.gz

然后在當前目錄中會生成一個models文件夾,保存了deepspeech訓練出來的模型:

-rw-r--r--? 1 none? staff? ? ? ? 329 11 18 03:25 alphabet.txt

-rw-r--r--? 1 none? staff? 1601028778 11 18 03:25 lm.binary

-rw-r--r--? 1 none? staff? 490978889? 1 17 22:09 output_graph.pb

-rw-r--r--? 1 none? staff? ? 43550345 11 18 03:25 trie


6)準備一個16K采樣,16bit,單聲道的wav文件

我將女兒的英語聽力mp3轉(zhuǎn)成16K,16bit,mono的文件

deepspeech models/output_graph.pb models/alphabet.txt models/lm.binary models/trie test.wav

結(jié)果會出現(xiàn)錯誤,然后退出,錯誤信息如下:

libc++abi.dylib: terminating with uncaught exception of type lm::FormatLoadException: native_client/kenlm/lm/read_arpa.cc:65 in void lm::ReadARPACounts(util::FilePiece &, std::vector &) threw FormatLoadException.

first non-empty line was "1414678853" not \data\. Byte: 11

估計是通過pip安裝的deepspeech版本存在問題,下載原代碼自己編譯:

git clone https://github.com/mozilla/DeepSpeech

cd DeepSpeech/

python util/taskcluster.py --arch osx --target .

查看目錄下生成了deepspeech,說明編譯成功了,將deepspeech拷貝到原來的目錄中再運行:

./deepspeech models/output_graph.pb models/alphabet.txt models/lm.binary models/trie test.wav

識別結(jié)果如下:

he aihtbebyureunittwoo smell and taste lisenanti one the water melon is big and round to here are two parts on the tall tree lets get them three taste these grapes are the nice for what a nice lemon it smells good five what would you like id like some oranges six what of those they are strawverieslisenancircle one is it lemenjuce or oringtuce to taste these grapes or they taste three taste what is it here are some strowberies for you five or those pairs sweet or sour six what would you like watermolanjuceororangejucepage nine beat yo year listen choose and complete one its nice here lets ever pigknack what do you have jo guess they are round they are or range they smell nice what are the kitty oh they are oringers i like sweetorringers here you are thank you to we do you have alice close your eyes now taste it oudesitase its soar is it a lemon yes youre right it sour but nice we can make some lemenjece three im thirsty what do you have beat look a big water melon wow thats great i like what i mean for what do you have in your pack kitty at them are the grape so strawberies there small and round their grapes i think yes they are grapes do you like grapes pen no i like stroperies five what do you like del uatamolans or apples guess they are sweet big and round a pulse no youre wrong i like water melons?

說實話識別率還有點低,而且耗時特別長:

real 4m31.706s

user 8m35.168s

sys 0m14.004s


注意:有運行deepspeech時有可能會找不到libsox2庫,使用brew安裝一下sox庫即可。

最后編輯于
?著作權歸作者所有,轉(zhuǎn)載或內(nèi)容合作請聯(lián)系作者
【社區(qū)內(nèi)容提示】社區(qū)部分內(nèi)容疑似由AI輔助生成,瀏覽時請結(jié)合常識與多方信息審慎甄別。
平臺聲明:文章內(nèi)容(如有圖片或視頻亦包括在內(nèi))由作者上傳并發(fā)布,文章內(nèi)容僅代表作者本人觀點,簡書系信息發(fā)布平臺,僅提供信息存儲服務。

相關閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容