網(wǎng)上搜一堆解決問題的文章都沒用
安裝 pytesseract
pip install pytesseract
跑python識(shí)別程序,下列語句會(huì)報(bào)錯(cuò)
pytesseract.image_to_string(Image.open(filename))
報(bào)錯(cuò):
Error opening data file /usr/local/share/tessdata/eng.traineddata Please make sure the TESSDATA_PREFIX environment variable issettoyour"tessdata"directory.Failedloadinglanguage'eng'Tesseract couldn't load any languages! Could not initialize tesseract.
Windows中需安裝tesseract
1. 下載?tesseract-ocr-setup-4.00.00dev.exe 安裝
2. 并新建用戶變量
TESSDATA_PREFIX
D:\Program Files (x86)\Tesseract-OCR
3. ?再次運(yùn)行,會(huì)有如下報(bào)錯(cuò)
tesseract.exe已停止工作
pytesseract.pytesseract.TesseractError:(3221225477, ‘’)
因?yàn)榘惭b的是版本4.0了,解決需卸載,并下載安裝tesseract 3.02.02 版本,sourceforge有歷史安裝文件和中文包下載
https://sourceforge.net/projects/tesseract-ocr-alt/files/
https://nchc.dl.sourceforge.net/project/tesseract-ocr-alt/tesseract-ocr-setup-3.02.02.exe
4.下載中文 chi_sim下載,解包到D:\Program Files (x86)\Tesseract-OCR\tessdata
https://nchc.dl.sourceforge.net/project/tesseract-ocr-alt/tesseract-ocr-3.02.chi_sim.tar.gz
5. 跑程序,但是識(shí)別結(jié)果真爛
材料成分識(shí)別成了材料咸分。。。
CentOS 7 安裝 tesseract
sudo yum install tesseract -y
pip3 install pytesseract
vi ~/.bash_profile
export TESSDATA_PREFIX=/usr/share/tesseract/tessdata
source ~/.bash_profile
yum install -y tesseract-langpack-chi_sim? # 中文包
參考:
https://www.devzoneoriginal.com/2020/11/how-to-install-tesseract-on-centos.html