中文人妻在线,亚洲伊人一区在线观看,大香蕉999大伊

大家好，我是 zeroing~

1，前言

之前談到圖片文本 OCR 識別時，寫過一篇文章介紹了一個 Python 包 pytesseract ，具體內(nèi)容可參考

介紹一個Python 包，幾行代碼可實現(xiàn) OCR 文本識別！這篇文章，pytesseract 包是基于 Tesseract 封裝得到的，這個包雖然支持多語言文本識別，但對于不同語言文本識別，準(zhǔn)確率卻不一樣，例如英文識別準(zhǔn)確率高，而中文文本較低；

英文字符識別，整體來看基本不會出錯，但對于圖片中的中文字符，經(jīng)常出現(xiàn)亂碼、識別失敗，

2,EasyOCR 介紹

今天將介紹一個的用于文本OCR 新的Python 包 EasyOCR ，這個包是基于訓(xùn)練好的 Deep Learning 模型開發(fā)的，模型包含功能：文本檢測、文本識別

EasyOCR 包從開源到現(xiàn)在 10 個月不到，但在 Github 已經(jīng)有 10k+ star，到目前為止經(jīng)過四次版本迭代，有以下幾個特點：

1，到目前為止支持70+種語言文本識別，包括但不限于英語、中文、日語、韓語等；
2，源于深度學(xué)習(xí)技術(shù)，識別精度很高；對于正常圖片文本識別來說，準(zhǔn)確率能達(dá)到 100% ；

image-20210121151536805

3，不僅適用于單語言，同樣也適用于多語言(例如一張圖片中需要同時識別中文、英語、日語三類語言)；

image-20210121151525119

4，支持 GPU 加速，GPU 識別速度要比 CPU 快 6~7 倍；(需要提前配置好 cuda、 pytorch、torchvision Python 環(huán)境);

對比傳統(tǒng) OCR 只具有圖片文本識別之外，EasyOCR 還具有 文本檢測 功能(圖片中識別到的文本框，在圖片中的定位以 左上、右上、右下、左下 坐標(biāo)順序依次返回)，效果如下圖：

image-20210121000941176

上圖中 EasyOCR 最終輸出的是右圖的 文本信息 ，左圖中的紅色線框是后面經(jīng)處理加上去的

3，EasyOCR 使用

上面對 EasyOCR 程序包做了簡單介紹，下面介紹一下它的基本用法

安裝

EasyOCR 已經(jīng)上傳到 Pypi 上面了，可通過 pip 命令完成安裝

<pre spellcheck="false" class="md-fences md-end-block ty-contain-cm modeLoaded" lang="" cid="n113" mdtype="fences" style="box-sizing: border-box; overflow: visible; font-family: var(--monospace); font-size: 0.9em; display: block; break-inside: avoid; text-align: left; white-space: normal; background-image: inherit; background-position: inherit; background-size: inherit; background-repeat: inherit; background-attachment: inherit; background-origin: inherit; background-clip: inherit; background-color: rgb(248, 248, 248); position: relative !important; border: 1px solid rgb(231, 234, 237); border-radius: 3px; padding: 8px 4px 6px; margin-bottom: 15px; margin-top: 15px; width: inherit; color: rgb(51, 51, 51); font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration-style: initial; text-decoration-color: initial;">pip install easyocr</pre>

EasyOCR 的模型是基于 pytorch 框架訓(xùn)練的，在 easyocr 下載同時會下載一些其它附加 python 包，例如 pytorch， torchvision 等，時間需要久一點(需要注意下，easyocr 默認(rèn)安裝的是 pytorch 的 cpu 版本，需要 gpu 配置的小伙伴可以搜一下 pytorch-gpu 相關(guān)教程進(jìn)行配置)；

使用方法

雖然 EasyOCR 安裝步驟很簡單，只有一行代碼；但使用過程中會出現(xiàn)包版本不匹配、環(huán)境項缺失 等問題，在使用過程中，我遇到了兩個因為環(huán)境錯誤導(dǎo)致無法使用的問題，這里我貼在下方并附上解決方案，遇到的小伙伴們可以參考下，當(dāng)然沒遇到的話更好

1，from ._remap import _map_array ImportError: DLL load failed: The specified module could not be found

該問題是由于 C++ 運行包丟失造成的，解決方案，終端輸入以下命令安裝即可

<pre spellcheck="false" class="md-fences md-end-block ty-contain-cm modeLoaded" lang="" cid="n119" mdtype="fences" style="box-sizing: border-box; overflow: visible; font-family: var(--monospace); font-size: 0.9em; display: block; break-inside: avoid; text-align: left; white-space: normal; background-image: inherit; background-position: inherit; background-size: inherit; background-repeat: inherit; background-attachment: inherit; background-origin: inherit; background-clip: inherit; background-color: rgb(248, 248, 248); position: relative !important; border: 1px solid rgb(231, 234, 237); border-radius: 3px; padding: 8px 4px 6px; margin-bottom: 15px; margin-top: 15px; width: inherit; color: rgb(51, 51, 51); font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration-style: initial; text-decoration-color: initial;">pip install msvc-runtime</pre>

2，train error ： ImportError: cannot import name 'Optional'

該問題是由于 pytorch 與 torchvision 版本不符造成的，安裝 easyocr 時默認(rèn)安裝的 torchvision 版本為 0.5.0，對應(yīng) pytorch 相兼容的版本應(yīng)該為 1.4.0，但通過下方命令安裝時

<pre spellcheck="false" class="md-fences md-end-block ty-contain-cm modeLoaded" lang="" cid="n122" mdtype="fences" style="box-sizing: border-box; overflow: visible; font-family: var(--monospace); font-size: 0.9em; display: block; break-inside: avoid; text-align: left; white-space: normal; background-image: inherit; background-position: inherit; background-size: inherit; background-repeat: inherit; background-attachment: inherit; background-origin: inherit; background-clip: inherit; background-color: rgb(248, 248, 248); position: relative !important; border: 1px solid rgb(231, 234, 237); border-radius: 3px; padding: 8px 4px 6px; margin-bottom: 15px; margin-top: 15px; width: inherit; color: rgb(51, 51, 51); font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration-style: initial; text-decoration-color: initial;">pip install torch==1.14.0</pre>

會安裝失敗，解決方法：通過另一種安裝命令即可

<pre spellcheck="false" class="md-fences md-end-block ty-contain-cm modeLoaded" lang="" cid="n124" mdtype="fences" style="box-sizing: border-box; overflow: visible; font-family: var(--monospace); font-size: 0.9em; display: block; break-inside: avoid; text-align: left; white-space: normal; background-image: inherit; background-position: inherit; background-size: inherit; background-repeat: inherit; background-attachment: inherit; background-origin: inherit; background-clip: inherit; background-color: rgb(248, 248, 248); position: relative !important; border: 1px solid rgb(231, 234, 237); border-radius: 3px; padding: 8px 4px 6px; margin-bottom: 15px; margin-top: 15px; width: inherit; color: rgb(51, 51, 51); font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration-style: initial; text-decoration-color: initial;">pip install torch==1.4.0+cpu torchvision==0.5.0+cpu -f https://download.pytorch.org/whl/torch_stable.htm</pre>

easyocr 將所有功能都封裝到一個類中 Reader ，可通過調(diào)用類里面的三種方法 readtext、detect、recognize 來實現(xiàn)，

image-20210126144903635

detect 方法用于檢測圖像中的文本框，最終返回兩個列表，來表示文本框在圖像中的位置，一個為 horizontal_list 格式為 [x_min,x_max,y_min,y_max] ，另一個為 free_list ，格式為 [[x1,y1],[x2,y2],[x3,y3],[x4,y4]]，

ceshi

上面這張圖是 B 站用于在用戶登錄時彈出的驗證碼界面，在接下來的例子中都以這張圖作為模板，detect 函數(shù)的使用方法如下

<pre spellcheck="false" class="md-fences md-end-block ty-contain-cm modeLoaded" lang="" cid="n129" mdtype="fences" style="box-sizing: border-box; overflow: visible; font-family: var(--monospace); font-size: 0.9em; display: block; break-inside: avoid; text-align: left; white-space: normal; background-image: inherit; background-position: inherit; background-size: inherit; background-repeat: inherit; background-attachment: inherit; background-origin: inherit; background-clip: inherit; background-color: rgb(248, 248, 248); position: relative !important; border: 1px solid rgb(231, 234, 237); border-radius: 3px; padding: 8px 4px 6px; margin-bottom: 15px; margin-top: 15px; width: inherit; color: rgb(51, 51, 51); font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration-style: initial; text-decoration-color: initial;">import easyocr
reader = easyocr.Reader(['ch_sim','en'],gpu=False,model_storage_directory='./model')
result = reader.detect('ceshi.png')
print(result)
?

ouput

?
([[11, 133, 11, 31], [158, 238, 2, 34], [199, 235, 315, 333]], [])</pre>

使用時，首先需要創(chuàng)建一個 Reader 類，在類中需要指定一些參數(shù)：

lang_list，用來指定需要識別語言代碼(例如中文、英文)，以列表形式存放，關(guān)于語言代碼可參考下方(這里只貼出部分，詳情可參考官網(wǎng))：

image-20210121123724942

gpu，布爾值，表示是否需要使用GPU，默認(rèn)為 True;
model_storage_directoy，字符串類型，默認(rèn)為~./EasyOCR/.用于指定網(wǎng)絡(luò)模型的存儲路徑，建議自己指定一個新路徑；

最終會輸出兩個列表，分別表示 horizontal_list, free_list

recognize 用于識別，使用該函數(shù)時需要提供三個參數(shù)，image、horizontal_list、free_list，使用時與 detect 相搭配

image 表示圖片；
horizontal_list、free_list 分別表示矩形文本框列表，是函數(shù) detect 的兩個輸出列表

使用方法如下

<pre spellcheck="false" class="md-fences md-end-block ty-contain-cm modeLoaded" lang="" cid="n148" mdtype="fences" style="box-sizing: border-box; overflow: visible; font-family: var(--monospace); font-size: 0.9em; display: block; break-inside: avoid; text-align: left; white-space: normal; background-image: inherit; background-position: inherit; background-size: inherit; background-repeat: inherit; background-attachment: inherit; background-origin: inherit; background-clip: inherit; background-color: rgb(248, 248, 248); position: relative !important; border: 1px solid rgb(231, 234, 237); border-radius: 3px; padding: 8px 4px 6px; margin-bottom: 15px; margin-top: 15px; width: inherit; color: rgb(51, 51, 51); font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration-style: initial; text-decoration-color: initial;">import easyocr
from PIL import Image,ImageDraw
?
reader = easyocr.Reader(['ch_sim','en'],gpu=False,model_storage_directory='./model')
result = reader.recognize('ceshi.png',horizontal_list=[[11, 133, 11, 31], [158, 238, 2, 34], [199, 235, 315, 333]],free_list=[])
?
print(result)
?

output

?
[([[158, 2], [238, 2], [238, 34], [158, 34]], '帶魚', 0.48613545298576355), ([[11, 11], [133, 11], [133, 31], [11, 31]], '清在下圖依次點擊:', 0.46184659004211426), ([[199, 315], [235, 315], [235, 333], [199, 333]], '確認(rèn)', 0.31680089235305786)]</pre>

最終 recognize 方法會返回每個文本框中的文本信息

readtext 函數(shù)是將 detect 和 recognize 方法相結(jié)合：先利用 detect 函數(shù)識別圖像中文本框的位置坐標(biāo)，將坐標(biāo)列表輸入 recognize 進(jìn)行識別，最終返回每個文本信息及位置坐標(biāo)，函數(shù)框架如下：

image-20210121125421038

<pre spellcheck="false" class="md-fences md-end-block ty-contain-cm modeLoaded" lang="" cid="n152" mdtype="fences" style="box-sizing: border-box; overflow: visible; font-family: var(--monospace); font-size: 0.9em; display: block; break-inside: avoid; text-align: left; white-space: normal; background-image: inherit; background-position: inherit; background-size: inherit; background-repeat: inherit; background-attachment: inherit; background-origin: inherit; background-clip: inherit; background-color: rgb(248, 248, 248); position: relative !important; border: 1px solid rgb(231, 234, 237); border-radius: 3px; padding: 8px 4px 6px; margin-bottom: 15px; margin-top: 15px; width: inherit; color: rgb(51, 51, 51); font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration-style: initial; text-decoration-color: initial;">import easyocr
?
reader = easyocr.Reader(['ch_sim','en'],gpu=False,model_storage_directory='./model')
result = reader.readtext('ceshi.png')
print(result)
?

ouput

?
Using CPU. Note: This module is much faster with a GPU.
[([[158, 2], [238, 2], [238, 34], [158, 34]], '帶魚', 0.48613545298576355), ([[11, 11], [133, 11], [133, 31], [11, 31]], '清在下圖依次點擊:', 0.46184659004211426), ([[199, 315], [235, 315], [235, 333], [199, 333]], '確認(rèn)', 0.31680089235305786)]</pre>

得到坐標(biāo)之后，為了更直觀地觀察到檢測結(jié)果的正確性，可通過 PIL 把圖像中文本框給繪制出來

<pre spellcheck="false" class="md-fences md-end-block ty-contain-cm modeLoaded" lang="" cid="n154" mdtype="fences" style="box-sizing: border-box; overflow: visible; font-family: var(--monospace); font-size: 0.9em; display: block; break-inside: avoid; text-align: left; white-space: normal; background-image: inherit; background-position: inherit; background-size: inherit; background-repeat: inherit; background-attachment: inherit; background-origin: inherit; background-clip: inherit; background-color: rgb(248, 248, 248); position: relative !important; border: 1px solid rgb(231, 234, 237); border-radius: 3px; padding: 8px 4px 6px; margin-bottom: 15px; margin-top: 15px; width: inherit; color: rgb(51, 51, 51); font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration-style: initial; text-decoration-color: initial;">import easyocr
from PIL import Image,ImageDraw
?
reader = easyocr.Reader(['ch_sim','en'],gpu=False,model_storage_directory='./model')
result = reader.readtext('ceshi.png')
?
img = Image.open('ceshi.png')
draw = ImageDraw.Draw(img)
?
for i in result:
draw.rectangle((tuple(i[0][0]),tuple(i[0][2])),fill=None,outline='red',width=2)
img.save("ceshi3.png")</pre>

效果如下

ceshi2

結(jié)果來看，除了圖片中間的 帶魚 文本信息沒有識別出來之外，其他區(qū)域的文本信息都能取得不錯的識別和檢測效果；

這里解釋一下識別失敗的原因，仔細(xì)觀察的話會發(fā)現(xiàn)上面這張圖并不是現(xiàn)實中真實存在的，而是通過深度學(xué)習(xí)技術(shù)生成的一個虛擬圖像例如 GAN，里面的文本信息不是單一地將文字貼到圖片上，我猜測是經(jīng)過一些加密處理，

為了驗證我的猜測，這里我借助了超級鷹打碼平提供的API 接口，但最終依然得不到很好的識別效果(圖中的藍(lán)色字體位置代表打碼平臺識別結(jié)果)

image-20210121153608649

上面只介紹了easyocr 方法中的一些常規(guī)參數(shù)，還有許多默認(rèn)參數(shù)沒有介紹，比如 batch_size 控制每次識別圖片的數(shù)量，有了這個參數(shù)可以實現(xiàn)批量識別，但前提需要 GPU 大內(nèi)存的支撐；adjust_contrast 調(diào)整圖像對比度；

關(guān)于 easyocr 更多詳細(xì)信息，感興趣的小伙伴可看官方文檔：Document API，

好了，以上就是本篇文章全部內(nèi)容，最后感謝大家的閱讀！

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九欧美,1769亚洲,黄色成人av

關(guān)于OCR圖片文本檢測、推薦一個基于深度學(xué)習(xí)的Python 庫！

關(guān)于OCR圖片文本檢測、推薦一個基于深度學(xué)習(xí)的Python 庫！

1，前言

2,EasyOCR 介紹

3，EasyOCR 使用

ouput

output

ouput

相關(guān)閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九 欧美,1769亚洲,黄色成人av

關(guān)于OCR圖片文本檢測、推薦一個 基于深度學(xué)習(xí)的Python 庫！

1，前言

2,EasyOCR 介紹

3，EasyOCR 使用

ouput

output

ouput

相關(guān)閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九欧美,1769亚洲,黄色成人av

關(guān)于OCR圖片文本檢測、推薦一個基于深度學(xué)習(xí)的Python 庫！

1，前言

3，EasyOCR 使用