一区二区成人精品,小明看看操

閱讀PDF書籍時(shí)如果該書是掃描版的那么記筆記就需要手打文字。個(gè)人感覺及其麻煩（主要還是懶）。所以想著實(shí)現(xiàn)一個(gè)簡(jiǎn)單的圖片轉(zhuǎn)文字小工具提升一下自己的閱讀效率。
操作系統(tǒng)：Ubuntu18.04
編程語言：Python3.6

1. 監(jiān)聽文件夾

使用系統(tǒng)自帶的截圖工具（Ubuntu為screenshot，快捷鍵是Shift+PrtSc）選擇截取屏幕，screenshot會(huì)將截取的圖片自動(dòng)保存到“/home/用戶/圖片”目錄下，那么我只需要監(jiān)聽該目錄就可以獲取到截取的圖片。
使用Python文件監(jiān)聽工具pyinotify來實(shí)現(xiàn)目錄監(jiān)聽。使用pip3 install pyinotify安裝該模塊，具體監(jiān)聽代碼如下：

class EventHandler(ProcessEvent):
    def process_IN_CREATE(self, event):
        text = os.path.join(event.path,event.name)
        print(text)
def auto_compile(path='.'):
    wm = WatchManager()
    mask = IN_CREATE
    notifier = ThreadedNotifier(wm, EventHandler())
    notifier.start()
    wm.add_watch(path, mask, rec=True, auto_add=True)
    while True:
        try:
            notifier.process_events()
            if notifier.check_events():
                notifier.read_events()
        except KeyboardInterrupt:
            notifier.stop()
            break

代碼參考
pyinotify使用的是Linux底層的Inotify機(jī)制。

2. 使用tesseract-ocr識(shí)別獲取的圖片

獲取到截取的圖片的地址后就可以通過Python的tesserocr模塊調(diào)用tesseract-ocr來識(shí)別圖片中的文字。具體代碼如下：

def image2word(path):
    try:
        time.sleep(1) # 停1秒，否則可能會(huì)讀取圖片失敗
        image = Image.open(path)
        words = tesserocr.image_to_text(image)
        print(words)
    except (OSError, NameError):
        print("os error")

Ubuntu下安裝tesserocr

安裝tesseract-ocr：sudo apt-get install -y tesseract-ocr libtesseract-dev libleptonica-dev。
查看語言支持：tesseract --list-langs。
安裝語言：
git clone https://github.com/tesseract-ocr/tessdata.git
sudo mv tessdata/* /usr/share/tesseract-ocr/tessdata
安裝 tesserocr：pip3 install tesserocr pillow。
筆者安裝使出現(xiàn)error: command 'x86_64-linux-gnu-gcc' failed with exit status 1 '錯(cuò)誤。解決辦法。

3. 使用pyperclip模塊將識(shí)別后的文字添加到剪切版

在Ubuntu中使用該模塊需要先安裝xsel和xclip。
sudo apt-get install xsel
sudo apt-get install xclip
使用pip安裝pyperclip：pip3 install pyperclip。
具體代碼只需要一行：pyperclip.copy(words)。

源碼

改善目標(biāo)

使用Python實(shí)現(xiàn)截圖后解析替代使用系統(tǒng)自帶截圖工具。
提高中文識(shí)別率。
解決跨平臺(tái)問題。

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九欧美,1769亚洲,黄色成人av

Python實(shí)現(xiàn)的圖片轉(zhuǎn)文字小工具

Python實(shí)現(xiàn)的圖片轉(zhuǎn)文字小工具

1. 監(jiān)聽文件夾

2. 使用tesseract-ocr識(shí)別獲取的圖片

Ubuntu下安裝tesserocr

3. 使用pyperclip模塊將識(shí)別后的文字添加到剪切版

源碼

改善目標(biāo)

相關(guān)閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九 欧美,1769亚洲,黄色成人av

Python實(shí)現(xiàn)的圖片轉(zhuǎn)文字小工具

1. 監(jiān)聽文件夾

2. 使用tesseract-ocr識(shí)別獲取的圖片

Ubuntu下安裝tesserocr

3. 使用pyperclip模塊將識(shí)別后的文字添加到剪切版

源碼

改善目標(biāo)

相關(guān)閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九欧美,1769亚洲,黄色成人av