人工智能Java SDK:文字識別(OCR)工具箱

目錄:

http://www.aias.top/

文字識別(OCR)工具箱

文字識別(OCR)目前在多個行業(yè)中得到了廣泛應(yīng)用,比如金融行業(yè)的單據(jù)識別輸入,餐飲行業(yè)中的發(fā)票識別,
交通領(lǐng)域的車票識別,企業(yè)中各種表單識別,以及日常工作生活中常用的身份證,駕駛證,護(hù)照識別等等。
OCR(文字識別)是目前常用的一種AI能力。

OCR工具箱功能:

  1. 方向檢測
  • 0度
  • 90度
  • 180度
  • 270度
    detect_direction
  1. 圖片旋轉(zhuǎn)

  2. 文字識別(提供三個模型)

  • mobile模型
  • light模型
  • 服務(wù)器端模型
  1. 版面分析(支持5個類別, 用于配合文字識別,表格識別的流水線處理)
  • Text
  • Title
  • List
  • Table
  • Figure
  1. 表格識別
  • 生成html表格
  • 生成excel文件

運行OCR識別例子

1.1 文字方向檢測:

  • 例子代碼: OcrDetectionExample.java
  • 運行成功后,命令行應(yīng)該看到下面的信息:
[INFO ] - Result image has been saved in: build/output/detect_result.png
[INFO ] - [
    class: "0", probability: 1.00000, bounds: [x=0.073, y=0.069, width=0.275, height=0.026]
    class: "0", probability: 1.00000, bounds: [x=0.652, y=0.158, width=0.222, height=0.040]
    class: "0", probability: 1.00000, bounds: [x=0.143, y=0.252, width=0.144, height=0.026]
    class: "0", probability: 1.00000, bounds: [x=0.628, y=0.328, width=0.168, height=0.026]
    class: "0", probability: 1.00000, bounds: [x=0.064, y=0.330, width=0.450, height=0.023]
]
  • 輸出圖片效果如下:


    detect_result

1.2 文字方向檢測幫助類(增加置信度信息顯示,便于調(diào)試):

  • 例子代碼: OcrDetectionHelperExample.java
  • 運行成功后,命令行應(yīng)該看到下面的信息:
[INFO ] - Result image has been saved in: build/output/detect_result_helper.png
[INFO ] - [
    class: "0 :1.0", probability: 1.00000, bounds: [x=0.073, y=0.069, width=0.275, height=0.026]
    class: "0 :1.0", probability: 1.00000, bounds: [x=0.652, y=0.158, width=0.222, height=0.040]
    class: "0 :1.0", probability: 1.00000, bounds: [x=0.143, y=0.252, width=0.144, height=0.026]
    class: "0 :1.0", probability: 1.00000, bounds: [x=0.628, y=0.328, width=0.168, height=0.026]
    class: "0 :1.0", probability: 1.00000, bounds: [x=0.064, y=0.330, width=0.450, height=0.023]
]
  • 輸出圖片效果如下:


    detect_result_helper

2. 圖片旋轉(zhuǎn):

每調(diào)用一次rotateImg方法,會使圖片逆時針旋轉(zhuǎn)90度。

  • 例子代碼: RotationExample.java
  • 旋轉(zhuǎn)前圖片:


    ticket_0
  • 旋轉(zhuǎn)后圖片效果如下:


    rotate_result

3. 文字識別:

再使用本方法前,請調(diào)用上述方法使圖片文字呈水平(0度)方向。

  • 例子代碼: LightOcrRecognitionExample.java
  • 運行成功后,命令行應(yīng)該看到下面的信息:
[INFO ] - [
    class: "你", probability: -1.0e+00, bounds: [x=0.319, y=0.164, width=0.050, height=0.057]
    class: "永遠(yuǎn)都", probability: -1.0e+00, bounds: [x=0.329, y=0.349, width=0.206, height=0.044]
    class: "無法叫醒一個", probability: -1.0e+00, bounds: [x=0.328, y=0.526, width=0.461, height=0.044]
    class: "裝睡的人", probability: -1.0e+00, bounds: [x=0.330, y=0.708, width=0.294, height=0.043]
]
  • 輸出圖片效果如下:


    ocr_result

4. 版面分析:

  • 運行成功后,命令行應(yīng)該看到下面的信息:
[INFO ] - [
    class: "Text", probability: 0.98750, bounds: [x=0.081, y=0.620, width=0.388, height=0.183]
    class: "Text", probability: 0.98698, bounds: [x=0.503, y=0.464, width=0.388, height=0.167]
    class: "Text", probability: 0.98333, bounds: [x=0.081, y=0.465, width=0.387, height=0.121]
    class: "Figure", probability: 0.97186, bounds: [x=0.074, y=0.091, width=0.815, height=0.304]
    class: "Table", probability: 0.96995, bounds: [x=0.506, y=0.712, width=0.382, height=0.143]
]
  • 輸出圖片效果如下:


    layout

5. 表格識別:

  • 運行成功后,命令行應(yīng)該看到下面的信息:
<html>
 <body>
  <table>
   <thead>
    <tr>
     <td>Methods</td>
     <td>R</td>
     <td>P</td>
     <td>F</td>
     <td>FPS</td>
    </tr>
   </thead>
   <tbody>
    <tr>
     <td>SegLink[26]</td>
     <td>70.0</td>
     <td>86.0</td>
     <td>770</td>
     <td>89</td>
    </tr>
    <tr>
     <td>PixelLink[4j</td>
     <td>73.2</td>
     <td>83.0</td>
     <td>77.8</td>
     <td></td>
    </tr>
...
   </tbody>
  </table> 
 </body>
</html>
  • 輸出圖片效果如下:


    table
  • 生成excel效果如下:


    excel
最后編輯于
?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請聯(lián)系作者
【社區(qū)內(nèi)容提示】社區(qū)部分內(nèi)容疑似由AI輔助生成,瀏覽時請結(jié)合常識與多方信息審慎甄別。
平臺聲明:文章內(nèi)容(如有圖片或視頻亦包括在內(nèi))由作者上傳并發(fā)布,文章內(nèi)容僅代表作者本人觀點,簡書系信息發(fā)布平臺,僅提供信息存儲服務(wù)。

友情鏈接更多精彩內(nèi)容