面向純JavaScript的ORC識別引擎—

首頁>技術>最美分享Coder2020-04-02 22:28

面向純JavaScript的ORC識別引擎——Tesseract.js

介紹

Tesseract.js是流行的面向純Javascript的OCR引擎的。該庫支援100多種語言（中文支援），自動文字方向和指令碼檢測，用於讀取段落，單詞和字元邊界框的簡單介面。Tesseract.js可以在瀏覽器和具有NodeJS伺服器上執行。

Github

https://github.com/naptha/tesseract.js

使用方式# For v2版本npm install tesseract.jsyarn add tesseract.js# For v1版本npm install tesseract.js@1yarn add tesseract.js@1

可以配合webpack或者直接在瀏覽器中引用

import Tesseract from 'tesseract.js';Tesseract.recognize( 'url.png', 'eng', { logger: m => console.log(m) }).then(({ data: { text } }) => { console.log(text);})import { createWorker } from 'tesseract.js';const worker = createWorker({ logger: m => console.log(m)});(async () => { await worker.load(); await worker.loadLanguage('eng'); await worker.initialize('eng'); const { data: { text } } = await worker.recognize('url.png'); console.log(text); await worker.terminate();})();使用場景

你可以用在你想使用的地方，官方提供了10種使用方式，分別是

線上版本

https://github.com/jeromewu/tesseract.js-offline

Electron版本

https://github.com/jeromewu/tesseract.js-electron

自定義訓練資料

https://github.com/jeromewu/tesseract.js-custom-traineddata

Chrome擴充套件程式

https://github.com/jeromewu/tesseract.js-chrome-extension

Chrome Extension #2: https://github.com/fxnoob/image-to-text

Vue版本

https://github.com/jeromewu/tesseract.js-vue-app

React版本

https://github.com/jeromewu/tesseract.js-react-app

Angular版本

https://github.com/jeromewu/tesseract.js-angular-app

Typescript版本

https://github.com/jeromewu/tesseract.js-typescript

視訊實時識別

https://github.com/jeromewu/tesseract.js-video

總結

在日常的開發中OCR的使用場景或許還是蠻多的，如果你剛好有這種需求，不妨試一試Tesseract.js，enjoy it！

556

JavaScript

GitHub

∨ JavaScript中的模組匯入有一個缺點

劇多