OPTICAL CHARACTER RECOGNITION (OCR)
- Data entry, what's that?
- There's a technology called "Optical Character Recognition" (OCR) that does "image to text".
- There's an open source OCR library called TesseractJS.
(1) INPUT IMAGE TO TEXT
1-select.html
<!-- (A) LOAD TESSERACT -->
<!-- https://cdnjs.com/libraries/tesseract.js -->
<script src="https://cdnjs.cloudflare.com/ajax/libs/tesseract.js/4.1.1/tesseract.min.js"></script>
<!-- (B) FILE SELECTOR & RESULT -->
<input type="file" id="select" accept="image/png, image/gif, image/webp, image/jpeg">
<textarea id="result"></textarea>
- (A) Load TesseractJS from the CDN. You can also download and host on your own server.
- (B) For this example, we will pick a file and show the result in the
<textarea>
.
1-select.js
window.addEventListener("load", async () => {
// (A) GET HTML ELEMENTS
const hSel = document.getElementById("select"),
hRes = document.getElementById("result");
// (B) CREATE ENGLISH TESSERACT WORKER
const worker = await Tesseract.createWorker();
await worker.loadLanguage("eng");
await worker.initialize("eng");
// (C) ON FILE SELECT - IMAGE TO TEXT
hSel.onchange = async () => {
const res = await worker.recognize(hSel.files[0]);
hRes.value = res.data.text;
};
});
- (A) On window load, get the HTML file picker and textarea.
- (B) Create a Tesseract worker, set it to recognize English.
- (C) On picking a file, send it to Tesseract. Put the results into the textarea.
(2) FETCH IMAGE TO TEXT
2-fetch.html
<!-- (A) LOAD TESSERACT -->
<!-- https://cdnjs.com/libraries/tesseract.js -->
<script src="https://cdnjs.cloudflare.com/ajax/libs/tesseract.js/4.1.1/tesseract.min.js"></script>
<!-- (B) RESULT -->
<textarea id="result"></textarea>
No file picker in this example, will fetch the image directly from the server.
2-fetch.js
window.addEventListener("load", () => {
// (A) FETCH IMAGE
fetch("text.png")
.then(res => res.blob())
.then(async (blob) => {
// (B) CREATE ENGLISH WORKER
const worker = await Tesseract.createWorker();
await worker.loadLanguage("eng");
await worker.initialize("eng");
// (C) RESULT
const res = await worker.recognize(blob);
document.getElementById("result").value = res.data.text;
});
});
- (A) Fetch image from server.
- (B) On fetch, create Tesseract worker.
- (C) Pass the image to the worker, output the text in the textarea.
(3) WEBCAM TO TEXT
3-cam.html
<!-- (A) LOAD TESSERACT -->
<!-- https://cdnjs.com/libraries/tesseract.js -->
<script defer src="https://cdnjs.cloudflare.com/ajax/libs/tesseract.js/4.1.1/tesseract.min.js"></script>
<!-- (B) WEBCAM & RESULT -->
<video id="vid" autoplay></video>
<button id="go">Go!</button>
<textarea id="result"></textarea>
- Last example - Use the webcam to capture an image, send to Tesseract.
-
<video>
Webcam live feed. -
<button>
Snapshot button. -
<textarea>
Result.
3-cam.js
var webkam = {
// (A) INITIALIZE
worker : null, // tesseract worker
hVid : null, hGo :null, hRes : null, // html elements
init : () => {
// (A1) GET HTML ELEMENTS
webkam.hVid = document.getElementById("vid"),
webkam.hGo = document.getElementById("go"),
webkam.hRes = document.getElementById("result");
// (A2) GET USER PERMISSION TO ACCESS CAMERA
navigator.mediaDevices.getUserMedia({ video: true })
.then(async (stream) => {
// (A2-1) CREATE ENGLISH WORKER
webkam.worker = await Tesseract.createWorker();
await webkam.worker.loadLanguage("eng");
await webkam.worker.initialize("eng");
// (A2-2) WEBCAM LIVE STREAM
webkam.hVid.srcObject = stream;
webkam.hGo.onclick = webkam.snap;
})
.catch(err => console.error(err));
},
// (B) SNAP VIDEO FRAME TO TEXT
snap : async () => {
// (B1) CREATE NEW CANVAS
let canvas = document.createElement("canvas"),
ctx = canvas.getContext("2d"),
vWidth = webkam.hVid.videoWidth,
vHeight = webkam.hVid.videoHeight;
// (B2) CAPTURE VIDEO FRAME TO CANVAS
canvas.width = vWidth;
canvas.height = vHeight;
ctx.drawImage(webkam.hVid, 0, 0, vWidth, vHeight);
// (B3) CANVAS TO IMAGE, IMAGE TO TEXT
const res = await webkam.worker.recognize(canvas.toDataURL("image/png"));
webkam.hRes.value = res.data.text;
},
};
window.addEventListener("load", webkam.init);
- (A) On window load:
- (A1) Get the HTML elements (video, button, textarea).
- (A2) Get the user's permission to access the webcam. Thereafter, create Tesseract worker and enable "snapshot" button.
- (B) On clicking "snapshot".
- (B1 & B2) Create an empty canvas. Capture the current video frame onto the canvas.
- (B3) Pass the video frame to Tesseract, output the result.
THE END
That's all for this condensed tutorial. Here are the links to the GIST and more.
Top comments (0)