DEV Community

Code Boxx
Code Boxx

Posted on

Javascript Image To Text (OCR)

OPTICAL CHARACTER RECOGNITION (OCR)

  • Data entry, what's that?
  • There's a technology called "Optical Character Recognition" (OCR) that does "image to text".
  • There's an open source OCR library called TesseractJS.

(1) INPUT IMAGE TO TEXT

1-select.html

<!-- (A) LOAD TESSERACT -->
<!-- https://cdnjs.com/libraries/tesseract.js -->
<script src="https://cdnjs.cloudflare.com/ajax/libs/tesseract.js/4.1.1/tesseract.min.js"></script>

<!-- (B) FILE SELECTOR & RESULT -->
<input type="file" id="select" accept="image/png, image/gif, image/webp, image/jpeg">
<textarea id="result"></textarea>
Enter fullscreen mode Exit fullscreen mode
  • (A) Load TesseractJS from the CDN. You can also download and host on your own server.
  • (B) For this example, we will pick a file and show the result in the <textarea>.

1-select.js

window.addEventListener("load", async () => {
  // (A) GET HTML ELEMENTS
  const hSel = document.getElementById("select"),
        hRes = document.getElementById("result");

  // (B) CREATE ENGLISH TESSERACT WORKER
  const worker = await Tesseract.createWorker();
  await worker.loadLanguage("eng");
  await worker.initialize("eng");

  // (C) ON FILE SELECT - IMAGE TO TEXT
  hSel.onchange = async () => {
    const res = await worker.recognize(hSel.files[0]);
    hRes.value = res.data.text;
  };
});
Enter fullscreen mode Exit fullscreen mode
  • (A) On window load, get the HTML file picker and textarea.
  • (B) Create a Tesseract worker, set it to recognize English.
  • (C) On picking a file, send it to Tesseract. Put the results into the textarea.

(2) FETCH IMAGE TO TEXT

2-fetch.html

<!-- (A) LOAD TESSERACT -->
<!-- https://cdnjs.com/libraries/tesseract.js -->
<script src="https://cdnjs.cloudflare.com/ajax/libs/tesseract.js/4.1.1/tesseract.min.js"></script>

<!-- (B) RESULT -->
<textarea id="result"></textarea>
Enter fullscreen mode Exit fullscreen mode

No file picker in this example, will fetch the image directly from the server.

2-fetch.js

window.addEventListener("load", () => {
  // (A) FETCH IMAGE
  fetch("text.png")
  .then(res => res.blob())
  .then(async (blob) => {
    // (B) CREATE ENGLISH WORKER
    const worker = await Tesseract.createWorker();
    await worker.loadLanguage("eng");
    await worker.initialize("eng");

    // (C) RESULT
    const res = await worker.recognize(blob);
    document.getElementById("result").value = res.data.text;
  });
});
Enter fullscreen mode Exit fullscreen mode
  • (A) Fetch image from server.
  • (B) On fetch, create Tesseract worker.
  • (C) Pass the image to the worker, output the text in the textarea.

(3) WEBCAM TO TEXT

3-cam.html

<!-- (A) LOAD TESSERACT  -->
<!-- https://cdnjs.com/libraries/tesseract.js -->
<script defer src="https://cdnjs.cloudflare.com/ajax/libs/tesseract.js/4.1.1/tesseract.min.js"></script>

<!-- (B) WEBCAM & RESULT -->
<video id="vid" autoplay></video>
<button id="go">Go!</button>
<textarea id="result"></textarea>
Enter fullscreen mode Exit fullscreen mode
  • Last example - Use the webcam to capture an image, send to Tesseract.
  • <video> Webcam live feed.
  • <button> Snapshot button.
  • <textarea> Result.

3-cam.js

var webkam = {
  // (A) INITIALIZE
  worker : null, // tesseract worker
  hVid : null, hGo :null, hRes : null, // html elements
  init : () => {
    // (A1) GET HTML ELEMENTS
    webkam.hVid = document.getElementById("vid"),
    webkam.hGo = document.getElementById("go"),
    webkam.hRes = document.getElementById("result");

    // (A2) GET USER PERMISSION TO ACCESS CAMERA
    navigator.mediaDevices.getUserMedia({ video: true })
    .then(async (stream) => {
      // (A2-1) CREATE ENGLISH WORKER
      webkam.worker = await Tesseract.createWorker();
      await webkam.worker.loadLanguage("eng");
      await webkam.worker.initialize("eng");

      // (A2-2) WEBCAM LIVE STREAM
      webkam.hVid.srcObject = stream;
      webkam.hGo.onclick = webkam.snap;
    })
    .catch(err => console.error(err));
  },

  // (B) SNAP VIDEO FRAME TO TEXT
  snap : async () => {
    // (B1) CREATE NEW CANVAS
    let canvas = document.createElement("canvas"),
        ctx = canvas.getContext("2d"),
        vWidth = webkam.hVid.videoWidth,
        vHeight = webkam.hVid.videoHeight;

    // (B2) CAPTURE VIDEO FRAME TO CANVAS
    canvas.width = vWidth;
    canvas.height = vHeight;
    ctx.drawImage(webkam.hVid, 0, 0, vWidth, vHeight);

    // (B3) CANVAS TO IMAGE, IMAGE TO TEXT
    const res = await webkam.worker.recognize(canvas.toDataURL("image/png"));
    webkam.hRes.value = res.data.text;
  },
};
window.addEventListener("load", webkam.init);
Enter fullscreen mode Exit fullscreen mode
  • (A) On window load:
    • (A1) Get the HTML elements (video, button, textarea).
    • (A2) Get the user's permission to access the webcam. Thereafter, create Tesseract worker and enable "snapshot" button.
  • (B) On clicking "snapshot".
    • (B1 & B2) Create an empty canvas. Capture the current video frame onto the canvas.
    • (B3) Pass the video frame to Tesseract, output the result.

THE END

That's all for this condensed tutorial. Here are the links to the GIST and more.

Top comments (0)