I used Node.js to OCR "Meme Monday" threads

#javascript #tutorial #programming #ai

I love programming-related memes and jokes, and I'm sure you do as well. @ben's weekly "Meme Monday" posts are an amazing source for humor I always look forward to weekly.

What we're building

We will build a simple project that outputs a markdown file with all the memes on a Meme Monday thread. Each meme will be outputted with the OCR (Optical character recognition) -detected text.

OCR detection will be done with Tesseract.

Setup

Spin up a Node.js Repl on Replit.

Installing Tesseract

If you run tesseract in the shell, you will notice the command does not exist since it isn't installed.

In the top-right-corner of the filetree, click the three dots and select "Show hidden files".

Navigate to the replit.nix configuration file and add pkgs.tesseract4 to the package dependency list.

{ pkgs }: {
  deps = [
    pkgs.tesseract4
    pkgs.nodejs-18_x
    pkgs.nodePackages.typescript-language-server
    pkgs.yarn
    pkgs.replitPackages.jest
  ];
}

Run tesseract in the shell. It should show some options now.

Dependencies

Install node-tesseract-ocr and node-fetch.

npm install node-tesseract-ocr node-fetch

We're all set, let's get coding.

Building the thing

Navigate to index.js.

Require/import the following dependencies at the top of the file.

const tesseract = require("node-tesseract-ocr");
const fetch = require("node-fetch");
const fs = require("fs");

Fetching article comments

Create an asynchronous function fetchArticleComments that takes a slug argument.

const fetchArticleComments = async (slug) => {

}

Let's hit the dev.to API and get an article by its slug. If the response fails, let's throw an error.

if (!articleRes.ok) throw new Error("Failed to fetch article")

const article = await articleRes.json();

Derive the article's ID and fetch the article comments with it. Return the comments if the response is successful.

const fetchArticleComments = async (slug) => {
  const articleRes = await fetch("https://dev.to/api/articles/" + slug)

  if (!articleRes.ok) throw new Error("Failed to fetch article")

  const article = await articleRes.json();

  const commentsRes = await fetch("https://dev.to/api/comments?a_id=" + article.id);

  if (!commentsRes.ok) throw new Error("Failed to fetch comments")

  return await commentsRes.json();
}

Extracting URLs

Create and call an asynchronous main function at the end of the file.

async function main() {

}

main();

Within the main function, fetch the comments of a dev.to article and create a urls array in which we'll store the extracted URLs.

const comments = await fetchArticleComments("ben/meme-monday-59gk");

// Embedded Image URLs found in the comments
const urls = [];

Create a for loop and iterate through the comments. For each comment, let's use a regular expression to match an image URL from an image src prop and push it to urls.

for (const comment of comments) {
  // Get embedded images from the comment
  const images = comment.body_html.match(/src=\"[^\"]+\.(jpg|png|webp|jpeg)\"/g);

  // Extract the image URLs from the embedded images
  if (images?.length) {
    const imageUrls = images.map(str => str.replace(/src="/, "").replace(/"/, ""));

    urls.push(...imageUrls);
  }
}

OCR Text Extraction

Create an array variable images for storing URLs and the extracted OCR text.

const images = [];

Create a for loop to iterate through urls. Use fetch and res.ok to ensure that the image exists.

 for (const i in urls) {
  const url = urls[i];

  // Make sure the image exists
  const res = await fetch(url);

  if (res.ok) {

  }
}

Within the if (res.ok) statement, use await tesseract.recognize(url) to get the text from the respective URL and push it to images.

if (res.ok) {
  const text = await tesseract.recognize(url);

  images.push({
    url,
    text
  });

  console.log("Finished Processing URL", Number(i) + 1, "of", urls.length);
}

Finally, at the end of the main function, use fs.writeFileSync to write the changes to a file named memes.md.

fs.writeFileSync(
  "memes.md",
  images
    .map(({ url, text }) => {
      // Sanitize the text to be an image alt by removing newlines and special markdown tokens
      const sanitizedText = text.replace(/\[|\]|\"/g, c => "\\" + c).replaceAll("\n", "");

      // Return the text followed by a markdown-formatted image
      return `${text}\n\n![${sanitizedText}](${url})`
    })
    .join("\n\n")
);

Run the Repl. You should see as each image gets processed and at the end you will see a memes.md file full of the memes along with the OCR-extracted text.

If you use the Markdown tool, you can preview the output markdown file.