DEV Community

Cover image for Introducing juhu - an open source search engine
Aashish Panthi
Aashish Panthi

Posted on • Updated on

 

Introducing juhu - an open source search engine

Overview of My Submission

Juhu is an open-source search engine that is built using Nodejs and Redis database. As you have already guessed the use case scenario of the website. Why not try on your own, Search for Elon Musk.

I was always curious to know about how search engine are built and most importantly how SEO is done. Throught the journey of building the new search engine, I have learnt a lot of things about how the data should be displayed on the website, what are the HTML tags needed to make the crawler understand what your website is all about. Also, using a pre-styled HTML elements (built for the specific purpose) are better than using a CSS to style the elements later.

Submission Category:

MEAN/MERN Mavericks Yes, I have used MERN, also replaced M with R, I mean I have used redis as a primary database

Video Explainer of My Project

Language Used:

I have used Javascript as a primary language and Nodejs as a runtime environment. I have used different libraries/packages to build this application. Some of them are:

  • Puppetter -> I have used this automation tool to scrape the websie

  • Redis OM -> Used to interact with redis services (Redis search/ JSON database)

  • Express -> To create server

  • React -> To create client side (frontend)

  • Mongoose -> To use secondary database (MongoDB)

Link to Code

The MERN application repository:

GitHub logo aashishpanthi / search-engine

This is an open source search engine built using redis and puppeteer

Juhu - search engine

Juhu is an open source search engine that doesn't track users and id fully customizable.

home page juhu search result juhu image search

Overview video

Here's a short video that explains the project and how it uses Redis:

Embed your YouTube video

How it works

Firstly of all a bot is used to run through different websites and that bot checks if the url is allowed to crawl or not. If able to crawl, it means it is able to be indexed. So, the bot scrapes the data from that website. Filters the data and stores in a form that it will be easier to search and index the scraped data. I want to attach a little architecture diagram here to clear out the things I said Juhu architecture

How the data is stored:

First of all, our server needs to be connected with the redis database. This is a long code but it works :).

import { Client }
…
Enter fullscreen mode Exit fullscreen mode

The bot repository:

Web-crawler

This is a web crawler made using puppeteer library in javascript. It visists different websites, scrapes the information and stores the information inside of a database which can be accessed by the search engine.




Architecture:

Juhu architecture

Links:

Some Screenshots of the website

  • Home page

Home juhu

  • Text Result (Search result) page

Search result

  • Image result (Search result) page

Image search result

Collaborators

@roshanacharya and me @aashishpanthi


Oldest comments (0)

Visualizing Promises and Async/Await 🤓

async await

☝️ Check out this all-time classic DEV post on visualizing Promises and Async/Await 🤓