Introduction
You might wonder why we need another search engine implementation, especially one using Rust and WebAssembly. While it's true that many developers start with a TODO app, I wanted to challenge myself and explore something different for my first blog post. I chose to create an Algolia-like search engine because that the first thing I came across Algolia on dev.to and I had no better idea.
What is Algolia?
Algolia is a powerful search provider for web applications. It's known for its speed and efficiency in delivering search results. Their tagline, "Show users what they need,"
Our Project: A WebAssembly Search Engine
In this project, we'll create a WebAssembly (Wasm) package using Rust, compile it to JavaScript, and run it in the browser. While you could build a similar search engine using JavaScript frameworks or vanilla JS (possibly with comparable performance), our goal is to explore the possibilities of Rust and WebAssembly.
Let's get started!
Setting Up the Rust Project
First, create a new Rust library project:
cargo new --lib algolia-clone-wasm
Add the following to your Cargo.toml
:
[lib]
crate-type = ["cdylib"]
[dependencies]
wasm-bindgen = "0.2"
serde = { version = "1.0", features = ["derive"] }
serde-wasm-bindgen = "0.4"
Implementing the Search Engine
Let's define the core types for our search engine:
use serde::{Serialize, Deserialize};
use std::collections::{HashMap, HashSet};
use wasm_bindgen::prelude::*;
#[derive(Serialize, Deserialize)]
struct Document {
id: String,
content: String,
}
#[derive(Serialize, Deserialize)]
struct SearchResult {
id: String,
score: usize,
}
#[wasm_bindgen]
pub struct SimpleSearch {
documents: Vec<Document>,
index: HashMap<String, HashSet<String>>,
visited_documents: HashMap<String, Document>,
}
The core of our simple search engine implementation consists of three main components:
Data Structures:
Document: Represents a searchable item with an ID and content.
SearchResult: Represents a search result with an ID and a relevance score.
SimpleSearch: The main struct that holds our search engine's state.
Now, let's implement the SimpleSearch
struct and its methods:
#[wasm_bindgen]
impl SimpleSearch {
#[wasm_bindgen(constructor)]
pub fn new() -> Self {
SimpleSearch {
documents: Vec::new(),
index: HashMap::new(),
visited_documents: HashMap::new(),
}
}
pub fn add_document(&mut self, id: String, content: String) {
let doc = Document {
id: id.clone(),
content: content.clone(),
};
self.documents.push(doc);
for word in content.to_lowercase().split_whitespace() {
self.index
.entry(word.to_string())
.or_insert_with(HashSet::new)
.insert(id.clone());
}
}
pub fn search(&self, query: String, limit: usize) -> Result<JsValue, JsValue> {
let lowercase_query = query.to_lowercase();
let query_words: Vec<&str> = lowercase_query.split_whitespace().collect();
let mut scores: HashMap<String, usize> = HashMap::new();
for word in query_words {
if let Some(matching_docs) = self.index.get(word) {
for doc_id in matching_docs {
*scores.entry(doc_id.clone()).or_insert(0) += 1;
}
}
}
let mut results: Vec<SearchResult> = scores
.into_iter()
.map(|(id, score)| SearchResult { id, score })
.collect();
results.sort_by(|a, b| b.score.cmp(&a.score));
results.truncate(limit);
serde_wasm_bindgen::to_value(&results).map_err(|e| e.into())
}
pub fn get_document(&self, id: String) -> Result<JsValue, JsValue> {
if let Some(doc) = self.documents.iter().find(|d| d.id == id) {
serde_wasm_bindgen::to_value(doc).map_err(|e| e.into())
} else {
Ok(JsValue::NULL)
}
}
}
The search method implements a basic search algorithm:
It splits the search query into individual words.
For each word, it looks up matching documents in the inverted index.
It keeps a score for each document, incrementing it for each matching word.
Finally, it sorts the results by score and returns the top matches.
The wasm-bindgen
crate provides a bridge between JavaScript and Rust types, allowing seamless interaction between the two languages.
Compiling to WebAssembly
To compile our Rust code to WebAssembly, run:
wasm-pack build --target web
This command creates a pkg
folder in your project root, containing the necessary files to run your WebAssembly code.
Using the Search Engine in a Web Page
Here's a simple HTML file demonstrating how to use our search engine:
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Rust + WebAssembly Search Engine</title>
<script type="module">
import init, { SimpleSearch } from './pkg/algolia_clone_wasm.js';
async function run() {
await init();
const searchEngine = new SimpleSearch();
// Add sample documents
searchEngine.add_document("1", "TypeScript is a typed superset of JavaScript");
searchEngine.add_document("2", "JavaScript is a popular programming language");
searchEngine.add_document("3", "Algolia provides powerful search capabilities");
// Set up event listener for search
document.getElementById('searchForm').addEventListener('submit', (e) => {
e.preventDefault();
const query = document.getElementById('searchInput').value;
const results = searchEngine.search(query, 10);
displayResults(results);
});
function displayResults(results) {
const resultsContainer = document.getElementById('results');
resultsContainer.innerHTML = '';
results.forEach(result => {
const doc = searchEngine.get_document(result.id);
const resultElement = document.createElement('div');
resultElement.innerHTML = `<p><strong>ID:</strong> ${result.id}, <strong>Score:</strong> ${result.score}</p><p><strong>Content:</strong> ${doc.content}</p>`;
resultsContainer.appendChild(resultElement);
});
}
}
run();
</script>
</head>
<body>
<h1>Rust + WebAssembly Search Engine</h1>
<form id="searchForm">
<input type="text" id="searchInput" placeholder="Enter search query">
<button type="submit">Search</button>
</form>
<div id="results"></div>
</body>
</html>
Conclusion
Forgive me if you thought this will be helpful it was just for fun and as i was bored and wanted to try something it was about demonstrating the potential of combining Rust and WebAssembly to create a simple search engine. While this implementation may not be as feature-rich as Algolia, it showcases the possibilities of using Rust in web development.
In future iterations, we could:
- Implement an in-memory database to store visited pages
- Benchmark this implementation against JavaScript frameworks
- Add more advanced search features, such as fuzzy matching or relevance scoring
Thank you for reading!
Top comments (0)