Well when i say i created a Nosql db using rust i mean kind of. like people ask me why would you do that i always ask myself why not someone even though making js in 10 days would be good idea so i really don't think it would be consider as the most bad idea.
so a Nosql db let's talk about this first so what is database is from our computer science class we all know database is a tool for storing and accessing the data .
Now a database is consist of three many things it can have more components but this are the basic one
1) Data model (representation of the way we define models for db)
2) Storage Engine (responsible for storing , clearing disk and memory to access or delete data)
3) query engine (responsible for the talking to the db)
Ok now let's define the document model first . I'm designing this after the mongodb document model which was made using c++ so it supports much more data types but we will make a very simple implementation of this.
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct Document {
id: Uuid,
created_at: DateTime<Utc>,
updated_at: DateTime<Utc>,
data: HashMap<String, Value>,
}
// document model
#[derive(Debug, Clone, Serialize, Deserialize, PartialEq)]
pub enum Value {
Null,
Boolean(bool),
Integer(i64),
Float(f64),
String(String),
Array(Vec<Value>),
Object(HashMap<String, Value>),
Date(DateTime<Utc>),
}
// basically we can put any kind of data into our db
so now we know how is our data gonna look like once it gets stored but how to store it is a challenge because remember in a database the most important part is saving something and retrieving something. we analyzed any database performance by it's read and write speed so it's absolutely crucial how you store your data .
so i took a very simple approach here i just saved the data into the disk and kept a cache in the ram for faster retrieval.
#[derive(Debug, Clone, Serialize, Deserialize)]
struct BlockMetadata {
id: String,
size: usize,
next_block: Option<u64>,
}
#[derive(Debug, Clone, Serialize, Deserialize)]
struct Block {
metadata: BlockMetadata,
data: Vec<u8>,
}
// we use block to store the data in disk
struct DiskManager {
file: File,
free_blocks: VecDeque<u64>,
}
struct Cache {
blocks: HashMap<u64, Block>,
lru: VecDeque<u64>,
}
// A deque tracking the least recently used (LRU) blocks for cache eviction.
pub struct StorageEngine {
disk: DiskManager,
cache: Cache,
index: HashMap<String, u64>, // Document ID to first block number
}
// is the core structure that ties everything together
Now we know how to store our data and after storing how it look like now we need to access the data for that we'll create a query engine of our own. I thought of creating a new template language for accessing the data then i though we are far behind the days of useless query language template like AWS velocity template language like honestly who though that was a good idea ?
Anyway for our simple program i just put a bunch of operations in an enum and wallah
#[derive(Debug, Clone)]
pub enum Operator {
Eq,
Ne,
Gt,
Lt,
Gte,
Lte,
In,
Nin,
}
// very basic euals , not equals , greater than ,less than, grater than equal , Inside the array , not inside the array
#[derive(Debug, Clone)]
pub struct Condition {
field: String,
operator: Operator,
value: Value,
}
#[derive(Debug, Clone)]
pub struct Query {
conditions: Vec<Condition>,
}
Now for the final part to test out your own db . I'm not gonna lie it's not a good experience i still haven't created an library for talking to this db easily so you have to set some stuff up.
first create a function for initializing a new document
fn create_test_document(name: &str, age: i64, city: &str) -> Document {
let mut doc = Document::new();
doc.insert("name".to_string(), Value::String(name.to_string()));
doc.insert("age".to_string(), Value::Integer(age));
doc.insert("city".to_string(), Value::String(city.to_string()));
doc
}
then you can use this function to add as many data as you want
let db_path = Path::new("test.bin");
let mut storage = StorageEngine::new(db_path).unwrap();
let doc1 = create_test_document("Alice", 30, "New York");
let doc2 = create_test_document("Bob", 25, "San Francisco");
let doc3 = create_test_document("Charlie", 35, "New York");
storage
.write(
doc1.id().to_string().as_str(),
&serde_json::to_vec(&doc1).unwrap(),
)
.expect("Failed to write doc1");
storage
.write(
doc2.id().to_string().as_str(),
&serde_json::to_vec(&doc2).unwrap(),
)
.expect("Failed to write doc2");
storage
.write(
doc3.id().to_string().as_str(),
&serde_json::to_vec(&doc3).unwrap(),
)
.expect("Failed to write doc3");
Just judging by the performance test i ran on my own computer I'm hopeful this can be an actual thing someday . even though i only created this as a fun project i still have so many things planned we can do so if you guys are interested and want to be a part of the project just create a PR in github.
This was made solely because of learning and understanding so if you have any suggestion or complains you can choose to do so in github.
here is the github project link
Github
thank you for sticking to the very end.
Top comments (0)