DEV Community

Cover image for How i created a query language for .git files (GQL)
Amr Hesham
Amr Hesham

Posted on

How i created a query language for .git files (GQL)

Hello everyone. Last month I got interested in Rust programming language and want to discover more about it. So I started to learn the basics and started to see the open source projects written in Rust. I also created one PR in the rust analyzer project; it does not depend on my knowledge of rust but on my general knowledge of Compilers and Static analysis. As usual, I love to learn new things by creating new projects with ideas that I am interested in.

The idea

I started to think about small ideas that I love to use, for example, a faster search CLI or some utility apps. But then I got a new cool idea.

While reading the Building git book (a book about building git from scratch), I learned what each file inside the .git folder does and how git store commits, branches and other data and manage its own database. So what if we have a query language that runs on those files?

The Git Query Language (GQL)

I decided to implement this query language, and I named it GQL. I was very excited to start this project because it was my first time implementing a query language. I decided to implement it from scratch, not converting .git files into an SQLite database and running normal SQL queries. And I thought it will be cool if, in the future, I can use the GQL engine as a part of a Git client or analyzer.

The implementation of GQL

The goal is to implement it into two parts. The first one is converting the GQL query into AST of nodes, then passing it to the engine to walk and execute it as an interpreter or in the future to convert this into virtual matching for GQL Byte code instructions.

The engine has the functionality to deal with .git files using the rust binding for git2 library so it can perform selecting, updating and deleting tasks, also storing the selected data into a data structure so we can perform filtering or sorting.

To simplify this implementation and I created a struct called GQLObject that can represent commit, branch, tag or any other object in this engine also to make it easy to perform sorting, searching, and filtering with single functions that deal with this type.

pub struct GQLObject {
  pub attributes: HashMap<String, String>,
Enter fullscreen mode Exit fullscreen mode

The GQLObject is just a map of string as a key and value, so it can be general to put the info of any type. And now features like comparisons, filtering or sorting can be implemented easily on this strings map.

The current state

Over the last week, I implemented the selecting feature with conditions, filtering and sorting with optional limit and offset so you can write queries like this

select * from commits
select name, email from commits
select name, email from commits order by name
select name, email from commits where name contains "gmail" order by name
select * from commits where name.lower = "amrdeveloper"

select * from branches
select * from branches where ishead = "true"
select * from branches where name ends_with "master"
select * from branches where name contains "origin"

select * from tags
select * from tags offset 1 limit 1
Enter fullscreen mode Exit fullscreen mode

The next step

Now the next step is to optimize the code and start to support more features, for example, imaging query for deleting all branches except the master.

delete * from branches where name ! "master"
Enter fullscreen mode Exit fullscreen mode

Or pushing all or some branches to a remote repository using a single query. Maybe grouping and analyzing how many commits for each user in this month and many other things we can do.


I am looking forward to your opinion and feedback 😋.

I hope you enjoyed my article and you can find me on

You can find me on: GitHub, LinkedIn, and Twitter.

Enjoy Programming 😋.

Top comments (0)