DEV Community

Eugene Tolbakov
Eugene Tolbakov

Posted on

Enhancing the SQL Interval syntax: A story of Open Source contribution

There are many reasons why developers dive into the world of Open Source contributions.

Contributing can be a way to give back to the community and use your skills for the greater good. It's a fantastic environment that allows you to network with talented developers, build relationships, and potentially find mentors or collaborators. For those seeking career advancement, contributions become a public portfolio showcasing your skills and experience. Sometimes, it's about a personal itch! You might encounter a bug or limitation in a project you rely on, and contributing a fix not only solves your frustration but benefits everyone who uses that software.

The recipe for a successful Open Source contribution goes beyond just code. A strong desire to learn fuels the journey, as you navigate unfamiliar codebases and tackle new challenges. But learning flourishes best within a supportive environment.

A responsive community acts as a safety net, offering guidance and feedback to help you refine your contributions. Ideally, you can also "dogfood" the tool you contribute to, using it in your work or personal projects. This firsthand experience provides valuable context for your contributions and ensures they address real-world needs. With these elements in place, you're well on your way to making a lasting impact on the Open Source community.

While I've been building my software development skills for a while now, Rust still feels like a whole new world to explore. This “newbie” feeling might make some shy away from contribution, fearing their code won't be good enough. I use silly mistakes as stepping stones to improve my skills, not a reason to feel discouraged.

Over a year of contributing to GreptimeDB has been a rewarding journey filled with learning experiences. Today, I'd like to walk you through one of those. Let's get our hands dirty (or should I say, claws? 🦀)
This time we will be enhancing the Interval syntax by allowing the [shortened version]

Motivation

This time I chose a task to enhance the Interval syntax by allowing a shortened version. The SQL standard specifies a syntax format such as:

select interval '1 hour';
Enter fullscreen mode Exit fullscreen mode

My objective was to ensure the functionality of the following alternative(shortened) format:

select interval '1h';
Enter fullscreen mode Exit fullscreen mode

Diving right into the code, I discovered that the core functionality for handling transformations already exists. The scope of a change boils down to adding a new rule specifically for the Interval data type: intervals with abbreviated time formats will be automatically expanded to their full versions. Let's take a closer look at a specific section of the code that does the logic

fn visit_expr(&self, expr: &mut Expr) -> ControlFlow<()> {
    match expr {
        Expr::Interval(interval) => match *interval.value.clone() {
            Expr::Value(Value::SingleQuotedString(value)) => {
                if let Some(data) = expand_interval_name(&value) {
                    *expr = create_interval_with_expanded_name(
                        interval,
                        single_quoted_string_expr(data),
                    );
                }
            }
            Expr::BinaryOp { left, op, right } => match *left {
                Expr::Value(Value::SingleQuotedString(value))=> {
                    if let Some(data) = expand_interval_name(&value) {
                        let new_value = Box::new(Expr::BinaryOp {
                            left: single_quoted_string_expr(data),
                            op,
                            right,
                        });
                        *expr = create_interval_with_expanded_name(interval, new_value);
                    }
............
}
Enter fullscreen mode Exit fullscreen mode

Code Review

An experienced Rust developer, Dennis (CEO & Co-Founder @ Greptime), quickly identified an area for improvement and suggested the fix:

Dennis spotted an improvement opportunity
Code review shines as a learning tool.

Beyond simply accepting the suggested change (though the rationale for “efficiency” was clear!), I decided to take a deep dive. Analyzing the proposed improvement and explaining it to myself(in the form of this post), helped me better understand the Rust code and its recommended practices.

Avoiding unnecessary cloning and ownership transfers

Originally I used .clone() on interval.value:

match *interval.value.clone() { ... }
Enter fullscreen mode Exit fullscreen mode

Cloning here creates a new instance of the data each time, which can be inefficient if the data structure is large or cloning is expensive. The suggested version avoids this by using references:

match &*interval.value { ... }
Enter fullscreen mode Exit fullscreen mode

Matching on a reference (&*interval.value) eliminates the cost of cloning. The same improvement is applied to the match on left in the binary operation case:

match &**left { ... }
Enter fullscreen mode Exit fullscreen mode

This one is slightly more involved: it uses a double dereference to get a reference to the value inside a Box, which is more efficient than cloning.

Clearer Pattern Matching

Using references in pattern matching emphasizes the intention of only reading data, not transferring ownership:

match &*interval.value { ... }
Enter fullscreen mode Exit fullscreen mode

This shows explicitly that the matched value is not being moved. It helps with reasoning about the code, especially in a context with complex ownership and borrowing rules.

Reduced Cloning Inside the Binary Operation

In the original code, the op and right fields of the Expr::BinaryOp variant are cloned unconditionally.

let new_value = Box::new(Expr::BinaryOp {
    left: single_quoted_string_expr(data),
    op,
    right,
});
Enter fullscreen mode Exit fullscreen mode

However, they only need to be cloned if the left field is an Expr::Value variant with a string value. The suggested enhancement moves the cloning inside the if let block, so it only happens when necessary.

let new_value = Box::new(Expr::BinaryOp {
    left: single_quoted_string_expr(data),
    op: op.clone(),
    right: right.clone(),
});
Enter fullscreen mode Exit fullscreen mode

Using references instead of cloning:

In the original code, expand_interval_name(&value) is used, which borrows value. However, value is of type String, which implements the AsRef<str> trait. This means that it can be automatically dereferenced to a &str when necessary. However, value is of type String, which implements the Deref<Target=str> trait (more information could be found here). This means that it can be automatically dereferenced to a &str when necessary. The improved version uses expand_interval_name(value), which avoids the need for explicit reference.

Summary

So in the context of this change “efficiency” stands for:
• Avoiding unnecessary cloning, reducing overhead.
• Making the borrowing and ownership patterns clearer and safer.
• Enhancing overall readability and maintainability.

This is how the visit_expr function looks like after all suggestions have been applied:

fn visit_expr(&self, expr: &mut Expr) -> ControlFlow<()> {
        match expr {
            Expr::Interval(interval) => match &*interval.value {
                Expr::Value(Value::SingleQuotedString(value))
                | Expr::Value(Value::DoubleQuotedString(value)) => {
                    if let Some(expanded_name) = expand_interval_name(value) {
                        *expr = update_existing_interval_with_value(
                            interval,
                            single_quoted_string_expr(expanded_name),
                        );
                    }
                }
                Expr::BinaryOp { left, op, right } => match &**left {
                    Expr::Value(Value::SingleQuotedString(value))
                    | Expr::Value(Value::DoubleQuotedString(value)) => {
                        if let Some(expanded_name) = expand_interval_name(value) {
                            let new_expr_value = Box::new(Expr::BinaryOp {
                                left: single_quoted_string_expr(expanded_name),
                                op: op.clone(),
                                right: right.clone(),
                            });
                            *expr = update_existing_interval_with_value(interval, new_expr_value);
                        }
                    }
                    _ => {}
                },
    .....................
    }
Enter fullscreen mode Exit fullscreen mode

Open Source contribution has been an incredible way to accelerate my Rust learning curve. My efforts on this #GoodFirstIssue serve as an illustration of how to improve skills through collaborations. Depending on your feedback, I'm excited to share more of these learning experiences in future posts!

A huge thanks to the entire Greptime team, especially Dennis, for their support and guidance! Let’s keep the contributing/learning going!

UPD: Thanks to @tisonkun for the thorough review!

Top comments (2)

Collapse
 
tisonkun profile image
Zili Chen

Using references instead of cloning:

I noticed that this part is not quite correct. The reason you don't need to add an explicit & is because String implements Deref<Target=str>, not the AsRef<str> trait.

You can read this doc: doc.rust-lang.org/std/ops/trait.De...

Furthermore, in the clone version, value is an owned struct, so you need &value to explicitly indicate no ownership transfer.

Collapse
 
etolbakov profile image
Eugene Tolbakov

Thanks @tisonkun I'll make the update in post