DEV Community 👩‍💻👨‍💻

Laysa Uchoa
Laysa Uchoa

Posted on • Updated on • Originally published at aiven.io

Write search queries with Python and OpenSearch®

A Python-perfect dinner party with OpenSearch®

Introduction

When I plan a dinner party, I want my guests to have a great experience, and definitely, I do not want anyone hungry. I need to check the ingredients, my guests' diet restrictions, and preferences. If you also feel that planning that special dinner can be challenging, you are in for a treat.

In this blog, I'll show how to find delicious recipes in the pythonic way. We just need data, Python and the powers of OpenSearch® to plan a perfect dinner party!

Our support material in this learning journey will be a useful CLI application that lets you explore common types of OpenSearch query, and even run them yourself.

Getting started

Some may agree that our search results can be only as good as our dataset. But do not worry, we will be using a high-quality dataset from Epicurious which contains over 20.000 full recipes, ratings, and nutrition information for us.

I'll be using Aiven's fully managed OpenSearch service to get our cluster up and running. I've prepared a demo that contains all the code to connect, send data and perform the search queries.

Everything is explained in the project README.rst, so you can just focus on understanding the queries.

Ingest data to the OpenSearch cluster

The first step is to load the data into our OpenSearch cluster, so we can start to query. Check out how to load the data to your cluster using the Python client.

Now, we can start to play with data!

Finding recipes

Range query

Rumors are that my grandfather is coming to the dinner, so I know I should have at least one recipe that is low in sodium. Low sodium meals are recommended to reduce blood pressure, but in general, this is a good healthy option for everyone.

We can use the range function to help us to find documents where the field value (in this case, sodium) is within a certain range.

Recipes under 140 ms of sodium per serving are considered low sodium meals. So let’s look for recipes around 100 - 140 mg, and build this query as:

{
   "query":{
      "range":{
         "sodium":{
            "gte":100,
            "lte":140
         }
      }
   }
}
Enter fullscreen mode Exit fullscreen mode

We can use the demo program to see the range query in action by running:

python search.py "sodium" 100 140
Enter fullscreen mode Exit fullscreen mode

I'm curious to see what kind of recipes, we get under this condition, and here we go:

['Salsa Verde ',
 'Green Bean and Red Onion Salad with Warm Cider Vinaigrette ',
 'Toasted-Pecan Pie ',
 'Provençal Chicken and Tomato Roast ',
 'Sauteed Cod Provençale ',
 'Roasted Potatoes and Asparagus with Parmesan ',
 'Sweet-and-Sour Baby Carrots ',
 'Ricotta Puddings with Glazed Rhubarb ',
 'Butternut Squash and White Bean Soup ',
 'Turkish Zucchini Pancakes ']
Enter fullscreen mode Exit fullscreen mode

'Turkish Zucchini Pancakes' seems like a delicious recipe, so this would be my choice.

Match-phrase query

Also, I need to find a delicious salad recipe for the occasion 🥗. It's radish season and this vegetable goes really well in summer salads, so let's use match_phrase to look for a "title" containing "Salad with Radish".

{
   "query":{
      "match_phrase":{
         "title":{
            "query":"Salad with radish"
         }
      }
   }
}
Enter fullscreen mode Exit fullscreen mode

We can run this query using the demo program:

python search.py match-phrase "title" "Salad with Radish"
Enter fullscreen mode Exit fullscreen mode

and here is our result:

['Green Bean and Red Onion Salad with Radish Dressing ']
Enter fullscreen mode Exit fullscreen mode

We only got one match and it seems like radish is only part of the dressing. The reason is that the order of words is important when you use match_phrase. In this case, the phrase Salad with Radish only appeared once, hence our single result.

We can fix that by adding some flexibility to our search. There is a powerful feature on match_phrase that allows us to define the distance that the search words can be from each other. This parameter is called slop (default=0). So let's try again with the slop parameter set to 3.

{
   "query":{
      "match_phrase":{
         "title":{
            "query":"Salad with radish",
            "slop":3
         }
      }
   }
}
Enter fullscreen mode Exit fullscreen mode

We can run this query using the demo program:

python search.py match-phrase "title" "Salad with Radish" --slop 3
Enter fullscreen mode Exit fullscreen mode

Not surprisingly, we got more results this time:

['Green Bean and Red Onion Salad with Radish Dressing ',
 'Winter Salad with Black Radish, Apple, and Escarole ',
 'Avocado Radish Salad with Lime Dressing ',
 'Chickpea Salad Sandwich With Creamy Carrot-Radish Slaw ',
 'Mâche, Frisée, and Radish Salad with Mustard Vinaigrette ',
 'Frisée and Radish Salad with Goat Cheese Croutons ',
 'Endive, Mâche, and Radish Salad with Champagne Vinaigrette ',
 'Butter Lettuce and Radish Salad with Fresh Spring Herbs ',
 'Butter Lettuce and Radish Salad with Lemon-Garlic Vinaigrette ',
 'Shaved Carrot and Radish Salad With Herbs and Pumpkin Seeds ']
Enter fullscreen mode Exit fullscreen mode

Now, our results match with "Radish Salad with", "Salad with <something else> Radish" and so on.
We can pick one and move forward to find a desert.

Match query

Let's explore how the match function works, building a query to find "Chocolate Carrot Cake" in our "title".

{
   "query":{
      "match":{
         "title":{
            "query":"Chocolate Carrot Cake",
            "operator": "and"
         }
      }
   }
}
Enter fullscreen mode Exit fullscreen mode

The match parameter will report results in a sorted order of how closely they relate to "Chocolate Carrot Cake" 🥕. By default match uses the "OR" operator, giving results for "Chocolate" or "Carrot" or "Cake". However, I want to have all these terms included in the "title" when we search. We can use the "AND" operator for that:

python search.py match "title" "Chocolate Carrot Cake" --operator "and"
Enter fullscreen mode Exit fullscreen mode

Here are our cake results.

['Chocolate-Orange Carrot Cake ',
 'Milk Chocolate Semifreddo with Star Anise Carrot Cake ']
Enter fullscreen mode Exit fullscreen mode

Everything seems delicious and we are ready for the party 🥳!

And what's for your dinner? You can play around writing your own search queries and find your own perfect dinner.

I'll be happy to hear from you or answer any questions you may have at laysa.uchoa@gmail.com.

Happy meal, everyone!

Examples and other resources

Top comments (0)

🌚 Life is too short to browse without dark mode