DEV Community

Cover image for Introducing Calorie GPT: Track your Calories using LLMs - Part 1
VG
VG

Posted on • Originally published at app.langdb.ai

Introducing Calorie GPT: Track your Calories using LLMs - Part 1

Calorie GPT

Table of Contents

NOTE: Original article can be found at https://app.langdb.ai/share/apps/319346f1-4616-42a9-bf35-57dc28d41c06

Introduction

I have always wanted to figure out my intake without having to fill up information manually. So here is a solution for all the lazy people out there. Calorie GPT will get the composition using LLM VIsion APIs. Through this we can also explore various ways of improving accuracy. This article will be the first of the series. I'm currently testing this for myself, with plans to make it available for general usage later on. In this article, we'll explore how to build this solution using LangDB, which allows to build full LLM applications just using SQL. Let's dive in.

At the end of the exercise we can do the following

1) Extract calorie information from images
2) Get uploads by user (THis could be integrated through an API)
3) Weekly and Daily Statistics

For example, here is a query to get weekly analytics.

WITH
    (SELECT
        toStartOfDay(toTimeZone(now(), 'Asia/Singapore')) AS today) AS today
SELECT
    date,
    total_calories,
    groupArray((nutrient_name, total_amount, unit)) AS nutrients,
    user_id
FROM (
    SELECT
        date,
        SUM(total_calories) AS total_calories,
        nutrient_name,
        ceil(SUM(amount), 1) AS total_amount,
        MAX(unit) AS unit,
        user_id
    FROM (
        SELECT
            toStartOfDay(toTimeZone(created_at, 'Asia/Singapore')) AS date,
            calories_per_serving AS total_calories,
            tupleElement(n, 1) AS nutrient_name,
            tupleElement(n, 2) AS amount,
            tupleElement(n, 3) AS unit,
            user_id
        FROM
            food_activity
        ARRAY JOIN nutrition_items AS n
        WHERE
            created_at >= today - INTERVAL 5 DAY
    )
    GROUP BY
        date, nutrient_name, user_id
)
GROUP BY
    date, total_calories, user_id
ORDER BY
    date DESC
LIMIT 5;
Enter fullscreen mode Exit fullscreen mode
date total_calories nutrients user_id
2024-08-06 00:00:00 1268 [["fat",43,"g"],["vitamin A",428.5,"mg"],["protein",67,"g"]] hgOzDg4yXCOnk3leagMV7onEARO2

Building Calorie GPT with LangDB

Our journey begins with building a LangDB model that utilizes vision APIs to analyze food items and determine their nutritional content. To ensure accuracy and reliability, we'll experiment with various providers later in the series.

This model will be responsible for identifying the food items and extracting nutritional information such as protein, fat, vitamins, and total calories. We'll start with a single provider and validate the data to ensure it meets our required format.

We ll do the following

  • Create a prompt with the instructions
  • Create a validator
  • Create a model with tools

Step 1: Create a Prompt

FIrst lets create a prompt and instruct to return JSON in a format that we are interested.

Step 2: Validating the Data

Data validation is crucial to ensure the information returned by the LLMs is accurate and in the format we need. We'll create a validator that checks the data for consistency and correctness.

Here we can simply use SQL to validate our JSON response.

CREATE PROMPT nutrient_info_prompt (
    system '
    You are a model that processes images of meals to determine the nutrient content of each food item. 
    Upon receiving an image, analyze it to identify each distinct food item, estimate the quantity, and calculate the corresponding nutrient content. 

    = Always Return the result in a valid JSON format.
    - Always use the test_calorie_json tool to check if the format is right. 
    - Only respond json and nothing else. 
    - Units can only be in mg and g.


    [{"name": "item","quantity": <quantity>,"default_serving": <usual_serving>,"unit": <unit>,"nutrients": [    {"type": "protein", "amount": <amount>, "unit": "g"},    {"type": "fat", "amount": <amount>,"unit": "g"},    {"type": "vitamin A", "amount": <amount>,"unit": "mg"}],"calories_per_serving": <calories>}]

    Example:
    [{"name": "apple","quantity": 1,"default_serving": 10,"unit": "g","calories_per_serving": 95,"nutrients": [    {"type": "protein", "amount": 30, "unit": "g"  },    {"type": "fat", "amount": 20, "unit": "g"},    {"type": "vitamin A", "amount": 0.54, "unit": "mg"}]}]
    ',
    human $input
)

Enter fullscreen mode Exit fullscreen mode
CREATE VIEW test_calorie_json (
    json_data string "Input must be a JSON array string"
) AS(
    SELECT 
        (COUNT(JSONExtractArrayRaw($json_data)) > 0) AS valid
    WHERE 
        isValidJSON($json_data) AND JSONExtractString($json_data, '') IS NOT NULL
)
Enter fullscreen mode Exit fullscreen mode

Lets test with one sample json output

Invalid Example

SELECT * from test_calorie_json('[{
        "name": "apple",
        "quantity": 1,
        "default_serving": 10,
        "unit": "g",
        "calories_per_serving": 95,
    }]')
Enter fullscreen mode Exit fullscreen mode
valid
0

Valid Example

SELECT * from test_calorie_json('[{
        "name": "apple",
        "quantity": 1,
        "default_serving": 10,
        "unit": "g",
        "calories_per_serving": 95,
        "nutrients": [
            {"type": "protein", "amount": 30, "unit": "g"  },
            {"type": "fat", "amount": 20, "unit": "g"},
            {"type": "vitamin A", "amount": 0.54, "unit": "mg"}
        ]
    }]')
Enter fullscreen mode Exit fullscreen mode
valid
1

Step 3: Create a model with tools

CREATE MODEL calorie_tracker (
    input
) 
USING open_ai_provider(model_name='gpt-4o-mini')
PROMPT nutrient_info_prompt
TOOLS (
    test_calorie_json comment 'tests for valid calorie response or returns an error'
)

Enter fullscreen mode Exit fullscreen mode

We can already using this to model to get calories. Lets try with a couple of images.

CREATE TABLE calorie_images (
    name String, 
    link String
) Engine = Memory()
Enter fullscreen mode Exit fullscreen mode
INSERT into calorie_images(name, link) values 
('Pasta', ''),
('Burrito','')
Enter fullscreen mode Exit fullscreen mode

Lets visualize the images using python

%export df%
select * from calorie_images
Enter fullscreen mode Exit fullscreen mode

Variable df exported

%python%
from IPython import display
from base64 import b64decode

html = ''
for image_url in df['link']:    
    html = html + f'<img style="width:200px" src="{image_url}" />'
display.HTML(html)
Enter fullscreen mode Exit fullscreen mode

Burrito

Pasta

Tracing

If we look at the traces, we can notice the following

  • It took about 20 seconds to analyize the picture
  • First tool call failed with the wrong json
  • Second call worked and responded with the json

Tracing

Step 3: Introducing Human-in-the-Loop Mechanism

In the mobile app, we also introduce a human in the loop screen for the human to fix the numbers if LLMs are producing incorrect information.

Step 4: Storing & Accessing Data

Once the data is validated and refined, we'll store it in our database. We'll use the uploads table for raw data storage and the food_activity table for structured data. The food_activity table schema includes fields for user ID, food item, nutritional information, total calories, and timestamps, enabling efficient tracking and querying.

CREATE TABLE IF NOT EXISTS uploads
(
    upload_id UUID,
    user_id String,
    image Nullable(String),
    created_at DateTime DEFAULT now()
) 
ENGINE = MergeTree()
PARTITION BY toYYYYMM(created_at)
ORDER BY (created_at, upload_id);
Enter fullscreen mode Exit fullscreen mode
CREATE TABLE IF NOT EXISTS food_activity
(
    id UUID DEFAULT generateUUIDv4(),
    upload_id UUID, 
    user_id String,
    name String,
    quantity Float32,
    calories_per_serving UInt32,
    unit String,
    nutrition_items Array(Tuple(String, Float32, String)),  -- Array of nutrients like protein, fat, vitamin A, etc., with their units
    created_at DateTime DEFAULT now()
) 
ENGINE = MergeTree()
PARTITION BY toYYYYMM(created_at)
ORDER BY (created_at, id);
Enter fullscreen mode Exit fullscreen mode

Running queries against a user

Uploads by user and upload_id


SELECT
      user_id,
      u.upload_id,
      COUNT(*) AS no_items,
      SUM(f.calories_per_serving) AS total_calories,
      u.created_at AS created_at,
      groupArray((name,quantity,calories_per_serving,unit,nutrition_items)) AS items
  FROM
      food_activity AS f
  JOIN
      uploads AS u
  ON
      f.upload_id = u.upload_id
  GROUP BY
      u.upload_id, u.created_at, u.image, user_id
  ORDER BY
      u.created_at DESC;
Enter fullscreen mode Exit fullscreen mode
user_id u.upload_id no_items total_calories created_at items
hgOzDg4yXCOnk3leagMV7onEARO2 a11a0b1e-3663-4a06-8704-0d4c39b48fa1 3 268 2024-08-06 14:14:49 [["spaghetti",150,158,"g",[["protein",5,"g"],["fat",1,"g"],["vitamin A",0,"mg"]]],["fried egg",1,90,"unit",[["protein",6,"g"],["fat",7,"g"],["vitamin A",270,"mg"]]],["bell pepper",50,20,"g",[["protein",1,"g"],["fat",0,"g"],["vitamin A",157,"mg"]]]]
hgOzDg4yXCOnk3leagMV7onEARO2 c4a1a434-fba7-4b45-ab02-537d18265b72 1 500 2024-08-06 14:11:30 [["chicken_burrito",1,500,"piece",[["protein",30,"g"],["fat",20,"g"],["vitamin A",0.5,"mg"]]]]
hgOzDg4yXCOnk3leagMV7onEARO2 db8d380b-3677-417f-a134-3d140208ddf6 1 500 2024-08-06 13:39:22 [["chicken burrito",1,500,"g",[["protein",25,"g"],["fat",15,"g"],["vitamin A",1,"mg"]]]]

Weekly & Daily Statistics

WITH
    (SELECT
        toStartOfDay(toTimeZone(now(), 'Asia/Singapore')) AS today) AS today
SELECT
    date,
    total_calories,
    groupArray((nutrient_name, total_amount, unit)) AS nutrients,
    user_id
FROM (
    SELECT
        date,
        SUM(total_calories) AS total_calories,
        nutrient_name,
        ceil(SUM(amount), 1) AS total_amount,
        MAX(unit) AS unit,
        user_id
    FROM (
        SELECT
            toStartOfDay(toTimeZone(created_at, 'Asia/Singapore')) AS date,
            calories_per_serving AS total_calories,
            tupleElement(n, 1) AS nutrient_name,
            tupleElement(n, 2) AS amount,
            tupleElement(n, 3) AS unit,
            user_id
        FROM
            food_activity
        ARRAY JOIN nutrition_items AS n
        WHERE
            created_at >= today - INTERVAL 5 DAY
    )
    GROUP BY
        date, nutrient_name, user_id
)
GROUP BY
    date, total_calories, user_id
ORDER BY
    date DESC
LIMIT 5;
Enter fullscreen mode Exit fullscreen mode
date total_calories nutrients user_id
2024-08-06 00:00:00 1268 [["fat",43,"g"],["vitamin A",428.5,"mg"],["protein",67,"g"]] hgOzDg4yXCOnk3leagMV7onEARO2

Upcoming Articles in the Series

Our journey doesn't end here. In the next articles, we'll delve deeper into the following topics:

  • Evaluating Multiple Providers: We'll assess various vision API providers to measure accuracy and reliability, helping us choose the best options for our system.

  • Q&A: Using SQL generation and metadata tables we can power workflows of providing interesting insights to the user with personalisation

  • Integrating with React Native: Learn how to integrate Calorie GPT with React Native or other applications, making it accessible on mobile devices.

  • Incorporating RAG Techniques: We'll explore Retrieval-Augmented Generation (RAG) techniques to improve the accuracy of our models further.

  • Building a Copilot Experience: Discover how to enhance Calorie GPT with automatic notifications and personalized suggestions, modeling user behavior for a more interactive and supportive experience.

Stay tuned as we continue to refine and expand Calorie GPT, transforming the way we track and manage our daily calorie intake. With LangDB at the core, we're excited to bring you a powerful tool that combines the latest in AI and database technology for effortless calorie tracking.

Future Version

Image description

Top comments (0)