Shure

Posted on Aug 22

🚀 Introducing LogLLM: Automate Your ML Experiment Logging with LLMs

#llm #wandb #chatgpt #openai

Project page: https://logllm.tiiny.site/

After getting tired of manually logging my experiments into Weights & Biases (W&B), I decided to develop LogLLM a few days ago. This tool automates the extraction of experimental conditions from your Python scripts using GPT-4 and logs them directly into W&B.

How It Works:

LLM(Our Prompt + Your ML Script) = Extracted Experimental Conditions

LogLLM uses GPT-4 to analyze your ML scripts and extract key experimental conditions, which are then logged into W&B. It simplifies the process, allowing you to focus more on your experiments and less on the logging process.

Our prompt:

You are an advanced machine learning experiment designer.
Extract all experimental conditions and results for logging via W&B API.
Add your original parameters in your JSON response if you want to log other parameters.
Extract all information you can find in the given script as int, bool, or float values.
If you cannot describe conditions with int, bool, or float values, use a list of natural language.
Give advice to improve accuracy.
If you use natural language, answers should be very short.
Do not include information already provided in param_name_1 for `condition_as_natural_langauge`.
Output JSON schema example:
This is just an example, make changes as you see fit.

Example ML Script: `svc-sample.ipynb`

from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.svm import SVC
from sklearn.metrics import accuracy_score

iris = datasets.load_iris()

X = iris.data[iris.target != 2]
y = iris.target[iris.target != 2]

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

model = SVC(kernel='linear')
model.fit(X_train, y_train)

y_pred = model.predict(X_test)
accuracy = accuracy_score(y_test, y_pred)

print(f"Accuracy: {accuracy:.2f}")

Extracted Experimental Conditions:

{
    "method": "SVC",
    "dataset": "Iris",
    "task": "classification",
    "is_advanced_method": false,
    "is_latest_method": "",
    "accuracy": 1.00,
    "kernel": "linear",
    "test_size": 0.2,
    "random_state": 42,
    "condition_as_natural_langauge": [
        "Using linear kernel on SVC model.",
        "Excluding class 2 from Iris dataset.",
        "Splitting data into 80% training and 20% testing."
    ],
    "advice_to_improve_acc": [
        "Confirm dataset consistency.",
        "Consider cross-validation for validation."
    ]
}

Get Started:

Clone the repo: git clone https://github.com/shure-dev/logllm.git
Install the package: pip install -e .

Usage:

Simply use log_llm in your scripts to start logging. Check out the GitHub repo for more details.

GitHub Repo

Looking for Contributors

This is an ongoing project!!
I'm actively seeking contributors to help improve LogLLM. Whether it's adding new features, refining the code, or enhancing documentation, your help would be greatly appreciated. Let's make ML experiment logging smarter and easier together.

DEV Community

🚀 Introducing LogLLM: Automate Your ML Experiment Logging with LLMs

How It Works:

Our prompt:

Example ML Script: `svc-sample.ipynb`

Extracted Experimental Conditions:

Get Started:

Usage:

Looking for Contributors

Top comments (0)

Read next

Chunking Techniques Every Developer Should Know for Enhanced RAG Applications!

Genz Hiring

Part 3: Building Powerful Chains and Agents in LangChain

AI Chatbot Architecture

How It Works:

Our prompt:

Example ML Script: svc-sample.ipynb

Extracted Experimental Conditions:

Get Started:

Usage:

Looking for Contributors

Read next

Chunking Techniques Every Developer Should Know for Enhanced RAG Applications!

Genz Hiring

Part 3: Building Powerful Chains and Agents in LangChain

AI Chatbot Architecture

Example ML Script: `svc-sample.ipynb`