DEV Community

Shure
Shure

Posted on

🚀 Introducing LogLLM: Automate Your ML Experiment Logging with LLMs

Image description

Project page: https://logllm.tiiny.site/

After getting tired of manually logging my experiments into Weights & Biases (W&B), I decided to develop LogLLM a few days ago. This tool automates the extraction of experimental conditions from your Python scripts using GPT-4 and logs them directly into W&B.

How It Works:

LLM(Our Prompt + Your ML Script) = Extracted Experimental Conditions

LogLLM uses GPT-4 to analyze your ML scripts and extract key experimental conditions, which are then logged into W&B. It simplifies the process, allowing you to focus more on your experiments and less on the logging process.

Our prompt:

You are an advanced machine learning experiment designer.
Extract all experimental conditions and results for logging via W&B API.
Add your original parameters in your JSON response if you want to log other parameters.
Extract all information you can find in the given script as int, bool, or float values.
If you cannot describe conditions with int, bool, or float values, use a list of natural language.
Give advice to improve accuracy.
If you use natural language, answers should be very short.
Do not include information already provided in param_name_1 for `condition_as_natural_langauge`.
Output JSON schema example:
This is just an example, make changes as you see fit.
Enter fullscreen mode Exit fullscreen mode

Example ML Script: svc-sample.ipynb

from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.svm import SVC
from sklearn.metrics import accuracy_score

iris = datasets.load_iris()

X = iris.data[iris.target != 2]
y = iris.target[iris.target != 2]

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

model = SVC(kernel='linear')
model.fit(X_train, y_train)

y_pred = model.predict(X_test)
accuracy = accuracy_score(y_test, y_pred)

print(f"Accuracy: {accuracy:.2f}")
Enter fullscreen mode Exit fullscreen mode

Extracted Experimental Conditions:

{
    "method": "SVC",
    "dataset": "Iris",
    "task": "classification",
    "is_advanced_method": false,
    "is_latest_method": "",
    "accuracy": 1.00,
    "kernel": "linear",
    "test_size": 0.2,
    "random_state": 42,
    "condition_as_natural_langauge": [
        "Using linear kernel on SVC model.",
        "Excluding class 2 from Iris dataset.",
        "Splitting data into 80% training and 20% testing."
    ],
    "advice_to_improve_acc": [
        "Confirm dataset consistency.",
        "Consider cross-validation for validation."
    ]
}
Enter fullscreen mode Exit fullscreen mode

Get Started:

  1. Clone the repo: git clone https://github.com/shure-dev/logllm.git
  2. Install the package: pip install -e .

Usage:

Simply use log_llm in your scripts to start logging. Check out the GitHub repo for more details.

GitHub Repo

Looking for Contributors

This is an ongoing project!!
I'm actively seeking contributors to help improve LogLLM. Whether it's adding new features, refining the code, or enhancing documentation, your help would be greatly appreciated. Let's make ML experiment logging smarter and easier together.

Top comments (0)