Justine "Jart"'s invention of one-file AI is stunning. Congrats for Mozilla in sponsoring and promoting her work. Here's the easiest way to download and run a local, secure AI:
https://simonwillison.net/2023/Nov/29/llamafile/
AI 101
Basically, once you download a "Llamafile" corresponding to a specific AI model, you can run it, either asking questions via a web page or just give a prompt directly on the command line.
Example: download and check a Llamafile for coding:
curl -LO https://huggingface.co/jartine/wizardcoder-13b-python/resolve/main/wizardcoder-python-13b.llamafile?download=true
chmod +x ./wizardcoder-python-13b.llamafile
./wizardcoder-python-13b.llamafile --version
output:
llamafile v0.8.7
write code using a one-off prompt
./wizardcoder-python-13b.llamafile -p 'be brief. code only. Write program to print numbers to 10 in C language'
output:
(lots of noise)
#include<stdio.h>
int main(){
int i;
for(i=1; i<=10; i++){
printf("%d\n",i);
}
return 0;
}
(more noise)
The speed is good, but oddly the model is extremely chatty.
Linux skills FTW: one-line AI server
Normally a Llamafile is run in "server mode", so the prompts and other settings can be done in a good old web browser. The server runs on localhost:8080.
I'm running it on a real server, not my local computer, so the browser and server are on two computers. Normally I'd tunnel port 8080 or something, but this trick is easier:
./wizardcoder-python-13b.llamafile --server --host 0
By specifying the host IP as 0 I'm telling it to run on all IPs -- my AI service is now public. I can access it by opening SERVER-IP:8080
on my local laptop browser.
This is normally terrible if your computer is on the internet. In my case the server is on the LAN, so exposing it isn't that big of a deal. My local browser can reach the AI service, and that's it.
Top comments (0)