Overview
- We will start from this template provided by vercel which is built with NextJs 13 App Directory with server actions enabled and Vercel AI SDK for streaming chat UI.
- Configure all necessary services step by step.
- All the services has a free plan with no payment method requirement.
- Implement configurable rate limiting to prevent abuse on your deployment.
- We will use the vercel platform, but familiarity with vercel is not a prerequisite.
- The code is available on github
Table of Contents
Configuring services
Model Provider
We will use hugging face for our model.
Why use hugging face instead of OpenAI ? Simply because it has a free plan and you don't need to attach any payment method.
- Create an account on hugging face.
- Go to account settings and create a token.
- Save the token as
HUGGINGFACE_API_KEY
in the .env file.
Auth
We will use github oauth for authentication via NextAuth.
- Register a new oauth app on github from here
- Give a name for the app and fill the rest like this:
- Note that, this is for local development, for deployment register a new app and replace all instance of
http://localhost:3000
with the url of that deployment. - Generate a new client secret.
- Copy Client ID as
AUTH_GITHUB_ID
and Client secret asAUTH_GITHUB_SECRET
in the .env file. - Go to this link to generate a secure random string and save it as
AUTH_SERCRET
in the .env file. This is needed for NextAuth to work in production.
Database
For storing chats we will use vercel's KV database.
- Sign up or login in vercel.
- If you signed up with email, connect your github account from settings.
- Go to storage and Create a KV datbase. Note that you can only have one KV database on free plan.
- Select the database you created and click on .env.local tab. Copy the contents to your .env file.
Upstash
We will use upstash for rate limiting. if you don't need rate limiting, you can skip this step.
- Go to upstash console.
- Click on Create Database, chose your preferred region and enable the
Eviction
option. - In Details, go to
REST API
section and click on the .env tab. - Copy the values and paste it into your .env file.
Rate Limiting
Per Account
app/api/chat/route.ts:
const ratelimit = new Ratelimit({
redis: redis,
limiter: Ratelimit.slidingWindow(15, '1 d')
})
export async function POST(req: NextRequest) {
const { userId } = auth()
const { success, reset } = await ratelimit.limit(userId!)
if (!success) {
return new Response(
`Your rate limit has been exceeded. You can chat again from ${new Date(
reset
).toLocaleString()} GMT`
)
}
.
.
.
}
This is the portion of code responsible for rate limiting. You can chose the values to suit your own needs. Here, I have set it to 15 requests per day. You can use 'm' for minutes, 's' for seconds. You can also modify the message shown when rate limit exceeded. The reset
variable returns a unix timestamp in miliseconds of when the rate limit will be reset for this user which is converted to a string such as 7/18/2023, 6:00:00 AM
Per IP
You can rate limit per IP address as well. Instead of using userId, pass ip as the parameter to ratelimit.limit.
export async function POST(req: NextRequest) {
const ip = req.ip ?? "127.0.0.1"
const { success, reset } = await ratelimit.limit(ip)
.
.
.
}
req.ip
is undefined on localhost, thats why we are placing 127.0.0.1
as the fallback value.
If you don't want rate limiting, remove the imports of redis and RateLimit at the top of the file and delete all lines references those.
Deployment
Before proceeding further, make sure that you have populated the values of all necessary environment variables in the .env file. If you want to run localy, clone the repo and install packages with pnpm install
. Then run the dev server with pnpm dev
Set up a new project on vercel
- Fork the repo.
- Create a new project on vercel.
- Import the forked repo.
- Set
Environment Variables
according to your .env file. Note that you dont need to copy the key value pairs one by one, just copy the whole file, put the cursor in the Name field and Ctrl-v. - Click on deploy
But we are not done yet. We hit deploy so that a new project is created. Vercel assigns unique domains to deployments. You need use a fixed domain for github oauth to work.
Configure Github OAuth App
- Go to vercel dashboard. and select your project.
- Click on the
Settings
tab and go to theDomains
section. - Set the domain to your preference and copy the full url.
- Register a new github oath app from here.
- Set Homapage URL as the domain you copied earlier.
- Set Authorization callback URL as
your-domain-url/api/auth/callback
. - Now go to project Settings on vercel and then go to
Environment Variables
section. - Edit
AUTH_GITHUB_ID
andAUTH_GITHUB_SECRET
and replace it with the value from the new github oauth app as before. - Ideally, you should use different
AUTH_SERCRET
in local and production environment for better security. Get a random string from here and editAUTH_SERCRET
.
Switch LLM
You can easily switch models by changing the model parameter in Hf.textGenerationStream()
on /api/chat/rout.ts
. Hugging face provides access to a lot of models. You can use a model without any major code modification if that model supports streaming and has a small size. You have to modify buildOpenAssistantPrompt
prompt method to format the prompts according to the model's specification. The model we are using takes user prompt as <|prompter|>${prompt}<|endoftext|>
and the previous replies of the model is expected in this format <|assistant|>${content}<|endoftext|>
. This format will vary depending on the model.
You can use LLM providers other than hugging face such as Anthropic, Langchain adn OpenAI as well. Follow this doc to change the route handler(/api/chat/route.ts) and refer to the section of your chosen LLM provider. You can test and compare the models in Vercel AI Playground
Top comments (1)
Nice step by step walkthrough.
Didn't know about hugging face, it seems to have tons of resources.
Some comments may only be visible to logged-in visitors. Sign in to view all comments.