DEV Community

Cover image for 29+ AI projects you can build today
Anmol Baranwal Subscriber for CopilotKit

Posted on • Edited on

29+ AI projects you can build today

Today, we're covering 30 or more projects that you can build with AI.

All the projects are open source so you can contribute to make it better.

Some of the projects may have a large codebase but you can get inspiration from it and build a cool side project.

Trust me, if this list doesn't surprise you, nothing will :)

Let's jump in!

Image description

Star ⭐️


1. CopilotKit - AI Copilots for your product in hours.

copilotKit

 

Integrating AI features in React is tough, that's where Copilot comes into the picture. A simple and fast solution to integrate production-ready Copilots into any product!

You can integrate key AI features into React apps using two React components. They also provide built-in (fully-customizable) Copilot-native UX components like <CopilotKit />, <CopilotPopup />, <CopilotSidebar />, <CopilotTextarea />.

Get started with the following npm command.

npm i @copilotkit/react-core @copilotkit/react-ui
Enter fullscreen mode Exit fullscreen mode

Copilot Portal is one of the components provided with CopilotKit which is an in-app AI chatbot that can see the current app state and take action inside your app. It communicates with the app frontend and backend, as well as 3rd party services via plugins.

This is how you can integrate a Chatbot.

A CopilotKit must wrap all components which interact with CopilotKit. It’s recommended you also get started with CopilotSidebar (you can swap to a different UI provider later).

"use client";
import { CopilotKit } from "@copilotkit/react-core";
import { CopilotSidebar } from "@copilotkit/react-ui";
import "@copilotkit/react-ui/styles.css"; 

export default function RootLayout({children}) {
  return (
    <CopilotKit url="/path_to_copilotkit_endpoint/see_below">
      <CopilotSidebar>
        {children}
      </CopilotSidebar>
    </CopilotKit>
  );
}
Enter fullscreen mode Exit fullscreen mode

You can set up Copilot Backend endpoints using this quickstart quide.

After this, you can let Copilot take action. You can read on how to provide external context. You can do so using useMakeCopilotReadable and useMakeCopilotDocumentReadable react hooks.

"use client";

import { useMakeCopilotActionable } from '@copilotkit/react-core';

// Let the copilot take action on behalf of the user.
useMakeCopilotActionable(
  {
    name: "setEmployeesAsSelected", // no spaces allowed in the function name
    description: "\"Set the given employees as 'selected'\","
    argumentAnnotations: [
      {
        name: "employeeIds",
        type: "array", items: { type: "string" }
        description: "\"The IDs of employees to set as selected\","
        required: true
      }
    ],
    implementation: async (employeeIds) => setEmployeesAsSelected(employeeIds),
  },
  []
);
Enter fullscreen mode Exit fullscreen mode

You can read the docs and check the demo video.

You can integrate Vercel AI SDK, OpenAI APIs, Langchain, and other LLM providers with ease. You can follow this guide to integrate a chatbot into your application.

The basic idea is to build AI Chatbots in minutes that can be useful for LLM-based applications.

The use cases are huge, and as developers, we should definitely try to use CopilotKit in our next project.

CopilotKit has 5.7k+ Stars on GitHub with 200+ releases meaning they're constantly improving.

Star CopilotKit ⭐️


2. AgentGPT - Assemble, configure, and deploy autonomous AI Agents.

agentGPT

 

AgentGPT allows you to configure and deploy Autonomous AI agents.

It will attempt to reach the goal by thinking of tasks to do, executing them, and learning from the results :)

It is built using:

  • Bootstrapping: create-t3-app + FastAPI-template.
  • Framework: Nextjs 13 + Typescript + FastAPI
  • Auth: Next-Auth.js
  • ORM: Prisma & SQLModel.
  • Database: Planetscale.
  • Styling: TailwindCSS + HeadlessUI.
  • Schema Validation: Zod + Pydantic.
  • LLM Tooling: Langchain.

Get started with this guide to install it locally.

You can see the demo of the app and check live website.

demo

They have 29k+ stars on GitHub and are on the v1 release.

Star AgentGPT ⭐️


3. Private GPT - ask questions about your documents without the internet.

private GPT

 

PrivateGPT is a production-ready AI project that allows you to ask questions about your documents using the power of Large Language Models (LLMs), even in scenarios without an internet connection.

100% private meaning no data leaves your execution environment at any point.

The API is divided into two logical blocks:

a. High-level API, which abstracts all the complexity of a RAG (Retrieval Augmented Generation) pipeline implementation:

  • Ingestion of documents: internally managing document parsing, splitting, metadata extraction, embedding generation, and storage.
  • Chat & Completions using context from ingested documents: abstracting the retrieval of context, the prompt engineering, and the response generation.

b. Low-level API, which allows advanced users to implement their complex pipelines:

  • Embeddings generation: based on a piece of text.
  • Contextual chunks retrieval: given a query, returns the most relevant chunks of text from the ingested documents.

You can read the installation guide to get started.

You can read the docs and the detailed architecture that is involved.

PrivateGPT is now evolving towards becoming a gateway to generative AI models and primitives, including completions, document ingestion, RAG pipelines, and other low-level building blocks.

They have 51k+ Stars on GitHub and evolving at a rapid pace.

Star Private GPT ⭐️


4. Instrukt - Integrated AI in the terminal.

instrukt

 

Instrukt is a terminal-based AI-integrated environment. It offers a platform where users can:

  • Create and instruct modular AI agents.
  • Generate document indexes for question-answering.
  • Create and attach tools to any agent.

Instruct them in natural language and, for safety, run them inside secure containers (currently implemented with Docker) to perform tasks in their dedicated, sandboxed space.

Built using Langchain, Textual, and Chroma.

Get started with the following command.

pip install instrukt[all]
Enter fullscreen mode Exit fullscreen mode

instrukt

There are a lot of exciting features such as:

  • A terminal-based interface for power keyboard users to instruct AI agents without ever leaving the keyboard.
  • Index your data and let agents retrieve it for question-answering. You can create and organize your indexes with an easy UI.
  • Index creation will auto-detect programming languages and optimize the splitting/chunking strategy accordingly.
  • Run agents inside secure docker containers for safety and privacy.
  • Integrated REPL-Prompt for quick interaction with agents, and a fast feedback loop for development and testing.
  • You can automate repetitive tasks with custom commands. It also has a built-in prompt/chat history.

You can read about all the features.

You can read the installation guide.

You can also debug and introspect agents using an in-built IPython console which is a neat little feature.

console debugging

Instrukt is licensed with an AGPL license meaning that it can be used by anyone for whatever purpose.

It is safe to say that Instrukt is a Terminal AI Commander at your fingertips.

It is a new project so they have around 200+ stars on GitHub but the use case is very good.

Star Instrukt ⭐️


5. Voice Assistant on Mac - Your voice-controlled Mac assistant.

gpt automator

 

Your voice-controlled Mac assistant. GPT Automator lets you perform tasks on your Mac using your voice. For example, opening applications, looking up restaurants, and synthesizing information. Awesome :D

It was built during the London Hackathon.

It has two main parts:

a. Voice to command: It generates the command using Whisper running locally (a fork of Buzz).

b. Command to Action: You give the command to a LangChain agent equipped with custom tools we wrote. These tools include controlling the operating system of the computer using AppleScript and controlling the active browser using JavaScript. Finally, like any good AI, we have the agent speak out the final result using AppleScript saying "{Result}" (try typing "Hello World!" into your Mac terminal if you haven’t used it before”).

A custom tool we made to have the LLM control the computer using AppleScript. The prompt is the docstring:

@tool
def computer_applescript_action(apple_script):
    """
    Use this when you want to execute a command on the computer. The command should be in AppleScript.

    Here are some examples of good AppleScript commands:

    Command: Create a new page in Notion
    AppleScript: tell application "Notion"
        activate
        delay 0.5
        tell application "System Events" to keystroke "n" using {{command down}}
    end tell

    ...

    Write the AppleScript for the Command:
    Command: 
    """
    p = subprocess.Popen(['osascript', '-'], stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.PIPE)

    stdout, stderr = p.communicate(applescript.encode('utf-8'))

    if p.returncode != 0:
        raise Exception(stderr)

    decoded_text = stdout.decode("utf-8")

    return decoded_text
Enter fullscreen mode Exit fullscreen mode

If you are wondering how it works, GPT Automator converts your audio input to text using OpenAI's Whisper. Then, it uses a LangChain Agent to choose a set of actions, including generating AppleScript (for desktop automation) and JavaScript (for browser automation) commands from your prompt using OpenAI's GPT-3 ("text-davinci-003") and then executing the resulting script.

Just remember, this is not for production use. This project executes code generated from natural language and may be susceptible to prompt injection and similar attacks. This work was made as a proof-of-concept.

You can read the installation guide.

Let's look at some of the prompts and what it will do:

✅ Find the result of a calculation.

Prompt: "What is 2 + 2?"

It will write AppleScript to open up a calculator and type in 5 * 5.

✅ Find restaurants nearby.

Prompt: "Find restaurants near me"

It will open up Chrome, google search for a restaurant nearby, parse the page and then return the top results. Sometimes it’s cheeky, and instead will open up the Google Maps result and say “The best restaurants are the ones at the top of the page on Google Maps”. Other times it opens the top link on Google - and gets stuck on the Google accessibility page…

Here’s what’s printed to the terminal as it runs:

Command: Find a great restaurant near Manchester.

> Entering new AgentExecutor chain...
 I need to search for a restaurant near Manchester.
Action: chrome_open_url
Action Input: https://www.google.com/search?q=restaurant+near+Manchester
Observation:

Thought: I need to read the page
Action: chrome_read_the_page
Action Input: 
Observation: Accessibility links
Skip to main content
... # Shortned for brevity
Dishoom Manchester
4.7
(3.3K) · £££ · Indian
32 Bridge St · Near John Rylands Library
Closes soon ⋅ 11 pm
Stylish eatery for modern Indian fare
San Carlo
4.2
(2.8K) · £££ · Italian
42 King St W · Near John Rylands Library
Closes soon ⋅ 11 pm
Posh, sceney Italian restaurant
Turtle Bay Manchester Northern Quarter
4.7

Thought: I now know the final answer
Final Answer: The 15 best restaurants in Manchester include El Gato Negro, Albert's Schloss, The Refuge, Hawksmoor, On The Hush, Dishoom, Banyan, Zouk Tea Room & Grill, Edison Bar, MyLahore Manchester, Turtle Bay Manchester Northern Quarter, San Carlo, The Black Friar, Mana, and Tast Cuina Catalana.
Enter fullscreen mode Exit fullscreen mode

I cannot guarantee that those restaurants are worth it, visit at your own risk. haha!

✅ If you ask GPT Automator to wipe your computer it will.

Yes, it will wipe your computer if you ask!
My inner self screaming to do it :)

 

You can see the full demo here!

 

You can read more on Chidi's blog.

It is more like a side project so they have around 200 stars on GitHub but it is very cool.

Star GPT Automator ⭐️


6. Flowise - Drag & drop UI to build your customized LLM flow.

flowiseai

 

Flowise is an open source UI visual tool to build your customized LLM orchestration flow & AI agents.

Get started with the following npm command.

npm install -g flowise
npx flowise start
OR
npx flowise start --FLOWISE_USERNAME=user --FLOWISE_PASSWORD=1234
Enter fullscreen mode Exit fullscreen mode

This is how you integrate the API.

import requests

url = "/api/v1/prediction/:id"

def query(payload):
  response = requests.post(
    url,
    json = payload
  )
  return response.json()

output = query({
  question: "hello!"
)}
Enter fullscreen mode Exit fullscreen mode

integrations

You can read the docs.

flowise AI

Cloud host is not available so you would have to self-host using these instructions.

Let's explore some of the use cases:

  • Let's say you have a website (could be a store, an e-commerce site, or a blog), and you want to scrap all the relative links of that website and have LLM answer any question on your website. You can follow this step-by-step tutorial on how to achieve the same.

scraper

  • You can also create a custom tool that will be able to call a webhook endpoint and pass in the necessary parameters into the webhook body. Follow this guide which will be using Make.com to create the webhook workflow.

webhook

There are a lot of other use cases such as building a SQL QnA or interacting with API.

FlowiseAI has 27.5k+ Stars on GitHub and has more than 10k forks so it has a good overall ratio.

Star Flowise ⭐️


7. Twitter Agent - Scrape data from social media and chat with it using Langchain.

twitter agent

 

Media Agent scrapes Twitter and Reddit submissions, summarizes them, and chats with them in an interactive terminal. Such a cool concept!

You can read the instructions to install it locally.

It is built using:

  • Langchain 🦜 to build and compose LLMs.
  • ChromaDB to store vectors (a.k.a embeddings) and query them to build conversational bots.
  • Tweepy to connect to your Twitter API and extract Tweets and metadata.
  • Praw to connect to Reddit API.
  • Rich to build a cool terminal UX/UI.
  • Poetry to manage dependencies.

Some of the awesome features:

  • Scrapes tweets/submissions on your behalf either from a list of user accounts or a list of keywords.
  • Embeds the tweets/submissions using OpenAI.
  • Creates a summary of the tweets/submissions and provides potential questions to answer.
  • Opens a chat session on top of the tweets.
  • Saves the conversation with its metadata.
  • A rich terminal UI and logging features.

You can watch the demo!

 

It has close to 100 stars on GitHub and it isn't maintained anymore. You can use it to build something better.

Star Twitter Agent ⭐️


8. GPT Migrate - Easily migrate your codebase from one framework or language to another.

GPT Migrate

 

If you've ever faced the pain of migrating a codebase to a new framework or language, this project is for you.

I think we can all agree that we need this at some point. You can do it using a workflow as well which Stripe did as far as I remember to convert their entire JS codebase to TS.

Migration is a costly, tedious, and non-trivial problem.
Do not trust the current version blindly and please use it responsibly. Please also be aware that costs can add up quickly as GPT-Migrate is designed to write (and potentially re-write) the entirety of a codebase.

You can install it using Poetry and read on how it works.

Please note.

GPT-Migrate is currently in development alpha and is not yet ready for production use. For instance, on the relatively simple benchmarks, it gets through "easy" languages like Python or JavaScript without a hitch ~50% of the time, and cannot get through more complex languages like C++ or Rust without some human assistance.

You can watch the demo here!

GPT Migrate

They have 6.5k+ stars on GitHub and the last commit was 6 months ago so I don't think it's maintained anymore!

Star GPT Migrate ⭐️


9. Plandex - AI coding engine for building complex, real-world software with LLMs.

plandex

 

Plandex uses long-running agents to complete tasks that span multiple files and require many steps. It breaks up large tasks into smaller subtasks, then implements each one, continuing until it finishes the job.

It helps you churn through your backlog, work with unfamiliar technologies, get unstuck, and spend less time on the boring stuff.

You can see the demo here!

The changes are accumulated in a protected sandbox so that you can review them before automatically applying them to your project files. Built-in version control allows you to easily go backward and try a different approach. Branches allow you to try multiple approaches and compare the results.

You can manage context efficiently in the terminal. Easily add files or entire directories to context, and keep them updated automatically as you work so that models always have the latest state of your project.

Plandex relies on the OpenAI API and requires an OPENAI_API_KEY env variable.

Plandex supports Mac, Linux, FreeBSD, and Windows. It runs from a single binary with no dependencies.

You can even try different models and model settings, and then compare the results.

You can read the installation instructions.

Plandex Cloud is the easiest and most reliable way to use Plandex. You'll be prompted to start an anonymous trial (no email required) when you create your first plan with plandex new. Trial accounts are limited to 10 plans and 10 AI model replies per plan. Plandex Cloud accounts are free for now so that's a good thing.

Plandex has 8k+ stars on GitHub and is built using Go.

Star PLandex ⭐️


10. SQL Translator - a tool for converting natural language queries into SQL code using AI.

sql translator

 

I was trying to build a similar project and found that it already exists.

This tool is designed to make it easy for anyone to translate SQL (Structured Query Language) commands into natural language and vice versa.

SQL is a programming language used to manage and manipulate data in relational databases, and while it's a powerful tool, it can also be quite complex and difficult to understand.

On the other hand, natural language is the language that we speak and write in everyday life, and it's often the preferred way to communicate for people who are not familiar with technical jargon.

With the SQL and Natural Language Translator, you don't need to be an SQL expert to understand what's going on in your database or to write SQL queries. You can simply type in your query in natural language and get the corresponding SQL code, or vice versa.

Some of the features are:

  • Dark mode.
  • Lowercase/uppercase toggle.
  • Copy to clipboard.
  • SQL syntax highlighting.
  • Schema awareness (beta).
  • Query history.

You can read the installation structions and it's very simple since it uses Nextjs.

This query is for YOU. haha!

cool query

SQL Translator has 4k stars on GitHub and is built using TypeScript.

Star SQL Translator ⭐️


11. WingmanAI - Real-time transcription of audio, integrated with ChatGPT.

WingmanAI

 

WingmanAI is a powerful tool for interacting with real-time transcription of both system and microphone audio. Powered by ChatGPT, this tool lets you interact in real time with the transcripts as an extensive memory base for the bot, providing a unique communication platform.

The bot can answer questions about past conversations when you load the transcripts for a designated person.

You can read the installation instructions.

You can watch the demo here!

demo

Some of the neat features are:

  • WingmanAI can transcribe both system output and microphone input audio, allowing you to view the live transcription in an easy-to-read format.

  • You can chat with a ChatGPT-powered bot that reads your transcripts in real time.

  • The bot maintains a record of the conversation but in a token-efficient manner, as only the current chunk of the transcript is passed to the bot.

  • WingmanAI allows you to save transcripts for future use. You can load them up anytime later, and any query made to the bot will be cross-referenced with a vector database of the saved transcript, providing the bot with a richer context.

  • You can keep appending to the saved transcripts, building a vast database over time for the bot to pull from.

It has 420 stars on GitHub and isn't maintained anymore.

Star WingmanAI ⭐️


12. Lively - allows users to set animated desktop wallpapers and screensavers.

lively

 

This is just for fun and we can learn a lot using the code on how it's done.

You can see this video on how insane it looks.

custom

They provide three types of wallpapers including video/GIF, Webpage, and Application/Games.

It is built on C# and some of the cool features that lively supports:

  1. Lively can be controlled with command line arguments from the terminal. You can integrate this with other languages like Python or scripting software AutoHotKey.

  2. Powerful set of API for developers to create interactive wallpapers. Get hardware readings, audio graphs, music information, and more.

  3. Wallpaper playback pauses when fullscreen applications/games run on the machine (~0% CPU, GPU usage).

  4. You can also leverage Machine Learning inference to create dynamic wallpapers. You can predict the distance from the camera of any 2D image and generate a 3D-like parallax effect. Cool :D

I've seen a lot of people using it and many of them aren't even aware that it's open source.

You can download it using installer or through microsoft store.

It was the winner of 2023 on the Microsoft Store.
It has 13k+ Stars on GitHub with 60 releases.

Star Lively ⭐️


13. RestGPT - LM-based autonomous agent controlling apps via RESTful APIs.

RestGPT

 

This work aims to construct a large language model-based autonomous agent, RestGPT, to control real-world applications, such as movie databases and music players. To achieve this, we connect LLMs with RESTful APIs and tackle the practical challenges of planning, API calls, and response parsing. To fully evaluate the performance of RestGPT, we propose RestBench, a high-quality benchmark that consists of two real-world scenarios and human-annotated instructions with gold solution paths.

RestGPT adopts an iterative coarse-to-fine online planning framework and uses an executor to call RESTful APIs. Here is an overview of RestGPT.

working

You can read the docs to evaluate the performance of RestGPT using RestBench.

An example of using the TMDB movie database to search for the number of movies directed by Sofia Coppola.

example

You can read the research paper for code that is published under Cornell University: RestGPT - Connecting Large Language Models with Real-World RESTful APIs.

They have 1.2k Stars on GitHub and it isn't something very huge but covers an excellent use case.

Star RestGPT ⭐️


14. ChatFiles - Upload your file and have a conversation with it.

ChatFiles

 

Document Chatbot — multiple files and powered by GPT / Embedding. You can upload any documents and have a conversation with it, the UI is very good considering they have used another famous open source project for it.

It uses Langchain and Chatbot-ui under the hood. Built using Nextjs, TypeScript, Tailwind, and Supabase (Vector DB).

If you're wondering about the approach and the technical architecture, then here it is!

architecture

The environment is only for trial and supports a maximum file size of 10 MB which is a drawback, if you want a bigger size then you can install it locally.

They have provided starter questions that you can use. You can check the live demo.

They have 3k stars on GitHub and are on the v0.3 release.

Star ChatFiles ⭐️


15. MindsDB - The platform for customizing AI from enterprise data.

MindsDB

 

MindsDB is the platform for customizing AI from enterprise data.

With MindsDB, you can deploy, serve, and fine-tune models in real-time, utilizing data from databases, vector stores, or applications, to build AI-powered apps - using universal tools developers already know.

With MindsDB and its nearly 200 integrations to data sources and AI/ML frameworks, any developer can use their enterprise data to customize AI for their purpose, faster and more securely.

how MindsDB works

You can read the docs and quickstart guide to get started.

They currently support a total of 3 SDKs that is using using Mongo-QL, Python, and JavaScript.

There are several applications of MindsDB such as integrating with numerous data sources and AI frameworks so you can easily bring data and AI together to create and automate custom workflows.

The other common use cases include fine-tuning models, chatbots, alert systems, content generation, natural language processing, classification, regressions, and forecasting. Read more about the use cases and each of them has an architecture diagram with a little info.

use cases

For instance, the chatbot architecture diagram with MindsDB. You can read about all the solutions provided along with their SQL Query examples.

// SQL Query Example for Chatbot
CREATE CHATBOT slack_bot USING database='slack',agent='customer_support'; 
Enter fullscreen mode Exit fullscreen mode

chatbot

Just to tell you about the overall possibilities, you can check out How to Forecast Air Temperatures with AI + IoT Sensor Data. Exciting right :)

mindsdb

They have 21k+ stars on GitHub and are on the v24.4.3.0 with more than 200 releases. By the way, this is the first time I've seen 4 parts in any release as I always followed the semantic release.

Star MindsDB ⭐️


16. Quivr - your GenAI Second Brain.

quivr

 

Quivr, your second brain, utilizes the power of GenerativeAI to be your personal assistant! Think of it as Obsidian, but turbocharged with AI capabilities.

stats

You can read the installation guide.

You can read the docs and see the demo video.

They could provide a better free tier plan but it's more than enough to test things on your end.

It has 30k+ Stars on GitHub with 220+ releases which means they're constantly improving.

Star Quivr ⭐️


17. Animated Drawings - A Method for Animating Children's Drawings of the Human Figure.

animated drawings

 

I mean WOW! Such a cool concept. I don't know about you but I'm damn excited.

This is an open source project by Facebook mainly for research purposes and contains an implementation of the algorithm described in the paper, A Method for Animating Children's Drawings of the Human Figure.

This project has been tested with macOS Ventura 13.2.1 and Ubuntu 18.04. If you're installing on another operating system, you may encounter issues.

They strongly recommend activating a Python virtual environment before installing Animated Drawings.

Read more on the installation instructions and how to quickly get started.

You can follow this complete guide to animate your drawing including how to add multiple characters in scenes, adding a background image and more exciting things.

They have 10k+ stars on GitHub and are solely for research purposes with an MIT license.

Star Animated Drawings ⭐️


18. Background Remover - lets you Remove Background from images and video using AI with a simple CLI.

background remover

 

This is a command line tool to remove background from images and videos using AI.

Get started by installing backgroundremover from pypi.

pip install --upgrade pip
pip install backgroundremover
Enter fullscreen mode Exit fullscreen mode

It is also possible to run this without installing it via pip, just clone the git to locally start a virtual env install requirements and run.

Some of the commands that you can use:

  • Remove the background from a local file image
backgroundremover -i "/path/to/image.jpeg" -o "output.png"
Enter fullscreen mode Exit fullscreen mode
  • remove the background from the local video and overlay it over an image
backgroundremover -i "/path/to/video.mp4" -toi "/path/to/videtobeoverlayed.mp4" -o "output.mov"
Enter fullscreen mode Exit fullscreen mode

You can check all the commands that you can use with CLI.

You can even use it as a library.

from backgroundremover.bg import remove
def remove_bg(src_img_path, out_img_path):
    model_choices = ["u2net", "u2net_human_seg", "u2netp"]
    f = open(src_img_path, "rb")
    data = f.read()
    img = remove(data, model_name=model_choices[0],
                 alpha_matting=True,
                 alpha_matting_foreground_threshold=240,
                 alpha_matting_background_threshold=10,
                 alpha_matting_erode_structure_size=10,
                 alpha_matting_base_size=1000)
    f.close()
    f = open(out_img_path, "wb")
    f.write(img)
    f.close()
Enter fullscreen mode Exit fullscreen mode

You can read the installation instructions and see the live demo.

The input vs The Output.

input image

They have 6k stars on GitHub and we can definitely learn some crucial concepts using this.

Star Background Remover ⭐️


19. Lobe Chat - modern-design LLMs/AI chat framework.

lobe chat

 

An open-source, modern-design ChatGPT/LLMs UI/Framework.
Supports speech-synthesis, multi-modal, and extensible (function call) plugin systems. You can deploy your private OpenAI with one click.

journey

Let's see some of the exciting features of LobeChat:

✅ Multi-Model Service Provider Support.

multi service

They have expanded our support to multiple model service providers, rather than being limited to a single one, to offer users a more diverse and rich selection of conversations.

Find the complete list of 10+ model service providers that they support.

✅ Assistant Market.

Assistant Market

In LobeChat's Assistant Market, creators can discover a vibrant and innovative community that brings together numerous carefully designed assistants. These assistants not only play a crucial role in work scenarios but also provide great convenience in the learning process. Here, everyone can contribute their wisdom and share their personally developed assistants.

market

There are so many awesome applications there. WOW!

✅ Model Vision Recognition.

Model Vision Recognition

LobeChat now supports large language models with visual recognition capabilities such as OpenAI's gpt-4-vision, Google Gemini Pro vision, and Zhipu GLM-4 Vision, enabling LobeChat to have multimodal interaction capabilities. Users can easily upload or drag and drop images into the chat box, and the assistant will be able to recognize the content of the images and engage in intelligent conversations based on them, creating more intelligent and diverse chat scenarios.

✅ Text to Image Generation.

Text to Image Generation

Supporting the latest text-to-image generation technology, LobeChat now enables users to directly utilize the Text-to-image tool during conversations with the assistant. By harnessing the capabilities of AI tools such as DALL-E 3, MidJourney, and Pollinations, assistants can now transform your ideas into images.

✅ Local Large Language Model (LLM) Support.

Local Large Language Model (LLM) Support.

With the powerful infrastructure of Ollama AI and the community's collaborative efforts, you can now engage in conversations with a local LLM (Large Language Model) in LobeChat!

By running the following Docker command, you can experience conversations with a local LLM in LobeChat.

docker run -d -p 3210:3210 -e OLLAMA_PROXY_URL=http://host.docker.internal:11434/v1 lobehub/lobe-chat
Enter fullscreen mode Exit fullscreen mode

✅ Progressive Web App (PWA).

Progressive Web App (PWA)

They have adopted Progressive Web App PWA technology, which is a modern web technology that elevates web applications to a near-native app experience. Through PWA, LobeChat can provide a highly optimized user experience on both desktop and mobile devices, while maintaining lightweight and high-performance characteristics.

✅ Custom Themes.

custom themes

LobeChat places a strong emphasis on personalized user experiences in its interface design and thus introduces flexible and diverse theme modes, including a light mode for daytime and a dark mode for nighttime.

In addition to theme mode switching, we also provide a series of color customization options, allowing users to adjust the application's theme colors according to their preferences.

 

Read about all of the features and use cases.

You can self-host or deploy it using docker. The ecosystem of lobe chat provides 4 packages: lobehub/ui, lobehub/icons, lobehub/tts, and lobehub/lint.

They also provide plugins market where you can find lots of useful plugins that can be used to introduce new function calls and even new ways to render message results. If you want to develop your own plugin, refer to 📘 Plugin Development Guide in the wiki.

plugins market

You can read the docs.

You can check the live demo. It's pretty cool!

demo snapshot

They have 28k+ stars on GitHub with more than 500 releases.

Star Lobe Chat ⭐️


20. Microagents - Agents Capable of Self-Editing Their Prompts.

microagents

 

It's an experimental framework for dynamically creating self-improving agents in response to tasks.

Microagents represent a new approach to creating self-improving agents. Small, microservice-sized (hence, microagents) agents are dynamically generated in response to tasks assigned by the user to the assistant, assessed for their functionality, and, upon successful validation, stored for future reuse.

This enables learning across chat sessions, enabling the system to independently deduce methods for task execution.

This is built using Python, OpenAI's GPT-4 Turbo and Text-Embedding-Ada-002.

You can read the installation instructions. They have mentioned that you should have an OpenAI account with access to gpt-4-turbo and text-embedding-ada-002.

Let's see an example of fetching a Weather Forecast Agent.

You are an adept weather informant. Fetch the weather forecast by accessing public API data using this Python code snippet:

``python
import requests
import json

def fetch_weather_forecast(location, date):
    response = requests.get(f"https://api.met.no/weatherapi/locationforecast/2.0/compact?lat={location[0]}&lon={location[1]}")
    weather_data = response.json()
    for day_data in weather_data['properties']['timeseries']:
        if date in day_data['time']:
            print(day_data['data']['instant']['details'])
            break
``
# Example usage: fetch_weather_forecast((47.3769, 8.5417), '2024-01-22T12:00:00Z')
Note: Replace the (47.3769, 8.5417) with the actual latitude and longitude of the location and the date string accordingly.
Enter fullscreen mode Exit fullscreen mode

If you're wondering how agents are created, then this architectural diagram explains it.

diagram

You can see the working demo.

They have around 700 stars on GitHub and are worth checking out.

Star Microagents ⭐️


21. GPT-4 & LangChain - GPT4 & LangChain Chatbot for large PDF docs.

chat architecture

 

This can be used for the new GPT-4 API to build a chatGPT chatbot for multiple Large PDF files.

The system is built using LangChain, Pinecone, Typescript, OpenAI, and Next.js. LangChain is a framework that simplifies the development of scalable AI/LLM applications and chatbots. Pinecone serves as a vector store for storing embeddings and your PDFs in text format, enabling the retrieval of similar documents later on.

You can read the development guide that involved cloning, installing dependencies, and setting up environments API keys.

You can see the YouTube video on how to follow along and use this.

They have 14k+ Stars on GitHub with just 34 commits. Try it out in your next AI app!

Star GPT-4 &amp; LangChain ⭐️


22. Buzz - transcribes and translates audio offline on your personal computer.

buzz

 

Transcribe and translate audio offline on your personal computer using the power of OpenAI's Whisper.

Buzz is even on the App Store. Get a Mac-native version of Buzz with a cleaner look, audio playback, drag-and-drop import, transcript editing, search, and much more.

You can read the installation instructions.

Exciting features:

  • Import audio and video files and export transcripts to TXT, SRT, and VTT (Demo).
  • Transcription and translation from your computer's microphones to text (Resource-intensive and may not be real-time.
  • It's available on Mac, Windows, and Linux.
  • There is also an option of CLI.

See the demo here!

 

You can read the docs.

They have almost 10k stars on GitHub and are still maintained since the last commit was 2 weeks ago.

Star Buzz ⭐️


23. Deepgram - Build Voice AI into your apps.

deepgram

 

From startups to NASA, Deepgram APIs are used to transcribe and understand millions of audio minutes every day. Fast, accurate, scalable, and cost-effective.

It provides speech-to-text and audio intelligence models for developers.

deepgram options

Even though they have a freemium model, the limits on the free tier are sufficient to get you started.

The visualization is next-level. You can check live streaming response, or audio files and compare the intelligence levels of audio.

streaming

sentiment analysis

You can read the docs.

You can also read a sample blog by Deepgram on How to Add Speech Recognition to Your React and Node.js Project.

If you want to try the APIs to see for yourself with flexibility in models, do check out their API Playground.

Star Deepgram ⭐️


24. OpenDevin - Code Less, Make More.

opendevin

opendevin

 

This an open source project aiming to replicate Devin, an autonomous AI software engineer who is capable of executing complex engineering tasks and collaborating actively with users on software development projects. This project aspires to replicate, enhance, and innovate upon Devin through the power of the open source community.

Just to let you know, this was way before Devin was introduced.

You can read the installation instructions with the requirements.

They use LiteLLM, so you can run OpenDevin with any foundation model, including OpenAI, Claude, and Gemini under the hood.

You can see the demo and contributing guidelines if you're looking to contribute to OpenDevin.

It has 10.7k+ Stars on GitHub and is growing at a rapid pace.

Star OpenDevin ⭐️


25. NPM Copilot - CLI tool for Next.js that can analyze logs in real-time.

npm copilot

 

npm/yarn/pnpm copilot is a command-line tool that uses OpenAI's GPT-3 language model to provide suggestions for fixing errors in your code.

The CLI tool detects the project type and package manager being used in the current directory.
It then runs the appropriate development server command (e.g., npm run dev, yarn run dev, pnpm run dev) and listens for logs generated by the running application.
When an error is encountered, the CLI tool provides suggestions for error fixes in real-time.

Get started by installing the npm-copilot package with the following npm command.

npm install -g npm-copilot
Enter fullscreen mode Exit fullscreen mode

The CLI tool will begin monitoring the logs generated by the Next.js application and provide suggestions for error fixes in real time.

You can use this command to use it in the project.

npm-copilot
Enter fullscreen mode Exit fullscreen mode

They have 338 stars on GitHub and support Next,js, React, Angular, and Vue.js.

Star NPM Copilot ⭐️


26. Mentat - The AI Coding Assistant.

mentat

 

Mentat is the AI tool that assists you with any coding task, right from your command line.

Unlike Copilot, Mentat coordinates edits across multiple locations and files. And unlike ChatGPT, Mentat already has the context of your project - no copy and pasting required!

You can watch this demo to understand a basic overview.

You can read the installation instructions or watch a tutorial to install if you need help.

You can read the docs.

They have 2.3k stars on GitHub and are on the v1 release.

Star Mentat ⭐️


27. FlowGPT - generate flowcharts with AI.

flowgpt

 

FlowGPT is a tool to generate a flowchart with ai (gpt-3.5).

It's built using Next.js, Langchain, Mermaid, and DaisyUI.

You can read the installation instructions.

You can check the gif demo.

It has only 11 commits but has 238 stars on GitHub and was built using TypeScript. It's worth checking out as a minor project.

Star FlowGPT ⭐️


28. reor - Self-organizing AI note-taking app.

reor

 

One of the most exciting projects that I've seen so far, especially because it runs models locally.

Reor is an AI-powered desktop note-taking app: it automatically links related notes, answers questions on your notes, and provides semantic search.

Everything is stored locally and you can edit your notes with an Obsidian-like markdown editor. The project hypothesizes that AI tools for thought should run models locally by default.

Reor stands on the shoulders of the giants Ollama, Transformers.js & LanceDB to enable both LLMs and embedding models to run locally. Connecting to OpenAI or OpenAI-compatible APIs like Oobabooga is also supported.

I know you're wondering How can it possibly be self-organizing?

a. Every note you write is chunked and embedded into an internal vector database.
b. Related notes are connected automatically via vector similarity.
c. LLM-powered Q&A does RAG on the corpus of notes.
d. Everything can be searched semantically.

You can watch the demo here!

demo

One way to think about Reor is as a RAG app with two generators: the LLM and the human. In Q&A mode, the LLM is fed retrieved-context from the corpus to help answer a query.

Similarly, in editor mode, the human can toggle the sidebar to reveal related notes "retrieved" from the corpus. This is quite a powerful way of "augmenting" your thoughts by cross-referencing ideas in a current note against related ideas from your corpus.

You can read the docs and download from the website. Mac, Linux & Windows are all supported.

They have also provided starter guides so they can help you get started.

get started guides

They have 4.2k stars on GitHub and are built using TypeScript.

Star reor ⭐️


29. Amica - allows you to easily chat with 3D characters in your browser.

amica

 

Amica is an open source interface for interactive communication with 3D characters with voice synthesis and speech recognition.

You can import VRM files, adjust the voice to fit the character, and generate response text that includes emotional expressions.

They use three.js, OpenAI, Whisper, Bakllava for vision, and many more. You can read on How Amica Works with the core concepts involved.

You can clone the repo and use this to get started.

npm i

npm run dev
Enter fullscreen mode Exit fullscreen mode

You can read the docs and see the demo and it's pretty damn awesome :D

demo

You can watch this short video on what it can do.

Amica uses Tauri to build the desktop application.

They have 400+ Stars on GitHub, and it seems very easy to use.

Star Amica ⭐️


30. Continue - enable you to create an AI software development system.

continue

 

Continue to keep developers in the flow. Our open-source VS Code and JetBrains extensions enable you to easily create your own modular AI software development system that you can improve.

They have a lot of awesome features so let's see about some of those:

Easily understand code sections.

code sections

Tab to autocomplete code suggestions.

autocomplete

Ask questions about your codebase.

questions

Quickly use documentation as context.

docs

Understand terminal errors immediately.

errors

Kick off actions with slash commands.

commands

Refactor functions where you are coding.

refactor

Read about all the features.

You will have to install the VSCode extension from the marketplace and then read the quickstart guide.

You can read the docs.

They have 10k+ stars on GitHub and are built using TypeScript.

Star Continue ⭐️


I've never covered so many projects in so much detail!
I hope this will help you create something inspirational.

Please share more projects or anything you want that others can learn from!

Have a great day! Till next time.

I create tech content to help others grow 1% daily so you can follow me on Twitter and LinkedIn to get daily insights.

If you like this kind of stuff,
please follow me for more :)
profile of Twitter with username Anmol_Codes profile of GitHub with username Anmol-Baranwal profile of LinkedIn with username Anmol-Baranwal

Follow Copilotkit for more content like this.

Top comments (25)

Collapse
 
artydev profile image
artydev

Speechless :-)
Will there be an AI fatigue ?

Collapse
 
anmolbaranwal profile image
Anmol Baranwal

Loved the term, haha!
Let's hope not.

Collapse
 
qingyicode profile image
QINGYICODE

aa

Collapse
 
qingyicode profile image
QINGYICODE

aa

Collapse
 
akmojahid profile image
Mujahid Ahmed

That's a big list, but really useful 💞

Collapse
 
jitendrachoudhary profile image
Jitendra Choudhary

Awesome, this is an exhausting list

Collapse
 
milkymaru profile image
junyiwang

This checklist is awesome! Thanks for sharing. I've been using an AI coding tool called MarsCode recently, and it's been helpful.

Collapse
 
anmolbaranwal profile image
Anmol Baranwal

Yep, a promising tool like Cursor but it isn't open source.
Still, thanks for sharing!

Collapse
 
axiommanifold profile image
simon • Edited

nice! thank you O.O

Collapse
 
ferguson0121 profile image
Ferguson

Cool

Collapse
 
benjamin00112 profile image
Benjamin

Wow, that's a lot haha...

Collapse
 
anmolbaranwal profile image
Anmol Baranwal

Thanks Benjamin!
I tried to cover these so that developers can find a lot of useful ones at one place.
Most of these are very new, like Instrukt. It took me a lot of time to find good projects.
I realize now that this is very long, haha!

Collapse
 
herberthk profile image
herberthk • Edited

I'm speechless but AMICA made my day thank you.

Collapse
 
jeremiah_the_dev_man profile image
Jeremiah

Nice one Anmol! 🎯

Collapse
 
anmolbaranwal profile image
Anmol Baranwal

Thanks, Jeremiah.
I know this is quite lengthy, so maybe you could save it to your reading list and come back to it later.

Collapse
 
uliyahoo profile image
uliyahoo

Another incredible list. Great job man!

Collapse
 
anmolbaranwal profile image
Anmol Baranwal

Thanks Uli :)

Some comments may only be visible to logged-in visitors. Sign in to view all comments. Some comments have been hidden by the post's author - find out more