Introduction
In the ever-evolving landscape of AI-powered tools, assistants for software development have carved a niche for themselves,...
For further actions, you may consider blocking this person and/or reporting abuse
Question: When you said GPT-4 do you mean latest GPT-4-turbo or original GPT-4?
Anyway I think you should include all GPT-3.5-turbo, GPT-4 and GPT-4-turbo.
Also I wrote recently something more high level but relevant to your post here:
Optimizing Codebases for AI Development Era
Dom Sipowicz ・ Feb 11
I've added the info in the article. The recent GTP-4 Turbo (gpt-4-0125-preview) has been used. GPT-3.5 didn't seem interesting to me as its performance is usually on par with the Open Source model. I might add it in the future.
Okay, that was interesting. But a question arises. Most developers work on laptops that do not have enough power to deploy models on local machine. It seems to me that most will do this on servers. And this raises the question: is it profitable at all? yes you get control but what is the financial side of the coin. For example GitHub Copilot costs $100 per year, how much will it cost to deploy your model for example CodeLlama70b? And I'm more than sure that it will be expensive. And I’m not justifying or trying to convince you of my point of view, but it seems to me that this is how things are. And until hardware becomes so powerful and accessible I think it’s too early to talk about how cool it is to deploy your assistant.
Analyzing the costs would need another article on its own, I think.
Mixtral8x7B can be run on Mac or a PC with a high-end GPU, none of those options come cheaply.
That's why I said that we should test our own use cases. I like the flexibility of the setup I described. You can, for example, use CodeLlama70B through a service like Togheter.ai on a regular basis but fire up a dedicated instance on your favorite cloud provider for Clients that would require the maximum privacy (imagine a Public Sector or a Healthcare agency).
Accessing the LLMs through the API is quite cheap (well, GPT4 is not so cheap really).
There are tons of other tests that could be made, but the time is limited. I hope others will share their own tests (for example it would be interesting thoroughly compare Github copilot, AWS CodeWhisperer, TabNine, ...)
You used GPT-4 via the OpenAI API, right? Not ChatGPT plus?
It's a very useful article, thanks!
Yes, all the LLMs are accessed through API. For the article, I've used OpenAI API for accessing GPT-4 but I've also tested the Azure OpenAI API (but not all models are available). Codellama has been accessed through together.ai and Mixtral8x7B through mistral.ai. Cost-wise, I've spent little less than 2$ for all the tests I've done (some of them multiple times) for this article. GPT-4 was the one using the largest part of that money.
Thanks! I know why you decided to test GPT-4 only, but think comparing to GPT-3.5 would be also very useful. Currently GPT-3.5_turbo is 20 times cheaper than GPT-4_preview, so if the performance is okay, would be a very budget-friendly solution.
I am currently learning JavaScript. At what point should start using a coding assistant
My view: you should use an assistant while learning.
Whatever you are trying to accomplish (for example solve an exercise) you can compare your solution with one suggested by the assistant. And if what the assistant suggests is wrong (it happens) all the better: trying to understand why something doesn't work is even more instructive than trying to understand why it works.
Also, a very instructive thing is to understand how to ask the right question to solve your problem. It may be simple for trivial things but the more you use an assistant, the more you'll
And you can always ask for more in-depth explanations. You may want to consider using services like Together.ai to access Codellama or use Mixtral8x7B directly from Mistral.ai. They will cost you a fraction of what GPT4 costs.
Of course, one should be tempted to leave the solution to the assistant using it without really understanding the outcome. But this is not an issue of using AI Assistants, I've seen too many "Stack Overflow programmers" around.
Any metrics for Claude?
It's some time I don't look at Claude, I'll check on them and might consider expanding the article.
EDIT: Bad luck. It's not available in my country :(
Such an assistant can be very helpful
Is there any solution for local llm with code assistant?
Thanks for sharing.