DEV Community

Cover image for We trained a small 1.6b code model that reaches 32% HumanEval
Refact AI for Refact AI

Posted on • Originally published at refact.ai

We trained a small 1.6b code model that reaches 32% HumanEval

Today we're introducing Refact LLM: 1.6B code model with infill real-time code completion (including fill-in-the-middle(FIM) capability) and chat.
Refact LLM achieves the state-of-the-art performance among the code LLMs, coming closer to HumanEval as Starcoder, being 10x smaller in size, and it beats other code models such as StableCode, CodeGen and ReplitCode on HumanEval metric.

Summary:

  • 1.6b parameters
  • 20 programming languages
  • 4096 tokens context
  • code completion and chat capabilities
  • SoTA on HumanEval benchmark among similar code models
  • pre-trained on permissive licensed code and available for commercial use
Model Model Size HumanEval pass@1
DeciCoder-1b 1b 19.1%
Refact-1.6-fim 1.6b 32.0%
StableCode 3b 20.2%
ReplitCode v1 3b 21.9%
CodeGen2.5-multi 7b 28.4%
CodeLlama 7b 33.5%
StarCoder 15b 33.6%

The base model was trained on our own set of code with permissive licenses only and open text datasets (the text to code ratio was 50:50). In total, we trained our base model on 1.2T tokens of code on our cluster.

The model was then fine-tuned with open code instruction-following datasets filtered for quality and a synthetic dataset based on The Stack dedup v1.1 to improve FIM and boosting the base model performance.

You can read more about the architecture decisions that we made in the blog post.

We aim for the model to be accessible to everyone, we're releasing the model for commercial use under BigScience OpenRAIL-M license and making the weight available on HuggingFace.

While the trend recently was for the model sizes to get bigger, we wanted to lower barriers to entry and make it a versatile tool for developers with varying hardware setups. With the smaller size, running the model is much faster and affordable than ever: the model can be served on most of all modern GPUs requiring just 3Gb RAM and works great for real-time code completion tasks.

Refact LLM can be easily integrated into existing developers workflows with an open-source docker container and VS Code and JetBrains plugins. With Refact's intuitive user interface, developers can utilize the model easily for a variety of coding tasks. Finetune is available in the self-hosting (docker) and Enterprise versions, making suggestions more relevant for your private codebase.

Refact 1.6B LLM is the third model in the family of our code models, with CodeContrast 3b and CodeContrast 0.3b released previously. We aim to continue with our research and future updates to improve the LLM's performance and capabilities. We would love to get community contributions and feedback to enhance the model further. For any questions and ideas, please visit our Discord.

Top comments (0)