DEV Community

Wincent Balin
Wincent Balin

Posted on



After a pause, this series comes to a conclusion, mostly because of the rapid developments in the area of large language models.

Original intention

At the beginning I intended to create a language model, that would have gotten a prompt "Geschirrabwaschgesetz" (a law about washing dishes) and write me a corresponding law text in German.

I was discouraged from training the original char RNN because of the scary amount of training time with a 110 M training data. Therefore I went with fine-tuning a German GPT-2 (and later the better one; thanks Jo!). The fine-tuning process of such a model is described here or here, for example.

(Un-)expected discovery

I happened to discover that my intended case is covered perfectly by the LLAMA 2 Chat German model (almost, because of a few grammatical errors). This is very likely because of being fine-tuned with the German legal SQuAD dataset, among others.

I do not want to withhold the result from you (produced in LM Studio): Output to "Geschirrabwaschgesetz"

Just look at this beauty! It even defined "Hygiene" in the last subparagraph! And hence this series is concluded.

Top comments (0)