DEV Community

Sabah Shariq
Sabah Shariq

Posted on

A Novice Guide to Large Language Model (LLM)

What is Large Language Model?

A large language model is an artificial intelligence (AI) model that is trained to understand and generate human language text. For example, as a human if we want to express something we use sentence to express it. This sentence is a collection of words that putted together one after another. But for a computer it is just a string of characters arranged in a particular order.

In LLM the term “Large” here refers to the number of parameters that were used to train the model. Based on the data it is fed and trained on it posesses the capability to provide answers and solutions to any questions and problems. The larger the model, the more data it can process and the more complex patterns it can learn. This Large Language Model is built on Transformer architecture.

Example:

An AI system using large language models can learn from a database of short stories and then use that knowledge to generate new short stories like "Once upon a time".

LLM_Model

Pre-trained Transformers:

This means that the model’s wealth of knowledge is limited to the vast amount of data it has been trained with. If some data was not fed to the model, then it is not known to the model and therefore it cannot generate content around it or answer the question.

Attention Image

Transformers:

Transformers are a type of neural network capable of understanding the context of sequential data, such as sentences, by analyzing the relationships between the words.

The Transformer in NLP is a novel architecture (ANN) that aims to solve sequence-to-sequence tasks while handling long-range dependencies with ease. It relies entirely on "self-attention" to compute representations of its input and output WITHOUT using sequence-aligned RNNs or convolution.

Self-Attention:

Let us start with revisiting what attention is in the NLP universe?

Attention allowed us to focus on parts of our input sequence while we predicted our output sequence. In simpler terms, self-attention helps us create similar connections but within the same sentence. Look at the following example:

  • “I poured water from the bottle into the cup until it was full.” it => cup
  • “I poured water from the bottle into the cup until it was empty.” it=> bottle

By changing one word “full” to “empty” the reference object for “it” changed. If we are translating such a sentence, we will want to know what the word “it” refers to.

Using the above that we discussed LLM provide answers and solutions to any questions and problems. How? by using "Prompts"?

Prompts

Prompts are a set instructions that is given to the LLM by human in order to obtain a solution, to guide its behavior or to generate desired outputs.

Conclusion

Large language models are deep learning models that can be used to generating creative content to aiding in language translation, summarization, and much more, LLMs showcase the incredible potential of machine learning and deep neural networks.

Image Reference:

Top comments (0)