Page 5 - LARGE LANUGAGE MODEL AND SMALL LANUGAGE MODEL
P. 5
How does LLM Works
Large language models (LLMs) work through a step-by-step process that involves training and
inference. Here is a detailed explanation of how LLMs function.
Step I: Data collection
The first step in training an LLM is to collect a vast amount of textual data. This can be from
books, articles, websites, and other sources of written text. The more diverse and
comprehensive the dataset, the better the LLM’s understanding of language and the world is.
Step II: Tokenization
Once the training data is collected, it undergoes a process called tokenization. Tokenization
involves breaking down the text into smaller units called tokens. Tokens can be words,
subwords, or characters, depending on the specific model and language. Tokenization allows
the model to process and understand text at a granular level.
Step III: Pre-training
The LLM then undergoes pre-training, learning from the tokenized text data. The model learns
to predict the next token in a sequence, given the preceding tokens. This unsupervised
learning process helps the LLM understand language patterns, grammar, and semantics. Pre-
training typically involves a variant of the transformer architecture, which incorporates self-
attention mechanisms to capture relationships between tokens.