Page 7 - LARGE LANUGAGE MODEL AND SMALL LANUGAGE MODEL
P. 7
Step VIII: Beam search
During the inference phase, LLMs often employ a technique called beam search to
generate the most likely sequence of tokens. Beam search is a search algorithm that
explores several possible paths in the sequence generation process, keeping track of
the most likely candidates based on a scoring mechanism. This approach helps
generate more coherent and high-quality text outputs.
Step IX: Response generation
LLMs generate responses by predicting the next token in the sequence based on the
input context and the model’s learned knowledge. Generated responses can be
diverse, creative, and contextually relevant, mimicking human-like language
generation.