1 minute read

The notes aims to provide a systematic and concise introduction in Q&A form to the fundamentals of Large Language Models, serving as a knowledge base for beginners or interviewees. It is motivated by the fact that existing general ML FAQs repos(1, 2, 3) are not up-to-date in Natural Language Processing tasks, due to fast advancements in LLMs.

After reading this blog, you should be able to understand the following diagrams.

pic
Applications of LLM as a service. Credit: Nicole Choi
pic
Diagram of training Large Language model. Credit: Chip Huyen
pic
Improving LLM Performance . Credit: Miguel Carreira Neves
pic
Diagram of knowledge distillation. Credit: Xu et. al 2024

1. what is word embedding?

“embedding” originates from mathematics, referring to how one instance of a structure/space/set is contained within another instance. As a method of natural language processing, word embedding maps the space of natural language into a normed vector space, such as Euclidean space. It is a numerical representation of words and phrases that aims to capture not only the meaning of words and phrases, but also reflect the semantic relationships by their norm-induced distance.

Such requirement of word embedding, referred to as linguistic regularities (Mikolov et al, 2013) depends on a manifold hypothesis on the space of natural language, assuming that high dimensional numerical representations of words or phrases tend to lie near a low dimensional manifold.

More read: 1 2

Tags:

Categories:

Updated: