Few-Shot Learning
Short Definition
Full Definition
Few-shot learning addresses one of the most significant practical challenges in machine learning: the need for large labeled datasets. While humans can learn to recognize a new type of bird from seeing just one or two examples, traditional machine learning models typically require thousands or millions of labeled examples. Few-shot learning aims to bridge this gap. The field has two main branches. In computer vision, meta-learning approaches like Prototypical Networks learn to compare new examples with class prototypes, MAML (Model-Agnostic Meta-Learning) learns initializations that adapt quickly to new tasks, and Siamese Networks learn similarity functions between pairs of examples. In NLP, few-shot learning has been revolutionized by large language models. GPT-3 demonstrated that providing a few examples in the prompt (in-context learning) enables the model to perform new tasks remarkably well. This in-context few-shot capability has become the primary way users interact with LLMs — showing the model what you want through examples rather than fine-tuning on large datasets. Few-shot learning has enormous practical value: it enables AI deployment in domains where labeled data is expensive (medical imaging), rare (manufacturing defects), or constantly changing (new product categories). It also powers rapid prototyping, allowing developers to test AI solutions with minimal data investment before committing to full-scale data collection.
Technical Explanation
Prototypical Networks compute class prototypes as the mean embedding of support examples: c_k = (1/|S_k|) * sum(f_theta(x_i)), then classify queries by distance to prototypes: P(y=k|x) = softmax(-d(f_theta(x), c_k)). MAML learns initialization theta that can be quickly adapted: theta’_i = theta – alpha * gradient(L_Ti(theta)) for each task T_i. Siamese Networks learn a similarity function: sim(x_1, x_2) = sigma(|f(x_1) – f(x_2)|). For LLMs, in-context learning provides k examples in the prompt: ‘Input: X1 -> Output: Y1, Input: X2 -> Output: Y2, Input: X_new -> Output:’. Performance typically improves with more examples (1-shot < 3-shot < 5-shot) but plateaus beyond a certain point.
Use Cases
Advantages
Disadvantages
Schema Type
Featured Snippet Candidate
Difficulty Level