Unsupervised Learning
Short Definition
Full Definition
Unsupervised learning represents one of the three main paradigms of machine learning, alongside supervised and reinforcement learning. Unlike supervised learning, there are no labeled examples or correct answers to guide the training process. Instead, the algorithm must discover the inherent structure of the data on its own. This makes unsupervised learning both more challenging and more versatile, as it can reveal patterns that humans might not have anticipated. The most common unsupervised learning tasks include clustering (grouping similar data points together, as in customer segmentation), dimensionality reduction (compressing high-dimensional data while preserving important structure, as in PCA and t-SNE), anomaly detection (identifying unusual data points), and generative modeling (learning the underlying data distribution to generate new samples). Unsupervised learning plays a crucial role in modern AI beyond standalone applications. Self-supervised learning, which powers the pre-training of large language models like GPT and BERT, is closely related to unsupervised learning — the model creates its own supervisory signal from the structure of unlabeled data. Autoencoders and variational autoencoders learn compressed representations unsupervised. In practice, unsupervised learning is often used as a preprocessing step, helping to understand data structure before applying supervised methods.
Technical Explanation
Clustering algorithms partition data into groups: K-Means minimizes within-cluster sum of squares, DBSCAN finds density-connected regions, and hierarchical clustering builds a dendrogram. Dimensionality reduction techniques include PCA (finding orthogonal directions of maximum variance), t-SNE (preserving local structure in low dimensions), and UMAP (preserving both local and global structure). Generative models learn the data distribution p(x): GANs use adversarial training, VAEs maximize a variational lower bound on log-likelihood, and autoregressive models decompose into conditional distributions. The Expectation-Maximization algorithm fits mixture models by alternating between estimating cluster assignments and updating parameters.
Use Cases
Advantages
Disadvantages
Schema Type
Featured Snippet Candidate
Difficulty Level