Federated Learning

Short Definition

Federated Learning is a decentralized machine learning approach where models are trained collaboratively across multiple devices or organizations while keeping raw data local and private. Only model updates are shared with a central server, preserving data privacy and enabling learning from distributed sensitive data.

Full Definition

Federated Learning addresses one of the most critical challenges in modern AI: how to train models on sensitive data distributed across many devices or organizations without compromising privacy. Introduced by Google in 2016, the approach allows multiple participants to collaboratively train a shared model while each participant’s data remains on their local device. The process works in rounds: a central server sends the current model to participating devices, each device trains the model locally on its own data, only the model updates (gradients or weight differences) are sent back to the server, and the server aggregates these updates to improve the global model. This approach is fundamentally different from traditional centralized training where all data must be collected in one place. Federated learning has found significant adoption in privacy-sensitive domains. Google uses it to improve keyboard predictions on Android phones without uploading what users type. Apple uses it to improve Siri without centralizing voice data. Healthcare consortiums use it to train diagnostic models across hospitals without sharing patient records. Financial institutions use it for fraud detection models that benefit from collective intelligence without sharing customer data. Challenges include handling non-IID data distributions across devices, communication efficiency, dealing with unreliable or adversarial participants, and formal privacy guarantees through combination with differential privacy.

Technical Explanation

FedAvg (Federated Averaging) is the standard algorithm: in each round, server sends global model w_t to selected clients, each client k trains locally for E epochs on local data to get w_t^k, server aggregates: w_{t+1} = sum(n_k/n * w_t^k) where n_k is client k’s data size. Communication efficiency uses gradient compression, quantization, and reducing communication rounds. Differential privacy adds calibrated noise to updates: w_t^k + N(0, sigma^2*S^2/n_k) where S is the sensitivity bound. Non-IID challenges arise when local data distributions differ significantly. FedProx adds a proximal term to local objectives to reduce client drift. Secure aggregation uses cryptographic protocols so the server only sees the aggregate, not individual updates.

Use Cases

Privacy-preserving mobile keyboard prediction | Healthcare model training across hospitals | Financial fraud detection across institutions | Edge device model improvement | Cross-organizational AI collaboration | Smart home device personalization | Telecommunications network optimization | Insurance risk modeling

Advantages

Strong data privacy protection | Enables collaboration without sharing sensitive data | Reduces data transfer and storage costs | Complies with data protection regulations | Leverages distributed data sources | Works with edge devices and mobile phones

Disadvantages

Slower convergence than centralized training | Communication overhead between participants | Non-IID data distributions complicate training | Vulnerable to adversarial participants | Difficult to debug and monitor | Formal privacy guarantees add noise and reduce accuracy

Primary Keyword

Federated Learning

Schema Type

DefinedTerm

Last Verified Date

17/04/2026

Difficulty Level

Beginner