Asynchronous Federated Retrieval-Augmented Generation
Supervisor: Rajshekar Kolichala
Author: N/A
Abstract
This thesis intends to design and deployment of an asynchronous Federated RAG system, a privacy-focused framework for medical QA trained on partitioned datasets and optimized for heterogeneous client environments. The work offers practical experience in the end-to-end development pipeline, including data partitioning and embedding, fine-tuning compact language models with triplet and cross-entropy losses, implementing RAG pipelines with vector databases, and applying staleness-aware aggregation strategies tailored for federated deployment. Participants will gain exposure to real-world system integration, applied federated machine learning workflows, and collaborative research in AI privacy and efficiency.
