
Professional Expertise
- • ML Research Engineer with 3+ years pioneering work in distributed machine learning systems.
- • Designed and deployed enterprise-grade MLOps pipelines incorporating CI/CD workflows, serving 100K+ users with 99.9% system uptime.
- • Developed quantitative risk models using XGBoost and LightGBM in production systems.
- • Optimized high-performance APIs using FastAPI and gRPC, achieving sub-50ms latency in latency-sensitive AI applications.
News & Honors
Co-authored research paper with MIT Media Lab, Meta, and Stanford: "An Index-based Federated Registry Architecture for Secure, Capability-Aware Agent Discovery"
Selected as MIT NANDA Radius Fellow for 2025 - joining an elite cohort of researchers working on Nanda project.
Project accepted to the 2025 NSF AI-SDM Workshop on Human-AI Complementarity for Decision Making
Accepted to Algoverse AI Research Program 2025 - Selected among top candidates for advanced AI research initiative
Contributed to Terminal Bench project with Prof. Ludwig at Stanford - Developing benchmarking tools for AI terminal applications
Graduated from UC San Diego with Master's degree in Machine Learning
Published at AAAI 2024 (Stanford):
"Building Communication Efficient Asynchronous Peer-to-Peer Federated LLMs with Blockchain"
SB Balija, A Nanda, D Sahoo. Proceedings of the AAAI Symposium Series 3(1), 288-292
Professional Experience

- Developed an AI-powered financial risk engine using TensorFlow, XGBoost, and PyTorch Geometric, deploying real-time fraud detection and credit scoring for 100K+ customers in under 3 months
- Built a customer analytics system with SQL, Pandas, and Langchain, reducing data retrieval time from 30 minutes to under 5 minutes for financial reporting teams
- Built an Agentic AI chatbot using GPT-4o, RAG, and AutoGen on Django, cutting support workload by 120+ hrs/month and slashing response time by 65% via real-time semantic caching and multi-agent automation
- Deployed on serverless AWS with auto-scaling Kubernetes, achieving 99.9% uptime while reducing API costs by 30% through LoRA fine-tuning and Redis Vector Search

- Deployed a scalable backend microservice using Java, Spring Boot, and REST APIs within 6 months, improving data processing speed by 3x for enterprise clients
- Developed a real-time data visualization dashboard using JavaScript and Elasticsearch in 5 months, enabling real-time monitoring for over 500K events per day
- Implemented a secure authentication system using OAuth2, API Development, and system integration, reducing unauthorized access incidents by 60% in the first year
- Accelerated cloud-based deployment on GCP, reducing infrastructure costs by 25% while maintaining 79% uptime

- Pioneered a cutting-edge hybrid sparse attention mechanism combined with precision fine-tuning across three domain-specific datasets, enhancing task accuracy by 15% while slashing inference time by 50%
- Engineered ultra-efficient transformer pipelines with revolutionary memory optimization, reducing GPU memory consumption by over 4GB per training cycle
- Innovated AI-driven clustering techniques to streamline token embeddings, reducing pre-processing overhead by 40% and dramatically accelerating end-to-end model training
Featured Projects


Education

Relevant Coursework: Deep Generative Models, AI Learning Algorithms, Sensing and Estimation, Random Processes, Recommender Systems and Data Mining, Search and Optimization, Linear Algebra and its Applications, Introduction to Deep Learning and Applications, Programming for Data Analysis, Statistical Learning

Relevant Coursework: Algorithms, DBMS, Data Structures, Probabilistic Graphical Models, Artificial Intelligence
Technical Skills
Programming
- Python (Expert)
- Java
- JavaScript
- C++
- R
ML/AI
- Machine Learning
- Deep Learning
- NLP
- Federated Learning
- Transformers
Frameworks
- TensorFlow
- PyTorch
- XGBoost
- LangChain
- Hugging Face
Tools
- Kubernetes
- Docker
- Git
- CI/CD
- MLOps
© 2024 Sree Bhargavi Balija. All rights reserved.