Sree Bhargavi Balija

Sree Bhargavi Balija

Machine Learning Research Engineer

Professional Expertise

  • ML Research Engineer with 3+ years pioneering work in distributed machine learning systems.
  • Designed and deployed enterprise-grade MLOps pipelines incorporating CI/CD workflows, serving 100K+ users with 99.9% system uptime.
  • Developed quantitative risk models using XGBoost and LightGBM in production systems.
  • Optimized high-performance APIs using FastAPI and gRPC, achieving sub-50ms latency in latency-sensitive AI applications.

News & Honors

🏆 MIT NANDA Radius Fellow 2025 🎓 UC San Diego ML Graduate 💻 Terminal Bench Contributor - Stanford 📜 AAAI 2024 Presenter - Stanford 🔬 Algoverse AI Research Fellow 💡 NSF AI-SDM Workshop Presenter - CMU 🧠 EleutherAI Open AI Research Fellow 🧬 BioSYS 2024 Presenter
🏆 Academic Excellence Award, IIT Hyderabad 🔭 Skill development Award, ServiceNow 🏅 IMMO Silver Medalist
JUL

Co-authored research paper with MIT Media Lab, Meta, and Stanford: "An Index-based Federated Registry Architecture for Secure, Capability-Aware Agent Discovery"

JUL

Selected as MIT NANDA Radius Fellow for 2025 - joining an elite cohort of researchers working on Nanda project.

JUN

Project accepted to the 2025 NSF AI-SDM Workshop on Human-AI Complementarity for Decision Making

MAY

Accepted to Algoverse AI Research Program 2025 - Selected among top candidates for advanced AI research initiative

MAY

Contributed to Terminal Bench project with Prof. Ludwig at Stanford - Developing benchmarking tools for AI terminal applications

JUN

Graduated from UC San Diego with Master's degree in Machine Learning

APR

Published at AAAI 2024 (Stanford): "Building Communication Efficient Asynchronous Peer-to-Peer Federated LLMs with Blockchain"
SB Balija, A Nanda, D Sahoo. Proceedings of the AAAI Symposium Series 3(1), 288-292

25+
Citations
7
Publications
5
Fellowships
3
Research Programs

Professional Experience

Machine Learning Engineer
Akdene Technologies
Mar 2025 – Present | USA
  • Developed an AI-powered financial risk engine using TensorFlow, XGBoost, and PyTorch Geometric, deploying real-time fraud detection and credit scoring for 100K+ customers in under 3 months
  • Built a customer analytics system with SQL, Pandas, and Langchain, reducing data retrieval time from 30 minutes to under 5 minutes for financial reporting teams
  • Built an Agentic AI chatbot using GPT-4o, RAG, and AutoGen on Django, cutting support workload by 120+ hrs/month and slashing response time by 65% via real-time semantic caching and multi-agent automation
  • Deployed on serverless AWS with auto-scaling Kubernetes, achieving 99.9% uptime while reducing API costs by 30% through LoRA fine-tuning and Redis Vector Search
Software Engineer
ServiceNow
Jun 2020 – Aug 2022 | India
  • Deployed a scalable backend microservice using Java, Spring Boot, and REST APIs within 6 months, improving data processing speed by 3x for enterprise clients
  • Developed a real-time data visualization dashboard using JavaScript and Elasticsearch in 5 months, enabling real-time monitoring for over 500K events per day
  • Implemented a secure authentication system using OAuth2, API Development, and system integration, reducing unauthorized access incidents by 60% in the first year
  • Accelerated cloud-based deployment on GCP, reducing infrastructure costs by 25% while maintaining 79% uptime
Lead Researcher - Advanced Transformer Optimization
US Meta Research Team
Apr 2024 | USA
  • Pioneered a cutting-edge hybrid sparse attention mechanism combined with precision fine-tuning across three domain-specific datasets, enhancing task accuracy by 15% while slashing inference time by 50%
  • Engineered ultra-efficient transformer pipelines with revolutionary memory optimization, reducing GPU memory consumption by over 4GB per training cycle
  • Innovated AI-driven clustering techniques to streamline token embeddings, reducing pre-processing overhead by 40% and dramatically accelerating end-to-end model training

Featured Projects

Featured
AutoPatch+
AutoPatch+
Next-gen AI code validation platform with hallucination detection for generative coding, showcased at MIT AI Summit 2025
PixPrompt
PixPrompt
Multimodal small LLM of size 125M parameters for efficient image-text generation
Federated LLMs
Peer-to-Peer Federated LLMs
Communication efficient asynchronous federated learning framework for LLMs with blockchain
CPTQuant
CPTQuant
Novel mixed precision quantization techniques for efficient deployment of large language models

Education

Master of Science in Machine Learning and Data Science
University of California, San Diego
Sep 2022 - Jun 2024

Relevant Coursework: Deep Generative Models, AI Learning Algorithms, Sensing and Estimation, Random Processes, Recommender Systems and Data Mining, Search and Optimization, Linear Algebra and its Applications, Introduction to Deep Learning and Applications, Programming for Data Analysis, Statistical Learning

Bachelor's in Engineering
Indian Institute of Technology, Hyderabad
Jul 2016 - May 2020

Relevant Coursework: Algorithms, DBMS, Data Structures, Probabilistic Graphical Models, Artificial Intelligence

Technical Skills

Programming

  • Python (Expert)
  • Java
  • JavaScript
  • C++
  • R

ML/AI

  • Machine Learning
  • Deep Learning
  • NLP
  • Federated Learning
  • Transformers

Frameworks

  • TensorFlow
  • PyTorch
  • XGBoost
  • LangChain
  • Hugging Face

Tools

  • Kubernetes
  • Docker
  • Git
  • CI/CD
  • MLOps

© 2024 Sree Bhargavi Balija. All rights reserved.