Professional Overview
I am a Senior AI Engineer with 4+ years of experience building and scaling production-grade AI systems, specializing in large language models (LLMs), retrieval-augmented generation (RAG), and intelligent agent architectures. I have hands-on expertise in fine-tuning foundation models using techniques like QLoRA, LoRA, and RL-based optimization, along with deploying high-performance inference systems using vLLM and TensorRT-LLM. My work focuses on delivering real-world impact—optimizing latency, reducing costs, and improving model accuracy at scale.