Location: Göttingen, Germany (work authorization available)
Remote: Yes (preferred)
Willing to relocate: Yes (anywhere in Germany, open to EU)
Technologies: Python, Go, PyTorch, vLLM, TensorRT, Kubernetes, ElasticSearch, Vespa, Apache Beam, PySpark, AWS, GCP
Résumé/CV: https://bluenotebook.io/about
Email: nikhil.kasukurthi [at] gmail [dot] com
ML Engineer with 8 years building production AI systems.
Most recently Lead Data Scientist at Eka.care (healthcare, 100K+ doctors), I Deployed a Speech LLM (Whisper + Gemma 2) via custom vLLM plugins, cutting inference costs 60%.
Built medical search from query log analysis through query decomposition on ElasticSearch (nDCG@10 +55%, relevance +160%).
Designed MedAssist, an agentic LLM platform with MCP, adopted by Apollo Hospitals. Open-sourced KARMA (https://github.com/eka-care/KARMA-OpenMedEvalKit), an LLM evaluation library for Indian healthcare scenarios.
Architected model serving on Kubernetes (vLLM, RayServe, TensorRT), cutting costs 50% vs SageMaker.
Before that: search ranking at Udaan (India's largest B2B marketplace, +10% conversion via A/B-tested LTR models), research at NCBS-TIFR (published in Bioinformatics), and clinical AI at SigTuple (retinal disease detection through CE certification).
3 peer-reviewed papers (IEEE ISBI, Bioinformatics). Recently completed Stanford CS336 (LLMs from scratch) with distributed training on H100 clusters.
Looking for: Senior/Staff ML Engineer, Applied Scientist, or AI Engineer roles. Especially interested in search/retrieval, LLM infrastructure, or applied ML.
ML Engineer with 8 years building production AI systems. Most recently Lead Data Scientist at Eka.care (healthcare, 100K+ doctors), I Deployed a Speech LLM (Whisper + Gemma 2) via custom vLLM plugins, cutting inference costs 60%. Built medical search from query log analysis through query decomposition on ElasticSearch (nDCG@10 +55%, relevance +160%). Designed MedAssist, an agentic LLM platform with MCP, adopted by Apollo Hospitals. Open-sourced KARMA (https://github.com/eka-care/KARMA-OpenMedEvalKit), an LLM evaluation library for Indian healthcare scenarios. Architected model serving on Kubernetes (vLLM, RayServe, TensorRT), cutting costs 50% vs SageMaker.
Before that: search ranking at Udaan (India's largest B2B marketplace, +10% conversion via A/B-tested LTR models), research at NCBS-TIFR (published in Bioinformatics), and clinical AI at SigTuple (retinal disease detection through CE certification). 3 peer-reviewed papers (IEEE ISBI, Bioinformatics). Recently completed Stanford CS336 (LLMs from scratch) with distributed training on H100 clusters. Looking for: Senior/Staff ML Engineer, Applied Scientist, or AI Engineer roles. Especially interested in search/retrieval, LLM infrastructure, or applied ML.
LinkedIn: https://linkedin.com/in/nikhil-kasukurthi GitHub: https://github.com/nikhil-kasukurthi