Skip to content
View nikJ13's full-sized avatar
🎯
Focusing
🎯
Focusing

Block or report nikJ13

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
nikJ13/README.md

Hi there, I'm Niket Jain πŸ‘‹

MS Computational Data Science @ CMU (GPA 4.03) | ML Intern @ Honeywell | NeurIPS 2025 Datasets Track | Building scalable LLM inference engines & agentic systems

πŸš€ Featured Experience

Role Key Impact Tech Stack
Honeywell ML Intern (Jun-Aug 2025) 2x faster sentence-transformer inference via ONNX + Kubernetes HPA; 35% better RAG w/ multimodal agent; Hackathon winner: 41% maintenance cost reduction ONNX, vLLM, Kubernetes, MCP
CMU Research Assistant (Prof. Carolyn Rose, Jan-May 2025) 40x speedup in image-guided code editing w/ QwenVL-32B agentic workflow; 150-instance dataset + SOTA baselines (0.7686 similarity) vLLM, QwenCoder-32B, GPT-4o
UBS Software Engineer (Jul 2022-Jul 2024) 87.5% faster doc processing w/ GPT-3.5 OCR/ETL; 60% data latency reduction via Kafka pipelines OpenAI APIs, Kafka, Java

πŸ› οΈ Core Skills

Languages & Databases

Python  β€’  Java  β€’  SQL  β€’  C++  β€’  Scala  β€’  R
MySQL  β€’  PostgreSQL  β€’  MongoDB  β€’  Redis

ML Frameworks & Infrastructure

PyTorch  β€’  Hugging Face  β€’  vLLM  β€’  TensorFlow  β€’  ONNX
Kubernetes  β€’  Docker  β€’  AWS  β€’  Ray  β€’  LangChain

DevOps & Tools

Git  β€’  SLURM  β€’  Helm  β€’  Terraform  β€’  FastAPI  β€’  Kafka
MLflow  β€’  Databricks  β€’  Modal  β€’  CUDA

πŸ”₯ Featured Projects

CMU, Dec 2025 | PyTorch, vLLM, Kubernetes

300 concurrent requests on 2x A100s with dynamic batching from first principles. Task routing: Self-Consistency (MMLU), Tool-use (Graph), Greedy (InfoBench).

CMU, Dec 2025 | C++, CUDA, Python

PyTorch-like framework with custom CPU/GPU backends for sparse ops. Optimized efficient GNN training and inference.

NeurIPS 2025 Datasets Track | Python, Hugging Face, Ray

Benchmarked 8 data attribution methods across Pythia-1B to Llama-3.1-8B. Hugging Face leaderboard with 70% evaluation burden reduction.

Jupyter Notebook | vLLM, LangChain, FastAPI

Scalable multi-agent orchestration framework for complex workflows and distributed inference.

πŸ“š Education & Publications

Carnegie Mellon University β€” MS Computational Data Science (Aug 2024 - Dec 2025)

  • GPA: 4.0/4.0
  • TA: Mathematical Foundations of ML, Interactive Data Science
  • Coursework: LLM Methods, Deep Learning Systems, LM Inference, Cloud Computing

NeurIPS 2025 β€” Data Attribution Benchmark for LLMs (Datasets & Benchmarks Track) [Paper Link]

🌐 Let's Connect

LinkedIn Google Scholar Email


πŸ” Open to ML Engineer roles in inference, distributed systems, & LLM infrastructure | Always excited to discuss deep learning systems and scalable ML solutions!

Last updated: December 2025

Pinned Loading

  1. Sportsera Sportsera Public

    This web application enables the user to book a sporting venue of their choice, which are recommended based on the safety guidelines published by the Government. This application also provides a fe…

    JavaScript 1

  2. Agentic-Inference-Systems Agentic-Inference-Systems Public

    Jupyter Notebook

  3. Inference-Engine-for-Heterogeneous-LLM-Workloads Inference-Engine-for-Heterogeneous-LLM-Workloads Public

    Python

  4. KV-Caching-and-Speculative-Decoding KV-Caching-and-Speculative-Decoding Public

    Python

  5. Language-Model-Math-Solver-via-Code-Generation Language-Model-Math-Solver-via-Code-Generation Public

    TeX

  6. LLM-Inference-Strategies LLM-Inference-Strategies Public

    Python