A highly passionate and results-driven AI Engineer specializing in Generative AI, Hybrid RAG, and Agentic AI Systems.
With 1.8 years of hands-on experience across the full AI project lifecycle, I design, build, optimize, and deploy scalable AI-driven solutions for real-world use cases.
- Agentic AI & Orchestration: CrewAI, LangChain Agents, LangGraph (stateful & multi-step agents)
- RAG Architectures: Hybrid RAG (Dense + Sparse), LightRAG, Traditional RAG Pipelines
- Search & Retrieval: Elasticsearch (BM25 + Vector Search), Hybrid Semantic Search
- LLM Inference & Optimization: Llama.cpp Server, vLLM, Ollama
- Generative AI Applications: Conversational AI, Knowledge Assistants, Autonomous & Tool-Using Agents
- Fine-tuning & Cloud Training: RunPod GPU Cloud, LoRA/QLoRA fine-tuning
- Deployment & Frontend: Streamlit Apps, Docker, API-driven integrations
- Cloud & Infra: AWS ECR, AWS ECS (Fargate/EC2), Task Definitions
- Programming Languages: Python (core), focused on AI frameworks
- 1.8+ years building and deploying LLM-powered systems in production environments.
- Designed and implemented Hybrid RAG pipelines combining Elasticsearch BM25 + vector embeddings for high-precision retrieval.
- Built LightRAG-based systems for low-latency, cost-efficient knowledge retrieval and reasoning.
- Developed LangGraph-based agents for stateful workflows, tool execution, memory handling, and complex multi-step reasoning.
- Architected and optimized RAG-powered chatbots, enterprise knowledge assistants, and multi-agent workflows using CrewAI, LangChain, and LangGraph.
- Optimized inference pipelines with Llama.cpp, vLLM, and Ollama to reduce latency and infrastructure cost.
- Built interactive Streamlit dashboards and conversational UIs for rapid experimentation and stakeholder demos.
- Implemented end-to-end AWS ECR + ECS deployments with secure secrets management, autoscaling, and ALB routing.
- Deployed and served fine-tuned models on RunPod GPU Cloud, leveraging LoRA/QLoRA for efficient training and inference.
- π§ LightRAG Contributor
- Added a Google Gemini-based demo
- Covers Naive, Local, Global, and Hybrid retrieval modes
- Merged upstream into main branch
π HKUDS/LightRAG#2538
