Skip to content
View vishvaRam's full-sized avatar

Block or report vishvaRam

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
vishvaRam/README.md

πŸ’« About Me - Generative AI Engineer

A highly passionate and results-driven AI Engineer specializing in Generative AI, Hybrid RAG, and Agentic AI Systems.
With 1.8 years of hands-on experience across the full AI project lifecycle, I design, build, optimize, and deploy scalable AI-driven solutions for real-world use cases.


πŸš€ Key Skills

  • Agentic AI & Orchestration: CrewAI, LangChain Agents, LangGraph (stateful & multi-step agents)
  • RAG Architectures: Hybrid RAG (Dense + Sparse), LightRAG, Traditional RAG Pipelines
  • Search & Retrieval: Elasticsearch (BM25 + Vector Search), Hybrid Semantic Search
  • LLM Inference & Optimization: Llama.cpp Server, vLLM, Ollama
  • Generative AI Applications: Conversational AI, Knowledge Assistants, Autonomous & Tool-Using Agents
  • Fine-tuning & Cloud Training: RunPod GPU Cloud, LoRA/QLoRA fine-tuning
  • Deployment & Frontend: Streamlit Apps, Docker, API-driven integrations
  • Cloud & Infra: AWS ECR, AWS ECS (Fargate/EC2), Task Definitions
  • Programming Languages: Python (core), focused on AI frameworks

πŸ’Ό Experience

  • 1.8+ years building and deploying LLM-powered systems in production environments.
  • Designed and implemented Hybrid RAG pipelines combining Elasticsearch BM25 + vector embeddings for high-precision retrieval.
  • Built LightRAG-based systems for low-latency, cost-efficient knowledge retrieval and reasoning.
  • Developed LangGraph-based agents for stateful workflows, tool execution, memory handling, and complex multi-step reasoning.
  • Architected and optimized RAG-powered chatbots, enterprise knowledge assistants, and multi-agent workflows using CrewAI, LangChain, and LangGraph.
  • Optimized inference pipelines with Llama.cpp, vLLM, and Ollama to reduce latency and infrastructure cost.
  • Built interactive Streamlit dashboards and conversational UIs for rapid experimentation and stakeholder demos.
  • Implemented end-to-end AWS ECR + ECS deployments with secure secrets management, autoscaling, and ALB routing.
  • Deployed and served fine-tuned models on RunPod GPU Cloud, leveraging LoRA/QLoRA for efficient training and inference.

🌍 Open Source

  • 🧠 LightRAG Contributor
    • Added a Google Gemini-based demo
    • Covers Naive, Local, Global, and Hybrid retrieval modes
    • Merged upstream into main branch
      πŸ”— HKUDS/LightRAG#2538

🌐 Socials

LinkedIn Instagram


πŸ’» Tech Stack

Python
LangChain
LangGraph
CrewAI
LightRAG
Elasticsearch
Llama.cpp
vLLM
Ollama
Streamlit
Docker
AWS ECR
AWS ECS
RunPod
PyTorch
TensorFlow


πŸ“Š GitHub Stats




Pinned Loading

  1. Structured-Output-Examples-for-LLMs Structured-Output-Examples-for-LLMs Public

    This repository demonstrates structured data extraction using various language models and frameworks. It includes examples of generating JSON outputs for name and age extraction from text prompts. …

    Python 2

  2. Data-Prep-for-LLM-fine-tuning Data-Prep-for-LLM-fine-tuning Public

    This repository helps prepare datasets for fine-tuning Large Language Models (LLMs). It includes tools for cleaning, formatting, and augmenting data to improve model performance. Designed for resea…

    Jupyter Notebook 1

  3. Blog-Writing-Agentic-RAG-CrewAI Blog-Writing-Agentic-RAG-CrewAI Public

    An automated blog writing system that leverages CrewAI to create high-quality, well-researched blog posts. The project implements a multi-agent workflow for researching topics, generating content, …

    Python

  4. Unsloth-FineTuning Unsloth-FineTuning Public

    Fine-tuning Qwen 2.5 3B on Reserve Bank of India (RBI) regulations using Unsloth for efficient training. Achieved 57.6% accuracy (8.2x improvement over base model).

    Jupyter Notebook