<hello world/>

I'm Nirajan Paudel

ML Engineer |

M.S. Computer Science @ CU Boulder
Building intelligent systems with AI, Cloud & NLP

Nirajan Paudel
scroll

// About Me

I'm a Master's student in Computer Science at the University of Colorado Boulder, passionate about building intelligent systems that solve real-world problems.

My journey spans from developing AI-driven chatbots to teaching programming fundamentals as a teaching assistant. I thrive at the intersection of Machine Learning, Cloud Infrastructure, and NLP.

Will humans be able to preserve their culture and language in AI models that persist longer than humans themselves?

Research Focus

  • Multilingual NLP & Low-resource Languages
  • Cross-lingual Transfer Learning
  • LLM Adaptation & Fine-tuning
  • Computer Vision & Autonomous Systems

// Projects

Audio RAG Assistant

Transcribes audio with OpenAI Whisper and enables intelligent Q&A using Retrieval-Augmented Generation.

Whisper RAG Python

digitalME Personal Assistant

Personal AI chatbot for natural conversations, document retrieval, and database queries.

Python NLP RAG

PDF Chat Assistant

Upload PDFs and chat with them using Mistral-7B-Instruct for natural language document interaction.

Mistral LLM Python

Nepali Image Captioning

Transformer-based model generating paragraph-length Nepali captions with Inception V3 feature extraction.

Transformer CNN NLP

Tour Recommender - Pokhara

Content-based recommendation system for Pokhara destinations using descriptions, genres, and keywords.

Python ML NLP

Music Separation as a Service

Scalable microservices on GKE for AI-driven music source separation. Processes MP3s into 4 instrumental stems with async pipeline.

GKE Redis Flask Demucs

Overview

The system processes user-uploaded MP3s through an asynchronous pipeline, separating them into four distinct instrumental stems (vocals, drums, bass, other) for download.

Key Achievements

  • Infrastructure Migration: Spearheaded critical migration from self-hosted MinIO to Google Cloud Storage (GCS), boosting data durability from ephemeral state to 99.99%. Eliminated single point of failure and reduced operational overhead.
  • Asynchronous Processing: Engineered async processing queue using Redis, decoupling lightweight Flask REST API from resource-intensive ML worker. Reduced API response time to under 200ms while handling ML inference tasks averaging 2-5 minutes per song.
  • Scalability & Cost-Efficiency: Designed worker pool for horizontal scaling on GKE. Architecture capable of processing hundreds of songs per hour by dynamically adjusting worker pod replicas based on queue depth.
  • Resource Management: Implemented precise Kubernetes resource requests and limits (6Gi RAM for Demucs worker pods). Prevented pod evictions from memory spikes and ensured predictable performance.
  • Real-time Frontend: Developed dynamic frontend with JavaScript providing real-time job status updates. UI transitions from upload → processing → download links seamlessly.

Skills Demonstrated

Cloud: Google Kubernetes Engine (GKE), Google Cloud Storage, container orchestration

Backend: Flask, Redis, asynchronous processing, microservices architecture

ML/AI: Demucs model deployment, resource optimization for ML workloads

// Skills & Courses

Languages

Python C++ C SQL Java

AI & ML

TensorFlow PyTorch LangChain/Langgraph Scikit-learn Transformers CNNs RNNs

Cloud & DevOps

AWS GCP Kubernetes Docker Terraform GitHub Actions

Tools & Frameworks

FastAPI Flask Django Git Pandas NumPy

Graduate Coursework

CSCI 5253 Data Center Scale Computing
CSCI 5448 Object Oriented Analysis and Design

// Publications

// Blog