Ritik Bompilwar
AI Researcher & Engineer
Video Understanding
Multimodal AI
Agentic Harness
Hello, I am Ritik. I build AI systems that combine strong research with practical product development. My work spans agentic AI, multimodal models, video understanding, deep learning, and computer vision. My expertise spans:
Agentic Harness & AI Engineering: As part of an academic collaboration, I developed AI agents to optimize procurement operations for the Commonwealth of Massachusetts, contributing to work recognized with a NASPO award. I also built AdGen, a multimodal creative storytelling platform that lets users generate AI images and videos, and Ragnarok, an MCP server that enables secure web search for open-source LLMs.
Multi-modal Models & Video Understanding: My master's thesis focuses on pose-based temporal activity segmentation and shot progress in cricket. I built a 72,000-frame fine-grained video understanding dataset and developed a novel method for biomechanical analysis and shot recognition in cricket videos. My work, QualiVision, focused on quality evaluation of AI-generated videos and was featured on the challenge leaderboard. It was published at ICCVW alongside the official VQualA 2025 Challenge paper. I also built AdRec, a multimodal product recommendation system based on product images.
Deep Learning & Computer Vision: At Dalhousie University, I developed a machine vision system for weed detection in wild blueberry fields to help farmers reduce spraying costs. I built semi-supervised data-labeling workflows, benchmarked object-detection models, and deployed optimized models for real-time inference on edge devices. At Big Vision, I developed applied computer-vision systems for medical imaging and aerial object detection, and authored a three-part series on TensorFlow Lite model optimization for LearnOpenCV.