Research
My research interests primarily include multi-modal learning, audio-visual learning, and agentic systems. Below are some of my recent works.
|
Other Projects
Here are some other projects that I've worked on as part of my coursework or personal interest.
|
|
Beyond Text: An LLM Agent Approach to Multimodal Reference-Guided Image Editing
Advait Gupta,
Rishie Raj,
Nithin Skantha Murugan
CMSC848M, 2025
paper
A novel LLM-driven agentic framework that handles indirect, multimodal instructions without dedicated agent retraining.
|
|
A Novel Approach for Detecting AI-Generated Images in Zero-Shot Setting
Nithin Skantha Murugan,
Krishna Taduri,
Rishie Raj
CMSC848K, 2024
paper
A novel approach for detecting AI-generated images based on cross-perplexity and perplexity computations using autoregressive image generation models.
|
|
Parameterized Defogging Network for Object Detection in Adverse Weather Conditions
Rishie Raj,
Uthappa Madettira,
Nathan Nussbaumer,
Sahaj Singh,
Lucas Leitao
CMSC472, 2024
paper
/
poster
A small convolutional neural network model designed to predict the parameters of differentiable image processing functions with the aim to defog the input images.
|
|
Real Time Semantic Segmentation using Efficient Neural Network
Rishie Raj,
Uthappa Madettira
ENPM673, 2024
paper
/
slides
An implementation of Efficient Neural Network (E-Net), finetuned for real-time semantic segmentation, which is specifically designed for tasks requiring low latency operation.
|
|
Project NavGuide: A Swarm Navigation Robot System for Warehouse Applications
Rishie Raj,
Uthappa Madettira
ENPM700, 2024
code
/
video
A design and implementation of a swarm multi-robot system for warehouse applications, aimed at improving safety and efficiency in material handling operations.
|
|