Pranay Gupta

I am an MSR student at the Robotics Institute at Carnegie Mellon University, where I am co-advised by Prof. Henny Admoni and Prof. David Held Previously I was a PreDoc Apprentice at TCS Research, working under Ms. Ramya Hebbalaguppe and Dr. Rahul Narain. I did my undergraduate studies in computer science from IIIT Hyderabad, where I worked as a research assistant at CVIT under the guidance of Dr. S Ravi Kiran .

At CMU, I have been working on problems in the domain of asssistive driving. At TCS, I worked on the problem of 3-D single view reconstruction. At CVIT, I have worked on problems related to skeleton based action recognition and zero shot and generalised zero shot skeleton action recognition. I spent the summer of 2020 working as an Applied Scientist intern at Amazon India, where I worked on semantic text similarity using Bert based siamese networks. I have also worked under the supervision of Dr. Manish Gupta on Knowledge aware video question answering.

Email  /  CV  /  Google Scholar  /  Twitter  /  Github

profile photo
Research

I am currently working on developing methods for real-time Situational Awareness (SA) estimation of a driver using their eye-gaze. Prior to this I worked on the problem of estimating an object's importance for making a safe driving decision. My research interests revolve around human robot interaction and human nonverbal behaviour understanding. I am also interested in exploring multimodal learning.

Object Importance Estimation using Counterfactual Reasoning for Intelligent Driving
Pranay Gupta, Abhijat Biswas, Henny Admoni, David Held
IEEE RA-L (under review)
Project Page / Code & Dataset / arXiv

The ability to identify important objects in a complex and dynamic driving environment is essential for autonomous driving agents to make safe and efficient driving decisions. It also helps assistive driving systems decide when to alert drivers. We tackle object importance estimation in a data-driven fashion and introduce HOIST - Human-annotated Object Importance in Simulated Traffic. HOIST contains driving scenarios with human-annotated importance labels for vehicles and pedestrians. We additionally propose a novel approach that relies on counterfactual reasoning to estimate an object's importance. We generate counterfactual scenarios by modifying the motion of objects and ascribe importance based on how the modifications affect the ego vehicle's driving. Our approach outperforms strong baselines for the task of object importance estimation on HOIST. We also perform ablation studies to justify our design choices and show the significance of the different components of our proposed approach.

News KVQA:Knowledge-Aware News Video Question Answering
Pranay Gupta, Manish Gupta
PAKDD, 2022
Dataset / arXiv

In this paper, we explore knowledge-based question answering in the context of news videos. To this end, we curate a new dataset with over 1M multiple-choice question-answer pairs. Using this dataset, we propose a novel approach, NEWSKVQA (Knowledge-Aware News Video Question Answering) which performs multi-modal inferencing over textual multiple-choice questions, videos, their transcripts and knowledge base

Quo Vadis, Skeleton action recognition?
Pranay Gupta, Anirudh Thatipelli, Aditya Aggarwal, Shubh Maheshwari, Neel Trivedi,
Sourav Das, Sarvadevabhatla, Ravi Kiran
IJCV, Special Issue on Human pose, Motion, Activities and Shape in 3D, 2021
project page / code / arXiv

In this paper, we study current and upcoming frontiers across the landscape of skeleton-based human action recognition. We introduce skeletics-152, a large scale into-the-wild skeleton action dataset. We extend out analysis to out of context actions by introducing Skelton-Mimetics dataset. Finally we introduce Metaphorics, a dataset with caption-style annotated YouTube videos of the popular social game Dumb Charades and interpretative dance performances. We benchmark state-of-the-art models on the NTU-120 dataset and provide multi-layered assessment of the results.

Syntactically Guided Generative Embeddings for Zero-Shot Skeleton Action Recognition
Pranay Gupta, Divyanshu Sharma, Sarvadevabhatla, Ravi Kiran
ICIP, 2021
project page / code / arXiv

In this paper, we study the effect of learning Part of Speech aware generative embeddings for zero shot and generalised zero shot skelton action recognition.

Service
Teaching Assistant, Computer Vision course, Spring 2020




Design and source code from Jon Barron's website