September 2024,
FlexCap: Describe Anything in Images in Controllable Detail is accepted at Conference of Neural Information Processing Systems (NeurIPS) 2024. |
🦾🤖📚we’ve been exploring the landscape of foundational models in robotics—unveiling insights on current trends and open challenges. A must-read for those interested in the path towards general-purpose robotics. #Robotics #FoundationModels #SurveyPaper https://t.co/VziYf3VScn
— Vidhi Jain (@viddivj) December 16, 2023
The winning team from the OVMM competition has a writeup now: https://t.co/Kk9kizAvb4 pic.twitter.com/tZDyHEaDmv
— Chris Paxton (@chris_j_paxton) December 20, 2023
One (sad?) takeaway for me: when we see planning-based and learning methods compare on even footing, in terms of time invested, we basically never see learning-based methods working better.
— Chris Paxton (@chris_j_paxton) December 15, 2023
HomeRobot is 100% a test of generalization, as object *classes* + envs are totally unseen https://t.co/3nBBudNgns
2. Spatial-Language Attention Policies for Efficient Robot Learning (SLAP)The future of robot butlers starts with mobile manipulation.
— Chris Paxton (@chris_j_paxton) June 21, 2023
We’re announcing the NeurIPS 2023 Open-Vocabulary Mobile Manipulation Challenge!
- Full robot stack ✅
- Parallel sim and real evaluation ✅
- No robot required ✅👀https://t.co/mggAbRhrLP pic.twitter.com/Wartsmkyyl
Excited that our work on SLAP will be appearing at CoRL 2023 @corl_conf
— Priyam Parashar (@Priyam8Parashar) September 11, 2023
See you there and looking forward to chatting about it!
Work with @chris_j_paxton @XiaohanZhang220 @jdvakil @ybisk @viddivj Sam Powers. https://t.co/CU30b3whgS
RT-X: generalist AI models lead to 50% improvement over RT-1 and 3x improvement over RT-2, our previous best models. 🔥🥳🧵
— Quan Vuong (@QuanVng) October 3, 2023
Project website: https://t.co/GAlvFdqwx5 pic.twitter.com/Jzy8b2eOjf
Excited to share our work on using few examples to learn manipulation skills. https://t.co/zPhUMLik1a
— Vidhi Jain (@viddivj) July 19, 2023
Excited to start as a student researcher at Google Deepmind Robotics :) pic.twitter.com/XW2lRqf0qH
— Vidhi Jain (@viddivj) June 6, 2023
Check out Home Robot Challenge @NeurIPSConf 2023! Let’s build robot policies for rearranging homes :) https://t.co/TA6WwceyQc
— Vidhi Jain (@viddivj) June 21, 2023
(1/5) Every home is different, and every person likes things done in their particular way. Therefore, home robots of the future need to both reason about the sequential nature of day-to-day tasks and generalize to user's preferences.
— Vidhi Jain (@viddivj) December 14, 2022
Research |
![]() |
ANAVI: Audio Noise Awareness using Visuals of Indoors for NAVIgationVidhi Jain, Rishi Veerapaneni, Yonatan Bisk. 8th Annual Conference on Robot Learning (CoRL) 2024. webpage | arXiv | video | code | reviews | poster | Show BibTeX |
![]() |
Vid2Robot: End-to-end Video-conditioned Policy Learning with Cross-Attention TransformersVidhi Jain, Maria Attarian, Nikhil J Joshi Ayzaan Wahid, Danny Driess, Quan Vuong, Pannag R Sanketi, Pierre Sermanet, Stefan Welker, Christine Chan, Igor Gilitschenski, Yonatan Bisk, Debidatta Dwibedi. 20th Edition of Robotics Science and Systems (RSS) Conference 2024. webpage | arXiv | video | Show BibTeX |
![]() |
Towards General-Purpose Robots via Foundation Models: A Survey and Meta-AnalysisYafei Hu*, Quanting Xie*, Vidhi Jain*, Jonathan Francis, Jay Patrikar, Nikhil Keetha, Seungchan Kim, Yaqi Xie, Tianyi Zhang, Shibo Zhao, Yu Quan Chong, Chen Wang, Katia Sycara, Matthew Johnson-Roberson, Dhruv Batra, Xiaolong Wang, Sebastian Scherer, Zsolt Kira, Fei Xia, Yonatan Bisk. Preprint 2024. webpage | arXiv | code | Show BibTeX |
![]() |
FlexCap: Generating Rich, Localized, and Flexible Captions in ImagesDebidatta Dwibedi, Vidhi Jain, Jonathan Tompson, Andrew Zisserman, Yusuf Aytar. 38th Annual Conference on Neural Information Processing Systems (NeurIPS) 2024. webpage | arXiv | Show BibTeX |
![]() |
How to Prompt Your Robot: A PromptBook for Manipulation Skills with Code as PoliciesMontserrat Gonzalez Arenas, Ted Xiao, Sumeet Singh, Vidhi Jain, Allen Z. Ren, Quan Vuong, Jacob Varley, Alexander Herzog, Isabel Leal, Sean Kirmani, Mario Prats, Dorsa Sadigh, Vikas Sindhwani, Kanishka Rao, Jacky Liang, Andy Zeng. 40th IEEE International Conference on Robotics and Automation (ICRA) 2023. arXiv | Show BibTeX |
![]() |
Open X-Embodiment: Robotic Learning Datasets and RT-X ModelsOpen X-Embodiment Collaboration, Abhishek Padalkar, Acorn Pooley, Ajinkya Jain, Alex Bewley, Alex Herzog, Alex Irpan, Alexander Khazatsky, Anant Rai, Anikait Singh, Anthony Brohan, Antonin Raffin, Ayzaan Wahid, Ben Burgess-Limerick, Beomjoon Kim, Bernhard Schölkopf, Brian Ichter, Cewu Lu, Charles Xu, Chelsea Finn, Chenfeng Xu, Cheng Chi, Chenguang Huang, Christine Chan, Chuer Pan, Chuyuan Fu, Coline Devin, Danny Driess, Deepak Pathak, Dhruv Shah, Dieter Büchler, Dmitry Kalashnikov, Dorsa Sadigh, Edward Johns, Federico Ceola, Fei Xia, Freek Stulp, Gaoyue Zhou, Gaurav S. Sukhatme, Gautam Salhotra, Ge Yan, Giulio Schiavi, Hao Su, Hao-Shu Fang, Haochen Shi, Heni Ben Amor, Henrik I Christensen, Hiroki Furuta, Homer Walke, Hongjie Fang, Igor Mordatch, Ilija Radosavovic, Isabel Leal, Jacky Liang, Jaehyung Kim, Jan Schneider, Jasmine Hsu, Jeannette Bohg, Jeffrey Bingham, Jiajun Wu, Jialin Wu, Jianlan Luo, Jiayuan Gu, Jie Tan, Jihoon Oh, Jitendra Malik, Jonathan Tompson, Jonathan Yang, Joseph J. Lim, João Silvério, Junhyek Han, Kanishka Rao, Karl Pertsch, Karol Hausman, Keegan Go, Keerthana Gopalakrishnan, Ken Goldberg, Kendra Byrne, Kenneth Oslund, Kento Kawaharazuka, Kevin Zhang, Keyvan Majd, Krishan Rana, Krishnan Srinivasan, Lawrence Yunliang Chen, Lerrel Pinto, Liam Tan, Lionel Ott, Lisa Lee, Masayoshi Tomizuka, Maximilian Du, Michael Ahn, Mingtong Zhang, Mingyu Ding, Mohan Kumar Srirama, Mohit Sharma, Moo Jin Kim, Naoaki Kanazawa, Nicklas Hansen, Nicolas Heess, Nikhil J Joshi, Niko Suenderhauf, Norman Di Palo, Nur Muhammad Mahi Shafiullah, Oier Mees, Oliver Kroemer, Pannag R Sanketi, Paul Wohlhart, Peng Xu, Pierre Sermanet, Priya Sundaresan, Quan Vuong, Rafael Rafailov, Ran Tian, Ria Doshi, Roberto Martín-Martín, Russell Mendonca, Rutav Shah, Ryan Hoque, Ryan Julian, Samuel Bustamante, Sean Kirmani, Sergey Levine, Sherry Moore, Shikhar Bahl, Shivin Dass, Shuran Song, Sichun Xu, Siddhant Haldar, Simeon Adebola, Simon Guist, Soroush Nasiriany, Stefan Schaal, Stefan Welker, Stephen Tian, Sudeep Dasari, Suneel Belkhale, Takayuki Osa, Tatsuya Harada, Tatsuya Matsushima, Ted Xiao, Tianhe Yu, Tianli Ding, Todor Davchev, Tony Z. Zhao, Travis Armstrong, Trevor Darrell, Vidhi Jain, Vincent Vanhoucke, Wei Zhan, Wenxuan Zhou, Wolfram Burgard, Xi Chen, Xiaolong Wang, Xinghao Zhu, Xuanlin Li, Yao Lu, Yevgen Chebotar, Yifan Zhou, Yifeng Zhu, Ying Xu, Yixuan Wang, Yonatan Bisk, Yoonyoung Cho, Youngwoon Lee, Yuchen Cui, Yueh-hua Wu, Yujin Tang, Yuke Zhu, Yunzhu Li, Yusuke Iwasawa, Yutaka Matsuo, Zhuo Xu, Zichen Jeff Cui. 40th IEEE International Conference on Robotics and Automation (ICRA) 2023. webpage | arXiv | code | Show BibTeX |
![]() |
Spatial Language Attention Policies for Efficient Robot LearningPriyam Parasher, Vidhi Jain, Xiaohan Zhang, Jay Vakil, Sam Powers, Yonatan Bisk and Chris Paxton. 7th Annual Conference on Robot Learning (CoRL) 2023. webpage | arXiv | code | reviews | Show BibTeX |
![]() |
HomeRobot: Open-Vocabulary Mobile ManipulationSriram Yenamandra, Arun Ramachandran, Karmesh Yadav, Austin S Wang, Mukul Khanna, Theophile Gervet, Tsung-Yen Yang, Vidhi Jain, Alexander Clegg, John M Turner, Zsolt Kira, Manolis Savva, Angel X Chang, Devendra Singh Chaplot, Dhruv Batra, Roozbeh Mottaghi, Yonatan Bisk, Chris Paxton. 7th Annual Conference on Robot Learning (CoRL) 2023. webpage | arXiv | code | reviews | Show BibTeX |
![]() |
Transformers are Adaptable Task PlannersVidhi Jain, Yixin Lin, Eric Undersander, Yonatan Bisk and Akshara Rai. 6th Annual Conference on Robot Learning (CoRL) 2022. webpage | arXiv | video | code | reviews | Show BibTeX |
![]() |
MAEA: Multimodal Attribution in Embodied AIVidhi Jain, Jayant Sravan Tamarapalli, Sahiti Yerramilli, and Yonatan Bisk. NeurIPS Workshop on Trustworthy Embodied AI 2022. webpage | arXiv | video | reviews | Show BibTeX |
![]() |
Towards Explainable Embodied AIVidhi Jain Masters thesis 2021. pdf | Show BibTeX |
![]() |
Learning to capture spatial semantic priors for indoor navigationVidhi Jain, Shishir Patil, Prakhar Agarwal and Katia Sycara. NeurIPS Object Representations for Learning and Reasoning (ORLR) 2020. pdf | webpage | arXiv | video | code | Show BibTeX |
![]() |
Predicting strategies in simulated search and rescue tasksVidhi Jain, Rohit Jena, Huao Li, Tejus Gupta, Dana Hughes, Michael Lewis and Katia Sycara. NeurIPS AI for Humanitarian Assistance and Disaster Response (AIADR) 2020. arXiv | video | slides | Show BibTeX |
![]() |
Learning to navigate in unseen cluttered environmentsVidhi Jain, Ganesh Iyer and Katia Sycara. NeurIPS Women in Machine Learning workshop (WiML) 2020. pdf | poster | Show BibTeX |
![]() |
Coping with sample inefficiency in deep reinforcement learningVidhi Jain, Simin Liu, and Ganesh Iyer. ICML Women in Machine Learning Un-Workshop (WiML) 2020. pdf | slides | Show BibTeX |
![]() |
Investigating the viability of Generative Models for Novelty DetectionVidhi Jain Bachelors thesis 2018. pdf | Show BibTeX |
![]() |
Symptomatic Diagnosis and Prognosis of Psychiatric Disorders through Personal GadgetsVidhi Jain, Prakhar Agarwal. ACM CHI Extended Abstracts (CHI EA'17) 2017. pdf | webpage | slides | poster | Show BibTeX |
![]() |
Model Selection Scores for Multi-Relational Bayesian NetworksSajjad Gholami, Oliver Schulte, Vidhi Jain, Qiang Zhao. IJCAI Declarative Learning Based Programming (DeLBP) 2017. pdf | code | Show BibTeX |
![]() |
Empowering API Consumer Community: Collaborative Annotation of Web API Documentation for Semantically Structured FormatVidhi Jain and Matthias Frank Grace Hopper Conference India (GHCI) 2016 2016. pdf | poster | |
Talks |
Education |
Design and source code from Leonid Keselman's Jekyll fork and Jon Barron's website |