Paola Cascante-Bonilla

Dr. Paola Cascante-Bonilla received her Ph.D. in Computer Science at Rice University in 2024, advised by Professor Vicente Ordóñez Román, working on Computer Vision, Natural Language Processing, and Machine Learning. She is the recipient of the Ken Kennedy Institute SLB Graduate Fellowship (2022/23), and was selected as a Future Faculty Fellow by Rice's George R. Brown School of Engineering (2023) and as a Rising Star in EECS (2023). She received a Master of Computer Science at the University of Virginia and a B.S. in Engineering at the Tecnológico de Costa Rica. Previously, she interned at the Mitsubishi Electric Research Laboratories (MERL) and twice at the MIT-IBM Watson AI Lab. Before that, she spent 10 years working as a Software Engineer at different tech companies. Here is my CV and a Research Summary of my work during my graduate studies.

               


I'm joining Stony Brook University (SUNY) as an Assistant Professor in the Department of Computer Science. I'm looking for students to join my lab in Fall 2025.
If you're interested in doing some exciting research with me, please send me an email.

Research Topics

dialpad   Vision and Language & Multi-modal learning:
Zero/few-shot learning, representation learning, continual learning.
Visual-question answering, crossmodal retrieval, multi-hop reasoning.

directions_run   Synthetic data generation for compositionality and privacy protection:
Simulated environments to provide a safe, controlled setting where agents can learn.
Virtual playgrounds that allow systems to experience and interact within the 3D space.

high_quality   Dynamic evaluations and real-world applications:
Data distribution and bias mitigation.
Assessing the performance and effectiveness of models under varying conditions.
News
05/2023. Pleased to be recognized as an Outstanding Reviewer for #CVPR2024!
02/2024. One paper accepted to #CVPR2024!
02/2024. I was invited to give a talk at the AI Safety research club @ UCLA.
12/2023. Our second edition of "What is Next in Multimodal Foundation Models? - MMFM Workshop" is accepted to #CVPR2024. See you in Seattle! ✨
09/2023. One paper accepted to #NeurIPS2023 as spotlight! Quite fun to work with LLMs for Vision+Language~!
08/2023. I've been selected to Rising Stars in EECS (2023) to be held in Georgia Tech, Atlanta!
08/2023. I've been accepted to the #ICCV2023 Doctoral Consortium & granted the DEI award for attending ICCV!
07/2023. I've been selected as a Future Faculty Fellow for the 2023-2024 academic year! 📢
07/2023. Going beyond nouns... is accepted to #ICCV2023! - Work featured in MIT News and Rice CS News.
05/2023. Pleased to be recognized as an Outstanding Reviewer for #CVPR2023!
03/2023. I'm co-organizing the Women in Computer Vision (WiCV) and What is Next in Multimodal Foundation Models? (MMFM) workshops at #ICCV2023! See you in Paris!
03/2023. I accepted a Research PhD Internship at MERL this Summer to work on Few-shot Action Recognition!
02/2023. Two papers accepted to #CVPR2023 on lifelong/continual learning! update
01/2023. I got awarded the Ken Kennedy Institute 2022/23 SLB Graduate Fellowship.
03/2022. SimVQA is accepted to #CVPR2022. Work featured in Rice News.
01/2022. Co-organizing the LatinXinCV research workshop at CVPR 2022. Co-chairing the Mentorship Program.
01/2022. I'm moving to Houston, TX to continue my PhD at Rice University.
12/2021. I'm returning to the MIT-IBM Watson AI Lab as a PhD Intern Researcher next Summer. []
11/2021. One paper accepted to BMVC 2021. It's all about image patches grid_on and evolution! sync_problem
09/2021. Got my Master's in Computer Science at the University of Virginia. GPA 4.0.
05/2021. Featured as the talent of May in the Costa Rican Talent Network Abroad (Ticotal). National Science Academy, Costa Rica.
01-10/2021. Co-organizing the LatinXinCV research workshop at CVPR 2021, ICML 2021, ICCV 2021. Co-chairing the Mentorship Program.
02/2021. Curriculum Labeling got accepted to AAAI 2021.
01/2021. Accepted a Summer PhD Internship at the MIT-IBM Watson AI Lab. []
08/2020. Invited to give a workshop at the International Meeting on Artificial Intelligence and its Applications (RIIAA).


Preprints
PropTest: Automatic Property Testing for Improved Visual Programming.
Jaywon Koo, Ziyan Yang, Paola Cascante-Bonilla, Baishakhi Ray, Vicente Ordóñez.
March 2024.
[project page] [bibtex]
Learning from Models and Data for Visual Grounding.
Ruozhen He, Paola Cascante-Bonilla, Ziyan Yang, Alexander C. Berg, Vicente Ordóñez.
March 2024.
[project page] [bibtex]
Grounding Language Models for Visual Entity Recognition.
Zilin Xiao, Ming Gong, Paola Cascante-Bonilla, Xingyao Zhang, Jie Wu, and Vicente Ordonez.
February 2024.
[code] [bibtex]
On the Transferability of Visual Features in Generalized Zero-Shot Learning.
Paola Cascante-Bonilla, Leonid Karlinsky, James Seale Smith, Yanjun Qi, Vicente Ordóñez.
November 2022.
[code] [bibtex]                                                                                                                                                                                                                           

Publications
Improved Visual Grounding through Self-Consistent Explanations.
Ruozhen He, Paola Cascante-Bonilla, Ziyan Yang, Alexander C. Berg, Vicente Ordóñez.
2024 Conference on Computer Vision and Pattern Recognition. CVPR 2024.
Seattle, Washington. June 2024. [project page] [bibtex]
Dense and Aligned Captions (DAC) Promote Compositional Reasoning in VL Models (spotlight).
Sivan Doveh, Assaf Arbelle, Sivan Harary, Roei Herzig, Donghyun Kim, Paola Cascante-Bonilla, Amit Alfassy, Rameswar Panda, Raja Giryes, Rogerio Feris, Shimon Ullman, Leonid Karlinsky.
Thirty-seventh Conference on Neural Information Processing Systems. NeurIPS 2023.
New Orleans, Lousiana. December 2023. [arxiv] [bibtex]
Going Beyond Nouns With Vision & Language Models Using Synthetic Data.
Paola Cascante-Bonilla, Khaled Shehada, James Seale Smith, Sivan Doveh, Donghyun Kim, Rameswar Panda, Gül Varol, Aude Oliva, Vicente Ordonez, Rogerio Feris, Leonid Karlinsky.
The 19th International Conference on Computer Vision. ICCV 2023.
Paris, France. December 2023. [arxiv] [project page] [bibtex]
ConStruct-VL: Data-Free Continual Structured VL Concepts Learning.
James Seale Smith, Paola Cascante-Bonilla, Assaf Arbelle, Donghyun Kim, Rameswar Panda, David Cox, Diyi Yang, Zsolt Kira, Rogerio Feris, Leonid Karlinsky.               
2023 Conference on Computer Vision and Pattern Recognition. CVPR 2023.
Vancouver, Canada. June 2023. [arxiv] [bibtex]
CODA-Prompt: COntinual Decomposed Attention-based Prompting for Rehearsal-Free Continual Learning.
James Seale Smith, Leonid Karlinsky, Vyshnavi Gutta, Paola Cascante-Bonilla, Donghyun Kim, Assaf Arbelle, Rameswar Panda, Rogerio Feris, Zsolt Kira.               
2023 Conference on Computer Vision and Pattern Recognition. CVPR 2023.
Vancouver, Canada. June 2023. [arxiv] [bibtex]
SimVQA: Exploring Simulated Environments for Visual Question Answering.
Paola Cascante-Bonilla, Hui Wu, Letao Wang, Rogerio Feris, Vicente Ordonez.
2022 Conference on Computer Vision and Pattern Recognition. CVPR 2022.
New Orleans, Lousiana. June 2022. [arxiv] [project page] [bibtex]
Evolving Image Compositions for Feature Representation Learning.
Paola Cascante-Bonilla, Arshdeep Sekhon, Yanjun Qi, Vicente Ordonez.
The 32nd British Machine Vision Conference. BMVC 2021.
Virtual Conference. November 2021. [arxiv] [project page] [bibtex]
Curriculum Labeling: Revisiting Pseudo-Labeling for Semi-Supervised Learning.
Paola Cascante-Bonilla, Fuwen Tan, Yanjun Qi, Vicente Ordonez.
The 35th AAAI Conference on Artificial Intelligence. AAAI 2021.
Virtual Conference. February 2021. [arxiv] [code] [bibtex]
Drill-down: Interactive Retrieval of Complex Scenes using Natural Language Queries.
Fuwen Tan, Paola Cascante-Bonilla, Xiaoxiao Guo, Hui Wu, Song Feng, Vicente Ordonez.
Conf. on Neural Information Processing Systems. NeurIPS 2019.
Vancouver, Canada. December 2019. [arxiv] [code] [bibtex]
Moviescope: Large-scale Analysis of Movies using Multiple Modalities.
Paola Cascante-Bonilla, Kalpathy Sitaraman, Mengjia Luo, Vicente Ordonez.               
August 2019. [arxiv] [project page] [bibtex]
Media coverage: techxplore article
Chat-crowd: A Dialog-based Platform for Visual Layout Composition.
Paola Cascante-Bonilla, Xuwang Yin, Vicente Ordonez, Song Feng.
North American Chapter of the Association for Computational Linguistics. NAACL 2019. System Demonstrations.
Minneapolis, Minnesota. June 2019. [arxiv] [project page] [code] [bibtex]
Teaching

Work Experience

Mitsubishi Electric Research Laboratories (MERL).
Research Intern in the Computer Vision Group. Working in Few-shot Action Recognition.
Host: Anoop Cherian.
May 2023 - Present.

IBM Research.
Research Intern in the Vision Group at the MIT-IBM Watson AI Lab.
Mentor: Leonid Karlinsky. Manager: Rogerio Feris.
May 2022 - March 2023.

IBM Research.
Research Intern in the Vision Group at the MIT-IBM Watson AI Lab.
Exploring simulated environments for Visual Question Answering.
Mentor: Hui Wu. Manager: Rogerio Feris.
May 2021 - Aug 2021.
N3.
Senior Sofware Engineer.
Dec 2012 - Jul 2018.
Growth Acceleration Partners.
Sofware Engineer.
Feb 2010 - Oct 2012.
Intel.
Sofware Engineer.
Nov 2009 - Feb 2010.

Professional Service
Program Chair: MMFM Workshop @ ICCV 2023 & CVPR 2024, WiCV Workshop @ ICCV2023.
Reviewer: CVPR 2022-2024, ECCV 2022-2024, ICCV 2023, WACV 2024, NeurIPS 2022-2024, AAAI 2022-2024, ICLR 2024, ICML 2024, BMVC 2021, ACMMM 2022 Industry Track.
Co-organizer for the What is Next in Multimodal Foundation Models? - MMFM Workshop @ ICCV2023 & CVPR2024.

Diversity and Inclusion
Co-organizing the Women in Computer Vision - WiCV Workshop @ ICCV2023.
Co-organizing the LatinX in CV Workshop @ CVPR2021, ICCV 2021, ICML 2021, CVPR2022. Co-chairing the Mentorship Program.

Extra
Drummer

2002 - 2016.