elb.jpg
Ellis Brown
PhD Student @ NYU Courant

ellis.brown at nyu dot edu

I am a CS PhD Student at NYU Courant advised by Profs. Saining Xie and Rob Fergus. My research is supported by the NDSEG Fellowship. I have interned at Meta FAIR (w/ Shengyi Qian) and at Ai2 PRIOR (w/ Ross Girshick).

Before NYU, I graduated from a Master’s at Carnegie Mellon where I was advised by Profs. Deepak Pathak and Alyosha Efros. Before that, I was a founding research engineer at BlackRock AI Labs, working with Profs. Mykel Kochenderfer, Stephen Boyd, and Trevor Hastie on applied research & finance and a non-degree grad student at Stanford and Columbia. I did my undergrad at Vanderbilt where I majored in CS & Math and did research in CogSci & Vision with Prof. Maithilee Kunda. I’m originally from St. Louis, MO and am a proud member of the Osage Nation.

→ If you haven’t made time for a regular checkin with a doctor recently, please do!   Even if you feel perfectly healthy.

news

May., 2025 Honored to be recognized as a CVPR 2025 Outstanding Reviewer!
Sep., 2024 Cambrian was accepted to NeurIPS 2024 as an oral presentation 🪼🎉
Mar., 2024 Thrilled to have been awarded the NDSEG Fellowship to support my PhD research at NYU!
Feb., 2024 I will be joining AllenAI (AI2) as a Resesarch Intern this summer in Seattle, working with Ross Girshick!
Aug., 2023 Excited to be starting my PhD at NYU advised by Profs. Saining Xie and Rob Fergus 🎉🗽

research (all)

My research interests lie at the intersection of deep learning, computer vision, and robotics—particularly in the areas of (multimodal) representation learning, self-supervised learning, open-endedness, and agents.

publications

2025

  1. arXiv
    cambrian-s.png
    Cambrian-S: Towards Spatial Supersensing in Video
    Shusheng Yang*Jihan Yang*, Pinzhi HuangEllis Brown , Zihao Yang, Yue Yu, Shengbang Tong, Zihan Zheng, Yifan Xu, Muhan Wang, Danhao Lu, Rob FergusYann LeCun, Li Fei-Fei, and Saining Xie
    arXiv preprint arXiv:2511.04670, 2025
  2. arXiv
    TsT.png
    Benchmark Designers Should “Train on the Test Set” to Expose Exploitable Non-Visual Shortcuts
    Ellis BrownJihan YangShusheng YangRob Fergus, and Saining Xie
    arXiv preprint arXiv:2511.04655, 2025
  3. arXiv
    sims-v.png
    SIMS-V: Simulated Instruction-Tuning for Spatial Video Understanding
    Ellis BrownArijit Ray, Ranjay Krishna, Ross Girshick, Rob Fergus, and Saining Xie
    arXiv preprint arXiv:2511.04668, 2025
  4. COLM
    SAT.jpg
    SAT: Dynamic Spatial Aptitude Training for Multimodal Language Models
    Arijit Ray, Jiafei DuanEllis Brown, Reuben Tan, Dina Bashkirova, Rose Hendrix, Kiana Ehsani, Aniruddha Kembhavi, Bryan A. Plummer, Ranjay Krishna, Kuo-Hao Zeng, and Kate Saenko
    In COLM, 2025

2024

  1. Cambrian-1: A Fully Open, Vision-Centric Exploration of Multimodal LLMs
    Shengbang Tong*Ellis Brown*, Penghao Wu*, Sanghyun Woo, Manoj Middepogu, Sai Charitha Akula, Jihan YangShusheng Yang, Adithya Iyer, Xichen Pan, Ziteng Wang, Rob FergusYann LeCun, and Saining Xie
    In NeurIPS, 2024
  2. V-IRL: Grounding Virtual Intelligence in Real Life
    Jihan Yang, Runyu Ding, Ellis Brown, Xiaojuan Qi, and Saining Xie
    In ECCV, 2024

2023

  1. Your Diffusion Model is Secretly a Zero-Shot Classifier
    In ICCV, 2023
  2. Internet Explorer: Targeted Representation Learning on the Open Web
    In ICML, 2023