Jon Saad-Falcon


Education

June 2023 - June 2027
Ph.D. in Computer Science
Stanford University, Palo Alto, CA
- Overall GPA - 4.00/4.00, Focus on Natural Language Processing, Information Retrieval, and ML Systems
- GEM Fellowship and EDGE Fellowship
Aug. 2018 — May 2022
B.S. - M.S. Program in Computer Science, Minors in Linguistics and Mathematics
Georgia Institute of Technology, Atlanta, GA
- Overall GPA - 3.96/4.00, Concentrations in Intelligence and Theory
- Stamps President's Scholar, Honors Program, Tapia Scholarship, and U.N. Millennium Fellowship
- Undergraduate Research Thesis - PeopleMap, NLP and Visualization Tool for Mapping Out Researchers
Summer 2019
Berlin Summer Program
Technische Universität Berlin, Germany
Exchange program with classes in Machine Learning, Agile Product Development, Statistics/Probability, and Entrepreneurship.

Academic Research Experience

Aug. 2022 - May 2023
Stanford Artificial Intelligence Lab, Stanford University, Palo Alto, CA
Research Assistant, Natural Language Processing Group
Mentor: Christopher Potts, Matei Zaharia
Exploring domain adaptive techniques for information retrieval models to improve zero and few-shot accuracy
Summer 2021
Center for the Study of Language and Information (CSLI), Stanford University, Palo Alto, CA (remote)
Research Intern, Stanford AI Lab, Natural Language Processing Group
Mentor: Christopher Potts, Matei Zaharia
Accepted for a cohort of 12 students out of 723 applicants. Constructing new benchmark dataset for naturalistic, technical question-answering models.
Aug. 2019 — May 2022
Georgia Institute of Technology, Atlanta, GA
Undergraduate Research Assistant, Polo Club of Data Science
Advisor: Duen Horng (Polo) Chau
Projects focusing machine learning and data visualization.
Aug. 2019 — May 2022
Georgia Institute of Technology, Atlanta, GA
Undergraduate Research Assistant, Speech and Language Technologies Laboratory
Advisor: Diyi Yang
Projects focusing natural language processing and computational social science.

Industry Research Experience

Aug. 2023 - Sep. 2023
Databricks, San Francisco, CA
ML Research / Engineering Intern,
Mentor: Matei Zaharia
Leveraged statistical inference strategies to create novel automated evaluation system for retrieval-augmented generation (RAG).
Aug. 2021 - Aug. 2022
Allen Institute for Artificial Intelligence (AI2), Seattle, WA (remote)
Predoctoral Young Investigator, Semantic Scholar
Mentor: Doug Downey
Designed new efficiency techniques for caching and reusing sequence representations in language models.
Spring 2021
Allen Institute for Artificial Intelligence (AI2), Seattle, WA (remote)
Research Intern, Semantic Scholar
Mentor: Dan Weld, Tom Hope
Constructed search engine for exploring of challenges and future directions within COVID-19 literature. Collaborated with 10+ medical professionals to design better tools for assisting scientific research.
Summer 2020
Goldman Sachs, New York City, NY (remote)
Research Analyst, Global Investment Research
Mentor: Michael Lapides
Initiated an independent project using quantitative techniques and NLP tools to perform valuation of companies. Collaborated with traders and external clients to develop reliable Python tools for extending coverage of sectors.

Honors and Awards

2018 - 2022
Stamps President’s Scholarship, Georgia Tech
Premier full-ride scholarship given to 40 undergraduate students out of 30,000 applicants to Georgia Tech
Spring 2022
Fulbright Scholarship, Research Award, Germany
Awarded to top U.S. students interested in research and cultural exchange abroad
Spring 2023
Gates-Cambridge Scholarship, Bill & Melinda Gates Foundation
Full-ride graduate scholarship awarded to 75 scholars out of more than 6,000 applicants worldwide
Spring 2022
Knight-Hennessy Scholarship, Finalist, Stanford
Full-ride graduate scholarship awarded for academic excellence, leadership, and civic engagement
Summer 2021
Donald V. Jackson Fellowship, Georgia Tech College of Computing
Awarded to well-rounded, first-year master’s student who embodies values of academic excellence and leadership
Spring 2021
Computer Science Research Mentorship Program, Google
Apprenticeship program for matching university students with mentors in Google Research
Fall 2021
U.N. Millennium Fellowship, United Nations
Semester-long leadership development program for supporting student leaders whose organizations address the UN Academic Impact Principles.
Fall 2020
President's Undergraduate Research Award (PURA)
$1,500 award to support undergraduate research with Georgia Tech faculty advisor
Summer 2020
Pathways to Graduate School for Rising College Seniors, Princeton University
Competitive program designed to prepare students for graduate school applications
Summer 2020
Oxford Machine Learning Summer School, Oxford University
Competitive two-week summer school for researchers and engineers in academia and industry
Spring 2020
Citadel Securities Trading Challenge, Citadel Securities
Placed first out of 26 competing teams in simulated trading challenge at Georgia Tech competition

Publications

Benchmarking and Building Long-Context Retrieval Models with LoCo and M2-BERT
Jon Saad-Falcon, Daniel Y. Fu, Simran Arora, Neel Guha, Christopher Ré
ICML. 2024.
Project PDF Code
ARES: An Automated Evaluation Framework for Retrieval-Augmented Generation Systems
Jon Saad-Falcon, Omar Khattab, Christopher Potts, Matei Zaharia
NAACL. 2024.
Project PDF Code
UDAPDR: Unsupervised Domain Adaptation via LLM Prompting and Distillation of Rerankers
Jon Saad-Falcon, Omar Khattab, Keshav Santhanam, Radu Florian, Martin Franz, Salim Roukos, Avirup Sil, Md Arafat Sultan, Christopher Potts
EMNLP. 2023.
Project PDF Code
Moving Beyond Downstream Task Accuracy for Information Retrieval Benchmarking
Keshav Santhanam*, Jon Saad-Falcon*, Martin Franz, Omar Khattab, Avirup Sil, Radu Florian, Md Arafat Sulton, Salim Roukos, Matei Zaharia, Christopher Potts
ACL Findings. 2023.
Project PDF Code
Embedding Recycling for Language Models
Jon Saad-Falcon, Amanpreet Singh, Luca Soldaini, Michael D'Arcy, Arman Cohan, Doug Downey
EACL. 2023.
Project PDF Code
ColBERTv2: Effective and Efficient Retrieval via Lightweight Late Interaction
Keshav Santhanam*, Omar Khattab*, Jon Saad-Falcon, Christopher Potts, Matei Zaharia
2022 Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL). 2022.
Project PDF Code
A Search Engine for Discovery of Scientific Challenges and Directions
Dan Lahav, Jon Saad-Falcon, Bailey Kuehl, Sophie Johnson, Sravanthi Parasa, Noam Shomron, Duen Horng Chau, Diyi Yang, Eric Horvitz, Daniel S. Weld, Tom Hope
AAAI Conference on Artificial Intelligence (AAAI). 2022.
Project Demo PDF Code Oral Presentation
Argo Scholar: Interactive Visual Exploration of Literature in Browsers
Kevin Li, Haoyang Yang, Anish Upadhayay, Zhiyan Zhou, Jon Saad-Falcon, Duen Horng (Polo) Chau
IEEE Visualization Conference (VIS). 2021.
Project PDF Video Best Poster, Honorable Mention
Large-Scale Analysis of Career Transitions: The Impact of Human Capital, Job History, and Language Factors
Austin P Wright, Caleb Ziems, Haekyu Park, Jon Saad-Falcon, Duen Horng (Polo) Chau, Diyi Yang, Maria Tomprou
Pre-print. 2021.
Project PDF
EnergyVis: Interactively Tracking and Exploring Energy Consumption for ML Models
Omar Shaikh, Jon Saad-Falcon, Austin P Wright, Nilaksh Das, Scott Freitas, Omar Isaac Asensio, Duen Horng (Polo) Chau
CHI '21 Extended Abstracts (CHI LBW). 2021.
Project PDF Video
Examining the Ordering of Rhetorical Strategies in Persuasive Requests
Omar Shaikh, Jiaao Chen, Jon Saad-Falcon, Duen Horng (Polo) Chau, Diyi Yang
Findings of the Association for Computational Linguistics: Empirical Methods in Natural Language Processing (EMNLP Findings). 2020.
Project PDF Video Code
Mapping Researchers with PeopleMap
Jon Saad-Falcon, Omar Shaikh, Zijie J. Wang, Austin P. Wright, Sasha Richardson, Duen Horng (Polo) Chau
IEEE Visualization Conference (VIS). 2020.
Project Demo PDF Code Best Poster, Honorable Mention

Press

Apr. 2021
"This New Tool Can Track the Environmental Cost of Your Machine Learning Model," Georgia Tech, College of Computing
Nov. 2020
"Being Polite Can Be Essential to Getting a Loan," Georgia Tech, College of Computing
Oct. 2020
"Georgia Tech Researchers Contribute 13 Papers to Premier Visualization Conference," Georgia Tech, College of Computing

Service

Conference Volunteering
Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL) 2022
Conference on Empirical Methods in Natural Language Processing (EMNLP) 2020
Annual Meeting of the Association for Computational Linguistics (ACL) 2020
Institutional Volunteering
January 2021 — May 2022
Georgia Tech Venture Capital Fund,
Co-founder and Student Director,
Raising an inaugural multi-million early-stage fund to begin supporting Georgia Tech startups by Spring 2022. Created educational curriculum for teaching 40+ students on the fundamental strategies of venture capital funds
July 2020 — May 2022
Computer Science Outreach Club,
Founding Member and Volunteer,
Develop coding workshops and lectures to teach underprivileged youth from Atlanta-area highschools. Assist students in their college application process as well as guide their preparation for summer internships.
Aug. 2019 — May 2021
Georgia Tech Investments Committee,
Research Analyst, Fixed Income Sector
Collaborate with fellow analysts to research sector trends, hedge investments, and determine market signals. Assist in developing forecasts and recommendations for a portfolio worth over $1.4 million.

Technology Skills

Coding Languages: Python, JavaScript, C, Java, C#, Scala

Technologies: PyTorch, TensorFlow, Keras, Apache Spark, Node.js, Svelte.js, React.js

Languages: English (native), Spanish (native), Arabic (beginner), French (beginner)

References

Dr. Christopher Potts, Professor and Chair
Department of Linguistics
Stanford University
Dr. Doug Downey, Professor and Research Manager
Semantic Scholar
Allen Institute for Artificial Intelligence (AI2)
Dr. Polo Chau, Associate Professor
School of Computational Science and Engineering
Georgia Institute of Technology
Dr. Diyi Yang, Assistant Professor
Department of Computer Science
Stanford University