Using NLP and Generative AI in Survey Research

This course covers the principles and application of Natural Language Processing (NLP) and Generative AI for survey research. As advanced AI methodologies become increasingly integrated into survey research, this course aims to equip participants with the necessary theoretical and practical knowledge to harness these methods effectively.

The course begins with an introduction to the foundational concepts of NLP and Generative AI, establishing a theoretical framework for their application within survey research. This foundational understanding is crucial for appreciating subsequent discussions on the utility and limitations of these methods.

Following this, we will provide an overview of the key developments in the integration of NLP and Generative AI into surveys, focusing on innovations in the recent methodological literature.

The course then delves into the areas where NLP and Generative AI demonstrate exceptional utility, including analyzing open-ended survey responses, dynamic survey probing, conversational interviewing, iterative survey design, tailored survey experiments, and missing value imputation. These applications underscore the transformative potential of AI in enhancing the precision, efficiency, and adaptability of survey research.

Conversely, the course will critically assess the limitations and challenges inherent in applying NLP and Generative AI to survey research. Specific focus will be placed on the complexities associated with synthetic respondents, the generation of survey questions de novo, the prediction of public opinion, and the application of these technologies in cross-cultural contexts. Understanding these limitations is essential for the responsible and informed application of AI in survey research.

By the conclusion of this course, participants will have acquired a rigorous understanding of both the capabilities and constraints of NLP and Generative AI in survey research, enabling them to apply these tools with greater sophistication and discernment in their professional practice.

 

Our speakers are:

Soubhik Barari is a research methodologist at NORC at the University of Chicago, specializing in statistical and computational methods for social science research. With over seven years of experience, he applies machine learning, generative AI, causal inference, and natural language processing to areas such as social media analysis, government program evaluation, and pre-election polling. At NORC, he leads key quantitative projects, including survey weighting for the National Center for Health Statistics’ Rapid Survey System and Pew Research Center’s Religious Landscape Study III, as well as data visualization tools for survey quality control and political polling. He has also evaluated mode effects in surveys and generative AI for interviews. Previously, Soubhik was a research scientist at SurveyMonkey and has held roles at Microsoft Research, Harvard, and MIT. His work has been published in Nature Scientific Data and featured in The Atlantic and Scientific American, and is often present at academic and industry conferences, including AAPOR. Learn more about his work here.

Joshua Lerner is a research methodologist at NORC specializing in AI, NLP, machine learning, and causal inference for program evaluation. Since joining in 2021, he has led initiatives integrating NLP into survey research, including automated transcript analysis for the America in One Room deliberative poll and designing models to assess the long-term effects of deliberation on voting and polarization. He has also developed econometric models for large-scale evaluations, such as ACO REACH and SAPRO, and introduced innovations in difference-in-difference modeling for healthcare research. A political scientist with publications in American Political Science Review and Journal of Politics, he has taught graduate courses on machine learning at Duke and led workshops at Northwestern. His research spans political ideology, institutional economics, and NLP applications in legislative analysis. Learn more about his work here.