Causal Representation Learning with Generative Artificial Intelligence


4/24/25 | 4:15pm | E51-376


Kosuke Imai

Kosuke Imai

Professor of Government and of Statistics
Harvard University


Abstract: In this paper, we demonstrate how to enhance the validity of causal inference with unstructured high-dimensional treatments like texts and images, by leveraging the power of generative Artificial Intelligence. Specifically, we propose to use a deep generative model such as large language models (LLMs) to efficiently generate treatments and use their internal representation for subsequent causal effect estimation. We show that the knowledge of this true internal representation helps disentangle the treatment features of interest, such as specific sentiments and certain topics of texts, from other possibly unknown confounding features. Unlike existing methods, our proposed approach eliminates the need to learn causal representation from the data, and hence produces more accurate and efficient estimates. We formally establish the conditions required for the nonparametric identification of the average treatment effect, propose an estimation strategy that avoids the violation of the overlap assumption, and derive the asymptotic properties of the proposed estimator through the application of double machine learning. Finally, using an instrumental variables approach, we extend the proposed methodology to the settings in which the treatment feature is based on human perception rather than is assumed to be fixed given the treatment object. The proposed methodology is also applicable to text or image reuse where an LLM is used to regenerate existing texts or images. We conduct simulation and empirical studies, using the generated text data from an open-source LLM, Llama 3, to illustrate the advantages of our estimator over state-of-the-art causal representation learning algorithms.

Bio: Kosuke Imai is a professor in the Department of Government and the Department of Statistics at Harvard University. He is also an affiliate of the Institute for Quantitative Social Science. Before moving to Harvard in 2018, Imai taught at Princeton University for 15 years where he was the founding director of the Program in Statistics and Machine Learning. Imai specializes in the development of statistical methods and machine learning algorithms and their applications to social science research. His areas of expertise include causal inference, computational social science, and survey methodology. Imai leads the Algorithm-Assisted Redistricting Methodology (ALARM) Project, serving as an expert witness for several high-profile legislative redistricting cases. Outside of Harvard, Imai served as the President of the Society for Political Methodology from 2017 to 2019.  His research has been supported by the Guggenheim Fellowship and grants from the National Science Foundation, Sloan Foundation, and other agencies and organizations.

Event Time:
4:15pm – 5:15pm