Entry-Specific Matrix Estimation under Arbitrary Sampling Patterns through the Lens of Network Flows


2/27/25 | 4:15pm | E51-376


Christina Yu

Christina Yu

Assistant Professor
Cornell University, Operations Research and Information Engineering


Abstract: Matrix completion is a powerful tool for predicting missing values in data, with applications in econometrics, recommendation systems, healthcare, and social sciences. Traditional methods often assume that missing data follows specific patterns, but real-world data could lack the assumed structures. We address this gap by developing a matrix completion algorithm that works effectively for any arbitrary given pattern of missing data, for the subclass of matrices that exhibit additivity or rank-1 structure. Our method, based on network flows in the bipartite graph induced by the observation pattern, models the connections between the observed and missing data, allowing us to measure the difficulty of estimating unobserved entries at a granular level. For additive matrices, we show that the optimal entry-specific estimator can be constructed using electrical flows, and is equivalent to the least squares estimator. We establish error upper bounds customized to each entry as a function of the observation set, along with matching minimax lower bounds. Our results show that the minimax squared error for recovery of a matrix entry is proportional to the effective resistance of the corresponding graph edge, giving clear intuition to the impact of the observation pattern on entry specific estimation. Applying our estimator to the two-way fixed effects model, we show that our estimator recovers established estimators: the Difference-in-Differences estimator and Two Way Fixed Effect regression. For rank-1 matrices, we use edge-disjoint paths to form an estimator that achieves minimax optimality under sufficiently dense sampling. Our discovery introduces a new family of estimators parametrized by network flows, offering a fine-grained and intuitive understanding of how sampling patterns influence estimation difficulty at an entry-specific level, moving beyond traditional global error metrics.

Bio: Christina Lee Yu is an Assistant Professor at Cornell University in the School of Operations Research and Information Engineering. Prior to Cornell, she was a postdoc at Microsoft Research New England. She received her PhD in 2017 and MS in 2013 in Electrical Engineering and Computer Science from Massachusetts Institute of Technology in the Laboratory for Information and Decision Systems. She received her BS in Computer Science from California Institute of Technology in 2011. She received honorable mention for the 2018 INFORMS Dantzig Dissertation Award. She is a recipient of the 2021 Intel Rising Stars Award, a JPMorgan Faculty Research Award, a NSF CAREER Award, and the 2024 ACM SIGMETRICS Rising Stars Award. Her work is supported by grants from the National Science Foundation and the Air Force Office of Scientific Research. Her research interests include algorithm design and analysis, high dimensional statistics, inference over networks, sequential decision making under uncertainty, online learning, and network causal inference.

Event Time:
4:15pm – 5:15pm