I am a Data Scientist at the U.S. Securities and Exchange Commission working on artificial intelligence and data science. I received my Ph.D. from the University of Illinois Chicago under the advisement of Elena Zheleva.
CV (updated August 2023)
We propose a feature selection method for estimating heterogeneous treatment effects (HTE) in observational and experimental data. Unlike existing data-driven HTE estimation methods that make strong assumptions about observed features and ignore causal model structure, we consider each feature’s value for HTE estimation and learn relevant parts of the causal structure from data.
Our paper proposes causal inference methods for estimating individual thresholds in the Linear Threshold Model, a widely used model for describing information diffusion through a social network. Traditional applications of the model assume that all nodes have the same threshold or randomly distributed thresholds, ignoring individual-level differences. We introduce the concept of heterogeneous peer effects and develop a Structural Causal Model that supports identification and estimation of these effects.
![]() |
![]() |
Our paper examines the impact of software patches on player performance and game balance in League of Legends, a popular team-based multiplayer online game. Using causal inference, we show that game patches have different impacts on players depending on their skill level and whether they take breaks between games.
We propose a tool called Aletheia that helps users identify and manage sensitive, unwanted files in cloud storage. By predicting a file’s perceived sensitivity and usefulness, as well as its desired management, Aletheia improves over state-of-the-art baselines and validates a human-centric approach to feature selection in security-related tasks. The tool also helps minimize the attack surface of cloud accounts, making it a valuable asset for cloud storage users concerned about their privacy and security.
We explore the problem of individual-level treatment effect differences, known as heterogeneous treatment effect estimation, where an individual-level threshold in treatment must be reached to trigger an effect. We propose a tree-based learning method to find the heterogeneity in the treatment effects and demonstrate through experimental results on multiple datasets that their approach can learn the triggers better than existing approaches.