Content

Speaker

Purva Pruthi

Abstract

Many real-world systems are modular and heterogeneous, composed of interacting components. Examples include computational systems, such as query processors and compilers; natural systems, such as cells and ecosystems; and social systems, such as families and organizations. However, current formalisms for causal reasoning typically treat such systems as single units and represent them with a fixed set of variables sampled from a fixed causal graph, assuming a homogeneous data-generating process.

In this thesis, I propose a compositional framework for causal reasoning in modular, heterogeneous systems, where each unit is represented by an instance-specific composition of multiple heterogeneous components. Consider query execution in relational databases as an example: each query execution plan consists of a specific composition of database operations (scan, sort, aggregate), with the number and kind of components varying across different query plans. I present a formalism describing the compositional approach for causal effect estimation, in which the unit-level causal queries are decomposed into component-level causal queries. I demonstrate the compositional approach using modular neural network architectures with explicit structure instantiated for each unit. To facilitate the development of methods for effect estimation for modular systems, I provide and use a set of three realistic benchmarks — query execution in relational databases, matrix operations processing, and manufacturing assembly simulator. I empirically demonstrate the benefits of the compositional approach to causal effect estimation — accurate estimation of causal effects for structured data, better sample efficiency, improved overlap between treatment groups, and compositional generalization to units with unseen combinations of components. 

Building on these findings, I propose a systematic study to understand when compositional approaches outperform standard methods for causal effect estimation by analyzing the impact of model misspecification, composition structure, observational bias, and system heterogeneity. This analysis aims to provide theoretical guarantees for sample efficiency, generalization risk, and identifiability of causal queries. Further, I propose to study the conditions required for the identification and specialization of underlying “causal” components in specific model classes — over-parameterized neural networks and large-language models (LLMs). Finally, by viewing compositionality through a causal lens, this thesis aims to provide a unified view of compositional reasoning and generalization in various machine learning domains — vision, language, and reinforcement learning — while establishing fundamental relationships between interventions and compositionality.

Advisor

David Jensen