General Response to the Reviewers

We thank all reviewers for their valuable comments, and we are happy that they acknowledge the importance of our work and the value of our results. We agree with the reviewers that the presentation of the paper can be improved, and we summarize the main changes we aim to add to the final version of the paper as follows. Further questions are answered in the individual rebuttals.

Introduction: We will improve the introduction by modifying Figure 1 to a figure which clearly explains the setup where a context variable is endogeneous, and will move the discussion around the current Figure 1 to Section 4. We will then move Example 3.1. to the introduction to better illustrate our goals. We will considerably improve the third paragraph of the introduction by removing unnecessary details and jargon, and rather explain better how our work relates to methods describing context-specific independencies (CSI), and how our assumptions allow us to give an interpretation in terms of structural causal models (SCMs) to the CSIs.

Graphical objects: We agree that the graphical objects may be hard to understand at first sight in the current form of the paper. We therefore propose to add the following table (formatted as best as possible) which underlines the differences and connections between the different graphical objects:

	Observable...	...graphs...	...
Symbol
Name	descriptive	physical	union ('standard')
Model	or
Observational Support
Captured Information	independence-structure	altered mechanisms	union mechanisms
Context-Specific (-dep.)	yes	yes	no
Used here primarily for	discovery	proofs	relation to literature
Node Sets	system	system	system
Edge Sets	active in context	present in context	in any context

We will also combine the graphical objects and the discovery goals sections for better reading flow.

Overall readability: We will revise notation and try to further simplify it. We will better explain the assumptions. However, we believe that the exact definitions are better left to the Appendix. We will fix all discovered typos, including the typos in Algorithm 1.
Context variable: We will revise our definition of a context variable to make it clearer that our method is modular and can be combined with any anomaly or regime-detection method. This change also makes it easier to explain how the current setup, where the context variable is measured and known, is plausible also for real-world scenarios.

Redraft of third paragraph of introduction

The third paragraph of the introduction did not illustrate well what we actually wanted to say, which is (the following is an early redraft and has to be fit together with the remainder of the introduction, so is likely subject to change):

Multiple context-specific graphs can contain more qualitative information than a single union graph as illustrated below:

Example 3.1: Given a binary context indicator variable and a multivariate mechanism of the form $𝟙$ the dependence is present in the context , but absent for . This entails a context-specific independence (CSI) .

As indicated in the example, such additional information can be captured via context-specific independence (CSI). Graphical independence models describing the CSI structure of a dataset, for example, LDAGs, have been studied before [22]. However, a causal analysis, i.e., understanding interventional properties of the context-specific model, requires knowledge about the causal model properties. In the single-context case, under the faithfulness assumption, knowledge about the causal properties of models is directly connected to the independence structure. As will be explored in detail in §3.1, this simple model to independence correspondence cannot generally hold in the multi-context case. Thus, an important open problem for the causal analysis of multi-context systems is the connection of CSI structure to the underlying causal model.

We provide a connection between the CSI structure and the underlying causal model and specify assumptions under which an efficiently computable subset of CSI and independencies on the pooled data together can be given an interpretation in terms of structural causal model (SCM) properties. The obtained context-specific graphs are of interest due to several desirable properties. For instance, context-specific modeling can avoid spurious edges that arise from cycles in the union graph, as shown in Figure 1. Furthermore, CSI testing poses multiple finite-sample challenges: It dramatically increases the search space of independence testing. Additionally, CSIs are only tested on a subset of samples, thereby increasing the per-test error rate. Our approach, which requires the framework connecting CSI to causal properties, adaptively decides whether a specific test can be run on the pooled data, or must run on a subset of samples associated with a specific context. It executes only one of both tests and uses as many samples as possible, thereby improving finite-sample performance.