Thank you for your thoughtful response and your time in reviewing our initial rebuttal!

We do believe that our algorithm introduces meaningful novelty that aligns with ICLR’s focus on impactful contributions. ACES implements a hierarchical recursive analysis framework that incorporates event-bounded aggregation for cohort extraction—a concept not employed in other tools. While SQL-based function encapsulations can handle temporal aggregation, ACES’ event-bounded aggregation windows go beyond these by enabling deterministic, transparent, and user-friendly cohort definitions. This design is especially critical in EHR data, where robust and reproducible extraction processes through SQL queries are non-trivial. The simplicity of configuring ACES via a no-code configuration file lowers the technical barrier for researchers, compared to the complex SQL querying and environment setup often required by existing tools.

Quantitative Evidence

In response to requests for quantitative evidence, we have conducted additional experiments evaluating ACES’ efficiency and accuracy compared to similar abstractions.

Using the OMOP version of MIMIC-IV-Demo, as well as a synthetic dataset of 1,000 patients generated using Synthea and converted into OMOP, we queried four tasks using ACES, OMOP-learn, and DPM360
We collected metrics including script runtime, peak memory usage (in MiBs), lines of code required (including configuration files and any template code needed to execute extraction), and human time spent.

Full results are shown below in the table. ACES proved to be generally computationally more efficient, and does not rely on SQL connections and long queries with complex SQL scripts, which can be brittle and time-intensive to implement. We do acknowledge the potential for bias in estimating human time, so these estimates are offered as a rough measure. Nonetheless, the results suggest a clear advantage for ACES in terms of user experience and worked out-of-the-box with no further errors needed to be resolved, unlike existing SQL solutions.

Method	Dataset	Task	Runtime (s)	Peak Memory Cost (MiBs)	Lines of Code (#)	Human Time (s)	Adaptability to Other Tasks
ACES via MEDS	Synthea-1000	First 24h in-hospital mortality	0.386	389	35	120	Edits to task configuration file
ACES via MEDS	Synthea-1000	30d post-hospital-discharge mortality	0.236	351	32	90	-
ACES via MEDS	Synthea-1000	30d re-admission	0.337	355	22	60	-
ACES via MEDS	Synthea-1000	End-of-Life prediction	0.449	421	28	120	-
DPM360 via OMOP	Synthea-1000	First 24h in-hospital mortality	5.932	390	205	2126	Requires new ATLAS or custom SQL queries
DPM360 via OMOP	Synthea-1000	30d post-hospital-discharge mortality	4.188	550	257	1200	-
DPM360 via OMOP	Synthea-1000	30d re-admission	6.26	870	288	2020	-
DPM360 via OMOP	Synthea-1000	End-of-Life prediction	4.901	387	222	1500	-
ACES via MEDS	MIMIC-IV-Demo	First 24h in-hospital mortality	0.617	545	35	180	Edits to task configuration file
ACES via MEDS	MIMIC-IV-Demo	30d post-hospital-discharge mortality	0.301	509	32	90	-
ACES via MEDS	MIMIC-IV-Demo	30d re-admission	0.455	532	22	90	-
ACES via MEDS	MIMIC-IV-Demo	End-of-Life prediction	0.349	589	28	300	-
OMOP-learn via OMOP	MIMIC-IV-Demo	First 24h in-hospital mortality	12.22	688	172	3623	Requires new SQL scripts and changes to Python parameters
OMOP-learn via OMOP	MIMIC-IV-Demo	30d post-hospital-discharge mortality	8.608	587	199	2441	-
OMOP-learn via OMOP	MIMIC-IV-Demo	30d re-admission	19.71	640	168	2998	-
OMOP-learn via OMOP	MIMIC-IV-Demo	End-of-Life prediction	24.54	932	251	12000	-

We also conducted experiments to demonstrate the necessity of a DSL like ACES, which is not yet established in ML4H cohort querying. For example, we recreated cohorts from past studies using MIMIC-IV and observed deviations of more than 15% when using existing cohort definitions. This variation underscores that natural language cohort descriptions, even when paired with codebases, lack sufficient precision. Tools like OMOP-Learn and DPM360, while valuable, often require significant effort to adapt code, resolve errors, and ensure compatibility across environments. ACES overcomes these limitations by encapsulating the process in a standardized, reproducible, and low-barrier framework, ensuring accessibility for all researchers.

Ultimately, ACES addresses a critical problem in ML for healthcare by providing an accessible, robust, and deterministic solution for cohort extraction. This approach facilitates reproducibility and lowers the entry barrier for researchers, regardless of their technical expertise or familiarity with common data models.

We hope this additional context and evidence help address your concerns, and we would be grateful if you could re-evaluate your rating in light of these contributions. Thank you again for your time and constructive feedback, which have been very helpful for ACES and our paper!