Overview of Approach

IMPACC Monitoring Framework

IMPACC Monitoring Framework

Development of Final Measure Sets

We developed a rapid, consensus-based process to establish health system-wide AI monitoring measures, piloting it with our system-wide AI scribe implementation. The goal was to create an initial set of feasible metrics, or metrics focused on immediate priorities such as safety, equity, and workflow impact. First, a literature review identified measure concepts previously evaluated in studies. Using the UCSF IMPACC AI Monitoring Metrics Framework, which builds on the HSS Trustworthy AI Playbook1, we identified key domains to prioritize for AI scribe monitoring metrics and mapped measures from the literature and additional measures to the framework. Next, we conducted a modified Delphi consensus panel with AI officers, informaticists, data scientists, clinicians, and researchers to rate the importance and feasibility of each measure on a 1-9 scale. Panelists also added new concepts as needed. Measures with high importance (≥7) and at least moderate feasibility (≥5) advanced to specification. Interdisciplinary groups then detailed sampling frames, data sources, and analytic strategies. The full group met again to review and finalize the measures before system-wide implementation.  

Want more information?  

To access measure sets and specifications or additional information regarding the measure set development process described please reach out via email