Using the PARAFAC2 tensor factorization on EHR audit data to understand PCP desktop work

Ioakeim Perros, Xiaowei Yan, J. B. Jones, Jimeng Sun, Walter F. Stewart

Research output: Contribution to journalArticlepeer-review


Background: Activity or audit log data are required for EHR privacy and security management but may also be useful for understanding desktop workflow. Objective: We determined if the EHR audit log file, a rich source of complex time-stamped data on desktop activities, could be processed to derive primary care provider (PCP) level workflow measures. Methods: We analyzed audit log data on 876 PCPs across 17,455 ambulatory care encounters that generated 578,394 time-stamped records. Each individual record represents a user interaction (e.g., point and click) that reflects all or part of a specific activity (e.g., order entry access). No dictionary exists to define how to combine clusters of sequential audit log records to represent identifiable PCP tasks. We determined if PARAFAC2 tensor factorization could: (1) learn to identify audit log record clusters that specifically represent defined PCP tasks; and (2) identify variation in how tasks are completed without the need for ground-truth labels. To interpret the result, we used the following PARAFAC2 factors: a matrix representing the task definitions and a matrix containing the frequency measure of each task for each encounter. Results: PARAFAC2 automatically identified 4 clusters of audit log records that represent 4 common clinical encounter tasks: (1) medications’ access, (2) notes’ access, (3) order entry access, and (4) diagnosis modification. PARAFAC2 also identified the most common variants in how PCPs accomplish these tasks. It discovered variation in how the notes’ access task was done, including identification of 9 distinct variants of notes access that explained 77% of the input data variation for notes. The discovered variants mapped to two known workflows for notes’ access and to two distinct PCP user groups who accessed notes by either using the Visit Navigator or the Wrap-Up option. Conclusions: Our results demonstrate that EHR audit log data can be rapidly processed to create higher-level constructed features that represent time-stamped PCP tasks.

Original languageEnglish (US)
Article number103312
JournalJournal of Biomedical Informatics
StatePublished - Jan 2020
Externally publishedYes


  • Electronic health records
  • Workflow analysis

ASJC Scopus subject areas

  • Computer Science Applications
  • Health Informatics


Dive into the research topics of 'Using the PARAFAC2 tensor factorization on EHR audit data to understand PCP desktop work'. Together they form a unique fingerprint.

Cite this