Skip to Content
SBMI Horizontal Logo



The main objective for a central eMPI is to consolidate patient records from across institutions creating unique and non-duplicated patient medical histories.  The greatest challenge to this effort is the sometimes lack of identifying patient information and typographical errors that make automatically detecting duplicate records difficult.  The algorithms used to differentiate records must be tuned in order to minimize human review in confirming records as duplicates.  BIG is investigating the feasibility of automating the reconciliation process through the implementation of Java Mural Project, open source software which is the most comprehensive eMPI product available.

Where We Started

This project began with an evaluation of the Java Mural Project to confirm the capabilities of the software would meet our needs.  Mural provides a web application to users for searching through matched and potential duplicate records.  Communication in and out of Mural can be HL7 encoded.  Default record comparison algorithms are included with Mural allowing for custom code to be written and easily implemented into the matching engine.  The matching process used during the creation of the eMPI can be run without creating the master index and is designed to be run many times to allow for fine-tuning of the matching algorithm.  This feature was especially useful since duplicate frequency and quality of patient data was unknown from the start.

Where We Are

Currently we are applying Match Analysis Methodology on the patient data that exists within source databases to determine the number of duplicates that must be handled.  Analyzing patient records provides identifying characteristics and a frequency of errors estimate for the data sources.  This information is used to fine tune the matching algorithm; reducing run time and improving match accuracy.  It is essential that the master index does not merge unique records when an automated process is used.  Any record that cannot be definitively classified as a match is referred to as a potential duplicate and requires manual review to confirm or reject the status.  The data analysis includes determining how many records within the source database are duplicates.  The goal is to maximize the number of automatically detected matches, minimize the manual review of potential duplicates and prevent matching errors.  Our currently tuned matching algorithm returns 8% of records from a source database as potential duplicates.

Future Direction

 The next step is to use the optimized matching scheme and create an eMPI based from the Allscripts and Axiom patient databases.  The system will be evaluated on the accuracy of automatic match detection and the amount of potential duplicates needing manual review.  The research eMPI will serve as a benchmark to find the requirements needed to operate a production eMPI database.