Insights5 Reasons Healthcare Data Is Unique andDifficult to MeasureBy Dan LeSueurHealthcare data tends toreside in multiple places.From different sourcesystems, like EMRs orHR software, to differentdepartments, likeradiology or pharmacy.The data comes from allover the organization.Those of us who work with data tend to think in very structured, linearterms. We like B to follow A and C to follow B, not just some of thetime, but all the time. Healthcare data isn’t that way. It’s both diverseand complex making linear analysis useless.There are several characteristics of healthcare data that make itunique. Here are five, in particular:1. Much of the data is in multiple places.Healthcare data tends to reside in multiple places. From differentsource systems, like EMRs or HR software, to different departments,like radiology or pharmacy. The data comes from all over theorganization. Aggregating this data into a single, central system, suchas an enterprise data warehouse (EDW), makes this data accessibleand actionable.Healthcare data also occurs in different formats (e.g., text, numeric,paper, digital, pictures, videos, multimedia, etc.). Radiology usesimages, old medical records exist in paper format, and today’s EMRscan hold hundreds of rows of textual and numerical data.Sometimes the same data exists in different systems and in differentformats. Such is the case with claims data versus clinical data. Apatient’s broken arm looks like an image in the medical record, butappears as ICD-9 code 813.8 in the claims data.And it looks like the future holds even more sources of data, likepatient-generated tracking from devices like fitness monitors andblood pressure sensors.Copyright 2017 Health Catalyst1

2. The data is structured and unstructured.Electronic medical recordsoftware has provided aplatform for consistentdata capture, but thereality is data capture isanything but consistent.For years, documentingclinical facts and findingson paper has trained anindustry to capture datain whatever way is mostconvenient for the careprovider with little regardfor how this data couldeventually be aggregatedand analyzed.Electronic medical record software has provided a platform forconsistent data capture, but the reality is data capture is anythingbut consistent. For years, documenting clinical facts and findingson paper has trained an industry to capture data in whatever way ismost convenient for the care provider with little regard for how thisdata could eventually be aggregated and analyzed. EMRs attemptto standardize the data capture process, but care providers arereluctant to adopt a one-size-fits-all approach to documentation.Thus, unstructured data capture is often allowed to appease thefrustrated EMR users and avoid hindering the care delivery process.As a result, much of the data captured in this manner is difficult toaggregate and analyze in any consistent manner. As EMR productsimprove, as users become trained to standard workflows, andas care providers become more accustomed to entering data instructured fields as designed, we will have more and better data foranalytics.An example of the above phenomenon is found in a recent initiativeto reduce unnecessary C-sections at a large health system in theNorthwest. The first task for the team was to understand how theindications for C-section were documented in the EMR. It turned outthat there were only two options to choose from: 1) fetal indicationand 2) maternal indication. Because these were the only twooptions, delivering clinicians would often choose to document thetrue indication for C-section in a free text form, while others did notdocument it at all. Well, this was not conducive to understanding theroot cause of unnecessary C-sections. So, the team worked with ananalyst to modify the list of available options in the EMR so that moredetail could be added. After making this slight modification to the datacapture process, the team gained tremendous insight, and identifiedopportunities to standardize care delivery and reduce unnecessaryC-sections.3. Inconsistent/variable definitions; Evidence-basedpractice and new research is coming out every day.Oftentimes, healthcare data can have inconsistent or variabledefinitions. For example, one group of clinicians may define a cohortof asthmatic patients differently than another group of clinicians. Asktwo clinicians what criteria are necessary to identify someone as adiabetic and you may get three different answers. There may justnot be a level of consensus about a particular treatment or cohortdefinition.Copyright 2017 Health Catalyst2

A different approachis needed that canhandle the multiplesources, the structuredand unstructured data,the inconsistency, thevariability, and thecomplexity within anever-changing regulatoryenvironment.Also, even when there is consensus, the consenting experts areconstantly discovering newly agreed-upon knowledge. As we learnmore about how the body works, our understanding continues tochange of what is important, what to measure, how and when tomeasure it, and the goals to target. For example, this year mostclinicians agree that a diabetes diagnosis is an Hg A1c value above7, but next year it’s possible the agreement will be somethingdifferent.There are best practices established in the industry, but there’salways ongoing discussion in the way those things are defined.Which means you’re trying to create order out of chaos and hit atarget that’s not only moving, but seems to be moving in a way youcan’t predict.4. The data is complex.Claims data has been around for years and thus it has beenstandardized and scrubbed. But this type of data is incomplete.Clinical data from sources like EMRs give a more complete picture ofthe patient’s story.While developing standard processes that improve quality is one ofthe goals in healthcare, the number of data variables involved makesit far more challenging. You’re not working with a finite number ofidentical parts to create identical outcomes. Instead, you’re lookingat an amalgam of individual systems that are so complex we don’teven begin to profess we understand how they work together (that isto say, the human body). Managing the data related to each of thosesystems (which is often being captured in disparate applications),and turning it into something usable across a population, requires afar more sophisticated set of tools than is needed for other industrieslike manufacturing.5. Changing Regulatory Requirements.Regulatory and reporting requirements also continue to increaseand evolve. CMS needs quality reports around measures likereadmissions, and healthcare reform means more transparentquality and pricing information for the public. The shift to valuebased purchasing models will only add to the reporting burden forhealthcare organizations.Copyright 2017 Health Catalyst3

Healthcare Data Will Only Get More ComplexHealthcare is not likethose industries wherebusiness rules anddefinitions are fixed forlong periods of time. Thevolatility of healthcaredata means a rule settoday may not be a bestpractice tomorrow.Healthcare data will not get simpler in the future. If anything, this listwill grow. Healthcare faces unique challenges and with that comesunique data challenges.Because healthcare data is so uniquely complex, it’s clear thattraditional approaches to managing data will not work in healthcare.A different approach is needed that can handle the multiple sources,the structured and unstructured data, the inconsistency, thevariability, and the complexity within an ever-changing regulatoryenvironment. The solution for this unpredictable change andcomplexity is an agile approach, tuned for healthcare. As with aprofessional athlete, the ability to change directions on a dime whenthe environment around you is in constant flux is a valuable attributeto have. If I start out from point A in direct route to point B and thelocation of point B suddenly changes or an obstacle arises, I certainlywouldn’t want to have to retrace my steps back to point A, redefinemy coordinates, and set off on the new course. Rather, I need to takeone step at a time, reevaluate, and pivot inflight when necessary.Agility Compensates for Complexity and UncertaintyThose are the core issues with healthcare data, and they are veryreal. Understanding that, and the fact that some of those issues willnever change, the question becomes how you work within thoselimitations to deliver better information to those who need it.The generally accepted method of aggregating data from disparatesource systems so it can be analyzed is to create an enterprise datawarehouse (EDW). It is a method common across many industries.Just as a physical warehouse is used to store all sorts of goods inbulk until they’re needed, an EDW houses data from across theenterprise in a single place.Yet how you aggregate that data can have a huge impact on yourability to gain maximum value from it. The early-binding methodsthat are prevalent in manufacturing, retail, and financial servicesdon’t work very well in healthcare, because they depend on makingbusiness rule decisions before you know what you want to do with it.It would be expensive to warehouse goods with the thought in mindthat you would store everything you could ever want in the future. Soyou’re paying for all the storage space and the overhead that comesalong with it. But you’re not using it.Copyright 2017 Health Catalyst4

Late-Binding allowsyou to aggregate dataquickly and developbusiness rules on thefly so users can develophypotheses, use the datato prove them right orwrong, and continue thediscovery process untilthey are able to makescientific, evidence-baseddecisions.Traditionally other industries look ahead at what business questionsthey’ll want to answer. They know exactly what information they’llneed. Their data warehouses, then, store everything they need in theway that they need it.Healthcare is not like those industries where business rules anddefinitions are fixed for long periods of time. The volatility ofhealthcare data means a rule set today may not be a best practicetomorrow. The industry is filled with instances of EDW projects thatnever deliver results or even come close to completion because therules and definitions keep changing.A better approach is to use a Late-Binding Data Warehouse.With this schema, data is brought into the EDW from the sourceapplications as-is, and placed into a source data mart. When youneed to turn it into information, it is then transformed into exactlywhat the analysis requires. If there is a change to the business rulesor definitions, such as what constitutes an at-risk patient, that changecan be applied within the application data mart rather than having totransform and reload all the data from the source.That is how Late-Binding supports the discovery process soimportant to healthcare. When frontline business users enter into aclinical analysis of the data, you want them to start free of any preconceived data models.Late-Binding allows you to aggregate data quickly and developbusiness rules on the fly so users can develop hypotheses, usethe data to prove them right or wrong, and continue the discoveryprocess until they are able to make scientific, evidence-baseddecisions.ResourcesHealthcare Analytics Adoption Model -adoption-model/Late-Binding Data Warehouse Platform rehouse-platformCopyright 2017 Health Catalyst5

ABOUT HEALTH CATALYST Health Catalyst is a mission-driven data warehousing, analytics, andoutcomes improvement company that helps healthcare organizations of allsizes perform the clinical, financial, and operational reporting and analysisneeded for population health and accountable care. Our proven enterprisedata warehouse (EDW) and analytics platform helps improve quality, addefficiency and lower costs in support of more than 50 million patients fororganizations ranging from the largest US health system to forward-thinkingphysician practices.For more information, visit, and follow us onTwitter, LinkedIn, and Facebook.About the AuthorDan has been developingand implementing the coreproducts and services ofHealth Catalyst since Februaryof 2011. He started as a dataarchitect, moved into a technicaldirector role and is now a VicePresident of Client and TechnicalOperations. Prior to joiningHealth Catalyst, Dan ownedand operated a managementconsultancy for five years thatassisted ambulatory practicesin the implementation ofelectronic health records anddata-driven managementmethodologies. In this venturehe served as data architect,business-intelligence developer,and strategic advisor tophysicians and practice ownersin the strategic managementand growth of their practices.Dan holds Master’s degrees inBusiness Administration andHealth-Sector Managementfrom Arizona State Universityand a Bachelor of Arts degree inEconomics from Brigham YoungUniversity.Copyright 2017 Health Catalyst6

May 05, 2021 · formats. Such is the case with claims data versus clinical data. A patient’s broken arm looks like an image in the medical record, but appears as ICD-9 code 813.8 in the claims data. And it looks like the future holds even more sources of data, like patient-generated tracking from