|Articles|August 20, 2015

August 2015
Volume 2
Issue 3

Big Data: Here to Stay, for Better or Worse

"Big data" has been described by various futurists and health industry pundits as the source of the answers to all problems in health care.

Health care data streams offer great promise, but pharmacy stakeholders must beware of improper data mining and false conclusions.

“Big data” has been described by various futurists and health industry pundits as the source of the answers to all problems in health care. The promise of vast amounts of data about patients, doctors, and hospitals that can be sorted and summarized numerous ways has led some stakeholders to conclude that the reasons for any problem can be determined and the solution predicted with great certainty. The reality: although it is true that the data streams are now quite voluminous and opportunities abound, there are plenty of caveats to the notion of elegant and accurate data sorting and analyzing, as well as the prediction of outcomes.

Big data describes any large data set that has the potential to be mined for information. It is high volume, has a wide variety, can be quickly generated and aggregated, and is often (regrettably) incompatible with different databases. Big data allows for faster identification of high-risk patients, more effective interventions, and closer monitoring. It has earned the label of “big” because it comes from so many more sources than in the past. Everything everyone does can now be stored capable of being stored in, and potentially recalled, from a computer, cell phone, tablet, etc. Every clinical outcome or lab, charge, cost, and provider identity can be both stored and combined with another database for any patient, disease, or doctor. Detailed reports can be prepared for any category and drilled down to any population subset within seconds rather than days.

Big data is data now coming from many diverse corners of the health care system: research from drug manufacturers, digitized patient records, clinical trial information, and claims databases from public payers such as Medicare and Medicaid. In addition, an individual patient’s clinical data now come from a variety of sources, as well: payers, hospitals, outpatient clinics, doctors’ offices, and the patient themselves. Electronic medical records (EMRs) have become a major source of data thanks to federal incentives. With EMRs, every lab, drug, intervention, order sheet, physician order, progress note, and (potential) clinical outcome is available for aggregation by population and identification for future trends.

Big Data Takeaways

Big data streams are high volume and quickly generated and aggregated.
Big data allows for faster identification of high-risk and high-cost patients.
Self-reported data by patients are particularly powerful in predictive terms and useful for patient satisfaction, quality of life, and correlation with clinical data; they come via cellphone, social media, or online surveys.
Big data enables searching of data for relationships and trends between outcomes, costs, providers, hospitals, and certain disease states that be used to predict future behavior and performance or identify areas for improvement.

Big Data Caveats

Big data streams often reside in separate databases that may be incompatible with other databases.
Missing, unverified, or incomplete data can limit usefulness. Different databases may lack standardization in definitions and of terminology.
Big data doesn’t equal big evidence; well-designed research to build a case is still needed. Big data correlations do not necessarily establish cause and effect, and can result in ridiculous conclusions. Potential for serious sampling errors exist with retrospective analysis of big data as opposed to more rigorous (but also more expensive and time-consuming) clinical trials.
Data mining (also known as data dredging) may enable business intelligence, but may be problematic when attempting to establish relationships between costs, outcomes, and providers. It is easy to get “accidental,” and incorrect conclusions, from this approach.

Examples of Contemporary Uses of Big Data

Prediction of patient behavior (adherence, emergency department utilization, and other outcomes and behaviors of interest).
Establish cost-effectiveness and use patterns among competing hospitals, drugs, and providers by comparing costs and clinical outcomes with providers and facilities, but by grouping the patients by things they have in common (eg, location, disease state, age, gender, etc).
Develop recommendations for clinical pathways, clinical guidelines, protocols, and formularies for better outcomes based on past experience.
Enable the targeting of patient groups and focused interventions for patients who are the most expensive in a system (eg, high-cost patients, readmissions, patients whose condition worsens, adverse events, and patients with complicated, multiorgan diseases).
Determine which drugs are associated with high rates of adverse events.
Monitor patients and providers for compliance with treatment guidelines, and educate or penalize those who fail to comply (or reward the compliant ones).
Pharmacoeconomic analysis to determine which drug, device, or service is the most cost-effective.

Big Data vs Big Evidence: What’s the Difference?

Doing research is just like cooking food your guests will enjoy and want more of: you need the right ingredients (data) and a method of preparing and combining them (research and statistics) to create the end product (evidence). Until all the ingredients come together properly, you do not have something worthy of presentation. According to the International Society for Pharmacoeconomics and Outcomes Research Task Force Report on Real World Data, “Evidence is generated according to a research plan and interpreted accordingly, whereas data is but one component of the research plan. Evidence is shaped, while data simply are raw materials and alone are noninformative.”¹

Much has been written in recent years suggesting that medicine decision making should be evidence-based. Evidence-based medicine (EBM) has been defined as “the conscientious, explicit, and judicious use of current best evidence in making decisions about the care of individual patients.”² Although EBM requires data, it also requires a rigor for both collection and analysis of the data so that the conclusions are not due to bias, statistical sampling error, or errors in data analysis. Simply copying data to a spreadsheet and looking at the average values may be quick, but it may also be inaccurate. No matter how much data exist, researchers still need to ask the right questions to create a hypothesis, design a test, and use the data to determine whether their hypothesis is true.

Self-Reported Data

Big data includes data from patients or unverified sources: patient registries, social media, and government sites that allow users and providers to enter data directly. These data can be aggregated and sorted anonymously or, with patient permission, be tied directly to objective clinical data, charges, cost of care, their disease states, and the medications they are taking. They can be used to measure a patient’s quality of life; their experience with physicians, hospitals, or other providers; or even home monitoring. Using GPS-enabled devices and smartphone apps, it is possible to directly report heart rate, blood pressure, arrhythmias, medication use or refill information, and blood glucose levels.

Data Mining and Correlations

A trend seen among less-experienced database users is the improper use of data mining. Individuals may search and re-sort the database until they find something that looks significant, even if it seems illogical. There are 3 problems with this approach:

As you data mine, you tend to shrink the size of the sample because fewer and fewer people have all the characteristics you add. If you search long enough, you can find results that may not be statistically significant.
It is easy to bias the data mining by first looking at which drugs had desirable outcomes and then choosing patients to match.
It is possible to generate spurious correlations, or things that correlate with each other but in fact have no relationship with each other. There is an entire website (www.tylervigen.com) devoted to such correlations between unrelated findings, such as the number of films a particular actor appeared in compared with the number of drowning deaths during a given time period.

The biggest risk of error from these correlation conclusions is inferring cause and effect. When 2 things occur together (which is all that correlation confirms), the researcher has the chance to show bias by declaring which happened first, naming which is the cause and which is the effect.

Predictive Modelling and Confounders

Big data by itself has limited value. The usefulness lies in the ability of pharmacy stakeholders to determine the trends and relationships between data points for any single population member. There are always “hidden variables,” or confounders, that may not be seen in the data but could serve to be important predictors of the outcome. These confounders include other concurrent therapy, severity of an illness, standard of care, concurrent diseases, and a patient’s genetic makeup.

A field of science called “predictive analytics” is used to predict how a situation will play out in the future based on results from the past. A prediction can be used to treat an entire population similar to the one studied or can be used to tailor treatments for individual patients based on determining the provider, hospitals, or medications most likely to achieve a given outcome. In essence, big data serves to substitute the experience of thousands of similar patients who had a variety of outcomes for the clinical judgment of the treating physician.

Can Big Data Be Used to Tell Us Which Care Is Cost Effective?

To choose a cost-effective intervention (whether medications, clinical services, or devices), provider, or facility for a patient or group of patients, providers need to know the cost from the perspective of the user. Costs differ from the provider and payer perspectives, and a hospital's costs and its charges are not the same. We also need to know how effective an intervention is in achieving the primary clinical outcome, whatever that may be: a quicker cure, a longer life, a disability prevented, a successful surgery. Costs and efficacy may then be compared, and the following rules developed by the author of this article may be used to determine the most cost-effective treatment:

If 2 drugs have the same cost, choose the more effective drug.
If 2 drugs have equal efficacy, choose the less expensive drug.
If 1 drug costs less and is more effective, choose it because it is dominant (a no-brainer).
If 1 drug costs more and is more effective, the more expensive drug is considered cost effective if the extra benefits are worth the extra cost (ie, it has greater value).

Potential for errors in cost-effectiveness research include:

Mixing perspectives.
Failure to capture costs outside the area served by the database, such as the cost of care of the home-care provider or a physician paid directly by the patient rather than the insurer.
Insufficient clinical data or faulty assumptions for missing data.
Cost figures that are an ill-defined mix of direct and indirect costs, and fixed and variable costs.

Conclusion

Big data seems to offer some real potential to improve the quality of care and related outcomes by trying to determine which procedures and providers offer cost-effective treatment. Database users need to know the information they will be accessing is complete and accurate. Big data needs to provide enough detail so that when users need to “drill down” to a specific treatment, patient category, or provider, the data can be accessed and summarized. Providers will, assuming that all the relevant databases can be tied together, have more information from multiples places where care has been provided: pharmacists, labs, physicians, hospitals, nursing homes, emergency departments, and outpatient surgery centers.

Because privacy and security will be concerns since significant database breaches are reported weekly among large companies, it will be important to ensure data do not fall into the wrong hands. Consumers will always be concerned that some of this data may fall into the hands of an employer, insurance company, or even an ex-spouse and be used in a prejudicial manner. How to collect and sort this data while keeping it away from the “wrong” people and getting to the “right” people will be a challenge for years to come.

The potential reward of big data is tremendous, but it coexists with the possibility of serious problems resulting from its misuse. Unverified data from patients; databases that cannot communicate and share information; the risk of missing, incomplete, or inaccurate information; and a lack of rigor in research, including false conclusions of cause and effect based on incorrect association or correlation, are all issues that must be addressed.

Lorne Basskin, PharmD, is a consultant on outcomes research, formulary decision making and pharmacoeconomics, and teaches in the School of Public Health at Brown University.

References

Garrison LP Jr, Neumann PJ, Erickson P, Marshall D, Mullins CD. Using real-world data for coverage and payment decisions: the ISPOR Real-World Data Task Force report. Value Health. 2007;10(5):326-335.
Sackett DL, Rosenberg WM, Gray JA, Haynes RB, Richardson WS. Evidence based medicine: what it is and what it isn’t. BMJ. 1996;312(7023):71-72.

Articles in this issue

over 10 years ago

Article

Pharmacists as Providers: The Road to Recognition in Washington State

over 10 years ago

Article

Combatting Alert Fatigue: Holistically Reducing Noise at the Point of Care

over 10 years ago

Article

Health Information Technology Brings Opportunities

over 10 years ago

Article

PCSK9 Inhibitors: Their Likely Place in Hypercholesterolemia Therapy

over 10 years ago

Article

Compliance Packaging: One Way to Help the Medicine Go Down

over 10 years ago

Article

Still Struggling to Find Their Role: Community Pharmacy Participation in ACOs

over 10 years ago

Article

Social Media and Gamification: New Members on the Health Care Team

over 10 years ago

Article

Personal Health Records: A Vital Element of Patient-Centered Care

over 10 years ago

Article

Technology Innovations: Empowering the Patient

over 10 years ago

Article

Five Technology Trends: Changing Pharmacy Practice Today and Tomorrow

Stay informed on drug updates, treatment guidelines, and pharmacy practice trends—subscribe to Pharmacy Times for weekly clinical insights.

Subscribe Now!

Latest CME

On-Demand Webinar

Implementing Updated RSV Vaccination Recommendations for Older Adults: Considerations for the Pharmacist (Pharmacy Technician Credit)

0.5 Credit / Immunology

On-Demand Webinar

Implementing Updated RSV Vaccination Recommendations for Older Adults: Incorporating ACIP Guidelines (Part 2)

0.5 Credit / Immunization

Online Article

Navigating Diabetes: A Guide to Effective Oral Treatments

1.0 Credits / Endocrinology, Diabetes & Metabolism

On-Demand Virtual Symposium

Exploring HIV Long-Acting Injectable Uptake: How Pharmacists Can Encourage Long-Acting Injectables to Stem the Spread of HIV

1.5 Credits / HIV/AIDS, Infectious Disease

In-Person + Virtual Event

APhA 2026

March 28-29, 2026

On-Demand Webinar

Optimizing LDL-C Lowering and Adherence to Hyperlipidemia Guidelines

1.0 Credit / Cardiology

Podcast

Best Practices for Management of Hyperlipidemia: A Focus on Guidelines and Patient Adherence

0.5 Credit / Cardiology

Webinar Registration

Understanding the Pearls and Pitfalls of Nimodipine in Aneurysmal Subarachnoid Hemorrhage: Opportunities to Improve Patient Outcomes

February 19, 2026 | March 5, 2026 | 1:00 PM & 8:00 PM ET

Case Conversation

Implementing Updated RSV Vaccination Recommendations for Older Adults: RSV Case Discussion

0.75 Credit / Immunization

On-Demand Webinar

Implementing Updated RSV Vaccination Recommendations for Older Adults: Incorporating ACIP Guidelines (Pharmacy Technician Credit)

0.5 Credit / Immunization

On-Demand Webinar

Implementing Updated RSV Vaccination Recommendations for Older Adults: Introduction and Disease Burden (Part 1)

0.5 Credit / Immunization

AJMC Supplement

Revolutionizing Acute Pain Relief: Emerging Nonopioid Therapies and the Essential Role of Managed Care

2.0 Credits / Pain Management

On-Demand Webinar

Empowering Patients With COPD: The Pharmacist's Role in Personalized Treatment Strategies

1.0 Credit / Pulmonology/Respiratory

Panel Discussion Registration

2025 SABCS Abstracts to Action: From Data to Decisions in HR+ Breast Cancer Care

February 23, 2026 | 7:00 PM ET

Case Conversation

Implementing Updated RSV Vaccination Recommendations for Older Adults: RSV Case Discussion (Pharmacy Technician Credit)

0.75 Credit / Immunization

Webinar Registration

Advancing AML Care: Equipping Pharmacists to Implement Menin Inhibitor Therapies with Precision

Thursday, March 5, 2026 | 12:00 PM & 7:00 PM ET

On-Demand Virtual Symposium

Advancing Acute Pain Care: Breakthrough Non-Opioid Therapies and the Pharmacist's Critical Role

1.5 Credits / Pain Management/Opioids

On-Demand Webinar

Hyperlipidemia Overview and the Clinical and Economic Burden

1.0 Credit / Cardiology

On-Demand Webinar

The Role of the Pharmacist in the Management of Hyperlipidemia

1.0 Credit / Cardiology

Online Article

Improving Patient Outcomes in Seasonal and Perennial Allergic Conjunctivitis

2.0 Credits / Allergy, Ophthalmology/Optometry, OTC

Online Article

Improving Patient Outcomes in Seasonal and Perennial Allergic Conjunctivitis (Pharmacy Technician Credit)

2.0 Credits / Allergy, Ophthalmology/Optometry, OTC

On-Demand Webinar

Implementing Updated RSV Vaccination Recommendations for Older Adults: Introduction and Disease Burden (Pharmacy Technician Credit)

0.5 Credit / Immunization

On-Demand Webinar

Implementing Updated RSV Vaccination Recommendations for Older Adults: Considerations for the Pharmacist (Part 3)

0.5 Credit / Immunization

Big Data: Here to Stay, for Better or Worse

Articles in this issue

Newsletter

Related Content

A Defining Moment for Health System Pharmacy: Advancing Competency, Policy, and Patient Need in 2026

Abortive Treatments for Acute Migraine: Lasmiditan and CGRP Receptor Antagonists

Are the Weight Loss Effects of Tirzepatide and Semaglutide Temporary?

January 2026 Product Updates

Shawn Riser Taylor's Research is Rooted in Mentorship and Meaningful Impact

Latest CME

Implementing Updated RSV Vaccination Recommendations for Older Adults: Considerations for the Pharmacist (Pharmacy Technician Credit)

Implementing Updated RSV Vaccination Recommendations for Older Adults: Incorporating ACIP Guidelines (Part 2)

Navigating Diabetes: A Guide to Effective Oral Treatments

Exploring HIV Long-Acting Injectable Uptake: How Pharmacists Can Encourage Long-Acting Injectables to Stem the Spread of HIV

APhA 2026

Optimizing LDL-C Lowering and Adherence to Hyperlipidemia Guidelines

Best Practices for Management of Hyperlipidemia: A Focus on Guidelines and Patient Adherence

Understanding the Pearls and Pitfalls of Nimodipine in Aneurysmal Subarachnoid Hemorrhage: Opportunities to Improve Patient Outcomes

Implementing Updated RSV Vaccination Recommendations for Older Adults: RSV Case Discussion

Implementing Updated RSV Vaccination Recommendations for Older Adults: Incorporating ACIP Guidelines (Pharmacy Technician Credit)

Implementing Updated RSV Vaccination Recommendations for Older Adults: Introduction and Disease Burden (Part 1)

Revolutionizing Acute Pain Relief: Emerging Nonopioid Therapies and the Essential Role of Managed Care

Empowering Patients With COPD: The Pharmacist's Role in Personalized Treatment Strategies

2025 SABCS Abstracts to Action: From Data to Decisions in HR+ Breast Cancer Care

Implementing Updated RSV Vaccination Recommendations for Older Adults: RSV Case Discussion (Pharmacy Technician Credit)

Advancing AML Care: Equipping Pharmacists to Implement Menin Inhibitor Therapies with Precision

Advancing Acute Pain Care: Breakthrough Non-Opioid Therapies and the Pharmacist's Critical Role

Hyperlipidemia Overview and the Clinical and Economic Burden

The Role of the Pharmacist in the Management of Hyperlipidemia

Improving Patient Outcomes in Seasonal and Perennial Allergic Conjunctivitis

Improving Patient Outcomes in Seasonal and Perennial Allergic Conjunctivitis (Pharmacy Technician Credit)

Implementing Updated RSV Vaccination Recommendations for Older Adults: Introduction and Disease Burden (Pharmacy Technician Credit)

Implementing Updated RSV Vaccination Recommendations for Older Adults: Considerations for the Pharmacist (Part 3)

Trending on Pharmacy Times

TrumpRx Launches, Offering Cash-Paying Patients Discounted Drugs

Nipah Outbreak in India Poses Low Global Risk Despite Lack of Approved Treatments

Pharmacist Takeover: CAR T Therapy Signals a Shift in ALL Treatment

SGLT2 Inhibitors in T2D Lower 5-Year Risk of CKD and Acute Kidney Injury

Q&A: Making Plant-Based Eating Practical in Pharmacy-Led Cardiometabolic Care