|Articles|August 20, 2015

August 2015
Volume 2
Issue 3

Big Data: Here to Stay, for Better or Worse

"Big data" has been described by various futurists and health industry pundits as the source of the answers to all problems in health care.

Health care data streams offer great promise, but pharmacy stakeholders must beware of improper data mining and false conclusions.

“Big data” has been described by various futurists and health industry pundits as the source of the answers to all problems in health care. The promise of vast amounts of data about patients, doctors, and hospitals that can be sorted and summarized numerous ways has led some stakeholders to conclude that the reasons for any problem can be determined and the solution predicted with great certainty. The reality: although it is true that the data streams are now quite voluminous and opportunities abound, there are plenty of caveats to the notion of elegant and accurate data sorting and analyzing, as well as the prediction of outcomes.

Big data describes any large data set that has the potential to be mined for information. It is high volume, has a wide variety, can be quickly generated and aggregated, and is often (regrettably) incompatible with different databases. Big data allows for faster identification of high-risk patients, more effective interventions, and closer monitoring. It has earned the label of “big” because it comes from so many more sources than in the past. Everything everyone does can now be stored capable of being stored in, and potentially recalled, from a computer, cell phone, tablet, etc. Every clinical outcome or lab, charge, cost, and provider identity can be both stored and combined with another database for any patient, disease, or doctor. Detailed reports can be prepared for any category and drilled down to any population subset within seconds rather than days.

Big data is data now coming from many diverse corners of the health care system: research from drug manufacturers, digitized patient records, clinical trial information, and claims databases from public payers such as Medicare and Medicaid. In addition, an individual patient’s clinical data now come from a variety of sources, as well: payers, hospitals, outpatient clinics, doctors’ offices, and the patient themselves. Electronic medical records (EMRs) have become a major source of data thanks to federal incentives. With EMRs, every lab, drug, intervention, order sheet, physician order, progress note, and (potential) clinical outcome is available for aggregation by population and identification for future trends.

Big Data Takeaways

Big data streams are high volume and quickly generated and aggregated.
Big data allows for faster identification of high-risk and high-cost patients.
Self-reported data by patients are particularly powerful in predictive terms and useful for patient satisfaction, quality of life, and correlation with clinical data; they come via cellphone, social media, or online surveys.
Big data enables searching of data for relationships and trends between outcomes, costs, providers, hospitals, and certain disease states that be used to predict future behavior and performance or identify areas for improvement.

Big Data Caveats

Big data streams often reside in separate databases that may be incompatible with other databases.
Missing, unverified, or incomplete data can limit usefulness. Different databases may lack standardization in definitions and of terminology.
Big data doesn’t equal big evidence; well-designed research to build a case is still needed. Big data correlations do not necessarily establish cause and effect, and can result in ridiculous conclusions. Potential for serious sampling errors exist with retrospective analysis of big data as opposed to more rigorous (but also more expensive and time-consuming) clinical trials.
Data mining (also known as data dredging) may enable business intelligence, but may be problematic when attempting to establish relationships between costs, outcomes, and providers. It is easy to get “accidental,” and incorrect conclusions, from this approach.

Examples of Contemporary Uses of Big Data

Prediction of patient behavior (adherence, emergency department utilization, and other outcomes and behaviors of interest).
Establish cost-effectiveness and use patterns among competing hospitals, drugs, and providers by comparing costs and clinical outcomes with providers and facilities, but by grouping the patients by things they have in common (eg, location, disease state, age, gender, etc).
Develop recommendations for clinical pathways, clinical guidelines, protocols, and formularies for better outcomes based on past experience.
Enable the targeting of patient groups and focused interventions for patients who are the most expensive in a system (eg, high-cost patients, readmissions, patients whose condition worsens, adverse events, and patients with complicated, multiorgan diseases).
Determine which drugs are associated with high rates of adverse events.
Monitor patients and providers for compliance with treatment guidelines, and educate or penalize those who fail to comply (or reward the compliant ones).
Pharmacoeconomic analysis to determine which drug, device, or service is the most cost-effective.

Big Data vs Big Evidence: What’s the Difference?

Doing research is just like cooking food your guests will enjoy and want more of: you need the right ingredients (data) and a method of preparing and combining them (research and statistics) to create the end product (evidence). Until all the ingredients come together properly, you do not have something worthy of presentation. According to the International Society for Pharmacoeconomics and Outcomes Research Task Force Report on Real World Data, “Evidence is generated according to a research plan and interpreted accordingly, whereas data is but one component of the research plan. Evidence is shaped, while data simply are raw materials and alone are noninformative.”¹

Much has been written in recent years suggesting that medicine decision making should be evidence-based. Evidence-based medicine (EBM) has been defined as “the conscientious, explicit, and judicious use of current best evidence in making decisions about the care of individual patients.”² Although EBM requires data, it also requires a rigor for both collection and analysis of the data so that the conclusions are not due to bias, statistical sampling error, or errors in data analysis. Simply copying data to a spreadsheet and looking at the average values may be quick, but it may also be inaccurate. No matter how much data exist, researchers still need to ask the right questions to create a hypothesis, design a test, and use the data to determine whether their hypothesis is true.

Self-Reported Data

Big data includes data from patients or unverified sources: patient registries, social media, and government sites that allow users and providers to enter data directly. These data can be aggregated and sorted anonymously or, with patient permission, be tied directly to objective clinical data, charges, cost of care, their disease states, and the medications they are taking. They can be used to measure a patient’s quality of life; their experience with physicians, hospitals, or other providers; or even home monitoring. Using GPS-enabled devices and smartphone apps, it is possible to directly report heart rate, blood pressure, arrhythmias, medication use or refill information, and blood glucose levels.

Data Mining and Correlations

A trend seen among less-experienced database users is the improper use of data mining. Individuals may search and re-sort the database until they find something that looks significant, even if it seems illogical. There are 3 problems with this approach:

As you data mine, you tend to shrink the size of the sample because fewer and fewer people have all the characteristics you add. If you search long enough, you can find results that may not be statistically significant.
It is easy to bias the data mining by first looking at which drugs had desirable outcomes and then choosing patients to match.
It is possible to generate spurious correlations, or things that correlate with each other but in fact have no relationship with each other. There is an entire website (www.tylervigen.com) devoted to such correlations between unrelated findings, such as the number of films a particular actor appeared in compared with the number of drowning deaths during a given time period.

The biggest risk of error from these correlation conclusions is inferring cause and effect. When 2 things occur together (which is all that correlation confirms), the researcher has the chance to show bias by declaring which happened first, naming which is the cause and which is the effect.

Predictive Modelling and Confounders

Big data by itself has limited value. The usefulness lies in the ability of pharmacy stakeholders to determine the trends and relationships between data points for any single population member. There are always “hidden variables,” or confounders, that may not be seen in the data but could serve to be important predictors of the outcome. These confounders include other concurrent therapy, severity of an illness, standard of care, concurrent diseases, and a patient’s genetic makeup.

A field of science called “predictive analytics” is used to predict how a situation will play out in the future based on results from the past. A prediction can be used to treat an entire population similar to the one studied or can be used to tailor treatments for individual patients based on determining the provider, hospitals, or medications most likely to achieve a given outcome. In essence, big data serves to substitute the experience of thousands of similar patients who had a variety of outcomes for the clinical judgment of the treating physician.

Can Big Data Be Used to Tell Us Which Care Is Cost Effective?

To choose a cost-effective intervention (whether medications, clinical services, or devices), provider, or facility for a patient or group of patients, providers need to know the cost from the perspective of the user. Costs differ from the provider and payer perspectives, and a hospital's costs and its charges are not the same. We also need to know how effective an intervention is in achieving the primary clinical outcome, whatever that may be: a quicker cure, a longer life, a disability prevented, a successful surgery. Costs and efficacy may then be compared, and the following rules developed by the author of this article may be used to determine the most cost-effective treatment:

If 2 drugs have the same cost, choose the more effective drug.
If 2 drugs have equal efficacy, choose the less expensive drug.
If 1 drug costs less and is more effective, choose it because it is dominant (a no-brainer).
If 1 drug costs more and is more effective, the more expensive drug is considered cost effective if the extra benefits are worth the extra cost (ie, it has greater value).

Potential for errors in cost-effectiveness research include:

Mixing perspectives.
Failure to capture costs outside the area served by the database, such as the cost of care of the home-care provider or a physician paid directly by the patient rather than the insurer.
Insufficient clinical data or faulty assumptions for missing data.
Cost figures that are an ill-defined mix of direct and indirect costs, and fixed and variable costs.

Conclusion

Big data seems to offer some real potential to improve the quality of care and related outcomes by trying to determine which procedures and providers offer cost-effective treatment. Database users need to know the information they will be accessing is complete and accurate. Big data needs to provide enough detail so that when users need to “drill down” to a specific treatment, patient category, or provider, the data can be accessed and summarized. Providers will, assuming that all the relevant databases can be tied together, have more information from multiples places where care has been provided: pharmacists, labs, physicians, hospitals, nursing homes, emergency departments, and outpatient surgery centers.

Because privacy and security will be concerns since significant database breaches are reported weekly among large companies, it will be important to ensure data do not fall into the wrong hands. Consumers will always be concerned that some of this data may fall into the hands of an employer, insurance company, or even an ex-spouse and be used in a prejudicial manner. How to collect and sort this data while keeping it away from the “wrong” people and getting to the “right” people will be a challenge for years to come.

The potential reward of big data is tremendous, but it coexists with the possibility of serious problems resulting from its misuse. Unverified data from patients; databases that cannot communicate and share information; the risk of missing, incomplete, or inaccurate information; and a lack of rigor in research, including false conclusions of cause and effect based on incorrect association or correlation, are all issues that must be addressed.

Lorne Basskin, PharmD, is a consultant on outcomes research, formulary decision making and pharmacoeconomics, and teaches in the School of Public Health at Brown University.

References

Garrison LP Jr, Neumann PJ, Erickson P, Marshall D, Mullins CD. Using real-world data for coverage and payment decisions: the ISPOR Real-World Data Task Force report. Value Health. 2007;10(5):326-335.
Sackett DL, Rosenberg WM, Gray JA, Haynes RB, Richardson WS. Evidence based medicine: what it is and what it isn’t. BMJ. 1996;312(7023):71-72.

Articles in this issue

almost 11 years ago

Article

Pharmacists as Providers: The Road to Recognition in Washington State

almost 11 years ago

Article

Combatting Alert Fatigue: Holistically Reducing Noise at the Point of Care

almost 11 years ago

Article

Health Information Technology Brings Opportunities

almost 11 years ago

Article

PCSK9 Inhibitors: Their Likely Place in Hypercholesterolemia Therapy

almost 11 years ago

Article

Compliance Packaging: One Way to Help the Medicine Go Down

almost 11 years ago

Article

Still Struggling to Find Their Role: Community Pharmacy Participation in ACOs

almost 11 years ago

Article

Social Media and Gamification: New Members on the Health Care Team

almost 11 years ago

Article

Personal Health Records: A Vital Element of Patient-Centered Care

almost 11 years ago

Article

Technology Innovations: Empowering the Patient

almost 11 years ago

Article

Five Technology Trends: Changing Pharmacy Practice Today and Tomorrow

Stay informed on drug updates, treatment guidelines, and pharmacy practice trends—subscribe to Pharmacy Times for weekly clinical insights.

Latest CME

Virtual Event

Neuropsychiatry Day of Education 2026

Thursday, July 16, 2026 | 3.0 Live CE Credits

On-Demand Virtual Symposium

Adapting to Advances in EGFR-Positive Non-Small Cell Lung Cancer Treatment: Critical Insights for Oncology Pharmacists

1.0 Credits / Lung Cancer, Oncology

Online Article

Securing the Supply Chain: The Pharmacists’ Role in DSCSA Compliance

1.0 Credit / Law

On-Demand Virtual Symposium

Expanding the Role of Long-Acting Injectables in Optimizing Schizophrenia Management

1.5 Credits / Psychiatry

On-Demand Virtual Symposium

VMAT2 Inhibitors for Long-Term Success in Tardive Dyskinesia

1.5 Credits / Neurology, Psychiatry

Big Data: Here to Stay, for Better or Worse

Articles in this issue

Related Content

Clots, Catheters, and Clinical Pearls: Pulmonary Embolism Updates for Practicing Pharmacists

Reducing Recurrence in C Difficile Infection

Putting Patients First to Drive Antimicrobial Stewardship Improvements

Functional Cure of Hepatitis B Infections Among Patients Coinfected With HIV

Connecting Care in a Changing Health System

Latest CME

Neuropsychiatry Day of Education 2026

Adapting to Advances in EGFR-Positive Non-Small Cell Lung Cancer Treatment: Critical Insights for Oncology Pharmacists

Securing the Supply Chain: The Pharmacists’ Role in DSCSA Compliance

Expanding the Role of Long-Acting Injectables in Optimizing Schizophrenia Management

VMAT2 Inhibitors for Long-Term Success in Tardive Dyskinesia

Navigating Adverse Effects in HR+/HER2-Negative Metastatic Breast Cancer: Tailored Management Plans and Pharmacist Interventions for Improved Outcomes

Bridging Knowledge Gaps in HR+/HER2– Early-Stage Breast Cancer

Optimizing HER2-Directed Therapy in Metastatic Breast Cancer: From HER2-Positive to HER2-Ultralow

Breaking Barriers in Asthma Care: Exploring the Role of Type 2 Inflammation and Biologic Therapies

Shaping the Future of Generalized Myasthenia Gravis Management: A Focus on Novel Treatment Approaches

Expert Insights on the Horizon of HER2-Directed Therapy

Chronic Obstructive Pulmonary Disease and Inflammation: Practical Approaches to Integrating Biologic Therapy

Innovations in Lymphoma Treatment and the Growing Impact of Bispecific Antibodies

Addressing the Burden of Hemolysis in Paroxysmal Nocturnal Hemoglobinuria: The Pharmacist's Contribution to Patient Care

Payment for Pharmacist Services: 2025 Update

From Treatment to Prevention: Navigating the Expanding Hereditary Angioedema Treatment Landscape

The Oncology Pharmacist's Role in Managing Small Molecule Inhibitor Use in Chronic Lymphocytic Leukemia to Enhance Patient-Centered Care

Lymphoma Day of Education 2026

Redefining Complement-Mediated Kidney Disease: The Latest in Diagnosis and Treatment

Innovations in Hidradenitis Suppurativa Treatment: Navigating the Evolving Landscape

Advancing Pharmacist Expertise in R/R FL: Navigating Novel Therapies and Optimizing Patient Outcomes

Paroxysmal Nocturnal Hemoglobinuria: Managed Care Strategies to Mitigate Burden and Enhance Outcomes

Optimizing Outcomes in Myasthenia Gravis: Therapeutic Advances and Value-Based Care Models

Navigating Novel Therapies in Steroid-Refractory cGVHD: Practical Strategies for Community-Based Oncology Pharmacists

Reducing Transfusion Burden in Myeloid Disorders: Novel Therapeutic Strategies and Pharmacist Interventions

Panel Discussion: Integrating Novel Combinations and Earlier Line Use in Diffuse Large B-Cell Lymphoma

Optimizing Patient Outcomes in R/R DLBCL: Bridging Knowledge Gaps for Oncology Pharmacists in the Era of Novel Immunotherapies

Clinical Panel Debate: CAR T-Cell Therapy vs Bispecific Antibodies

From Molecules to Medicine: Pharmacologic Principles of Innovative Non-Hodgkin Lymphoma Therapies

The Pharmacist's Role in Palliative and End of Life Symptom Management (Pharmacy Technician Credit)

The Pharmacist's Role in Palliative and End of Life Symptom Management

Understanding mRNA Vaccines: Dispelling Myths and Empowering Pharmacists to Counsel Patients (Pharmacy Technician Credit)

Understanding mRNA Vaccines: Dispelling Myths and Empowering Pharmacists to Counsel Patients

Precision Matters: Foundations of Biomarker-Driven Care in Non-Small Cell Lung Cancer

Navigating the Legal Landscape of Telehealth: Updates and Implications for Pharmacists and Technicians (Pharmacy Technician Credit)

Navigating the Legal Landscape of Telehealth: Updates and Implications for Pharmacists and Technicians

From Testing to Treatment: Empowering Pharmacists to Overcome Barriers and Optimize Biomarker-Driven Care

Clinical Presentation of Hyperglycemia with PI3K and AKT Inhibitors

Effective Strategies to Manage Hyperglycemia When Treating with PI3K and AKT Inhibitors

Innovations in Retinal Therapies: A Managed Care Perspective on Anti-VEGF Advancements

Multidisciplinary Insights to Enhance Biomarker Testing Practices in Non-Small Cell Lung Cancer

Transforming Gout Care: Navigating Barriers and Therapeutic Advances in Disease Management

New Therapeutic Targets in the Treatment of Generalized Myasthenia Gravis: Understanding Disease Pathways and Pharmacist-Led Strategies for Optimized Care

Minimizing Injection Burden: Anti-VEGF Innovation for Retinal Disease Management

Beyond the Nasal Passage: Managing Chronic Rhinosinusitis With Nasal Polyps With Biologics and Pharmacist-Led Approaches

Pharmacists at the Forefront: Enhancing Targeted Therapy Implementation and Patient Outcomes in Advanced Gastric Cancer

Managing Overactive Bladder in Older Adults: Challenges and Strategies for Long-Term Care

Bridging Clinical and Access Gaps in Phenylketonuria: A Managed Care Perspective

Sugar, We’re Going Down: Navigating Glycemic Control in the Era of PI3K/AKT Inhibition for Breast Cancer

Advancing the Idiopathic Pulmonary Fibrosis Treatment Landscape: What’s Next in Care

Exploring Immunotherapy Strategies in Endometrial Cancer

Best Practice Approaches for Understanding Chronic Obstructive Pulmonary Disease and Precision Medicine Treatment

Collaborative Practice Agreement Implementation and Adherence: A Practical Roadmap for Oncology Pharmacists

HER2-Positive Metastatic Breast Cancer: A Managed Care Perspective on Emerging Therapies and Clinical Data

Advancing Patient Safety: Strategies to Address Safety Risks and Adherence Barriers to Optimize Outcomes for Patients on Injectable Therapies

Advancing Chronic Kidney Disease Detection and Cardiovascular Risk Reduction in Complex Patients

Artificial Intelligence in Pharmacy Practice: Validated Tools, Real-World Applications, and Emerging Innovations

Addressing Vaccine Hesitancy in Infectious Disease Prevention

Protecting Against Meningococcal Disease: Updates on Vaccines, Guidelines, and Pharmacist Best Practices

IL-23 Inhibitors in Psoriasis: Optimizing Access and Patient Outcomes Across Integrated Systems

Utilizing VMAT-2 Inhibitors for the Management of Tardive Dyskinesia: The Role of Long-Term Care Pharmacists

The Expanding Therapeutic Landscape in IgA Nephropathy: Translating New Clinical Evidence and Updated Guidelines Into Managed Care Strategies

From Guidelines to Action: Implementing Pneumococcal Vaccine ACIP Recommendations in Long-Term Care Settings

Influenza Immunization in Older Adults: Enhanced Vaccines and Best Practices for Long-Term Care Pharmacists

Cardiorenal Protection With SGLT2 Inhibitors: Perspectives for Managed Care

Breast Cancer Day of Education 2026

Understanding Type 2 Inflammation and Its Role Across Various Immune-Related Diseases

Biomarker Testing in Practice: Navigating Modalities and Multidisciplinary Decision-Making

Shifting Perspectives in Treatment-Resistant Depression: Novel Therapies and Patient-Centered Care

Updated Guidance and Managed Care Strategies to Optimize Care in EGFR Mutated NSCLC

Exploring Oncolytic Viruses in Melanoma: Pharmacist-Led Approaches for Safe Integration

Reducing Recurrence in C Difficile Infection