|Articles|April 25, 2023

Data-Driven Science: How AI and Open Data Can Revolutionize Scientific Discovery

While the demand for professionals adept in the ability to work with big data is at an all-time high, there remains a significant lack of skilled talent.

Scientists have long been perceived and portrayed in film as older individuals in white lab coats perched at a bench full of bubbling fluorescent liquids—the present day reality is quite different. Scientists are increasingly data jockeys in hoodies sitting before monitors analyzing enormous amounts of data. Modern day labs are more likely composed of sterile rows of robots doing the manual handling of materials, and lab notebooks are now electronic in massive data centers holding vast quantities of information. Today, scientific input comes from data pulled from the cloud, with algorithms fueling scientific discovery the way Bunsen burners once did.

Advances in technology, and especially instrumentation, enable scientists to collect and process data at an unprecedented scale. As a result, scientists are now faced with massive datasets that require sophisticated analysis techniques and computational tools to extract meaningful insights. This also presents significant challenges—how do you store, manage, and share these large datasets, as well as ensure that the data is of high quality and reliable?

The Impact of Big Data on Science

This growth in data is transforming the way scientists conduct research, and it is enabling new discoveries across many fields, but especially in the areas of genome and protein research. This has fostered the emergence of a whole new type of scientist whose role is as bioinformaticians and data scientists who work hands-on with big data by developing and applying algorithms. In fact, “data scientist” has been at the top of the list of desirable jobs on career sites for the last few years. However, while the demand for professionals adept in the ability to work with big data is at an all-time high, there is a significant lack of skilled talent.

In medicine, as in other fields, it’s not just the volume and velocity of data generation that is increasing, but also the variety of data being collected to answer research questions. For instance, flow cytometry data is fundamentally different from DNA sequencing data, which is again totally different from 3D models of proteins. The tools and algorithms that work for one data type are not suited for another. Further, flexibility in data storage and modeling is crucial for repurposing data. This is especially true for predictive science where integration occurs between data and data types unrelated to the hypotheses of any of the original studies.

Turning to Machine Learning and Artificial Intelligence

Technology can act like a powerful flashlight, illuminating hidden patterns and insights that exist in vast amounts of data, and allowing us to see and understand things that were previously too dark to see. That’s why, despite the recent rise in genAI like ChatGPT generating a lot of headlines and stoking fear about potential risks, drug discovery is one setting where artificial intelligence (AI) and machine learning (ML) are poised to make a significant, positive impact.

For example, during the pandemic, I had the opportunity to collaborate with the team behind the EVE Online video game to create Project Discovery - Flow Cytometry, a free mini-game that enabled tens of thousands of gamers to become citizen scientists. Using data from cell samples of patients with COVID-19 and other immune system diseases, players were trained to identify different cell patterns generated using a technology known as flow cytometry. The game was incentivized with rewards and rankings to make it fun and challenging, but many players expressed the desire and satisfaction associated with participation in scientific research, especially as it related to their own interests and experience.

To-date, players have solved millions of puzzles, representing hundreds of years of effort. All data from the project will be freely available for open science. Companies like Dotmatics will be able to use the data to develop ML approaches to flow cytometry data analysis, leading to exponentially faster, less expensive, and more significant medical breakthroughs.

Today, both ML and AI are being used around the world in many research labs and universities to expedite discoveries. The National Cancer Institute’s Center for Cancer Research has developed deep learning algorithms to improve cancer detection. For example, one model can function as “a virtual expert,” reviewing MRIs in hard to detect cancer types, guiding less-experienced radiologists, and minimizing error rates. Similarly, AI is used in the University of Toronto to predict Alzhiemer risk, byRutgers University to predict cardiovascular disease, and by hundreds of startups using advanced technology to design cheaper, safer drugs with less adverse effects.

Complexities of Big Data

Despite these advances, the complexity of the data and the heterogeneity of the tools required to analyze these data can make it difficult for researchers to collaborate effectively to generate the big datasets that AI requires. Efforts such as the FAIR Guiding Principles for scientific data management and stewardship provide guidelines to improve the Findability, Accessibility, Interoperability, and Reuse of digital assets. They are increasingly being adopted and are even being mandated by granting agencies. The withholding of funding will act as a powerful motivating force in academia, but this doesn’t directly translate to pharmaceutical companies who are perhaps even more burdened by the same underlying challenges when trying to find and share massive and complex datasets internally within global organizations.

While the old way of science using beakers and chemistry is still important, tomorrow’s scientists will be able to explore and understand the world around us and scale ambitious research into areas that are presently economically prohibitive. However, to truly harness the power of AI, we must invest in further improvements to the infrastructure supporting the integration, analysis, and reuse of data that have already become the new frontier of scientific discovery.

About the Author

Ryan Brinkman, PhD, is the vice president and research director for Dotmatics.

Stay informed on drug updates, treatment guidelines, and pharmacy practice trends—subscribe to Pharmacy Times for weekly clinical insights.

Latest CME

Virtual Event

Lymphoma Day of Education 2026

Thursday, August 20, 2026

Panel Discussion Registration

The Educated Pharmacist - Weighing In: Translating Evidence to Practice — A Multidisciplinary Analysis of Obesity Pharmacotherapy Implementation and Team-Based Care

Thursday, August 27, 2026 | 1:00 PM – 2:30 PM EDT

On-Demand Virtual Symposium

Two Targets, One Mission: Optimizing Bispecific Antibody Use in Non-Hodgkin Lymphoma

1.5 Credits / Oncology, Hematologic Cancer

Data-Driven Science: How AI and Open Data Can Revolutionize Scientific Discovery

Related Content

Up to 5 Cups of Coffee Daily Safe—and Benficial—for Most Adults

Deprescribing in Cardiovascular Disease: A New AHA Roadmap for Tackling Polypharmacy

Pharmacist Outreach Doubles SGLT2 Inhibitor Uptake in CKD

Building the Infrastructure for Outpatient CAR T-Cell Therapy

Understand the Patient Behind the Hesitation to Address Flu Vaccine Concerns

Latest CME

Lymphoma Day of Education 2026

The Educated Pharmacist - Weighing In: Translating Evidence to Practice — A Multidisciplinary Analysis of Obesity Pharmacotherapy Implementation and Team-Based Care

Two Targets, One Mission: Optimizing Bispecific Antibody Use in Non-Hodgkin Lymphoma

An American Journal of Managed Care Forum: Bridging Evidence and Access in Advanced Small Cell Lung Cancer

Continuing the Conversation: Pharmacist Interventions to Optimize T2D Care and Break Through Clinical Inertia

NEIAP Conference 2026

NEIAP 2026 Annual Forum

Navigating the Legal Landscape of Telehealth: Updates and Implications for Pharmacists and Technicians (Pharmacy Technician Credit)

Reducing the Treatment Burden in HIV Management: The Pharmacist’s Role in Long-Acting Injectable Adoption

Optimizing Lipid Management in Statin-Intolerant Populations: Payer Strategies for Evidence-Based Access and Risk Reduction

Effective Strategies to Manage Hyperglycemia When Treating with PI3K and AKT Inhibitors

Advancing Treatment Strategies in Extensive-Stage Small Cell Lung Cancer: Enhancing Pharmacist Competence in Therapy Selection, Administration, and Adverse Event Management

Beyond the Nasal Passage: Managing Chronic Rhinosinusitis With Nasal Polyps With Biologics and Pharmacist-Led Approaches

HER2-Positive Metastatic Breast Cancer: A Managed Care Perspective on Emerging Therapies and Clinical Data

Sugar, We’re Going Down: Navigating Glycemic Control in the Era of PI3K/AKT Inhibition for Breast Cancer

From Payload to Patient: Managing Toxicities of Antibody-Drug Conjugates Directed at HER2 and TROP2

Hat Trick for Hypertension: Strategies to Improve Outcomes With Single-Pill Combinations

Innovations in Hidradenitis Suppurativa Treatment: Navigating the Evolving Landscape

Addressing Gaps in Care for the Rapid and Long-Term Management of Hyperkalemia With Novel Oral Potassium Binding Agents: Insights for Managed Care Professionals

New Therapeutic Targets in the Treatment of Generalized Myasthenia Gravis: Understanding Disease Pathways and Pharmacist-Led Strategies for Optimized Care

Minimizing Injection Burden: Anti-VEGF Innovation for Retinal Disease Management

Pharmacists at the Forefront: Enhancing Targeted Therapy Implementation and Patient Outcomes in Advanced Gastric Cancer

Managing Overactive Bladder in Older Adults: Challenges and Strategies for Long-Term Care

Bridging Clinical and Access Gaps in Phenylketonuria: A Managed Care Perspective

Advancing the Idiopathic Pulmonary Fibrosis Treatment Landscape: What’s Next in Care

Exploring Immunotherapy Strategies in Endometrial Cancer

Managing Steroid-Refractory Chronic Graft-versus-Host Disease With Novel Therapies: Opportunities for Pharmacist-Led Interventions

Best Practice Approaches for Understanding Chronic Obstructive Pulmonary Disease and Precision Medicine Treatment

Collaborative Practice Agreement Implementation and Adherence: A Practical Roadmap for Oncology Pharmacists

Enhancing The Role of Oncology Pharmacists in Multidisciplinary Myeloproliferative Neoplasm Care

Advancing Patient Safety: Strategies to Address Safety Risks and Adherence Barriers to Optimize Outcomes for Patients on Injectable Therapies

The Educated Pharmacist: Overcoming Clinical Inertia—An Interactive Case Workshop for Optimizing Insulin Use

Advancing Chronic Kidney Disease Detection and Cardiovascular Risk Reduction in Complex Patients

Artificial Intelligence in Pharmacy Practice: Validated Tools, Real-World Applications, and Emerging Innovations (Pharmacy Technician Credit)

Targeting the Root of Autoimmunity in Generalized Myasthenia Gravis: Pharmacist Strategies for Integrating FcRn Therapies Into Specialty Practice

IL-23 Inhibitors in Psoriasis: Optimizing Access and Patient Outcomes Across Integrated Systems

Utilizing VMAT-2 Inhibitors for the Management of Tardive Dyskinesia: The Role of Long-Term Care Pharmacists

The Expanding Therapeutic Landscape in IgA Nephropathy: Translating New Clinical Evidence and Updated Guidelines Into Managed Care Strategies

From Guidelines to Action: Implementing Pneumococcal Vaccine ACIP Recommendations in Long-Term Care Settings

Multidisciplinary Insights and Strategies for Patients Treated With PI3K and AKT Inhibitors to Prevent Hyperglycemia

Navigating Advanced Prostate Cancer Treatment: Optimizing Novel Therapeutic Strategies for Managed Care Pharmacists

New Horizons in ATTR-CM: Therapeutic Advances and Strategic Insights

Multidisciplinary Insights to Enhance Biomarker Testing Practices in Non-Small Cell Lung Cancer

Innovations in Retinal Therapies: A Managed Care Perspective on Anti-VEGF Advancements

Bridging Innovation and Access in HR-Positive/HER2-Negative Metastatic Breast Cancer: Implications for Managed Care

Breaking Barriers in Asthma Care: Exploring the Role of Type 2 Inflammation and Biologic Therapies

Shaping the Future of Generalized Myasthenia Gravis Management: A Focus on Novel Treatment Approaches

Expert Insights on the Horizon of HER2-Directed Therapy

Type 2 Inflammation in Focus: Advancing Pediatric Atopic Dermatitis Care With Biologic Therapies

Innovations in Lymphoma Treatment and the Growing Impact of Bispecific Antibodies

Addressing the Burden of Hemolysis in Paroxysmal Nocturnal Hemoglobinuria: The Pharmacist's Contribution to Patient Care

From Treatment to Prevention: Navigating the Expanding Hereditary Angioedema Treatment Landscape

The Oncology Pharmacist's Role in Managing Small Molecule Inhibitor Use in Chronic Lymphocytic Leukemia to Enhance Patient-Centered Care

Lymphoma Day of Education 2026

Redefining Complement-Mediated Kidney Disease: The Latest in Diagnosis and Treatment

Advancing Pharmacist Expertise in R/R FL: Navigating Novel Therapies and Optimizing Patient Outcomes

Paroxysmal Nocturnal Hemoglobinuria: Managed Care Strategies to Mitigate Burden and Enhance Outcomes

Optimizing Outcomes in Myasthenia Gravis: Therapeutic Advances and Value-Based Care Models

Navigating Novel Therapies in Steroid-Refractory cGVHD: Practical Strategies for Community-Based Oncology Pharmacists

Reducing Transfusion Burden in Myeloid Disorders: Novel Therapeutic Strategies and Pharmacist Interventions

Panel Discussion: Integrating Novel Combinations and Earlier Line Use in Diffuse Large B-Cell Lymphoma

Optimizing Patient Outcomes in R/R DLBCL: Bridging Knowledge Gaps for Oncology Pharmacists in the Era of Novel Immunotherapies

Clinical Panel Debate: CAR T-Cell Therapy vs Bispecific Antibodies

From Molecules to Medicine: Pharmacologic Principles of Innovative Non-Hodgkin Lymphoma Therapies

The Pharmacist's Role in Palliative and End of Life Symptom Management (Pharmacy Technician Credit)

The Pharmacist's Role in Palliative and End of Life Symptom Management

Understanding mRNA Vaccines: Dispelling Myths and Empowering Pharmacists to Counsel Patients (Pharmacy Technician Credit)

Understanding mRNA Vaccines: Dispelling Myths and Empowering Pharmacists to Counsel Patients

Precision Matters: Foundations of Biomarker-Driven Care in Non-Small Cell Lung Cancer

Navigating the Legal Landscape of Telehealth: Updates and Implications for Pharmacists and Technicians

From Testing to Treatment: Empowering Pharmacists to Overcome Barriers and Optimize Biomarker-Driven Care

Addressing Vaccine Hesitancy in Infectious Disease Prevention