Skip to Main Content
The HSHSL is a part of the University of Maryland, Baltimore | My UMB The Elm UM Shuttle Blackboard
Library Logo

601 West Lombard Street
Baltimore MD 21201-1512

Reference: 410-706-7996
Circulation: 410-706-7928

Finding Data: Types of Health Data

Finding and using data and datasets from various sources. Browse datasets by topic.

Common Data Types

Quantitative Data

  • Quantitative data is measurable, often used for comparisons, and involves counting of people, behaviors, conditions, or other discrete events.
  • Quantitative data uses numbers to determine the what, who, when, and where of health-related events.
  • Examples of quantitative data include: age, weight, temperature, or the number of people suffering from diabetes.

Qualitative Data

  • Qualitative data is a broad category of data that can include almost any non-numerical data.
  • Qualitative data uses words to describe a particular health-related event.
  • This data can be observed, but not measured.
  • Involves observing people in selected places and listening to discover how they feel and why they might feel that way.
  • Examples of qualitative data include: male/female, smoker/non-smoker, or questionnaire response (agree, disagree, neutral).
  • Example of qualitative data from a health care setting include: measuring organizational change, measures of clinical leadership in implementing evidence-based guidelines, or patient perceptions of quality of care.

Reference:

National Institutes of Health Office of Research Services. "Common Data Types in Public Health Research." 2024.

Research Data

Research Data is collected during the research process, specifically for the purpose of data analysis.

The goal of clinical and scientific research is to find answers to the research question by means of generating data for proving or disproving a hypothesis. While research data types may overlap with real world data, they are collected for different purposes. 

In order to prospectively collect data for clinical research approvals from institutional review boards, governing bodies and ethical committees must be sought.

Key characteristics of research data:

  • Data generated by the researcher for the primary use by the researcher.
  • Primary research data can become secondary research data once it is shared within a data repository.
  • Research data can be observational, experimental, simulation, derived, or reference.

Reference:

Kwok CS, Muntean EA, Mallen CD, Borovac JA. Data Collection Theory in Healthcare Research: The Minimum Dataset in Quantitative StudiesClin Pract. 2022;12(6):832-844. doi:10.3390/clinpract12060088


Real World Data

Real world data is data relating to patient health status and/or the delivery of health care routinely collected from a variety of sources for purposes other than research.

With the increased usage of the internet, social media, wearable devices and mobile devices, claims and billing activities, (disease) registries, electronic health records (EHRs), product and disease registries, e-health services, and other technology-driven services, have led to the rapid generation and availability of real world data.

The main sources of real world evidence are surveys, disease registries, patient-generated health data, electronic health records, administrative records, vital records, and social media.

There are many advantages of real world evidence including:

  • The ability to track real-world patient behavior.
  • The possibility of undertaking research that cannot be done with RCT, such as that on high-risk groups like pregnant women and children.
  • Social media platforms can provide patient perspectives on various health topics (adverse events, reasons for treatment changes and non-adherence, and quality of life).
  • Rapid and more straightforward retrieval of and access to data.

Read the article "#Coronavirus on TikTok: user engagement with misinformation as a potential threat to public health behavior" from University of Maryland School of Medicine faculty using real world data.

Reference:

Dang A. Real-World Evidence: A PrimerPharmaceut Med. 2023;37(1):25-36. doi:10.1007/s40290-022-00456-6

 

See below for a list of common sources for real world evidence

Surveys

Surveys are an important means of collecting health and social science information from a sample of people in a standardized way to better understand a larger population. While surveys can also be used in research data, in terms of real world data, patient-reported outcomes are a common form of survey you may encounter. There are many methods used to conduct surveys, including questionnaires and in-depth interviews via phone, mail, email, and in-person.

Two main types of surveys used to collect health-related data are population surveys and provider surveys.

Common issues with surveys:

  • It can be hard to get detailed information in a survey, respondents may choose not to answer difficult questions, or they may not remember important details correctly.
  • Surveys can have low response rates, and those who do not have access to the medium (mail, phone, email, etc.) through which the surveys are distributed are excluded. 

Reference:

National Library of Medicine. "Finding and Using Health Statistics." 2024.

Disease Registries

Disease registries are systems that allow people to collect, store, retrieve, analyze, and disseminate information about people with a specific disease or condition. Disease registries let researchers estimate how large a health problem is, determine the incidence of the disease, study trends over time, and evaluate the effects of certain environmental exposures. 

Registries are kept by governments, hospitals, universities, non-profits, and private groups. They store data from hospital records, lab reports, and other sources.

Common issues with disease registries:

  • It can be difficult to accurately track trends because diseases sometimes change definitions.
  • Data can also be lacking if hospitals or doctors do not report it.

Reference:

National Library of Medicine. "Finding and Using Health Statistics." 2024.

Patient-Generated Health Data

Patient-generated health data are data generated from devices that provide information on a patient’s status (for example, internet-connected scales, pedometers, home blood pressure monitors). Patient-generated health data can include the raw sensor values and summary statistics calculated from the underlying data.

The creation and use of PGHD offers multiple benefits to patients, caregivers, health care systems, and researchers as it complements information captured in other health care data sources and offers communication channels for greater patient care, involvement in health practice, and research.

Common issues with patient-generated health data:

  • Wearable devices generate huge amounts of data. Advances in data storage, real-time processing capabilities and efficient battery technology would be essential for the full utilization of wearable data.

Reference:

Duke-Margolis Institute for Health Policy. "Regulatory Fit-for-Purpose Considerations for Patient-Generated Health Data." 2024.

Liu, F., Panagiotakos, D. Real-world data: a brief review of the methods, applications, challenges and opportunitiesBMC Med Res Methodol. 2022;22(287). doi.org/10.1186/s12874-022-01768-6

Electronic Health Records

Electronic health records, or medical records, are used to track events and transactions between patients and health care providers. They offer information on diagnoses, procedures, lab tests, and other services. Medical records help us measure and analyze trends in health care use, patient characteristics, and quality of care.

Medical records are usually accurate and detailed because they come from health care providers. The data are automatically collected, including information that patients might not think to add or feel comfortable sharing through other data sources like surveys.

Common issues with electronic health records:

  • Because the information is written down in a specific context, however, it can be misinterpreted if taken out of context.
  • EHRs are only available for people who are able to get medical care.
  • For researchers, it can be costly, both in time and in money, to obtain medical records in the United States, especially for large-scale studies. 

Reference:

National Library of Medicine. "Finding and Using Health Statistics." 2024.

Administrative Records

Administrative records, or claims records, are another sort of electronic record, but on a much bigger scale. Claims databases collect information on  doctors’ appointments, bills, insurance information, and other patient-provider communications.

Administrative data come directly from notes made by the health care provider, and the information is recorded at the time patient sees the doctor. Researchers can use these records to analyze groups of patients with rare illnesses and medical conditions.

Common issues with administrative or claims records:

  • There may be low validity due to certain illegal billing practices, like ordering unnecessary tests or billing for services that were not provided.

Reference:

National Library of Medicine. "Finding and Using Health Statistics." 2024.

Vital Records

Vital records are collected by the National Vital Statistics System, and are maintained by state and local governments. Vital records include births, deaths, marriages, divorces, and fetal deaths. They also record information about the cause of death, or details of the birth.

Vital records are useful because they offer very detailed information and include information about rare disorders that end in death.

Common issues with vital records:

  • Records can be inconsistent and vary state by state.
  • Vital records only provide information on diseases and illnesses that end in death.

Reference:

National Library of Medicine. "Finding and Using Health Statistics." 2024.

Social Media

Social media platforms such as Facebook, Twitter and patient networks have created abundant opportunities for patients and their carers to create and exchange health-related information.

Social media data can be used meaningfully to understand patient experiences with their disease or treatment more broadly.

Common issues with social media:

  • Exploration of topics can often be limited; Twitter, for example, only allows individuals to write 280 characters.
  • Many discussions also take place in private patient forums, largely inaccessible to researchers.
  • The demographics of individuals posting on social media are rarely known. 

Reference:

McDonald, L., Malcolm, B., Ramagopalan, S. et al. Real-world data and the patient perspective: the PROmise of social media?BMC Med. 2019;17(11). doi.org/10.1186/s12916-018-1247-8