The Pandemic PACT Dataset
A Review
Introduction
This document provides a review of the Pandemic PACT dataset. Specifically, this document:
Gives a brief description of what the Pandemic PACT dataset is about; and,
Reviews the fields available in the Pandemic PACT dataset and matches it to the fields available in EcoHealth Alliance’s Research Information Gateway database.
Pandemic PACT Overview
The Pandemic PACT project is an initiative led by the Pandemic Sciences Institute at the University of Oxford. It aims to monitor and analyse global research efforts in pandemic preparedness, focusing on the development of diagnostics, therapeutics, vaccines, and other countermeasures for infectious diseases.
The project curates and tracks an extensive dataset, which include research projects, funding, and outputs from international studies. The objective is to provide insights into how resources are allocated and to identify gaps in global health security efforts. It emphasises equitable access to the benefits of research and innovation, particularly for vulnerable populations in lower-income regions.
The dataset
The data and tools developed through the Pandemic PACT project are hosted on platforms like Figshare1 and on its website2, enabling open access for researchers to analyse trends and contribute to pandemic readiness globally.
The Pandemic PACT dataset is distributed with an open license under the Creative Commons - Attribution - 4.0 International license (CC-BY-4.0).
Curation process
The process of curating the Pandemic PACT dataset is described in their protocol and is summarised in Figure 1.
Source: https://www.pandemicpact.org/about/our-data
Data governance
In the Pandemic PACT’s published protocol, it states its data governance procedures:
For the purposes of the Pandemic PACT project, we have collected data on researchers and their research outputs using names and Open Researcher and Contributor IDs (ORCIDs) (“personal” data). The University of Oxford are the ‘data controller’ for these data, which means we decide how to use it and are responsible for looking after it in accordance with the UK General Data Protection Regulation and associated data protection legislation. We share data with anyone who wishes to download and re-use the information under a CC-BY licence. We will only retain data for as long as we need it to meet our purposes, including any relating to legal, accounting, or reporting requirements. Data will be held securely in accordance with the University’s policies and procedures. Further information is available on the University’s Information Security website where information on rights in relation to personal data are explained.
Sources
The Pandemic PACT lists the following funders as their sources of information found in the dataset (see Figure 2).
Source: https://www.pandemicpact.org/about/our-data
Pandemic PACT dataset fields
As of 2024-11-20, the Pandemic PACT dataset contains 19172 grant records of various research activities relevant to diseases with pandemic potential.
Description of Pandemic PACT dataset fields compared with EHA database
Table 1 presents the various fields in the Pandemic PACT dataset3. A column labelled Similar/related fields to the EHA database has been added to indicate whether the particular Pandemic PACT field has a related/similar/same field in the EHA database.
Variable name | Data format | Data Standard | Values | Notes | Similar/related field to EHA database |
---|---|---|---|---|---|
PACTID | string | Non-standard, assigned internally | A combination of a letter character and numbers | Pandemic PACT-specific but EHA database has its own unique identifier | |
Grant in Scope | binary | Non-standard, assigned internally | This field indicates whether grant is in scope as per Pandemic PACT protocol; no corresponding field in EHA database | ||
Grant Title Original | text | Non-standard | This is the same as the activity field in the activities table of the EHA database |
||
Grant Number | text | Non-standard | As assigned by a funder | This is the same as the grant_id field in the activities table of the EHA database |
|
Grant Amount Original | string | Non-standard | Not found in the EHA database | ||
Grant Currency | string | ISO 4217 code | Not found in the EHA database | ||
Currency Exchange Rate USD | numeric | Non-standard | Calculated using API and code | Not found in EHA database | |
Grant Amount Converted | numeric | Non-standard | Not found in the EHA database | ||
Grant Type | text | Non-standard | Not found in the EHA database | ||
Abstract Original | text | Non-standard | This is the same as the abstract field in the activities table in the EHA database |
||
Abstract English | text | Non-standard | This is the same as the abstract field in the activities table in the EHA database |
||
Lay Summary | text | Non-standard | Not found in the EHA database | ||
ODA funding used | binary | Non-standard, assigned internally | Official Development Assistance (ODA) | Not found in the EHA database | |
Grant Start Month | numeric | MM, ISO standard | Not found as is in the EHA database; related to the activity_start_date field in the activities table of the EHA database |
||
Grant Start Year | numeric | YYYY, ISO standard | Not found as is in the EHA database; related to the activity_start_date field in the activities table of the EHA database |
||
Grant End Month | numeric | MM, ISO standard | Not found as is in the EHA database; related to the activity_end_date in the activities table of the EHA database |
||
Grant End Year | numeric | YYYY, ISO standard | Not found as is in the EHA database; related to the activity_end_date in the activities table of the EHA database |
||
Publication Month of Award | numeric | MM, ISO standard | Not found in the EHA database | ||
Publication Year of Award | numeric | YYYY, ISO standard | Not found in the EHA database | ||
Grant Type | text | Non-standard | New Grant, Grant Extension | Not found in the EHA database | |
Study Subject | Text, Boolean | MESH Terms | Animals, bacteria, human populations, disease vectors, viruses, environment, other, unspecified, not applicable | Similar to the target_species field in the activities table of the EHA database |
|
Ethnicity | text, Boolean | Standard, UK Census | Asian, Black, White, Mixed, other, unspecified, not applicable | Optional field, populate if the grant is for research involving a specific ethnic group | Not found in the EHA database |
Age Groups | Text, Boolean | MESH Terms modified | Adolescent, 13–17 yrs; Adults, 18+; Children, 1–12 yrs; Infants, 1mth–1yr; Newborn (<1mth); Older adults, 65+; Unspecified, not applicable | Optional field, populate if the grant is for research involving a specific age group | Not found in the EHA database |
Rurality | text, Boolean | MESH terms, modified | Rural population/setting, suburban population/setting, urban population/setting, other, unspecified, not applicable | Optional field, populate if the grant is for research on urban or rural populations or settings | Not found in the EHA database |
Vulnerable Populations | Text, Boolean | MESH Terms, modified | Disabled persons, drug users, Internally Displaced and Migrants, Indigenous People, Sexual and gender minorities, Prisoners, Sex workers, Smokers, Women, Pregnant women, Individuals with multi-morbidity, Minority communities unspecified, vulnerable populations unspecified, other, unspecified, not applicable | Optional field, populate if the grant is for research involving a specific vulnerable population group | Not found in the EHA database |
Occupational Groups | Text, Boolean | MESH terms modified | Farmers, Emergency Responders, Military Personnel, Social workers, Caregivers, Health Personnel, Hospital personnel, Nurses and Nursing Staff, Physicians, Dentists and dental staff, Vets, Volunteers, other, unspecified, not applicable | Optional field, populate if the grant is for research involving a specific occupational group | Not found in the EHA database |
Study Type | Text, Boolean | Non-standard | Clinical, Non-clinical, other, unspecified, not applicable | If clinical is selected, then there is an option to select a clinical trial phase and design and record this information in a new field. If non-clinical is selected, then there is an option to choose a report or literature review in a new field | Similar to activity_type field in the activities table of the EHA database |
Disease | numeric | Standard, SNOMED code | See the list of diseases at https://termbrowser.nhs.uk/ | Similar/related to the diseases field in the topics table of the EHA database |
|
Pathogen | numeric | Standard, SNOMED code | See the list of diseases at https://termbrowser.nhs.uk | Similar/related to the topic_name field of the topics table of the EHA database |
|
Funder | text | Standard, CrossRef Open Funder Registry | https://www.crossref.org/services/funder-registry/ | This is the same as the funder_name of the funders table of the EHA database |
|
Funder Region | text | Standard, WHO region | https://en.wikipedia.org/wiki/List_of_WHO_regions | The region was assigned automatically based on the country of the funding organisation as listed in the global standard list | Similar to the funder_region field of the funders table of the EHA database but instead of WHO regions, the EHA database shows AU regions. However, the funders table has a corresponding WHO region classification |
Funder Country | numeric | ISO 3166-1 numeric | https://www.crossref.org/services/funder-registry/ | Country information was pulled from the CrossRef Open Funder Registry | The same as the who_country_id of the funders table of the EHA database |
Funder Acronym | text | Standard, CrossRef Open Funder Registry | Acronym was pulled from the CrossRef Open Funder Registry | The same as the funder_short_name field of the funders table of the EHA database |
|
Investigator Title | text | Non-standard | Not found in the EHA database | ||
Investigator First Name | text | Non-standard | Not found as is in the EHA database; related to the researcher_name field in the researchers table of the EHA database |
||
Investigator Last Name | text | Non-standard | Not found as is in the EHA database; related to the researcher_name field in the researchers table of the EHA database |
||
Investigator ORCID | string | Standard, ORCID ID number | Optional field. Researchers manually searched and entered the ORCID using the first and last name of the awardee. | This is the same as the orcid field in the researchers table in the EHA database |
|
ROR ID | string | Standard, ROR ID | https://ror.org/ | Research Organisation Registry (ROR ID) for research institution | Not found in the EHA database |
Institution Name | text | Standard, ROR list of research institutions | https://ror.org/ | This is the same as the institution_name field in the institutions table of the EHA database |
|
Institution Country | text | Standard, ROR list of research institutions | https://ror.org/ | This is the same as the who_country_id field of the institutions table of the EHA database |
|
Institution Country ISO | numeric | ISO 3166-1 numeric | https://www.iso.org/iso-3166-country-codes.html | This is the same as the country_code field of the countries table of the EHA database |
|
Research Institution Region | text | Standard, WHO region | The region was assigned by a data manager using information from the ROR list | This is the same as the who_region_name field in the countries table of the EHA database |
|
Partner Organisation Name | text | Non-standard | Information on the partner organisation is added if available in the grant abstract | This is similar to the institution_name of the institutions table of the EHA database |
|
Research Location Country | text | Non-standard | Information on the location of research is added if available in the grant abstract. Otherwise, we used the country where the Research Institution is based | This is similar to the country_id field in the activities table of the EHA database |
|
Research Location Country ISO | numeric | Standard, ISO 3166-1 numeric code | This is similar to the country_id field in the activities table of the EHA database |
||
Research Location Region | text | Standard, WHO Region | Assigned based on the location of research is such information is available in the grant. Otherwise, we used the region where the Research Institution is based | This is similar to the who_region_name field in the activities table of the EHA database |
|
Tags | Text, Boolean | Non-standard | Data Management and Data Sharing, Digital Health, Innovation, Gender | The tags were assigned by researcher who reviewed the grants | Not found as is in the EHA database; related to the fields in the topics table of the EHA database |
Research and Policy Roadmaps | Text, Boolean | Non-standard | 100 Days Mission, WHO Surveillance, ESSENCE for Health | Mapping to selected roadmaps was done by researcher reviewing the grants | Not found in the EHA database |
Primary Research Category | string | Non-standard | 12 broad research categories, each has a list of subcategories | Researchers reviewed each grant and assigned a broad research category and subcategory. Multiple values permitted | |
Secondary Research Category | string | Non-standard | 12 broad research categories, each has a list of subcategories | Not found as is in the EHA database; similar to but not quite the same as acrtivity_outputs and activity_purpose fields of the activities table of the EHA database |
Out of the 50 fields in the Pandemic PACT dataset, 28 (56%) have similar/related/same field in the EHA database.
The following fields in the EHA database has no similar/related/same field in the Pandemic PACT dataset (see Table 2).
Field name | Table name | Data format | Data standard | Values |
---|---|---|---|---|
activity_status | activities | text | Non-standard | Active or completed |
funder_type | funders | text | Non-standard (ontology created by EHA) | Government, Academic, Pharmaceutical, Foundation, Research Institute, Charity, Inter-governmental, International Organisation, International Non-governmental Organisation, Public-Public Partnership |
funder_email | funders | text | Non-standard | |
funder_website | funders | text | Non-standard | |
institution_type | institutions | text | Non-standard (ontology created by EHA) | University, Industry, Government Agency, Hospital, Research Institute, Charity, Partnership of Institutes, Non-Government Institution |
institution_email | institutions | text | Non-standard | |
institution_website | institutions | text | Non-standard | |
researchers | text | Non-standard | ||
website | researchers | text | Non-standard |
Table 3 summarises the per table field coverage of the EHA datasets by the Pandemic PACT dataset.
Table | No. of Fields | No. of Fields Covered by Pandemic PACT | % |
---|---|---|---|
activities | 33 | 33 | 100% |
au_countries | 3 | 1 | 33% |
countries | 9 | 6 | 66% |
funders | 8 | 5 | 63% |
institutions | 8 | 5 | 63% |
researchers | 19 | 17 | 89% |
topics | 6 | 6 | 100% |
who_countries | 2 | 2 | 100% |
Overall | 88 | 75 | 85% |
Footnotes
Data available from Figshare is more raw data in wide data format/structure.↩︎
Data available from website is derived from the raw datasets from Figshare but is already structured in a more accessible structure for general users.↩︎
As presented in the Pandemic PACT protocol - https://wellcomeopenresearch.org/articles/9-156↩︎