Developing Genomic Data Sharing Plans

Funding requests proposing to generate large-scale genomic data are expected to include a genomic data sharing plan detailing the approach for following the NIH GDS policy. This page describes the elements that a genomic data sharing plan should include, how to address these elements, as well as how to submit the completed genomic data sharing plan.

The National Institutes of Health (NIH) Genomic Data Sharing (GDS) Policy expects applicants proposing research that will generate large-scale human or non-human genomic data to submit a plan in their funding application for how genomic data will be shared (a GDS plan). 

Applicants are expected to submit a more detailed genomic data sharing plan to the funding Institute or Center (IC) through the Just-in-Time process prior to award.

GDS plans for smaller-scale genomic research proposals may be expected if the funding IC determines the project is subject to the GDS Policy based on the state of the science, the IC’s programmatic priorities, and/or the utility of the data for the research community. Please review the funding announcement and GDS Policy Expectations by NIH Institute & Center to understand whether a GDS plan is required for a particular proposal. Further questions should be addressed to the funding IC’s program officer.

In addition to providing a GDS plan in the application, an investigator should include any resources that may be needed to support the genomic data sharing plan (such as fees for data storage, submission, etc.) in the project’s budget.

Investigators may also be expected to plan for sharing other types of data and resources. Be sure to consult your funding opportunity or contact a Program Officer (extramural) or Scientific Director (intramural).

Elements to Address in a GDS Plan

The information below outlines possible types of information that should be provided in a GDS plan. 

NIH strongly encourages investigators to contact the NIH program officer listed on the funding opportunity announcement as early as possible to ensure that all requirements are being addressed.

1. Data Type
  • Explain whether the proposed research involves human data, non-human data, or both. 
  • List the type(s) of genomic data that will be shared (e.g., sequence, transcriptomic, epigenomic, and/or gene expression data) and whether it is individual-level data, aggregate-level data, or both. 
  • List any other information such as relevant associated data (e.g., phenotype or exposure data) and information necessary to interpret the data (e.g., study protocols, data collection instruments, survey tools) that the investigator anticipates sharing.

2. Data Repository
  • Identify the repository or repositories where the investigator plans to submit genomic data.
    • Human data:  
      • Studies generating human genomic and any associated phenotypic data must use an NIH-designated data repository for submission.
      • Plans must include whether the data will be available through unrestricted or controlled-access repositories.
      • If data cannot be submitted to an NIH-designated repository, see Request for an Alternative Data Sharing Plan
    • Non-human data:  Studies generating non-human genomic data may use any widely available repository as appropriate for the data.

Need help identifying the most appropriate repository? See Where to Submit Genomic Data.


3. Data Submission and Release Timeline
  • Provide a timeline for how genomic data will be shared in a timely manner.
    • Human data: Generally, NIH will release data no later than six months after the data has been submitted to an NIH-designated repository and cleaned, or at the time of acceptance of the first publication, whichever occurs first, without restrictions on publication or other dissemination.
    • Non-human data: Generally, investigators should make non-human data publicly available no later than the date of initial publication. However, earlier availability may be expected for certain data, depending on the funding IC.

Refer to our Data Submission and Release Expectations page for detailed information on NIH expectations for different types of data.

 
4. Institutional Review Board (IRB) Review of Institutional Certification
  • Human data only: IRB review of the investigator’s proposal for data submission is an element of the Institutional Certification, which assures that the proposal for data submission and sharing is appropriate.

Note: An Institutional Certification is expected prior to funding along with other Just-in-Time information or finalization of a contract.

 
5. Appropriate Uses of the Data
  • Describe any limitations on the use of the data.
    • These limitations should be decided by the submitting investigator and their institution, in consultation with the IRB or equivalent body. They should be based on the language in the informed consent form or the recommendations of an IRB or equivalent body.

Need help developing informed consent documents for data sharing? See our new sample language and points to consider in the resource Informed Consent for Secondary Research with Data and Biospecimens.

 
6. Statement of Designation of Genomic Summary Results (GSR)

Submitting GDS Plans

Extramural (Grants & Contracts):
  • Include a GDS plan in the Resource Sharing Plan section of the application for funding, following the instructions in the Application Guide and the funding opportunity announcement. An NIH program officer will review the plan to ensure its acceptability. 
  • A final plan should be submitted and approved by the funding IC before the award begins. 
NIH Intramural:
  • Include the GDS plan in the project proposal for IC scientific leadership approval. If additional guidance is needed, contact IC scientific leadership, an IC GPA, or the Office of Intramural Research.

Requesting an Alternative Data Sharing Plan

NIH acknowledges that data sharing is not always possible.  If an element of the Institutional Certification cannot be met because the sharing of human genomic data for secondary research use would be inappropriate, then the investigator should request an exception to submission. 

A detailed explanation for the exception request and an alternative mechanism for data sharing should be included in the genomic data sharing plan in the Resource Sharing Plan section of the funding application or proposal. 

The alternative data sharing plan should describe an alternative mechanism for sharing as much genomic data as allowable (for example, sharing data in a summary format). Examples of factors that may preclude submission of data include international laws, limitations in the original informed consents, or concerns about harms to individuals or groups.

Exceptions to the data sharing expectation will be considered by the funding IC on a case-by-case basis.

Extramural (Grants & Contracts):
  • If the funding IC grants an exception to submission, the research will be registered in either the Database of Genotypes and Phenotypes (dbGaP), or NCBI BioProject. 
  • The reason for the exception as well as the alternative sharing plan will be described in the registration record and a reference will be provided to an alternative data sharing plan or resource, if available.
NIH Intramural:
  • The NIH Deputy Director for Intramural Research will make the final decision on the request, after the IC has made its determination.

GDS Plan Templates & Examples

GDS plans should address all of the elements noted above. However, there is no universal template requirement. 

Individual ICs may have specific template requirements. Consult GDS Policy Expectations by NIH Institute & Center for any IC-specific sharing plan requirements.

Below are several examples of GDS plans.

Example 1

Data from human specimens not yet collected will be shared through NIH-designated data repositories. Data generated from 800 human samples will be shared through unrestricted-access NIH-designated data repositories; individuals who do not give consent for sharing data will be excluded from the study. Genomic data include individual- and aggregate-level data from whole exome sequencing and genome-wide expression arrays. The study will be registered in dbGaP and the following data and information will be shared through the Sequence Read Archive and Gene Expression Omnibus:

  • Study documents (e.g., study protocol, manual of operations, questionnaire, and data abstraction forms)
  • Individual-level sequence data produced as part of Specific Aim 1 (i.e., files for single nucleotide polymorphisms)
  • Individual-level expression data included in the analyses under Specific Aim 2 (i.e., array data and intensity peaks)
  • Associated phenotypic data

The sequence and expression data will be shared once the data have been cleaned and quality control procedures are completed, which is expected to be completed no more than two months after the data have been generated. Data will be generated in years 1 and 2 and submitted in years 2 and 3 of the proposed study. The draft consent form provides consent for the data to be used for future research purposes and to be shared broadly through unrestricted-access databases. The Institutional Certification signed by the Institutional Signing Official will be submitted prior to award, along with any other Just-in-Time information.

The IRB advised that the sequence data produced through this award may be shared through unrestricted-access NIH-designated data repositories, consistent with data sharing under the NIH GDS Policy. The IRB will review the protocol of this project and will assure, prior to funding, that:

  • The protocol for the collection of genomic and phenotypic data is consistent with 45 CFR Part 46;10
  • Data submission and subsequent data sharing for research purposes are consistent with the informed consent of study participants from whom the data were obtained;
  • Consideration was given to risks to individual participants and their families associated with data submitted to NIH-designated data repositories and subsequent sharing;
  • To the extent relevant and possible, consideration was given to risks to groups or populations associated with submitting data to NIH-designated data repositories and subsequent sharing; and
  • The investigator’s plan for de-identifying datasets is consistent with the standards outlined in the GDS Policy.
Example 2

Data are generated from human specimens collected before the effective date of the GDS Policy, and the data will be shared through NIH-designated data repositories. Genomic data will be generated from specimens that were previously collected from 2,000 study participants. The genotype and relevant phenotype data for participants will be shared through dbGaP, a controlled-access database, once the genotyping data have been cleaned, which we expect to be completed no more than two months after genotyping is finished. Submission of individual-level genome-wide genotype data produced as part of Specific Aim 1 and individual-level phenotypic data related to mood disorders included in the analyses under Specific Aim 2 is anticipated in year 2 of the proposed study.

The consent for the collection of specimens did not directly address the broad sharing of participants’ data but did denote their desire to advance science. After careful review, the IRB determined that data submission was not inconsistent with the terms outlined in the consent. The Institutional Certification, which will be provided prior to award along with any other Just-in-Time information, will include the following DUL: “Use of these data is limited to health/medical/biomedical purposes, which does not include the study of population origins or ancestry.”

The Institutional Review Board (IRB) advised that the genotyping data generated from 2,000 specimens may be shared through NIH-designated data repositories, consistent with data sharing under the NIH GDS Policy. The IRB has reviewed the study protocol and assures that:

  • The protocol for the collection of genomic and phenotypic data is consistent with 45 CFR Part 46;10
  • Data submission and subsequent data sharing for research purposes are consistent with the informed consent of study participants from whom the data were obtained;
  • Consideration was given to risks to individual participants and their families associated with data submitted to NIH-designated data repositories and subsequent sharing;
  • To the extent relevant and possible, consideration was given to risks to groups or populations associated with submitting data to NIH-designated data repositories and subsequent sharing; and
  • The investigator’s plan for de-identifying datasets is consistent with the standards outlined in the GDS Policy.
Example 3

Data are generated from human specimens collected before the effective date of the GDS Policy, and the data cannot be shared through NIH-designated data repositories. Genomic data from more than 100 genes in the genome will be generated from specimens previously collected from 700 study participants from a small population in Africa. The consent form did not directly address the broad sharing of participants’ data nor the risks associated with broad data sharing of these data. Because of the small population and the lack of information in the consent form, the IRB concluded that it is not appropriate to share these individual-level data collected from existing specimens through any NIH-designated repository and is requesting an exception to data deposition be granted. Pursuing a re-consent process for these participants is not a viable option due to the time lapse between acquiring the samples and generating the data. As an alternative data sharing plan, the University has agreed to share aggregate-level data that will be submitted to dbGaP and to provide a mechanism to facilitate data sharing through direct collaborations with other investigators under appropriate IRB oversight. The aggregate-level data will include aggregated minor allele frequencies and associated p-values. Other investigators may contact the principal investigator if interested in collaborating on a project that requires use of the individual-level data. All future research participants will be asked to sign an amended consent form that is consistent with the expectation of broad data sharing.

Example 4

Data from non-human specimens will be shared through NIH-designated data repositories. The University will share individual-level genotype data from 1,500 mice by depositing these data in Sequence Read Archive, which is an NIH-funded repository. In addition, the study protocol, manual of operations, and phenotype data will be submitted. The genotype data will be made publicly available no later than the date of initial publication, which we anticipate during year 3 of the proposed research.

/faqs#/genomic-data-sharing-policy.htm