Using Genomic Data Responsibly Under the NIH Genomic Data Sharing Policy

Controlled-access data users must protect the privacy of the human participants from whom the datasets were generated. Learn about the responsibilities that come with receiving access to human genomic data from NIH.

NIH expects users of both controlled and unrestricted/open-access human genomic data to manage and secure the data in a way that protects the privacy of human participants.

NIH has issued an Implementation Update for Data Management and Access Practices Under the Genomic Data Sharing Policy (NOT-OD-24-157) for Approved Users and developers accessing, storing, or providing access to human genomic data shared under the NIH Genomic Data Sharing (GDS) Policy. These changes are to ensure GDS Policy implementation continues to evolve alongside changing practices for collecting, sharing, and using controlled-access human genomic data.

Read below to learn about the responsibilities that come with receiving access to human genomic data shared under the GDS Policy from NIH.

Expectations for Approved Users of Controlled Access Data

Investigators approved to access data (i.e., Approved Users) and their institution are responsible for maintaining the confidentiality, integrity, and security of the data accessed from an NIH designated repository according to NIH Security Best Practices for Controlled-Access Data Subject to the NIH Genomic Data Sharing (GDS) Policy. On or after January 25, 2025, Approved Users and their institution will be expected to secure data according to NIH Security Best Practices for Users of Controlled-Access Data. The is intended to ensure that Approved Users of NIH controlled-access data, including controlled-access data under the GDS Policy, maintain such data on institutional IT systems and third-party computing infrastructures that meet certain standards in accordance with National Institute of Standards and Technology (NIST) SP 800-171 Protecting Controlled Unclassified Information in Nonfederal Information Systems and Organizations.

Investigators who download unrestricted/open-access data from NIH-designated data repositories should:

  • Not attempt to identify individual human research participants from whom the data were obtained
  • Acknowledge in all oral or written presentations, disclosures, or publications the specific dataset(s) or applicable accession number(s) and the NIH-designated data repositories through which the investigator accessed any data

Expectations for Controlled Access Data Users

Investigators approved to access data (i.e., Approved Users) and their institution are responsible for maintaining the confidentiality, integrity, and security of the data accessed from an NIH designated repository according to NIH Security Best Practices for Controlled-Access Data Subject to the NIH Genomic Data Sharing (GDS) Policy. On or after January 25, 2025, Approved Users and their institution will be expected to secure data according to NIH Security Best Practices for Users of Controlled-Access Data. The is intended to ensure that Approved Users of NIH controlled-access data, including controlled-access data under the GDS Policy, maintain such data on institutional IT systems and third-party computing infrastructures that meet certain standards in accordance with National Institute of Standards and Technology (NIST) SP 800-171 Protecting Controlled Unclassified Information in Nonfederal Information Systems and Organizations.

On or after January 25, 2025, The NIH Security Best Practices for Users of Controlled-Access Data will be included in new or renewed Data Use Certifications or similar agreements stipulating terms of access to controlled-access human genomic data.

Users approved prior to January 25, 2025, should secure data according to the NIH Security Best Practices for Controlled-Access Data Subject to the NIH Genomic Data Sharing (GDS) Policy until project close-out or renewal

AnchorGenomic Data User Code of Conduct

Investigators who are approved by NIH to download controlled-access data agree to:

  • Use datasets only for the research project described in the approved Data Access Request for each dataset;
  • Make no attempt to identify or contact individual participants or groups from whom data were collected, or generate information that could allow participants’ identities to be discovered, without appropriate approvals from the institution that submitted the dataset to dbGaP;
  • Maintain the confidentiality of the data and not distribute them to anyone outside of those specified in the approved Data Access Request;
  • Adhere to the NIH Security Best Practices for Controlled-Access Data Subject to the NIH Genomic Data Sharing Policy and ensure that only approved users can gain access to data files;
  • Acknowledge the Intellectual Property terms as specified in the Data Use Certification Agreement;
  • Provide appropriate acknowledgement in any dissemination of research findings including the investigator(s) who generated the data, the funding source, accession numbers of the dataset, and the data repository from which the data were accessed; and,
  • Report any inadvertent data release, breach of data security, or other data management incidents in accordance with the terms specified in the Data Use Certification Agreement.

Note: If approved before January 25, 2025, Approved Users may continue adhering to the NIH Security Best Practices for Controlled-Access Data Subject to the NIH Genomic Data Sharing Policy until project renewal, at which point they will adopt the updated NIH Security Best Practices for Users of Controlled-Access Data.

If an investigator plans to use cloud computing systems to store or analyze controlled-access data, NIH expects the cloud systems to meet the same standards as outlined in NIH Security Best Practices for Controlled-Access Data Subject to the NIH Genomic Data Sharing (GDS) Policy until project renewal, at which point they must adopt the updated NIH Security Best Practices for Users of Controlled-Access Data. NIH will hold the institution, not the cloud service provider, responsible for any failure in the oversight of using cloud computing services for controlled-access data.

Users must agree to abide by terms of access of the Data Use Certification Agreement, the NIH Genomic Data User Code of Conduct, and the NIH Security Best Practices for Users of Controlled-Access Data. Violating these terms is considered a data management incident and may result in loss of access privileges to controlled-access data.

Data Use Certification Agreement

The Data Use Certification Agreement, co-signed by the investigator requesting the data and their institution as represented by the Institutional Signing Official, and NIH, specifies the terms for appropriate secondary research use of controlled-access data, including:

  • Using the data only for the research use stated in the approved data access request; 
  • Protecting data confidentiality;
  • Following, as appropriate, all applicable national, tribal, and state laws and regulations, as well as relevant institutional policies and procedures for handling genomic data;
  • Not attempting to identify individual participants from whom the data were obtained;
  • Not selling any of the data obtained from NIH-designated data repositories;
  • Not sharing any of the data obtained from controlled-access NIH-designated data repositories with individuals other than those listed in the data access request;
  • Agreeing to the listing of a summary of approved research uses in dbGaP along with the investigator’s name and organizational affiliation;
  • Agreeing to report any violation of the GDS Policy to the appropriate data access committee(s) as soon as it is discovered;
  • Reporting research progress using controlled-access datasets through annual access renewal requests or project close-out reports;
  • Acknowledging in all oral or written presentations, disclosures, or publications the contributing investigator(s) who conducted the original study, the funding organization(s) that supported the work, the specific dataset(s) and applicable accession number(s), and the NIH-designated data repositories through which the investigator accessed any data. 

Patenting NIH-funded Genomic Data

NIH encourages the broad use of NIH-funded genomic data in ways that are consistent with responsible management of any intellectual property that resulted from the data. For that reason, NIH encourages patents that can lead to products that address public needs and that do not hinder research. NIH discourages the use of patents to block the use of, or access to, genomic or genotype/phenotype data developed with NIH support.

In addition, naturally occurring DNA sequences are not patentable in the United States. This means that basic DNA sequence data, and related information such as genotypes, are pre-competitive. When investigators use these types of data from NIH repositories, the data as well as any conclusions that came directly from the data should remain freely accessible, without any licensing requirements.

Minimum Standard Operating Procedures for Developer Oversight

NIH is establishing minimum expectations for developer access to controlled-access data shared under the GDS Policy. Developer activities include testing platforms, pipelines, analysis tools, and user interfaces that store, manage, and interact with human genomic data from NIH controlled-access data repositories as well as provide infrastructure development and repository maintenance, but does not include research (e.g., methods development).>

Lead Developer(s) (e.g., for extramural the Principal Investigator (PI) who is listed as the Project Director (PD) or PI on the funding application; for intramural the developer team lead at the managing NIH ICO repository)seeking access should submit a request containing a Data Use Statement (DUS) to the NIH Developer Data Access Committee (NIH Developer DAC) (DeveloperAccessDAC@od.nih.gov). For additional information, see Implementation Update for Data Management and Access Practices Under the Genomic Data Sharing Policy (NOT-OD-24-157).

This framework, consisting of minimum standards for developer access, complements the NIST information security standards (e.g., NIST SP 800-53 and NIST SP 800-171). It does not supersede, replace, or otherwise negate developer responsibilities under these standards.

/faqs#/genomic-data-sharing-policy.htm