Writing a Data Sharing Plan

Under its 2003 data sharing policy, NIH expects investigators to submit a data sharing plan with requests for funding or grants, cooperative agreements, intramural research, contracts, or other funding agreements of $500,000 or more per year.

Data sharing plans should describe how an applicant will share their final research data. The specifics of the plan will vary on a case-by-case basis, depending on the type of data to be shared and how the investigator plans to share the data.

Examples of information to cover in a data sharing plan include:

  • The expected schedule for data sharing
  • The format of the dataset
  • The documentation to be provided with the dataset
  • Whether any analytic tools also will be provided
  • Whether a data-sharing agreement will be required. If so, consider including:
    • A brief description of such an agreement
    • Criteria for deciding who can receive the data
    • Whether or not any conditions will be placed on their use
    • The mode of data sharing (e.g., the PI could handle data sharing by mailing a disk or posting data on their institutional or personal website, or the data sharing could be handled through a data archive or enclave).
      • Investigators choosing to handle their own data sharing may wish to enter into a data-sharing agreement.
Generating large-scale genomic data? NIH’s Genomic Data Sharing (GDS) policy may also apply to your research. See our GDS Policy Overview to learn more.

Examples of Data Sharing Plans

The exact content and level of detail to be included in a data sharing plan depends on the specifics of the project, such as how the investigator is planning to share data, or the size and complexity of the dataset. The examples below give a sense of what a data sharing plan can look like. 

Example 1
This application requests support to collect public-use data from a survey of more than 22,000 Americans over the age of 50 every 2 years. Data products from this study will be made available without cost to researchers and analysts. User registration is required in order to access or download files. As part of the registration process, users must agree to the conditions of use governing access to the public release data, including restrictions against attempting to identify study participants, destruction of the data after analyses are completed, reporting responsibilities, restrictions on redistribution of the data to third parties, and proper acknowledgment of the data resource. Registered users will receive user support, as well as information related to errors in the data, future releases, workshops, and publication lists. The information provided to users will not be used for commercial purposes, and will not be redistributed to third parties.

Example 2
The proposed research will include data from approximately 500 subjects being screened for three bacterial sexually transmitted diseases (STDs) at an inner city STD clinic. The final dataset will include self-reported demographic and behavioral data from interviews with the subjects and laboratory data from urine specimens provided. Because the STDs being studied are reportable diseases, we will be collecting identifying information. Even though the final dataset will be stripped of identifiers prior to release for sharing, we believe that there remains the possibility of deductive disclosure of subjects with unusual characteristics. Thus, we will make the data and associated documentation available to users only under a data-sharing agreement that provides for: (1) a commitment to using the data only for research purposes and not to identify any individual participant; (2) a commitment to securing the data using appropriate computer technology; and (3) a commitment to destroying or returning the data after analyses are completed.

Example 3
The proposed research will involve a small sample (less than 20 participants) recruited from clinical facilities in the New York City area with Williams syndrome. This rare craniofacial disorder is associated with distinguishing facial features. Even with the removal of all identifiers, we believe that it would be difficult if not impossible to protect the identities of subjects given the physical characteristics of subjects, the type of clinical data (including imaging) that we will be collecting, and the relatively restricted area from which we are recruiting subjects. Therefore, we are not planning to share the data.

Example 4

Example Data Sharing Plan for FOA-XX-XXXX

What data that will be shared:

I will share phenotypic data associated with the collected samples by depositing these data at ________________ which is an NIH-funded repository.  Genotype data will be shared by depositing these data at ________________.  Additional data documentation and de-identified data will be deposited for sharing along with phenotypic data, which includes demographics, family history of XXXXXX disease, and diagnosis, consistent with applicable laws and regulations.  I will comply with the NIH GWAS Policy and the funding IC’s existing policies on sharing data on XXXXXX disease genetics to include secondary analysis of data resulting from a genome wide association study through the repository.  Meta-analysis data and associated phenotypic data, along with data content, format, and organization, will be available at ____________.  Submitted data will confirm with relevant data and terminology standards.

Who will have access to the data:

I agree that data will be deposited and made available through ________________ which is an NIH-funded repository, and that these data will be shared with investigators working under an institution with a Federal Wide Assurance (FWA) and could be used for secondary study purposes such as finding genes that contribute to process of XXXXXX.  I agree that the names and Institutions of persons either given or denied access to the data, and the bases for such decisions, will be summarized in the annual progress report.  Meta-analysis data and associated phenotypic data, along with data content, format, and organization, will be made available to investigators through ____________.

Where will the data be available:

I agree to deposit and maintain the phenotypic data, and secondary analysis of data (if any) at ________________, which is an NIH-funded repository and that the repository has data access policies and procedures consistent with NIH data sharing policies.

When will the data be shared:

I agree to deposit genetic outcome data into ________________ repository as soon as possible but no later than within one year of the completion of the funded project period for the parent award or upon acceptance of the data for publication, or public disclosure of a submitted patent application, whichever is earlier.

How will researchers locate and access the data:

I agree that I will identify where the data will be available and how to access the data in any publications and presentations that I author or co-author about these data, as well as acknowledge the repository and funding source in any publications and presentations.  As I will be using ________________, which is an NIH-funded repository, this repository has policies and procedures in place that will provide data access to qualified researchers, fully consistent with NIH data sharing policies and applicable laws and regulations.

How to Submit Data Sharing Plans

The plan should be included in the Resource Sharing section of the application. See the How to Apply – Application Guide for form instructions.

Writing a Data Management and Sharing Plan

Under the 2023 Data Management and Sharing (DMS) policy, NIH expects researchers to maximize the appropriate sharing of scientific data, taking into account factors such as legal, ethical, or technical issues that may limit the extent of data sharing and preservation.

NIH requires all applicants planning to generate scientific data to prepare a DMS plan that describes how the scientific data will be managed and shared.

The DMS plan should be submitted as follows:

  • Extramural (grants): as part of the Budget Justification section of the application 
  • Extramural (contracts): as part of the technical evaluation
  • Intramural: determined by the Intramural Research Program
  • Other funding agreements: prior to the release of funds

Although the plans are submitted before research begins, if any changes occur during the award or support period that affects how data is managed or shared, investigators should update the plan to reflect the changes. It may be helpful to discuss potential changes with the Program Officer. In addition, the funding institute or center will need to approve the updated plan.

Elements to Include in a Data Management and Sharing Plan

As outlined in NIH Guide Notice Supplemental Policy Information: Elements of an NIH Data Management and Sharing Plan, DMS plans should address the following recommended elements and should be two pages or less in length.

Data Type: Briefly describe the scientific data to be managed and shared:

  • Summarize the types (for example, 256-channel EEG data and fMRI images) and amount (for example, from 50 research participants) of scientific data to be generated and/or used in the research. Descriptions may include the data modality (e.g., imaging, genomic, mobile, survey), level of aggregation (e.g., individual, aggregated, summarized), and/or the degree of data processing.
  • Describe which scientific data from the project will be preserved and shared. NIH does not anticipate that researchers will preserve and share all scientific data generated in a study. Researchers should decide which scientific data to preserve and share based on ethical, legal, and technical factors. The plan should provide the reasoning for these decisions.
  • A brief listing of the metadata, other relevant data, and any associated documentation (e.g., study protocols and data collection instruments) that will be made accessible to facilitate interpretation of the scientific data.

Related Tools, Software and/or Code: Indicate whether specialized tools are needed to access or manipulate shared scientific data to support replication or reuse, and name(s) of the needed tool(s) and software. If applicable, specify how needed tools can be accessed.

Standards: Describe what standards, if any, will be applied to the scientific data and associated metadata (i.e., data formats, data dictionaries, data identifiers, definitions, unique identifiers, and other data documentation).

Data Preservation, Access, and Associated Timelines: Give plans and timelines for data preservation and access, including:

  • The name of the repository(ies) where scientific data and metadata arising from the project will be archived. See Selecting a Data Repository for information on selecting an appropriate repository. 
  • How the scientific data will be findable and identifiable, i.e., via a persistent unique identifier or other standard indexing tools.
  • When the scientific data will be made available to other users and for how long. Identify any differences in timelines for different subsets of scientific data to be shared.
    • Note that NIH encourages scientific data to be shared as soon as possible, and no later than the time of an associated publication or end of the performance period, whichever comes first. NIH also encourages researchers to make scientific data available for as long as they anticipate it being useful for the larger research community, institutions, and/or the broader public.

Access, Distribution, or Reuse Considerations: Describe any applicable factors affecting subsequent access, distribution, or reuse of scientific data related to:

  • Informed consent
  • Privacy and confidentiality protections consistent with applicable federal, Tribal, state, and local laws, regulations, and policies
  • Whether access to scientific data derived from humans will be controlled 
  • Any restrictions imposed by federal, Tribal, or state laws, regulations, or policies, or existing or anticipated agreements
  • Any other considerations that may limit the extent of data sharing. Any potential limitations on subsequent data use should be communicated to the individuals or entities (for example, data repository managers) that will preserve and share the scientific data. The NIH IC will assess whether an applicant’s DMS plan appropriately considers and describes these factors.

Need help developing informed consent documents for data sharing? See our new sample language and points to consider in the resource Informed Consent for Secondary Research with Data and Biospecimens.

Oversight of Data Management and Sharing: Indicate how compliance with the DMS plan will be monitored and managed.

Additional Considerations

Note that funding opportunities or ICs may have specific expectations (for example: scientific data to share, relevant standards, repository selection). View a list of NIH Institute or Center data sharing policies.

Generating large-scale genomic data? NIH’s Genomic Data Sharing (GDS) policy may also apply to your research. See our GDS Policy Overview to learn more.