Writing a Data Management & Sharing Plan

Learn what NIH expects Data Management & Sharing plans to address.

1
https://sharing.nih.gov/faqs#/data-sharing.htm
https://sharing.nih.gov/faqs#/data-management-and-sharing-policy.htm

Writing a Data Sharing Plan

Under its 2003 data sharing policy, NIH expects investigators to submit a data sharing plan with requests for funding or grants, cooperative agreements, intramural research, contracts, or other funding agreements of $500,000 or more per year.

Data sharing plans should describe how an applicant will share their final research data. The specifics of the plan will vary on a case-by-case basis, depending on the type of data to be shared and how the investigator plans to share the data.

Examples of information to cover in a data sharing plan include:

  • The expected schedule for data sharing
  • The format of the dataset
  • The documentation to be provided with the dataset
  • Whether any analytic tools also will be provided
  • Whether a data-sharing agreement will be required. If so, consider including:
    • A brief description of such an agreement
    • Criteria for deciding who can receive the data
    • Whether or not any conditions will be placed on their use
    • The mode of data sharing (e.g., the PI could handle data sharing by mailing a disk or posting data on their institutional or personal website, or the data sharing could be handled through a data archive or enclave).
      • Investigators choosing to handle their own data sharing may wish to enter into a data-sharing agreement.
Generating large-scale genomic data? NIH’s Genomic Data Sharing (GDS) policy may also apply to your research. See our GDS Policy Overview to learn more.

Examples of Data Sharing Plans

The exact content and level of detail to be included in a data sharing plan depends on the specifics of the project, such as how the investigator is planning to share data, or the size and complexity of the dataset. The examples below give a sense of what a data sharing plan can look like. 

Example 1
This application requests support to collect public-use data from a survey of more than 22,000 Americans over the age of 50 every 2 years. Data products from this study will be made available without cost to researchers and analysts. User registration is required in order to access or download files. As part of the registration process, users must agree to the conditions of use governing access to the public release data, including restrictions against attempting to identify study participants, destruction of the data after analyses are completed, reporting responsibilities, restrictions on redistribution of the data to third parties, and proper acknowledgment of the data resource. Registered users will receive user support, as well as information related to errors in the data, future releases, workshops, and publication lists. The information provided to users will not be used for commercial purposes, and will not be redistributed to third parties.

Example 2
The proposed research will include data from approximately 500 subjects being screened for three bacterial sexually transmitted diseases (STDs) at an inner city STD clinic. The final dataset will include self-reported demographic and behavioral data from interviews with the subjects and laboratory data from urine specimens provided. Because the STDs being studied are reportable diseases, we will be collecting identifying information. Even though the final dataset will be stripped of identifiers prior to release for sharing, we believe that there remains the possibility of deductive disclosure of subjects with unusual characteristics. Thus, we will make the data and associated documentation available to users only under a data-sharing agreement that provides for: (1) a commitment to using the data only for research purposes and not to identify any individual participant; (2) a commitment to securing the data using appropriate computer technology; and (3) a commitment to destroying or returning the data after analyses are completed.

Example 3
The proposed research will involve a small sample (less than 20 participants) recruited from clinical facilities in the New York City area with Williams syndrome. This rare craniofacial disorder is associated with distinguishing facial features. Even with the removal of all identifiers, we believe that it would be difficult if not impossible to protect the identities of subjects given the physical characteristics of subjects, the type of clinical data (including imaging) that we will be collecting, and the relatively restricted area from which we are recruiting subjects. Therefore, we are not planning to share the data.

Example 4

Example Data Sharing Plan for FOA-XX-XXXX

What data that will be shared:

I will share phenotypic data associated with the collected samples by depositing these data at ________________ which is an NIH-funded repository.  Genotype data will be shared by depositing these data at ________________.  Additional data documentation and de-identified data will be deposited for sharing along with phenotypic data, which includes demographics, family history of XXXXXX disease, and diagnosis, consistent with applicable laws and regulations.  I will comply with the NIH GWAS Policy and the funding IC’s existing policies on sharing data on XXXXXX disease genetics to include secondary analysis of data resulting from a genome wide association study through the repository.  Meta-analysis data and associated phenotypic data, along with data content, format, and organization, will be available at ____________.  Submitted data will confirm with relevant data and terminology standards.

Who will have access to the data:

I agree that data will be deposited and made available through ________________ which is an NIH-funded repository, and that these data will be shared with investigators working under an institution with a Federal Wide Assurance (FWA) and could be used for secondary study purposes such as finding genes that contribute to process of XXXXXX.  I agree that the names and Institutions of persons either given or denied access to the data, and the bases for such decisions, will be summarized in the annual progress report.  Meta-analysis data and associated phenotypic data, along with data content, format, and organization, will be made available to investigators through ____________.

Where will the data be available:

I agree to deposit and maintain the phenotypic data, and secondary analysis of data (if any) at ________________, which is an NIH-funded repository and that the repository has data access policies and procedures consistent with NIH data sharing policies.

When will the data be shared:

I agree to deposit genetic outcome data into ________________ repository as soon as possible but no later than within one year of the completion of the funded project period for the parent award or upon acceptance of the data for publication, or public disclosure of a submitted patent application, whichever is earlier.

How will researchers locate and access the data:

I agree that I will identify where the data will be available and how to access the data in any publications and presentations that I author or co-author about these data, as well as acknowledge the repository and funding source in any publications and presentations.  As I will be using ________________, which is an NIH-funded repository, this repository has policies and procedures in place that will provide data access to qualified researchers, fully consistent with NIH data sharing policies and applicable laws and regulations.

How to Submit Data Sharing Plans

The plan should be included in the Resource Sharing section of the application. See the How to Apply – Application Guide for form instructions.

Writing a Data Management and Sharing Plan

Under the 2023 Data Management and Sharing (DMS) Policy, NIH expects researchers to maximize the appropriate sharing of scientific data, taking into account factors such as legal, ethical, or technical issues that may limit the extent of data sharing and preservation.

NIH requires all applicants planning to generate scientific data to prepare a DMS Plan that describes how the scientific data will be managed and shared. For more on what constitutes scientific data, see Research Covered Under the Data Management & Sharing Policy.

Applications subject to NIH’s Genomic Data Sharing (GDS) Policy should also address GDS-specific considerations within the elements of a DMS Plan (see NOT-OD-22-189 and details below). 

Submitting Data Management and Sharing Plans

The DMS Plan should be submitted as follows:

  • Extramural (grants):
    • DMS Plans should be included within the “Other Plan(s) field on the PHS 398 Research Plan or PHS 398 Career Development Award Supplemental Form as indicated in the Application Instructions. See below for details on developing and formatting Plans.
    • A brief summary and associated costs should be submitted as part of the budget and budget justification (see Budgeting for Data Management and Sharing and the Application Instructions for details).
  • Extramural (contracts): as part of the technical evaluation
  • Intramural: determined by the Intramural Research Program
  • Other funding agreements: prior to the release of funds

Data Management and Sharing Plan Format

DMS Plans are recommended to be two pages or less in length.  

NIH has developed an optional DMS Plan format page that aligns with the recommended elements of a DMS Plan.

A preview of this format page is available now, with a final fillable format version available by Fall 2022.

Data Management and Sharing Plan Format Page

Elements to Include in a Data Management and Sharing Plan

As outlined in NIH Guide Notice Supplemental Policy Information: Elements of an NIH Data Management and Sharing Plan, DMS Plans should address the following recommended elements and are recommended to be two pages or less in length. As described in the Application Guide, the DMS Plan should be attached to the application as a PDF file. See NIH’s Format Attachments page.

1.  Data Type

Briefly describe the scientific data to be managed and shared:

  • Summarize the types (for example, 256-channel EEG data and fMRI images) and amount (for example, from 50 research participants) of scientific data to be generated and/or used in the research. Descriptions may include the data modality (e.g., imaging, genomic, mobile, survey), level of aggregation (e.g., individual, aggregated, summarized), and/or the degree of data processing.
  • Describe which scientific data from the project will be preserved and shared. NIH does not anticipate that researchers will preserve and share all scientific data generated in a study. Researchers should decide which scientific data to preserve and share based on ethical, legal, and technical factors. The plan should provide the reasoning for these decisions.
  • A brief listing of the metadata, other relevant data, and any associated documentation (e.g., study protocols and data collection instruments) that will be made accessible to facilitate interpretation of the scientific data
    For data subject to the GDS Policy:
    • Data types expected to be shared under the GDS Policy should be described in this element. Note that the GDS Policy expects certain types of data to be shared that may not be covered by the DMS Policy’s definition of “scientific data”. For more information on the data types to be shared under the GDS Policy, consult Data Submission and Release Expectations.
2.  Related Tools, Software and/or Code

Indicate whether specialized tools are needed to access or manipulate shared scientific data to support replication or reuse, and name(s) of the needed tool(s) and software. If applicable, specify how needed tools can be accessed.

3.  Standards

Describe what standards, if any, will be applied to the scientific data and associated metadata (i.e., data formats, data dictionaries, data identifiers, definitions, unique identifiers, and other data documentation).

4.  Data Preservation, Access, and Associated Timelines

Give plans and timelines for data preservation and access, including:

  • The name of the repository(ies) where scientific data and metadata arising from the project will be archived. See Selecting a Data Repository for information on selecting an appropriate repository. 
  • How the scientific data will be findable and identifiable, i.e., via a persistent unique identifier or other standard indexing tools.
  • When the scientific data will be made available to other users and for how long. Identify any differences in timelines for different subsets of scientific data to be shared.
    • Note that NIH encourages scientific data to be shared as soon as possible, and no later than the time of an associated publication or end of the performance period, whichever comes first. NIH also encourages researchers to make scientific data available for as long as they anticipate it being useful for the larger research community, institutions, and/or the broader public.
    For data subject to the GDS Policy:
    • For human genomic data:
    • For Non-human genomic data:
      • Investigators may submit data to any widely used repository.
      • Non-human genomic data is expected to be shared as soon as possible, but no later than the time of an associated publication, or end of the performance period, whichever is first.
5.  Access, Distribution, or Reuse Considerations

Describe any applicable factors affecting subsequent access, distribution, or reuse of scientific data related to:

  • Informed consent
  • Privacy and confidentiality protections consistent with applicable federal, Tribal, state, and local laws, regulations, and policies
  • Whether access to scientific data derived from humans will be controlled 
  • Any restrictions imposed by federal, Tribal, or state laws, regulations, or policies, or existing or anticipated agreements
  • Any other considerations that may limit the extent of data sharing. Any potential limitations on subsequent data use should be communicated to the individuals or entities (for example, data repository managers) that will preserve and share the scientific data. The NIH ICO will assess whether an applicant’s DMS plan appropriately considers and describes these factors. For more examples, see Frequently Asked Questions for examples of justifiable reasons for limiting sharing of data.
    Expectations for human genomic data subject to the GDS Policy:
    • Informed Consent Expectations: 
      • For research involving the generation of large-scale human genomic data from cell lines or clinical specimens that were created or collected AFTER the effective date of the GDS Policy (January 25, 2015):
        • NIH expects that informed consent for future research use and broad data sharing will have been obtained. This expectation applies to de-identified cell lines or clinical specimens regardless of whether the data meet technical and/or legal definitions of de-identified (i.e. the research does not meet the definition of “human subjects research” under the Common Rule).
      • For research involving the generation of large-scale human genomic data from cell lines or clinical specimens that were created or collected BEFORE the effective date of the GDS Policy:
        • There may or may not have been consent for research use and broad data sharing. NIH will accept data derived from de-identified cell lines or clinical specimens lacking consent for research use that were created or collected before the effective date of this Policy. 
    • Institutional Certifications and Data Sharing Limitation Expectations:
      • DMS Plans should address limitations on sharing by anticipating sharing according to the criteria of the Institutional Certification.
      • In cases where it is anticipated that Institutional Certification criteria cannot be met (i.e., data cannot be shared as expected by the GDS Policy), investigators should state the institutional Certification criteria in their DMS Plan, explaining why the element cannot be met, and indicating what data, if any, can be shared and how to enable sharing to the maximal extent possible (for example, sharing data in a summary format). In some instances, the funding NIH ICO may need to determine whether to grant an exception to the data submission expectation under the GDS Policy.
    • Genomic Summary Results: 
      • Investigators conducting research subject to the GDS Policy should indicate in their DMS Plan if a study should be designated as “sensitive” for the purposes of access to Genomic Summary Results (GSR), as described in NOT-OD-19-023.
6.  Oversight of Data Management and Sharing

Indicate how compliance with the DMS Plan will be monitored and managed, the frequency of oversight, and by whom (e.g., title, roles). This element refers to oversight by the funded institution, rather than by NIH. The DMS Policy does not create any expectations about who will be responsible for Plan oversight at the institution.

Sample Plans

NIH has provided sample DMS Plans as examples of how a DMS Plan could be completed in different contexts, conforming to the elements described above. These sample DMS Plans are provided for educational purposes to assist applicants with developing Plans but are not intended to be used as templates and their use does not guarantee approval by NIH.

Note that the sample DMS Plans provided below may reflect additional expectations established by NIH or specific NIH Institutes, Centers, or Offices that go beyond the DMS Policy. Applicants will need to ensure that their Plan reflects any additional, applicable expectations (including from NIH policies, ICO policies, or as stated in the FOA).

Sample Description NIH Institute or Center
Sample Plan A   Clinical and/or MRI data from human research participants NIMH
Sample Plan B   Genomic data from human research participants NIMH
Sample Plan C   Genomic data from a non-human source NIMH
Sample Plan D   Secondary Data Analysis NIMH

Assessment of Data Management and Sharing Plans 

Program staff at the proposed NIH Institute or Center (IC) will assess DMS Plans to ensure the elements of a DMS Plan have been adequately addressed and to assess the reasonableness of those responses. Applications selected for funding will only be funded if the DMS Plan is complete and acceptable.

During peer review, reviewers will not be asked to comment on the DMS Plan nor will they factor the DMS Plan into the Overall Impact score, unless sharing data is integral to the project design and specified in the Funding Opportunity Announcement (see NOT-OD-22-189). 

If data sharing is integral to the project and tied to a scored review criterion in the funding opportunity announcement, program staff will assess the adequacy of the DMS Plan per standard procedure, but peer reviewers will also be able to view the DMS Plan attachment and may factor that information into scores as outlined in the evaluation criteria. 

For information about budget assessment by peer reviewers, see Budgeting for Data Management and Sharing.

Revising Data Management and Sharing Plans

Pre-Award Plan Revisions: If the DMS Plan provided in the application cannot be approved based on the information provided, applicants will be notified that additional information is needed. This will occur through the Just-in-Time (JIT) process. Applicants will be expected to communicate with their Program Officer and/or Grants Management Specialist to resolve any issues that prevent the funding IC from approving the DMS Plan. If needed, applicants should submit a revised DMS Plan. Refer to NIH Grants Policy Statement Section 2.5.1 Just-in-Time Procedures for additional guidance.

Post-Award Plan Revisions: Although investigators submit plans before research begins, plans may need to be updated or revised over the course of a project for a variety of reasons for example, if the type(s) of data generated change(s), a more appropriate data repository becomes available, or if the sharing timeline shifts. If any changes occur during the award or support period that affects how data is managed or shared, investigators should update the Plan to reflect the changes. It may be helpful to discuss potential changes with the Program Officer. In addition, the funding NIH ICO will need to approve the updated Plan. NIH staff will monitor compliance with approved DMS Plans during the annual RPPR process as well.

Additional Considerations

Note that funding opportunities or ICs may have specific expectations (for example: scientific data to share, relevant standards, repository selection). View a list of NIH Institute or Center data sharing policies. Investigators are encouraged to reach out to program officers with questions about specific ICO requirements.

Please note that a Plan is part of an application, and, as such, an institution takes responsibility for the Plan and the rest of the application’s contents when submitting an application. Although part of the official submission, when not considered during peer review the attachment is maintained as a separate “Data Management and Sharing (DMS) Plan” document in the grant folder viewable via the Status Information screen in eRA Commons. This document is viewable by authorized users and is not part of the assembled e-Application.