FAQ for the York University Institutional RDM Strategy

November 4, 2022

In March 2021, Canada’s federal granting agencies launched the Tri-Agency Research Data Management (RDM) Policy. The policy supports Canadian research excellence by fostering sound digital data management and data stewardship practices. It includes requirements related to institutional research data management (RDM) strategies, data management plans (DMPs), and data deposit. The Tri-Agencies plan to implement the policy incrementally in consultation with research community stakeholders and in step with continuing development of RDM practices and capacity in Canada and internationally. [1] 

[1] https://science.gc.ca/site/science/en/interagency-research-funding/policies-and-guidelines/research-data-management  

The Tri-Agency recognizes that research data may have different formats and is contextual: 

“Research data are data that are used as primary sources to support technical or scientific enquiry, research, scholarship, or creative practice, and that are used as evidence in the research process and/or are commonly accepted in the research community as necessary to validate research findings and results. Research data may be experimental data, observational data, operational data, third party data, public sector data, monitoring data, processed data, or repurposed data. What is considered relevant research data is often highly contextual, and determining what counts as such should be guided by disciplinary norms.” [1]  

The agencies recognize diverse models of scientific inquiry and acknowledge that differences exist in  RDM standards: 

“Research data management (RDM) refers to the processes applied through the lifecycle of a research project to guide the collection, documentation, storage, sharing and preservation of research data.” [2] 

To learn more about RDM, visit: https://www.library.yorku.ca/web/research-learn/research/rdm/#rdm 

[1] Tri-Agency RDM Policy FAQ  https://science.gc.ca/site/science/en/interagency-research-funding/policies-and-guidelines/research-data-management/tri-agency-research-data-management-policy-frequently-asked-questions#1b  

[2] https://science.gc.ca/site/science/en/interagency-research-funding/policies-and-guidelines/research-data-management/tri-agency-research-data-management-policy-frequently-asked-questions#1d  

The table below identifies stages in the research lifecycle and identifies corresponding stages in the RDM lifecycle that would occur concurrently as part of the research process. 

Research LifecycleRDM Lifecycle
Stage 1 
  • Research planning
  • Development of research proposal and submission of grant applications
  • Obtain ethics approval for research involving human participants
  • Creation of RDM Plan
  • Identify data for re-use
  • Obtain participants’ consent to share data
Stage 2 
  • Set up and conduct research
  • Create, document, protect and store data
Stage 3 
  • Research publication
  • Tracking of impact
  • Prepare data for storage and/or sharing
  • Long term storage of data
  • Share and publish data
  • Tracking of impact

Research funders around the world, especially those in the US [1], UK, and Europe [2], have started to require DMPs and data deposit. This is a general trend, although there is some variation pertaining to funder expectations. For example, funders vary in their requirements for timing of DMP submission: some require DMPs at the application stage while others ask for the DMP after a grant is awarded. Some funders, for example, the US NIH, have expanded the scope of data management and sharing requirements to include all grant programs [3], not just their larger grants [4]. 

An ever-increasing number of academic journals have adopted data-sharing policies [5] and require authors to deposit data in trusted repositories, obtain DOIs for data sets, provide data availability statements, and cite secondary data appropriately. 

Research societies, including the American Psychological Association [6], the American Chemical Society [7], and the American Geophysical Union [8] have developed educational programs and initiatives, open data requirements, and guidance for researchers in their disciplines. These initiatives vary between societies. International networks of disciplinary data archives, such as the Consortium of European Social Science Data Archives (CESSDA) [9] were formed to provide technology and services for domain researchers across countries and infrastructures.  

The international Registry of Research Data Repositories is a great example of the global adoption of data deposit through the use of trusted data repositories. It showcases 2990 research data repositories from over 85 countries, many of them international in scope.  

[1] US funders: https://sparcopen.org/our-work/research-data-sharing-policy-initiative/funder-policies/
[2] UK and Europe funders: https://www.ucl.ac.uk/library/open-science-research-support/research-data-management/policies/research-funders-research-data
[3] US NIH 2020 Data Management and Sharing Policy: https://grants.nih.gov/grants/guide/notice-files/NOT-OD-03-032.html
[4] US NIH 2003 Data Sharing Policy: https://grants.nih.gov/grants/guide/notice-files/NOT-OD-03-032.html
[5] Publisher data availability policies: https://www.chorusaccess.org/resources/chorus-for-publishers/publisher-data-availability-policies-index/
[6] https://www.apa.org/pubs/journals/resources/data-sharing
[7] https://acsopenscience.org/acs-policies/#open-data-policy
[8] https://www.agu.org/Learn-About-AGU/About-AGU/Data-Leadership
[9] https://www.cessda.eu/
[10] https://wesharedata.org/
[11] https://www.re3data.org/

The Tri-Agency RDM Policy states three requirements, one of which is for the institutions to publish their “Institutional Strategy” by March 1, 2023. According to Section 3.1 of the policy: 

“Each postsecondary institution and research hospital eligible to administer CIHR, NSERC or SSHRC funds is required to create an institutional RDM strategy and notify the agencies when it has been completed. 

The strategy must be made publicly available on the institution’s website, with contact information to which inquiries about the strategy can be directed.” 

A list of details can be found in Section 3.1 of the policy: https://science.gc.ca/site/science/en/interagency-research-funding/policies-and-guidelines/research-data-management/tri-agency-research-data-management-policy 

From the Tri-Agency FAQ: https://www.science.gc.ca/eic/site/063.nsf/eng/h_97609.html#4c  

“Research institutions have a significant role to play in supporting RDM. Developing a RDM strategy provides institutions with an opportunity to think through where gaps exist and how to address them from an institutional perspective. RDM strategies will allow institutions to develop solutions that work for them, while encouraging alignment and collaboration with other institutions. The information in these institutional strategies will help research funders and the Canadian research community gain a better understanding of RDM capacity across the country. 

The agencies will not be evaluating the strategies.” 

Section 3.1 of the Tri-Agency RDM Policy recognizes each institution’s particular circumstances and encourages the use of the strategy as a starting point of further RDM conversation and planning. The Tri-Agencies are seeking to encourage discussion at this point, and are expecting the strategies to evolve over time.  

Having the strategies publicly available will help the agencies and the broader research community to understand institutions’ current and planned RDM capacity, challenges and needs, and will facilitate ongoing dialogue and collaboration on the advancement of RDM in Canada. 

The agencies recognize that each strategy will reflect the institution’s particular circumstances—for example, institution size, research intensity, and existing RDM capacity—but in all cases, the agencies expect high quality strategies that outline how the institution will provide its researchers with an environment that enables and supports RDM.” 

The York University Institutional RDM Strategy is a plan to develop institutional support to help researchers meet current and future RDM requirements dictated by the policy, including the creation of Data Management Plans (DMPs) and data deposit.  

The York University Open Access and Open Data Committee is leading the development of the York U RDM Strategy through broad engagement with York researchers. All researchers are encouraged to participate in the process through multiple venues and express their needs, concerns, and areas of interest, individually and collectively.  

The Tri-Agency RDM Policy is distinct from the York University Institutional RDM Strategy and will affect researchers applying for Tri-Agency grants. 

The Tri-Agency RDM Policy requirements for researchers are: 

“Data management plans: By spring 2022, the agencies will identify the initial set of funding opportunities subject to the DMP requirement. The agencies will pilot the DMP requirement in targeted funding opportunities before this date.” 

“All grant proposals submitted to the agencies should include methodologies that reflect best practices in RDM. For certain funding opportunities, the agencies will require data management plans (DMPs) to be submitted to the appropriate agency at the time of application, as outlined in the call for proposals; in these cases, the DMPs will be considered in the adjudication process.” 

“Data deposit: After reviewing the institutional strategies and in line with the readiness of the Canadian research community, the agencies will phase in the deposit requirement.” 

“Grant recipients are required to deposit into a digital repository all digital research data, metadata and code that directly support the research conclusions in journal publications and pre-prints that arise from agency-supported research. Determining what counts as relevant research data, and which data should be preserved, is often highly contextual and should be guided by disciplinary norms.” 

The Tri-Agencies have identified initial funding opportunities that will require applicants to submit DMPs on their update page here: https://science.gc.ca/site/science/en/interagency-research-funding/policies-and-guidelines/research-data-management 

Specific requirements and information about how these requirements will be adjudicated are slowly being phased in as funding opportunities are launched. However, it is important to keep in mind that research societies, academic journals, and grant agencies worldwide are encouraging the adoption of data management planning and are articulating data deposit requirements.  

Research has become more and more collaborative and data intensive. The Tri-Agency’s RDM policy was informed by a thorough study of the international RDM landscape. Many grant agencies in Europe and the US have required Data Management Planning (DMP) and data deposit for some years. Academic journals are also adopting data policies to encourage and require data sharing and data deposits (where ethically and legally possible). Research societies are starting to integrate responsible RDM practices into disciplinary research ethics and research integrity documents. York University’s Institutional RDM Strategy will enhance the capacity of all its researchers to contribute to and participate in the broader research ecosystem and to publish robust scholarship for global public access.

The existing Tri-Council RDM policy does not require that all data be made openly accessible:  

“The agencies believe that research data collected through the use of public funds should be responsibly and securely managed and be, where ethical, legal and commercial obligations allow, available for reuse by others.” [1] 

“The objective of this policy is to support Canadian research excellence by promoting sound RDM and data stewardship practices. This policy is not an open data policy.” [1] 

“The Government of Canada encourages all members of the research ecosystem to remain vigilant and to ensure that they balance collaborative research with risk and science-appropriate safeguards. This includes employing strong cybersecurity and physical security protocols. Research should be as open as possible and as safeguarded as necessary, and should be practiced in full respect of privacy, security, ethical considerations and appropriate intellectual property protections.” [2] 

[1] Tri-Agency Research Data Management Policy https://www.science.gc.ca/eic/site/063.nsf/eng/h_97610.html 

[2] Tri-Agency Research Data Management Policy FAQ section on “Safeguarding Your Research” https://www.science.gc.ca/eic/site/063.nsf/eng/h_97609.html#5 

We do not anticipate this to be the case because the following supports are integrated into the ethics process: 

1. The Office of Research Ethics’ current research protocol form already includes questions concerning data retention, data security, and data reuse; 

2. The Office of Research Ethics’ informed consent template already includes sample language for obtaining permissions for data sharing and reuse; 

3. York University Libraries has developed a “Data Retention and Deposit Guidelines for Research Involving Human Participants” https://www.library.yorku.ca/web/research-learn/research/rdm/research-data-retention-and-deposit-guidelines/ to help researchers evaluate potential risks of data sharing. 

The Institutional RDM strategy upholds Indigenous self-determination as articulated within the preamble and then section 3.1 of the Tri-Agency RDM Policy: 

“In line with the concept of Indigenous self-determination and in an effort to support Indigenous communities to conduct research and partner with the broader research community, the agencies recognize that data related to research by and with the First Nations, Métis, or Inuit whose traditional and ancestral territories are in Canada must be managed in accordance with data management principles developed and approved by these communities, and on the basis of free, prior and informed consent. This includes, but is not limited to, considerations of Indigenous data sovereignty, as well as data collection, ownership, protection, use, and sharing. The principles of Ownership, Control, Access and Possession (OCAP®) are one model for First Nations data governance, but this model does not necessarily respond to the needs and values of distinct First Nations, Métis, and Inuit communities, collectives and organizations. The agencies recognize that a distinctions-based approach is needed to ensure that the unique rights, interests and circumstances of the First Nations, Métis and Inuit are acknowledged, affirmed, and implemented.”[1]

On the topic of Institutional Strategy, the Tri-Agencies indicate that strategies should include items such as: 

“recognizing that data created in the context of research by and with First Nations, Métis, and Inuit communities, collectives and organizations will be managed according to principles developed and approved by those communities, collectives and organizations, and in partnership with them; 

recognizing that a distinctions-based approach is needed to ensure that the unique rights, interests and circumstances of the First Nations, Métis, and Inuit are acknowledged, affirmed, and implemented.” [1] 

Within the Tri-Agency RDM Policy FAQ, a section on “Indigenous Research” points to additional alignment of between the agencies’ RDM Policy and indigenous research principles, protocols, and ethics guidelines: 

“The policy aligns with the CARE Principles for Indigenous Data Governance (Collective benefit, Authority to control, Responsibility, and Ethics), which reflect the crucial role of data in advancing Indigenous innovation and self-determination (see Global Indigenous Data Alliance below). 

With respect to Indigenous research, the agencies acknowledge the importance of ethical considerations and refer grant recipients to the framework for the ethical conduct of research involving First Nations, Inuit, and Métis Peoples outlined in Chapter 9 of the Tri-Council Policy Statement: Ethical Conduct for Research Involving Humans (TCPS 2). Decisions to deposit and/or share Indigenous research data and knowledge should be guided by principles of research with Indigenous Peoples. 

Moving forward, the agencies plan to support the development of Indigenous RDM protocols that aim to ensure community consent, access and ownership of Indigenous data, and protection of Indigenous intellectual property rights. This next phase in advancing Indigenous RDM in Canada is outlined in Setting New Directions to Support Indigenous Research and Research Training in Canada 2019-2022.” [2] 

[1] Tri-Agency Research Data Management Policy: Preamble and Section 3.1 https://www.science.gc.ca/eic/site/063.nsf/eng/h_97609.html#2 

[2] Tri-Agency Research Data Management Policy FAQ section on “Indigenous Research” https://science.gc.ca/site/science/en/interagency-research-funding/policies-and-guidelines/research-data-management/tri-agency-research-data-management-policy-frequently-asked-questions#2  

The Tri-Agency RDM Policy invokes a contextual definition of research data and is inclusive of research data in any format or type if it serves as a primary source to support research and/or creative practice. [1] The Policy also recognizes that different disciplines, areas of research, and modes of inquiry will have different RDM standards. [2] 

Qualitative data thus falls into the scope of The Tri-Agency RDM Policy. Developing a Data Management Plan (DMP) will enable researchers who collect and analyze qualitative data to apply RDM best practices suitable for qualitative data throughout their research data life cycle. DMP sub-sections on data ethics, data security, data storage and backup, and data sharing and preservation are especially important for researchers to gather resources and balance the needs of an open and collaborative research environment with the needs of safeguarding research, data, and vulnerable research participants, in all stages of qualitative data management. 

Although qualitative data is within the scope of Tri-Agency RDM Policy and the DMP tool will benefit qualitative data management, the Tri-Agency RDM Policy is not an open data policy in general, and it does not require open data sharing for all funded research. 

“Grant recipients are not required to share their data. However, the agencies expect researchers to provide appropriate access to the data where ethical, cultural, legal and commercial requirements allow, and in accordance with the FAIR principles and the standards of their disciplines. Whenever possible, these data, metadata and code should be linked to the publication with a persistent digital identifier.” [2] 

The Tri-Agency RDM policy recognizes that not all data is sharable because of ethical, cultural, legal, and commercial restrictions. Therefore, as with other types of research data, qualitative data may need to be destroyed or kept closely and securely. For example, it may not be feasible to de-identify data when there is a high risk of re-identifying or relinking data, and/or exposure of the data might cause harm to the participants or their communities, and/or the topic of the research is highly sensitive. Researchers can check out the “Can I Share My Data” guide when considering conditions of sharing data involving human participants, developed based on the TCPS2. [3] In addition, data use and reuse agreements, national security risks, commercial contracts, intellectual property law, indigenous data governance, and community partnership protocols could all potentially prevent qualitative (and quantitative) data from being publicly sharable. 

On the other hand, it is possible that qualitative data can be managed and prepared well and shared ethically and legally, if there are no ethical, legal, cultural, or commercial concerns. See the answers to next few questions for details. 

It is important to note that ethical considerations should guide the research design and methodological considerations when working with data scraped from the internet. It’s impossible to adopt a ‘one size fits all model’ because every data source (for example, every social media context) is unique, and ethical considerations are grounded in the specifics of the community, the methodology and the research questions.  To this end, ethical decision making is necessarily a deliberative and iterative process. [1] 

The following resources may be of assistance to researchers working with data sources of this nature: 

[1] Andrea Zeffiro and Jay Brodeur, “Social Media Research Data Ethics and Management.” Workshop presented March 5, 2020, Sherman Centre for Digital Scholarship, McMaster University, http://hdl.handle.net/11375/25327 

[2] Ryerson University Research Ethics Board, “Guidelines for Research Involving Social Media,” Ryerson University, November, 2017, https://www.torontomu.ca/content/dam/research/documents/ethics/guidelines-for-research-involving-social-media.pdf 

[3] Alex Voss, Ilia Lvov, and Sara Day Thomson, 2017, “Data Storage, Curation and Preservation” in The SAGE handbook of social media research method. SAGE Publications. https://ocul-yor.primo.exlibrisgroup.com/permalink/01OCUL_YOR/1jocqcq/alma991009792649705164  

UIT publishes data storage rates here: https://www.yorku.ca/uit/faculty-staff-services/server-data-storage/.  Recognize that long term storage solutions often represent unique needs (volumes, length of time, frequency/speed of access required).  UIT will work with partners and community members to develop and price custom solutions that provide the desired level of performance, scalability, security, and cost-effectiveness.  Custom solutions can be requested through a IT service request (https://askus.yorku.ca/portal/) or by simply contacting UIT SMS team directly. 

Ethics guidance and resources for securing research are available via the “More Information” section of the Tri-Agency RDM Policy FAQ document. [12] York UIT has also developed “Data Security Guideline: Research Involving Human Participants” [13] and “O365/OneDrive Security and Privacy” guide for researchers to choose appropriate storage media during the active research project stage. [14]  

For customized research computing services, researchers are encouraged to contact UIT’s Research Computing Services or Compute Canada’s Rapid Access Service. [15] It is recommended that consultations happen prior to grant submission so that accurate costing can be included with the grant application.

For relevant RDM services and contact information at York, check out the York University Libraries’ RDM page: https://www.library.yorku.ca/web/research-learn/research/rdm/

Guide on Creating a Research Data Management Plan (DMP): https://www.library.yorku.ca/web/research-learn/research/rdm/creating-a-research-data-management-plan/ 

Additional questions? Contact the Libraries at yul_rdm@yorku.ca

Balancing data sharing requirements with common directives for research ethics and confidentiality can be achieved through careful data management planning. To consider your options for sharing data, consult the following resources: 

Risk and Security Guidelines: 

Research Data Retention and Deposit Guidelines for Research Involving Human Participants (York University Libraries) 
Data Security Guideline: Research Involving Human Participants (York University Information Technology) 
Can I share my data (Portage Network) 

Qualitative data does not always need to be destroyed. With careful preparation, certain qualitative data can be retained and deposited into trusted repositories. This important decision can be made based on the sensitivity and identifiability of the data, and the levels of data disclosure risk. Researchers may refer to the “Human Participant Research Data Risk Matrix” [3], developed by the Alliance Sensitive Data Expert Group, or York University Libraries’ “Data Retention and Deposit Guidelines for Research Involving Human Participants” [4] for guidance.   

If qualitative data has potential to be shared or could potentially be shared after de-identification, researchers need to obtain participants’ informed consent prior to data collection. Make sure to use the updated research ethics application protocol and consent form template from the York University Office of Research Ethics [5] or check out the guide on “Research Data Management language for Informed Consent”. [6] Researchers will also need to factor in additional procedures and resources for data de-identification and prepare qualitative data and metadata for deposit. [7] For research data already collected prior to the announcement of Tri-Agency RDM Policy, the Canadian Panel on Research Ethics has developed the “Guidance on Depositing Existing Data in Public Repositories“, highlighting the consent issues for researchers and Research Ethics Board to consider. [8] 

In Canada, our national data repositories, FRDR and Borealis, are both starting to investigate sensitive data deposit options and policies. Currently, according to their Terms of Use, FRDR and Borealis cannot accept identifiable or sensitive data deposits unless public release consent has been obtained from the research participants. [10] The Alliance RDM Team has a Sensitive Data Expert Group which is working on developing guidance and tools for managing sensitive data in Canadian context. [11] They are investigating the potential for developing a model for a secure and trusted Canadian data repository that will allow researchers to share information about qualitative data, research instruments, and coding schemas while providing controlled access to the data itself. Researchers wishing to deposit sensitive qualitative research may want to investigate existing disciplinary repositories such as ICPSR.  

Additional resources for managing qualitative data are as follows: 

There are three categories of DMP Exemplars developed by the Alliance RDM Team, Canada  

(1) Digital Humanities 

(2) Mixed Methods

(3) Social Sciences

Qualitative Data Repository’s “Managing Data” guide:

Other:

Once it is determined that research data can be shared, researchers can select a trusted repository for data sharing and preservation. A number of commercial and not-for-profit generalist repositories exist for research data. 

A disciplinary data repository can be the best option for data to be properly catalogued and discovered by users in your discipline. You can consult the re3data directory and browse by discipline to generate a potential data repository list for your project. 

Canadian researchers also have two national research data repositories they can use to deposit their data: 

Borealis, the Canadian Dataverse Repository, is a bilingual, multidisciplinary, secure, Canadian research data repository, supported by academic libraries and research institutions across Canada. Borealis supports open discovery, management, sharing, and preservation of Canadian research data. Users can create robust metadata, track changes across versions of their datasets, create Digital Object Identifiers (DOIs), and make data open or restricted. 

The Federated Research Data Repository (FRDR) is a platform for sharing open and large data sets as well as a national data discovery engine. 

Researchers can learn more about repository choices using the following guides: 
Repository Options in Canada: A Portage Guide 
Generalist Repository Comparison Chart 
Recommended Repositories for COVID-19 Research Data 

Have questions about your data repository options? Contact us at yul_rdm@yorku.ca

To deposit data with Borealis and the York University Dataverse, please follow the instructions listed in the following guides: 

Depositing Data in the York University Dataverse: A Quick Guide 

York University Dataverse Deposit Submission Guide 

To learn more about the York University Dataverse, please consult our data deposit guidelines and collections policy: 

York University Dataverse Deposit Guidelines 

York University Dataverse Collections Policy 

See the related Q&A: https://www.library.yorku.ca/web/research-learn/research/rdm/  

Contact the Libraries at yul_rdm@yorku.ca

[1] Tri-Agency Research Data Management Policy FAQ https://science.gc.ca/site/science/en/interagency-research-funding/policies-and-guidelines/research-data-management/tri-agency-research-data-management-policy-frequently-asked-questions  

[2] Tri-Agency Research Data Management Policy https://science.gc.ca/site/science/en/interagency-research-funding/policies-and-guidelines/research-data-management  

[3] Sensitive Data Expert Group. (2020). Sensitive Data Toolkit for Researchers Part 2: Human Participant Research Data Risk Matrix. Zenodo. https://doi.org/10.5281/zenodo.4088954  

[4] York University Libraries’ “Data Retention and Deposit Guidelines for Research Involving Human Participants”. “Data Retention and Deposit Guidelines for Research Involving Human Participants” 

[5] Forms from York University Office of Research Ethics. https://yulink-new.yorku.ca/group/yulink/research-documents-forms#ORE  

[6] Sensitive Data Expert Group. (2020). Sensitive Data Toolkit for Researchers Part 3: Research Data Management Language for Informed Consent. Zenodo. https://doi.org/10.5281/zenodo.4107178  

[7] Portage COVID-19 Working Group. (2020). De-identification Guidance (Version 2). Zenodo. https://doi.org/10.5281/zenodo.4270551 

[8] Canadian Panel on Research Ethics, (2021). Guidance on Depositing Existing Data in Public Repositories. https://ethics.gc.ca/eng/depositing_depots.html  

[9] Qualitative Data Repository. “Access Control”. https://qdr.syr.edu/guidance/human-participants/access-controls  

[10] Federated Research Data Repository. https://www.frdr-dfdr.ca/repo/. Borealis, The Canadian Dataverse Repository. https://borealisdata.ca/  

[11] Sensitive Data Expert Group. https://alliancecan.ca/en/services/research-data-management/network-experts. Sensitive Data Guide. https://alliancecan.ca/en/services/research-data-management/learning-and-training/training-resources#heading-sensitive-data-guidance   

[12] “More Information” section of the Tri-Agency RDM Policy FAQ document. https://science.gc.ca/site/science/en/interagency-research-funding/policies-and-guidelines/research-data-management/tri-agency-research-data-management-policy-frequently-asked-questions#5  

[13] Security Guideline: Research Involving Human Participants: https://www.yorku.ca/research/data-security-guideline-research-involving-human-participants/  

[14] O365/OneDrive Security and Privacy: https://yuoffice.yorku.ca/privacy-security/  

[15] UIT’s Research Computing Services: https://www.yorku.ca/uit/faculty-staff-services/teaching-research-computing/research-computing/ . Compute Canada’s Rapid Access Service: https://alliancecan.ca/en/services/advanced-research-computing/accessing-resources/rapid-access-service