One of OpenDP use cases is the integration of OpenDP tools with public data repositories to release differentially private statistics of sensitive data hosted in the repository. Public data repositories have become ubiquitous in research and are a key part of the research lifecycle. Funders increasingly require sharing the research data associated with a funded project in a data repository when the project ends; journals increasingly require publishing the data associated with a scholarly article in a supported data repository; researchers seek recognition and credit for the datasets they have collected, cleaned, and curated, and data repositories have become the publisher of their datasets, providing a persistent citation for each dataset to be used in bibliographic references; and research communities increasingly expect transparency and open science, with access to the data associated with a scientific study to facilitate reproducibility of its results and reuse for future studies.
However, when datasets are sensitive, they remain difficult to publish in these public data repositories, and any information about them becomes unknown to the research community. OpenDP provides an opportunity to solve this problem. Sensitive datasets could be hosted securely and strictly protected in data repositories, with access to the original data limited to only a few approved researchers, but differentially private statistical releases and queries could be made more widely available. This is what Dataverse, the open-source data repository software platform, plans to offer to its community of more than 60 data repositories worldwide. And you can learn more about it in the upcoming Dataverse Community Meeting on June 17, 18, and 19, hosted via Zoom.
You don't need to use Dataverse or even be familiar with Dataverse to join. The Community Meeting is open to anybody interested in research data management, data and code sharing, metadata and standards, open science, reproducibility, sensitive data, geospatial data, and more. This year's meeting will include a plenary session on COVID-19 data sharing with recommendations from the Research Data Alliance, the National Institutes of Health, and COVID researchers, and breakout sessions on a wide range of topics of interest to the community. The breakout session on sensitive data will cover support for data enclaves, tools to recommend DataTags sensitive levels, and how OpenDP could be used in data repositories.
For more information and to register for free, go to: https://projects.iq.harvard.edu/dcm2020.
See you soon!