QDR views data curation as an interactive process between repository staff and depositors. Curation involves a series of steps that make data easily findable and maximize their potential for use and re-use. QDR offers multiple types of services related to curation, including assistance to researchers before and after they deposit data. QDR also actively seeks out unarchived qualitative and multi-method data that it deems of particular value and works with data creators to curate those data.
Preparation and Training
QDR frequently teaches data management, in particular qualitative data, to researchers and data professionals through courses at academic conferences, webinars, etc. We also offer individualized data management consultations both at academic conferences and remotely. This allows us to inform scholars of the requirements for data deposits and increases the quality and completeness of data projects that we receive. Training on data management includes tasks such as writing effective data management plans, data organization during and after research, and data security.
Support for data management also includes consultations on human-participants research, which empower authors to maintain the confidentiality and protect the safety of human participants without unduly restricting their ability to share the data that result from their interactions with research subjects. We advise researchers on privacy protocols and informed consent statements to be included in IRB applications.
In conducting all such training and consultations, we emphasize how managing data effectively both enhances a scholar’s own research and allows them to make their data meaningfully accessible to others.
Initial Deposit and Appraisal
As part of our deposit process, we ask researchers to provide an initial description of the data they plan to deposit. If the data fall within the scope of the types of data QDR ingests, as described in our collection development policy, we carry out the following initial consultation and appraisal steps, frequently having one or multiple phone/Skype conversations with depositors:
- Curation interview: in a curation interview we help researchers decide which files to include in their deposit (and which files to exclude). As norms of qualitative data sharing are still crystalizing, these initial conversations often lead to scholars depositing more files than they had originally anticipated, thereby better representing the research conducted.
- Project organization: in collaboration with depositors, we determine which organization will best reflect their intention and facilitate reuse of the data they wish to deposit. For qualitative data, how the data are organized has implications for how well those who access the data project can understand it. Collaboration between data creators and knowledgeable curators assures data are represented correctly and in a way that facilitates reuse.
Deposit – Data Files
Most data are deposited, together with all documentation, through QDR’s “New Data Project” interface. For highly sensitive data, we use more secure channels for data transmission as specified in our policies for handling sensitive data.
On receipt of the data, QDR stores all files in their original form for preservation. We then perform the following tasks on data files. This often involves in-depth consultation with depositors to ensure that the final data publication reflects depositors’ thinking and supports secondary usage:
- File integrity checks: QDR assures that all files can be opened correctly
- File format conversion: QDR converts files to data formats suitable for long-term preservation
- File naming: QDR imposes a uniform naming structure on files
- Copyright check: QDR assesses whether files are likely to violate applicable copyright terms or licenses; when they may, QDR works with depositors to develop a solution that will allow us to publish as much of the data as legally possible.
- Disclosure risk evaluation assistance: As noted in QDR’s deposit agreement, the responsibility for possible disclosure remains with the depositor. QDR can assist depositors whose materials are in languages spoken by a QDR staff member by reviewing all files with potentially sensitive material for possible disclosure risk, and helping depositors to identify and execute a strategy to address any such risk (e.g., de-identifying data and selecting appropriate access controls, which can be applied at the file level or the project level.). For depositors whose materials are in other languages, we convey repository best practices.
Deposit - Documentation and Metadata
Complete documentation and metadata should accompany each data deposit. These materials are crucial for making data findable by other researchers and for facilitating the reuse of the data. During the deposit process, data project authors provide multiple types of information that contextualize their data, aiding its re-use and reducing the risk of misinterpretation. In addition, to assist depositors in providing the full context of their data, QDR helps them to identify additional files to include as documentation, such as data management plans, de-identification protocols, and informed consent scripts.
As part of the curation process, all fields in QDR’s data catalog are automatically mapped to metadata formats, both the Datacite metadata kernel (which is then deposited with Datacite and can be searched via Datacite search) and to Data Documentation Initiative (DDI) Codebook, the most important metadata format for social science data. QDR makes sure data entry conforms to best cataloging practices (e.g., using controlled vocabularies). We also work with depositors to fill any gaps in documentation or metadata.
Publication and Dissemination
QDR publishes data and documentation at data.stage-aws-new.qdr.org. Most data are published under QDR’s standard deposit and download agreements. These agreements stipulate that data are accessible only to registered, logged-in users, while their documentation is available without login under a Creative Commons Attribution Share-Alike (CC-BY-SA) license. Where data require additional access controls, QDR applies such controls as agreed to with the depositor and reviews any access requests.
Published data projects are assigned digital object identifiers (DOIs) which serve as permanent links and allow for reliable citation to the data.
We track re-use of data published on QDR. When data are reused, a full citation to the publication that drew on them is added to the landing page of the relevant data project, and the metadata records for the project are updated and thus makes sure such reuse is captured in citation metrics such as Making Data Count.
Following best practices on data publishing, we provide a sample citation, as well as export formats for reference managers (RIS, BibTeX, EndNote XML), which facilitate citation. We frequently promote data projects via twitter, recommend them to our users, and write about them in our publications.
QDR assures that data remain accessible in the long term. In line with our preservation policy, we keep multiple copies of all data files and automatically check them for damage (“bitrot”). We also monitor all file formats stored in QDR to protect against file-format obsolescence (the inability to open a file with available software) and migrate files to a newer format as needed.