Making Qualitative Research More Transparent

QDR Blog

QDR is Offering Complimentary Curation and Storage for COVID-19 Related Data

skarcher

Tue, 05/19/2020 - 21:35

DOI For this blog post: https://doi.org/10.59350/5znft-x4j11

The scientific response to the global pandemic has shown, among other things, the value of open science, collaboration, and data sharing. In that spirit, QDR will support efforts to share qualitative and multi-method social science data underlying COVID-19 related research.

sign for COVID-19 testing center — Photo by Colin D on Unsplash

Social scientists are contributing important insights to understanding and addressing challenges caused by the pandemic. There is a growing public list of ongoing social science work related to COVID-19. Major funders such as NSF and SSRC are soliciting research proposal as part of dedicated projects, and journals such as Perspectives on Politics are inviting manuscripts for special issues.

Guided by our mission, we at QDR believe that the impact of these efforts will be enhanced if underlying data are shared ethically wherever possible. We will work with interested researchers to devise a plan for organizing and sharing data, and will curate those data, publish it on QDR (with appropriate access controls if warranted), and preserve it for the long-run.

Please contact us at qdr@syr.edu if you are interested in a consultation for your project or simply want to learn more about the repository.

view

1a36fd60-67d8-4e6e-a01d-56f0c57241ae

archivr - Loving Data from Web Research

skarcher

Tue, 05/19/2020 - 20:42

DOI: https://doi.org/10.59350/5znft-x4j11

Welcome to Love Data Week. Every year, research data professionals from libraries, data repositories, and other organizations celebrate great ways to use data and best practices in taking care of research data. You can find lots of us tweeting using #LoveData20 or #LoveDataWeek. At QDR, we’re celebrating this year by releasing our first software tool for researchers, the R package archivr (pronounced “archiver”).

The Problem

If you have done any research using web sources, you have probably run into this issue:

Firefox I can't find this page message

You used a web resource – a blogpost, a newspaper article, a statement from an organization – but when you want to come back to it, you can no longer find it. Even if you were smart enough to save the page at the time or you used a tool like Evernote or Zotero to take a Snapshot, citing the now gone webpage is of little help to other researchers who may want to follow up on your claims about it

A Solution

When this happens to you as a reader – e.g., when, you find a webpage cited that is no longer online, you may have used the WaybackMachine, an incredibly useful tool by the Internet Archive, a non-profit organization. The WaybackMachine allows you to look up a website and find an archived copy. Webpages associated with the Love Data Week event actually provide some great examples of this – the event used to be called “Love Your Data Week” but had to drop the name, and associated website https://loveyourdata.wordpress.com/, due to an existing trademark. While the live pages have disappeared, the Internet Archive allows us to find many archived copies of this site.

Old Love Your Data Week banner — The banner for the first Love (Your) Data Week in 2016 from the WaybackMachine.

But this only works for sites that the Internet Archive has saved automatically. Sites that are only available for a short time and/or haven’t been linked to widely are often not archived in the Internet Archive. This is particularly true for non-English sources.

This is where archivr comes in. It allows you to automatically save all URLs in a spreadsheet or a Word file ro the Internet Archive or perma.cc, a similar service run by a consortium of libraries led by the Harvard Law Library. So if you, for example, listed 100 URLs you consulted in an Excel sheet, you can make sure they’re archived. If you’ve written a chapter, or even an entire dissertation, archivr will find all URLs in the text and make sure they are archived.

By using archivr, you can be sure that scholars will always have access to the web pages you relied on.

At QDR, we have been using this tool for curation for several months now. We worked together with Agile Humanities’ Ryan Deschamps in building the original prototype and have since taken over maintenance of the tool.

An Example

In her masterful book Authoritarian Apprehensions, Lisa Wedeen draws, among other things, on 100s of web sources, many of them from Syria and thus particularly prone to disappearance. When curating data accompanying Lisa Wedeen’s book, QDR used archivr to make archived copies of all those sources, ensuring they’ll remain available. If readers of the book find a URL that is no longer working, they can simply search for it in the spreadsheet of archived URLs on QDR and find an archived copy instead

Excel sheet with archived URLS

Using archivr

Archivr is easy to use, including for R-novices. Basic installation and usage instructions are included in the readme and detailed instructions and examples are in the built-in documentation.

If you are using archivr, please let us know what you think. If you have any feature requests or bug reports, email us at qdr@syr.edu or create an issue on the project’s github repository.

view

fbfffc11-1c5e-40bf-9a67-48b54df9da40

NSF Wants You to Use Effective Data Practices

skarcher

Fri, 10/11/2019 - 13:37

QDR Can Help

In a recent “Dear Colleague Letter”, the National Science Foundation (NSF) encourages researchers to adopt best practices in managing research data. NSF frequently uses “DCLs” to make researchers aware of funding priorities and preferred practices, so if you are thinking of applying for NSF funding, you should pay close attention to such pronouncements.

Endorsed by US Social Science Data Repositories

The Data Preservation Alliance for the Social Sciences (< a href="http://www.data-pass.org/">Data-PASS), in which many of the social science data repositories in the US – including QDR – are organized, has strongly endorsed the NSF's efforts on this. Data repositories can be instrumental in helping researchers achieve the "effective practices for data" that NSF is looking for.

Top of NSF DCL on Effective Practices for Data

The NSF points to two specific things it wants researchers to pay close attention to:

1. Persistent IDs for Data

The most important “persistent identifier” in scholarly publishing is the Digital Object Identifier (DOI). You likely have seen DOIs, which always start with “10.” and allow you to create stable links to resources (e.g. the link to my article on DOIs below is its DOI, 10.5281/zenodo.2563130, prefix it with https://doi.org and you have a permanent link to the paper: https://doi.org/10.5281/zenodo.2563130). NSF wants you to share your data with a persistent identifier not just because of the stable linking, but also because permanent identifiers help link different scholarly outputs together and make their metadata accessible (I have written a slightly longer text about DOIs and their usefulness for QMMR).

How QDR can help you: When you deposit data with QDR, your data will automatically receive a DOI. We go even further and provide separate DOIs for every file you deposit. This is particularly important for qualitative data, where someone may want to cite a specific interview or document that is part of your data.

Suggested citation for a QDR data rpoject with DOI — Suggested citation for a QDR data project with DOI

2. Machine-Readable Data Management Plans

NSF has been requiring data management plans (DMPs) to be submitted with every grant application since 2011. Feedback from researchers suggests that NSF program officers are paying increasingly close attention to the content of these plans. In addition to this requirement, NSF now also suggest you make your DMP “machine readable,” i.e. provide it in a format/structure that is easy for computers to parse. Thankfully you don’t have to do this by yourself. If you use the DMP Tool, a very useful online tool that helps you write data management plans, your DMP will automatically be machine readable.

How QDR can help you: QDR has extensive guidance on data management and data management planning, specifically for qualitative data. We will also consult with you on your data management (at no charge) and offer a template for inclusion in your DMP if you’re planning to deposit data from your project with QDR (but please make sure to get in touch with us beforehand to make sure your data are a good fit for QDR).

And there’s more

If you’ve recently visited QDR, you may have seen that we now charge a deposit fee unless your institution is a QDR member (though we do offer waivers and are currently able to do so quite generously).

NSF previously indicated that it would cover such fees for funded research, and reiterates this in its DCL:

In some cases, PIs may have to pay a "data deposit fee" to place data in repositories that then make the data more accessible to others. A "data deposit fee" is a one-time charge paid at the time a dataset is deposited into a data repository. In exchange for this fee, repositories commit to making the data available into the future. NSF has clarified its policies on data deposit fees: these fees are allowable expenses in proposal and award budgets.

We are happy to provide you with a deposit-fee quote for your NSF budget.

DOI: https://doi.org/10.59350/5znft-x4j11

view

b300b30e-1482-4f39-959a-4d2bfe9f0913

Improving QDR's Dataverse for Qualitative Data

skarcher

Fri, 06/21/2019 - 11:11

When QDR adopted the Dataverse platform in early 2017, one of our goals was to improve the software, development primarily with quantitative data in mind, for qualitative data and the researchers using it. A little more than one year into using Dataverse software at QDR, we have made significant strides in this direction. Here is a quick overview of some of our biggest additions. I also talk about some of these in the video below.

Full Text Search

Qualitative data come in many formats – text, audio, video, images, and more – yet textual data make up by far the largest component of QDR’s (and other qualitative data repositories’) holdings. We therefore care deeply that these data are easy to find. In order to do this, we need to go beyond the description of projects: we need to allow users to search what’s inside the data files. For tabular, quantitative data, Dataverse already allows for this, already extracting variable-level metadata and including it in searches. The analog for qualitative data is straightforward: users need to be able to search for text in files.

Full text search of text and PDF files is now available in QDR and, based on our contributions, in other repositories using Dataverse software such as the Harvard Dataverse. Many of the data files in QDR, however, are restricted to be viewed only by registered users (and some carry further restrictions). How do we make these findable without exposing potentially sensitive contents? Any search on QDR will only show the results the searching users has access to: an guest user’s search will mostly encompass documentation, an authenticated user’s search most files, and an admin’s search all files, in both published and unpublished projects. Full text search is available both across data projects and within any project (the .gif below shows that latter).

Multimedia Viewers

Other common file types for qualitative data are images, audio recordings, and videos. In the standard Dataverse software, the only way to view such files is to download them and then open them in a viewer on your computer: that’s a lot of steps! Moreover, as you’re looking at larger video files, this entails downloading massive files before you even know if you’ll find them useful.

QDR therefore implemented a set of lightweight viewers, which allow you to open a large variety of files (text, html, pdf, image, audio, video) in a new tab for quick viewing. They’re easily accessible from a button next to the file (see .gif below). While QDR is currently the only Dataverse installation using these viewers, their code is open and easy to run and several other repositories have already indicated that they will use them.

Benefitting from a Strong Open Source Community

While we at QDR focus our development efforts on Dataverse features that are particularly important for qualitative data, many such features also are developed by an active open source community, coordinated by the wonderful team at Harvard’s Institute for Quantitative Social Science (IQSS). Here are some of the biggest gains for us, as a repository for qualitative data, from the last year:

Data projects with many files work much better now and it is easily possible to select and download all files in such projects. Many of our projects have hundreds, some thousands of files, so this is of particular importance for qualitative data.
Individual files in QDR now have Digital Object Identifiers (DOI), making them clearly citeable. Files in qualitative data projects often make sense by themselves (think an individual interview transcript or a historical documents), and we’ve had requests for this in the past.
We are now able to display and make available for download data files organized in folder structures. If you upload a ZIP file containing folders to QDR, these are automatically preserved. Given the large number of files in projects, organization such as this is particularly important for our data. See e.g. the recently published deposits from Alisha Holland and Matt Hitt for two examples of using folders effectively to organize and display data.

Are there any other areas in which you think we should do better for qualitative data? We’re always looking to hear from you by email or on twitter.

doi:10.59350/5znft-x4j11

view

1db69ba3-aa40-4496-a020-2246a1d9ae61

Webinar: A Tale of Two Data Projects

skarcher

Fri, 11/30/2018 - 08:39

Data Curation at the Qualitative Data Repository

On November 7, Sebastian Karcher, QDR's associate director, and Dessi Kirilova, our curation specialist, hosted a webinar providing deep insights into QDR's curation process based on two data projects. Around sixty participants attended live and many others inquired about a recording. The recording of the webinar is now available on YouTube.

Find the slides for the webinar here

The two projects discussed in depth were:

Clarke, Killian B. 2018. "Data for: When do the dispossessed protest? Informal leadership and mobilization in Syrian refugee camps". Qualitative Data Repository. https://doi.org/10.5064/F6CN723S. QDR Main Collection. and
Loyle, Cyanne E.;Davenport, Christian;Sullivan, Christopher. 2018. "Association for Legal Justice (ALJ) Human Rights Testimony, Northern Ireland". Qualitative Data Repository. https://doi.org/10.5064/F6LHMHJR. QDR Main Collection.

Sebastian also discussed QDR's recently launched institutional membership and what it offers to member institutions. Find out more about institutional membership here, and please be in touch with any questions.

(updated on 2018-12-04 to reflect publication of 2nd data project)

view

acf06197-dd2b-4af4-a2b6-15fef066a0a0

Announcing the Annotation for Transparent Inquiry (ATI) Initiative

We are excited to announce the first round of published projects from our Annotation for Transparent Inquiry (ATI) Initiative. ATI creates a digital overlay on top of articles generated through qualitative and multi-method research published on journal web pages. That overlay connects specific passages of text to author-generated annotations.

ATI’s annotations include ‘analytic notes’ discussing data generation and analysis, excerpts from data sources, and links to those sources stored in trusted digital repositories. Readers are able to view annotations immediately alongside the main text, removing the need to jump to footnotes or separate appendices. Sharing the data sources via a secure repository ensures that they are findable, accessible, interoperable, reusable, and preserved for the long term, and that human participants are protected.

In a partnership with Cambridge University Press and Hypothesis (the non-profit tech company on whose software ATI is built), and with generous funding from the Robert Wood Johnson Foundation (RWJF) and the National Science Foundation, QDR assembled an interdisciplinary group of scholars to annotate a journal article that each had recently published.

Participating scholars were drawn from diverse disciplines including anthropology, linguistics, political science, and public health. The annotated articles are published in some of the leading journals in these disciplines, including the American Political Science Review, International Organization, English Language & Linguistics and the Journal of Biosocial Science.

Screenshot of article text and annotation — Joey O’Mahoney annotated his article on the normative underpinnings of the UK’s position during the Bangladesh Liberation War. ATI enables him to bring the decision-making process within the British government to life by annotating his account with extensive excerpts and linking to digital copies of memos, telegrams, and similar materials that provide a contemporary, first-hand account and underpin O’Mahoney’s carefully constructed empirical argument.

In a workshop this February, the authors met with a group of reviewers--as well as representatives from QDR, Hypothesis, Cambridge University Press, and RWJF--to discuss their ATI data projects and the future of ATI. Cambridge University Press has generously agreed to make all participating papers freely available until March 2019. We encourage you to browse through all of these projects.

Article Citation	ATI Link
Arrey, Agnes Ebotabe, Johan Bilsen, Patrick Lacor, and Reginald Deschepper. 2017. “Perceptions of Stigma and Discrimination in Health Care Settings towards Sub-Saharan African Migrant Women Living with Hiv/Aids in Belgium: A Qualitative Study.” Journal of Biosocial Science 49 (5): 578–96. https://doi.org/10.1017/S0021932016000468	Click here
Gans-Morse, Jordan. 2017. “Demand for Law and the Security of Property Rights: The Case of Post-Soviet Russia.” American Political Science Review 111 (2): 338–59. https://doi.org/10.1017/S0003055416000691.	Click here
Moorlock, Greg, James Neuberger, Simon Bramhall, and Heather Draper. 2016. “An Empirically Informed Analysis of the Ethical Issues Surrounding Split Liver Transplantation in the United Kingdom.” Cambridge Quarterly of Healthcare Ethics 25 (3): 435–47. https://doi.org/10.1017/S0963180116000086	Click here
O’Mahoney, Joseph. 2017. “Making the Real: Rhetorical Adduction and the Bangladesh Liberation War.” International Organization 71 (2): 317–48. https://doi.org/10.1017/S0020818317000054	Click here
Pierskalla, Jan, Alexander De Juan, and Max Montgomery. 2017. “The Territorial Expansion of the Colonial State: Evidence from German East Africa 1890–1909.” British Journal of Political Science, FirstView, 1–27. https://doi.org/10.1017/S0007123416000648	Click here
Skarbek, David. 2016. “Covenants without the Sword? Comparing Prison Self-Governance Globally.” American Political Science Review 110 (4): 845–62. https://doi.org/10.1017/S0003055416000563	Click here
Smith, Jennifer, and Sophie Holmes-Elliott. 2017. “The Unstoppable Glottal: Tracking Rapid Change in an Iconic British Variable.” English Language & Linguistics, January, 1–33. https://doi.org/10.1017/S1360674316000459	Click here
Tidy, Joanna. 2017. “Visual Regimes and the Politics of War Experience: Rewriting War ‘from above’ in WikiLeaks’ ‘Collateral Murder.’” Review of International Studies 43 (1): 95–111. https://doi.org/10.1017/S0260210516000164	Click here

ATI projects connected to eight articles published in journals from other publishers were also discussed at the workshop, and will be made available in the near future.

Screenshot of article text and annotation for Smith/Holmes-Elliott — As sociolinguists, Jennifer Smith and Sophie Holmes-Elliott face particular challenges in representing the evidence they use when writing their articles: Their principal concern is the *spoken* language, yet the traditional article format requires them to reduce recorded spoken language to transcripts. Their article investigates the ways glottal replacement (the shift from [t] to [ʔ] as in “I was a bi[ʔ] annoyed a[t] i[ʔ].”) has spread in the Scottish fishing town of Buckie. Using ATI, the authors were able to link both recorded sound samples and extended transcripts to their text. These materials support their empirical claims and illustrate them more vividly to readers than short transcripts could.

As part of the ATI Initiative, we also solicited feedback from all participating authors and formal evaluations by the reviewers. Our goal is to understand where and how ATI can be employed most fruitfully, and how we can support researchers’ workflows to ease the additional workload of generating ATI annotations. We expect to present the insights we gained about ATI through the workshop and the generous input of participating scholars later this year.

ATI Challenge

We are now inviting faculty and advanced graduate students in the social and health sciences, humanities, law, and other disciplines that employ qualitative data and methods, to submit proposals to participate in the ATI Challenge.

ATI Challenge participants will receive an award of $2,000 (as either an honorarium or research support), subject to any relevant tax and visa status limitations. They will also attend a workshop in New York City in November 2018, where they will meet and network with other scholars who share a commitment to making their research accessible and evaluable. All travel and accommodation expenses will be covered by QDR.

The proposal deadline is May 31, 2018 (extended from the original May 11 deadline). Additional information can be found here. Please contact QDR (qdr@syr.edu) with questions.

By skarcher on Wed, 04/25/2018 - 17:05