Skip to Main Content

AkLA 2017 Crowdsourcing Your Special Collection: Home

Harness the power of the Zooniverse to analyze photos/texts in your collections - classify, count subjects of interest, etc.

Introduction

Almost any library regardless of type has some backlog collections of photos or diaries, or something. Some libraries have digitized parts of their unique collections and placed them on the web. But even they struggle with adequate description and/or transcription. There's often no money or no staff to enhance these unique offerings.

Help is at hand. The open source organization Zooniverse offers a free platform to turn your digitized objects into data for the world to use. No programming required. Read on for more. 

Examples of What Can be Done

The software behind Zooniverse powers a number of projects, most of them science related. Here are three projects in the humanities that were active as of February 2017: 

Sources of Help

Whether you are building or managing a Zooniverse project, there are places with help for you.

More Questions?

If you have questions or comments after exploring this guide, feel free to use the "Email Me" button in the Guide Author block below to contact me. 

Before You Begin

Some things to think about before you start building a Zooniverse project:

  • Is there interest in your material? You can only crowdsource if there is a crowd. Putting low interest materials into a project may result in disappointment.
  • What is your end product? (A classified collection, transcribed journals, etc?) Where will that product live? Zooniverse isn't a place to display your finished product.
  • Is your digitized collection restricted or under someone else's copyright? Zooniverse terms of use require that project data be available for public use and that you should only post items that are either under your copyright or public domain/creative commons. 
  • Do you have partners in your community or in your field that can help you publicize the project?
  • Can you have someone devote a few hours each month promoting the project? Volunteers need to be appreciated and encouraged. 

Zooniverse Limits

Understanding limits to the Zooniverse platform can help avoid frustration later. Limits I've been able to identify include:

  • File size of individual images should be limited to 300K or less. Larger than this and they won't display. Photoshop Elements and other photo software can be used to batch resize a group of photos.
  • The self-service version of Zooniverse allows you to upload 10,000 subjects - ever. Deleted objects count towards your quota. So if you upload 5,000 photos, then delete them and then upload another 5000, you have reached your quota. If you have more than 10,000 items to classify, talk to the Zooniverse team. For serious subjects they'll often give you more space. 

Building a Project

  1. Register for Zooniverse if you haven't already. Go to https://www.zooniverse.org and click on "Register" in the upper right hand corner. You'll be asked for a username, password, email address and real name. Once you're registered, you may add a profile picture if you wish, but it is not required. 
  2. Click on "build a project", then how-to. Read over the Zooniverse Project Building Guide before you start.
  3. Gather your images (each one should be less than 300K) and your file listing (a spreadsheet) - together they will be your "subject set"
  4. Once you start your project, the main sections are:
    1. Defining your project - here you're describing your project in terms that will encourage visitors to want to help you. 
    2. Build a workflow (or more) - this is a set of linked questions that you want visitors to answer about every image. Shorter workflows tend to get more volunteers. Zooniverse lets you test your workflow before it goes lives. 
    3. Link the worklflows together
    4. Upload your subject sets (the images and spreadsheet you got together above)
  5. You should now be ready to let people start working on your project. Make sure that your visibility is set to "public" once you're ready. 

Getting Your Data

Once people have been classifying for awhile, you'll have data to download. You get it by logging into to the Zooniverse Project Building, selecting your project, and clicking on Data Exports. You'll get a screen like this:

Zooniverse Data Export options

Usually you'll want a new classification export. Zooniverse will send you a download link within a day and the resulting file will look like the comma separate value (csv) file below:

When you open the file in Excel or another spreadsheet program, the data will look a bit strange, but you can figure some things out by looking at the "annotations" and "subject_data" columns. 

The "annotations" column will have entries like this:

[{"task":"T0","task_label":"How many cats?","value":"1-5"},{"task":"T1","task_label":"How Cute are these kittens?","value":"3"}]

This gives you the answers from one individual user. The next cell over on in the "subject_data" line tells you what image the answers were for:

{"5722363":{"retired":null,"link":"https://www.flickr.com/photos/htakashi/5408115862","origin":"Flickr","license":"Creative Commons - share adapt attribute","subject_id":"28","attribution":"Takashi 

To find all of the classifications for a given subject, you can use Excel's filter function to look at just the answers for that one subject. Usually a majority of answers will be the same. That's the classification/answer you use for that subject.

For large subject sets, you may wish to find someone with Python programming experience to be able to do quick mass summaries of your gathered data. But is is possible to use this manual spreadsheet method.