Amazon’s Mechanical Turk is a services that allows you to hire 100s of people from across the world to tag photos, complete surveys or find websites. According to Wikipedia:
The Amazon Mechanical Turk (MTurk) is one of the suite of Amazon Web Services, a crowdsourcing marketplace that enables computer programs to co-ordinate the use of human intelligence to perform tasks which computers are unable to do. Requesters, the human beings that write these programs, are able to pose tasks known as HITs (Human Intelligence Tasks), such as choosing the best among several photographs of a storefront, writing product descriptions, or identifying performers on music CDs. Workers (called Providers in Mechanical Turk’s Terms of Service) can then browse among existing tasks and complete them for a monetary payment set by the Requester. To place HITs, the requesting programs use an open Application Programming Interface, or the somewhat limited Mturk Requester site.
Why would economists find this service useful? An example from my own work might help. I am collecting all the individuals employed by new VC firms founded from 1992 – 2007. I need demographic information and employment histories for each VC partner. After scraping the web to get the VC firm websites and “team pages” I have a set of locations for an individual to find the online biography of each of some 3000 VC partners. I submit a job to Mechanical Turk that asks the Turk’er to go to the website, find the individual’s biography and copy and paste the text. Further jobs could ask the Turk to read the bio and answer questions like: 1) Does this person have an MBA or PhD? 2) Male or Female? 3) Founder of firm? Unfortunately, submitting HITs to the Turk system is somewhat difficult. Enter Smartsheet.
Smartsheet as a Frontend to Mechanical Turk
Although the Mechanical Turk service has a system for submitting HITs, it is a little cubersome and requires a bit of strange formatting steps. If you value your time just a little (say, $10/hour) I recommend using the project management webapp Smartsheet’s service SmartSourcing. Again, I refer to my research. I created an Excel file with columns like “Name”, “Firm Name”, “Website” and an empty column “Biography.” I upload this file to Smartsheet, select the rows for which I need biography filled in and walk through their SmartSourcing steps. In about 3 minutes I have submitted a HIT to 1000s of workers that will be complete in 12 hours. I can approve or reject the responses while I watch them populate the online spreadsheet.
Such a service is not free. For a $9.95/month fee (or $99/annual…ask them for a non-profit coupon), you get access to SmartSourcing. Then, on top of the standard Turk fees, you have Smartsheet charges:
Any paid Smartsheet subscriber has access to the Crowdsourcing feature. Monthly charges include Amazon fees plus the cost for work performed (number of tasks completed * cents paid per task) and a low Smartsheet processing fee ($.01 + 10% per task completed – usually $10-$30 per 1,000 tasks).
For example, I paid about $7 for 116 biographies of VC partners. It sounds relatively expensive, but this service has:
- increased the potential sample size of my studies
- expanded the set of possible control variables
- gives you the ability to request multiple workers per task for error checking
- kept me sane by outsourcing mundane data tasks