🏠 Home 

HIT Scraper WITH EXPORT

Snag HITs. mturk.


Install this script?
Author
feihtality
Daily installs
0
Total installs
55,056
Ratings
50 5 4
Version
4.1.4
Created
2015-06-24
Updated
2017-12-22
Size
151 KB
License
N/A
Applies to

worker sub-domain compatible

To use the script once installed, attach hit_scraper or hit-scraper or hitscraper to the path of any mturk URL. Any the following URLs will work and are equally valid for the purposes of initializing the script:

  https://worker.mturk.com/hitScraper
https://worker.mturk.com/hit_scraper
https://www.mturk.com/hit-scraper
https://www.mturk.com/mturk/findhits?match=true&hit_scraper

User Guide


Understanding the Interface


The top section with all the various search settings and options is internally called the Control Panel. This is filled with options that users may want to change more frequently--on a per search/scrape basis--than the items in the Settings Panel (accessed through the Settings button).

Control Panel options

Auto-refresh delayThis controls how often (in seconds) a scrape will automatically be run. Setting this to 0 will force the scraper into manual mode, turning off automatic scraping.


Pages to scrapeSets the minimum threshold for number of pages to retrieve.


Correct for skipsIf more than 66% of HITs are blocked by the blocklist, an additional page will be added until the number of blocked HITs is less than 66% of the total r###lts.


R###lts per pageControls the number of r###lts retrieved per page. It has a maximum of 100. It is typically better to increase the number r###lts per page rather than increasing the number of pages to scrape.


Minimum rewardSets a minimum pay threshold.


QualifiedLimits r###lts to only HITs for which you are qualified.


Masters OnlyLimits r###lts to only HITs that require the Masters qualification.


Hide MastersFilters out HITs that require the Masters qualification while keeping all other HITs for which you may not be qualified. This setting is mutually exclusive with the Qualified setting. If both are selected, the Qualified setting will take precedence.


Hide InfeasibleFilters out HITs with qualifications you can neither request nor take a test to obtain. Useful for filtering out location based qualifications

Minimum batch sizeSets a threshold for number of HITs per HIT group. All HIT groups which contain fewer HITs than specified will be filtered out. This setting only applies when the Search by option is set to Most Available.
  • Global
Forces the Minimum batch size value to apply to all search options, not only Most Available.


New HIT highlightingSets the amount of time (in seconds) for which new HITs will be highlighted. Highlighted HITs will be emboldened and appear in larger font. Their cells in the r###lts table will also be outlined in a white, dotted line which is more prominent on some themes than others.


Sound on new HITWhen new HITs are found, play an audio alert. There are two options--Ding and Squee.


Disable TOSkip directly to displaying the scrape r###lts without retrieving Turkopticon data.


Search byControls the method by which to query HITs from mturk.
  • Latest - HIT creation date (newest first)
  • Most Available - number of HITs available (most first)
  • Reward - reward amount (most first)
  • Title - alphabetical by title (A-Z)
  • Invert
inverts the ordering of the above selection


Min pay TOSets a threshold on requesters' Turkopticon pay rating and hides all r###lts with requesters below the specified value. Their visibility can be toggled via the Toggle Ignored HITs button.
Note: Requesters that have not been rated will not be affected by this setting.


Hide no TOHides all r###lts from requesters that have no reviews on Turkopticon. Their visibility can be toggled via the Toggle Ignored HITs button.


Sort by TO paySort the r###lts by Turkopticon pay rating.


Sort by overall TOSorts the r###lts by overall Turkopticon ratings.


Search TermsSearch terms to search for specific HITs or requesters


Hide blocklistedHide all r###lts which trigger a match against the blocklist.


Restrict to includelistHides all r###lts that do not trigger a match against the includelist. If the inludelist is empty, all r###lts will be blocked.


Highlight inludelistR###lts which trigger a match against the includelist will be enclosed in a thick, green, dashed outline.


R###lts Table

--

Additional Settings


Settings Panel options are already pretty well explained. This section is probably not necessary.