Standard Harvesting vs Premium Harvesting & Geolocation
Data harvesting is the process of automatically extracting a large amount of data from websites.
Mozenda offers two types of harvesting: Standard and Premium Harvesting.
By using Standard Harvesting, the Agent Builder accesses the website from your geographical region, using your network IP and displays the data as it would in your browser.
In the Web Console, the agent runs directly on Mozenda's data center servers. These servers are located in the United States.
Premium Harvesting allows you to strategically access websites that serve region-specific data (Google, Amazon, Yelp), through a different geographical location. This reduces the chances of your IP getting blocked from some websites while:
It allows you to see and gather specific content that the website displays for that given geographical region (pricing, store location, local currency, and language).
It allows you to send a higher volume of requests to the websites. This means an agent will run into fewer CAPTCHAs or other types of bans imposed by those websites.
Turn the Use geolocation feature On if you want to choose a different location.
Turn the Use geolocation feature Off if you do not want to use geolocation or remove it with the sign next to the .
Select on the Use geolocation feature.
Find and select the preferred location.
- Select USE GEOLOCATION.
Premium Harvesting Settings
Select the Premium Harvesting button to modify the list of regular expressions that Mozenda will use to determine whether or not to route the requests through the geolocation you have selected.
Use geolocation can cause a small delay when a web page loads. You might need to add or increase wait times between actions to prevent errors. See (wait seconds)
Agents configured to use anonymous processing will be automatically changed to Premium Harvesting, which offers the same functionality plus the features listed above. These agents default to the North America (Region) location, which can be changed as needed.