Standard Harvesting vs Premium Harvesting & Geolocation
  • 14 May 2024
  • 1 Minute to read
  • Contributors
  • Dark
    Light
  • PDF

Standard Harvesting vs Premium Harvesting & Geolocation

  • Dark
    Light
  • PDF

Article summary

Data harvesting is the process of automatically extracting a large amount of data from websites.

Mozenda offers two types of harvesting: Standard and Premium Harvesting.

Standard Harvesting

By using Standard Harvesting, the Agent Builder accesses the website from your geographical region, using your network IP and displays the data as it would in your browser.

In the Web Console, the agent runs directly on Mozenda's data center servers. These servers are located in the United States.

Premium Harvesting

geo1

Premium Harvesting allows you to strategically access websites that serve region-specific data (Google, Amazon, Yelp) through a different geographical location. This reduces the chance of your IP getting blocked from some websites while:

  • Allowing you to see and gather specific content that the website displays for that given geographical region (pricing, store location, local currency, and language).

  • Allowing you to send a higher volume of requests to the websites. This means an agent will run into fewer CAPTCHAs or other types of bans imposed by those websites.

Use geolocation

Turn the Use geolocation feature On if you want to choose a different location.

geo2

Turn the Use geolocation feature Off if you do not want to use geolocation or remove it with the Help-Close (2) sign next to the ActionProperties.

geo3

Change geolocation

  1. Select ActionProperties on the Use geolocation feature.

  2. Find and select the preferred location.

geo5

  1. Select USE GEOLOCATION.

Premium Harvesting Settings

Select the Premium Harvesting button to modify the list of regular expressions that Mozenda will use to determine whether or not to route the requests through the geolocation you have selected.

geo6

geo7

To avoid unnecessary slowdowns, the Agent Builder automatically determines which requests need to be rerouted. Requests that retrieve static content (such as images and JavaScript) rarely need to be rerouted for websites to operate correctly. By default, Mozenda won't reroute requests that don't alter the web page content or appearance.

Note 1

Use geolocation can cause a small delay when a web page loads. You might need to add or increase wait times between actions to prevent errors.

Note 2

Agents configured to use anonymous processing will be automatically changed to Premium Harvesting, which offers the same functionality plus the features listed above. These agents default to the North America (Region) location, which can be changed as needed.


Was this article helpful?