Glossary
  • 14 May 2024
  • 7 Minutes to read
  • Contributors
  • Dark
    Light
  • PDF

Glossary

  • Dark
    Light
  • PDF

Article summary

Action List

An ordered set of actions an agent will perform.

agent

A set of Mozenda processes that runs automated tasks against specified websites on the internet. Mozenda agents are primarily used for data gathering but are capable of automating other browser tasks as well.

Agent Builder

The Mozenda Windows application used to design and test Mozenda agents.

agent collection

The table where an agent's data is written and stored.

agent dashboard

An area in the Web Console where an individual agent can be configured, run, and scheduled. It is also where it's data can be viewed, exported, and published.

agent definition

A Mozenda agent backed up in XML format.

Agent group

An agent group is a tool that encourages consistency in agents that collect the same or same type of data. Agents assigned to an agent group can be made to share the agent group’s settings, including publishing, scheduling, notifications, and views.

agent list

The area in the Web Console where the user can see all of the agents in their account or department. Enterprise accounts can split their agents among multiple departments.

agent ID

A unique identifier for an agent.

agent run job

An instance of an agent running.

AJAX

An industry acronym that means Asynchronous JavaScript and XML. With AJAX, web applications can send and retrieve data from a server in the background without interfering with the display and behavior of the existing page.

Amazon S3

An Amazon AWS service that allows a user to store and retrieve data. Mozenda allows you to publish (push) data from a Mozenda collection to Amazon S3.

API

An acronym for Application Programming Interface. APIs provide a way for someone to write a program to interact with another program or service. Mozenda's REST API lets the user automate tasks within Mozenda and code custom solutions leveraging Mozenda's services.

autoblock

Pre-set request blocking settings of aggressiveness.

Azure Blob Storage

A Microsoft cloud service that allows you to store and retrieve data. Mozenda lets you publish (or save) data from a Mozenda collection to Azure Blob Storage.

CAPTCHA

An acronym for Completely Automated Public Turing test to tell Computers and Humans Apart. You often see them as a checkbox titled, "I am not a robot."CAPTCHAs use different methods to tell whether it is a human or automated bot browsing a web resource.

Client Connector

The Mozenda background application that connects and passes communication between the Agent Builder and the Web Console.

collection

A data table in the Web Console. Every agent has a collection associated with it, but some collections do not have an associated agent.

combined collection

A collection that gets data from several source collections.

concurrent jobs

Several data-gathering jobs that can run at the same time.

cookie store

A repository for a set of browser cookies that can be loaded and used by an agent.

crawler (or web crawler)

An automated process that browses the internet in a methodical, automated manner. It is generally less specific than a scraping bot, browsing URLs in a much more indiscriminate manner.

CSV

An acronym for comma-separated values. CSV is a minimalist and widely-supported data format used for tabular data. It's supported by all major spreadsheets, such as Microsoft Excel and Google Sheets. Mozenda lets you publish and export data in CSV format.

Data wrangling

The process of transforming and mapping data from one format to another to make it more appropriate for an intended use such as analytics.

Department

An enterprise feature that lets you create multiple isolated Mozenda environments managed by the same organization/company. Although the term department is used by default, the naming convention can be changed according to an organization’s needs.

Domain/domain name

A label that identifies a set of internet resources, such as a website (like mozenda.com).

Dropbox

A file-hosting service. Mozenda lets you publish (or save) data from a Mozenda collection to Dropbox.

dynamic date variable

A feature that lets you specify dates relative to the current date/time in the agent, rather than providing a specific date. For example, you might use it to filter 30 days from the current date.

error handling

A set of rules that determines how an agent should respond to different errors.

field

A column in a collection.

field reference

A way to reference a field's value in an agent or in some places of the Web Console. A field reference is indicated by %field_name%.

file package

A .zip file containing one or more files that have been downloaded by an agent.

folder

A way to organize agents and collections. The concept is similar to tagging in Google Gmail.

FTP

An acronym for File Transfer Protocol. Mozenda lets you publish (or save) data from a collection to another server using FTP.

geolocation

A feature of Premium Harvesting. Geolocation lets you configure your agent to simulate making requests from a particular location. For example, you might be writing the agent in the US, but you can tell the agent to act as if it making the requests from Ireland.

Google Drive

A file-hosting service from Google. Mozenda lets you publish (or save) data from a Mozenda collection to Google Drive.

HTML

An acronym for Hypertext Markup Language, one of the foundational technologies of the web. Mozenda features make it easy to parse and target elements within an HTML page.

harvesting

The process of an agent running the instructions you specified to gather data from the targeted site.

harvesting results

The data gathered by an agent.

JavaScript

JavaScript is one of the core technologies of the web. JavaScript enables interactive web pages and is an essential part of web applications. Mozenda lets you inject custom JavaScript in your agents, giving you enormous flexibility in how your agent interacts with websites.

job

A unit of work that needs to be done. In Mozenda, a job may represent a run of an agent, publishing a collection's data, packaging of downloaded files, or background maintenance work that is done to make your account work smoothly.

JSON

An acronym for JavaScript Object Notation. A data storage and transfer format that is commonly used in web applications. Mozenda lets you publish and export data in JSON format.

Managed Services

You can initiate a contract with Mozenda specifying the data you want to gather from certain websites and how frequently you want it delivered. This is an alternative to buying a license and using Mozenda software yourself.

premium harvesting

A type of harvesting that routes an agent's requests through one of Mozenda's 3rd-party network partners to simulate particular geographic locations or to by-pass CAPTCHAs.

processing credit

A unit of measurement for the utilization of Mozenda’s computing resources through various activities, including page navigation, loading dynamic content, downloading files, and using Premium Harvesting.

publishing

Moving and saving the data gathered by your Mozenda agent to an outside storage location. (See publish method below.)

publish method

An outside storage location where you can deliver your data from a Mozenda collection. Mozenda's publish methods include email, FTP, Amazon S3, Azure Storage, Dropbox, Google Cloud Storage, and Google Drive.

RegEx

A sequence of characters that represent a pattern of text. RegEx is used in Mozenda to define what text to capture from a targeted element in a Capture Text action.

REST API

Mozenda's REST API lets the user automate tasks within Mozenda and code custom solutions using Mozenda's services. See also API.

RSS

An acronym for Rich Site Summary. RSS is an XML format used to feed content from a website to a client application. Mozenda allows you to create an RSS feed of a collection view.

schedule

You can specify a time and frequency for Mozenda to automatically run different tasks. You can schedule running agents, publishing collections, and running sequences.

sequence

An automated workflow in the Web Console that provides deeper task automation in Mozenda projects. You can use sequences to run agents, clear, and publish collection data and run other sequences.

sequence step

A task to be completed within a sequence.

standard collection

A static data collection in Mozenda. It's usually used to provide inputs to an agent.

TSV

An acronym for tab-separated values. TSV is a minimalistic and widely-supported data format for tabular data. Mozenda can publish and export collection data in TSV format.

URL

An acronym for Uniform Resource Locator. The address to a web resource, such as https://www.mozenda.com. It can also be expressed numbers as an IP address, such as http://192.168.0.1.

view

A way to see the data in a collection in a particular way. Mozenda's views let you adjust which fields are visible and in what order. You can also filter and sort rows based on field values.

Web Console

The web portal where you run, manage, and schedule your agents. When the agents have finished, you can also directly access the data you gathered in the Web Console.

web scraping

The process of automatically extracting or gathering structured or unstructured data from websites.

XLSX

The file format used by Microsoft Excel and other spreadsheet grid-editing applications to represent spreadsheets. Mozenda lets you publish and export collection data in XLSX format.

XML

An acronym for eXtended Markup Language. XML is a widely-supported and robust data storage and transfer format. Mozenda lets you publish and export collection data in XML format. The Mozenda REST API also returns information in XML format by default.

XPath

A query language for selecting elements in an HTML document. Mozenda uses XPath expressions to target specific elements on a web page to interact with.


Was this article helpful?