Harvester

What is the Harvester?
The Sintelix Harvester can collect content from the web.
The content is added to a collection, normalized and processed as a Sintelix document:
Features
Features:
- Sintelix Extension: You can access the Harvester functionality within the Sintelix application or using Sintelix Extension. For information how to install and setup the Sintelix Extension, see Harvest via Sintelix Extension.
- Adblocker: To reduce unnecessary downloads.
- Dark Web: The Harvester can be used to extract text from .onion sites using Tor. For more information see Harvest the Dark Web.
Requirements
The Harvester functionality requires a connection to a Sintelix Agent and Internet connectivity.
To check the Harvester status, select Harvester > Agent tab.
For more information, see Sintelix Agent Connections.
What you can do
Sintelix offers a number of methods for Harvesting from web pages. You can:
-
harvest manually, using the Sintelix Extension, where you can individually or use a specific Rule Set to select elements to include and unselect elements to exclude. See Harvest via Sintelix Extension
-
harvest automatically, from the Harvester tab in Sintelix, so Sintelix can use defined Rule Sets to automatically select the elements on the page we want to collect and analyse.
You can:
You can also:
-
Create a Login Persona to log in to restricted sites