Manage Rule Sets
Background
Rule Sets are used to harvest web pages. See Concept: Harvesting and Harvester Rule Sets.
You can:
-
All Rule Sets: copy, export, import and modify
-
Default Rule Sets: If a default Rule Set has been modified, you can revert back to the system default.
-
Created Rule Sets: create, rename, and delete Rules Sets created for this Project.
Access
To manage Rule Sets, select Configurations > Harvester Rule Sets
The rule set panes are displayed.
To increase the space, click the collapse icon against the Configurations and Harvester rule sets panes.
Create a rule set
Creating a Rule Set is the first step in establishing an effective Rule Set. For the recommended process for creating a rule set, see Create a Rule Set: Sintelix Extension.
While you can Create a new (empty) Harvester Rule Set within the Harvester Rule Sets configurations (below), you can also:
-
use the Sintelix Extension (recommended), with a wizard to guide you through the process and start creating a Gold Standard Collection. See Create a Rule Set: Sintelix Extension.
-
copy or import an existing Rule Set, which you can then modify, test and evaluate.
Create a new (empty) Harvester Rule Set
In the Create tab at the bottom of the Harvester Rule Sets pane, enter the name of the new rule set in the Name field, then select .
Result:
The new empty Rule Set is created and is opened.
A new Gold Standard Collection is automatically created with the same name and the suffix 'GS', to indicate that it's a gold standard collection.
Since the Rule Set is empty, you will need to:
-
add documents to the Gold Standard collection created (Harvest to a Gold Standard Collection) (or link to a different Gold Standard Collection).
Refer to the Concept: Harvester Gold Standardfor information on the process of establishing a Rule Set.
-
create, evaluate and modify the rules (Evaluate and Modify the Rule Set).
Import a rule set
-
Select Import > Choose File.
-
Select the name.harvesteruleset.xml file you want to import.
-
If you want to rename the rule set, enter the new name, and then select
.The rule set is linked to the gold standard it was last evaluated against. If you want to change, select another gold standard collection from the Collection dropdown list on the Collection Evaluation tab, then select .
Copy to This Project or Another Project
-
Select the Copy icon
next to the Rule Set you want to copy.
-
Choose to copy to the Same Project or Other Project.
-
If copying to another project, select the project from the Project dropdown list.
-
Enter a name for the Rule Set.
-
Select
.If copied to the same project, the rule set is linked to the gold standard collection it was last evaluated against. If you want to change the linked collection, select another gold standard collection from the Collection dropdown list on the Collection Evaluation tab, then click .
If copied to another project, the rule set will not be linked to any collection. You will need to either select a Collection or create a new Collection to use as the Gold Standard Collection.
Copy From Another Project
You can copy a Rule Set from the same project or from a different project.
-
Select the Copy From tab.
-
Select the project from the Project dropdown list.
-
Select the rule set configuration and click
.The rule set is linked to the gold standard collection it was last evaluated against. If you want to change the linked collection, select another gold standard collection from the Collection dropdown list on the Collection Evaluation tab, then click .
Export
To export a Rule Set, select the Export icon next to the Rule Set.
Result: The xml file will be downloaded: name.harvesteruleset.xml.
Revert to system default
When a Rule Set is a Sintelix default, it will have a DEF symbol next to the Rule Set name.
If you modify a default Rule Set, a MOD symbol is shown next to the Rule Set.
To restore the default Rule Set (clearing all modifications):
-
select the Revert icon
-
select Close & Revert to confirm.
Result: The MOD symbol changes to a DEF symbol
.
Rename
You can not rename default rule sets that come with Sintelix.
To rename a Rule Set:
-
select the Rename icon
next to the Rule Set
-
enter the new name
-
select
.
Delete
You can not delete default rule sets that come with Sintelix.
To delete a Rule Set:
-
select the Delete icon
next to the Rule Set
-
select Delete to confirm deletion.