Manage Rule Sets

Background

Rule Sets are used to harvest web pages. See Concept: Harvesting and Harvester Rule Sets.

You can:

  • All Rule Sets: copy, export, import and modify

  • Default Rule Sets: If a default Rule Set has been modified, you can revert back to the system default.

  • Created Rule Sets: create, rename, and delete Rules Sets created for this Project.

Access

To manage Rule Sets, select Configurations > Harvester Rule Sets

The rule set panes are displayed.

To increase the space, click the collapse icon against the Configurations and Harvester rule sets panes.

Create a rule set

Creating a Rule Set is the first step in establishing an effective Rule Set. For the recommended process for creating a rule set, see Create a Rule Set: Sintelix Extension.

While you can Create a new (empty) Harvester Rule Set within the Harvester Rule Sets configurations (below), you can also:

  • use the Sintelix Extension (recommended), with a wizard to guide you through the process and start creating a Gold Standard Collection. See Create a Rule Set: Sintelix Extension.

  • copy or import an existing Rule Set, which you can then modify, test and evaluate.

Create a new (empty) Harvester Rule Set

In the Create tab at the bottom of the Harvester Rule Sets pane, enter the name of the new rule set in the Name field, then select Create.

Result:

The new empty Rule Set is created and is opened.

A new Gold Standard Collection is automatically created with the same name and the suffix 'GS', to indicate that it's a gold standard collection.

Since the Rule Set is empty, you will need to:

Import a rule set
  1. Select Import > Choose File.

  2. Select the name.harvesteruleset.xml file you want to import.

  3. If you want to rename the rule set, enter the new name, and then select Import.

    The rule set is linked to the gold standard it was last evaluated against. If you want to change, select another gold standard collection from the Collection dropdown list on the Collection Evaluation tab, then select Refresh Documents & Evaluate.

Copy to This Project or Another Project
  1. Select the Copy icon next to the Rule Set you want to copy.

  2. Choose to copy to the Same Project or Other Project.

  3. If copying to another project, select the project from the Project dropdown list.

  4. Enter a name for the Rule Set.

  5. Select Copy.

    If copied to the same project, the rule set is linked to the gold standard collection it was last evaluated against. If you want to change the linked collection, select another gold standard collection from the Collection dropdown list on the Collection Evaluation tab, then click Refresh Documents & Evaluate.

    If copied to another project, the rule set will not be linked to any collection. You will need to either select a Collection or create a new Collection to use as the Gold Standard Collection.

Copy From Another Project

You can copy a Rule Set from the same project or from a different project.

  1. Select the Copy From tab.

  2. Select the project from the Project dropdown list.

  3. Select the rule set configuration and click Copy.

    The rule set is linked to the gold standard collection it was last evaluated against. If you want to change the linked collection, select another gold standard collection from the Collection dropdown list on the Collection Evaluation tab, then click Refresh Documents & Evaluate.

Export

To export a Rule Set, select the Export icon next to the Rule Set.

Result: The xml file will be downloaded: name.harvesteruleset.xml.

Revert to system default

When a Rule Set is a Sintelix default, it will have a DEF symbol next to the Rule Set name.

If you modify a default Rule Set, a MOD symbol is shown next to the Rule Set.

To restore the default Rule Set (clearing all modifications):

  • select the Revert icon

  • select Close & Revert to confirm.

Result: The MOD symbol changes to a DEF symbol .

Rename

You can not rename default rule sets that come with Sintelix.

To rename a Rule Set:

  • select the Rename icon next to the Rule Set

  • enter the new name

  • select Rename.

Delete

You can not delete default rule sets that come with Sintelix.

To delete a Rule Set:

  • select the Delete icon next to the Rule Set

  • select Delete to confirm deletion.