Collection Evaluation tab

Collection Evaluation tab

When you open a Rule Set, the last Collection linked to the Rule Set will be displayed in the Collection Evaluation tab.

On this pane, you can perform three key tasks:

Selecting the Gold Standard Collection

On the Collection Evaluation tab, confirm that the Gold Standard collection associated with the Rule Set you are evaluating and modifying is selected.

If the Rule Set was originally copied or imported, you may need to:

  • create a new Gold Standard Collection, and then select the Collection

  • select and/or change the Collection linked to the Rule Set.

Updating the Evaluation Table

You need to select Refresh Documents & Evaluate when you:

  • change the collection or

  • add/remove documents from the collection.

When you change a rule, you can select Update Table to evaluate the updated rules against the Gold Standard Collection of documents.

Selecting a Document to Evaluate

To select a document to evaluate, select the document name from the Evaluation Table.

Result: The Full Page document will be displayed in the centre pane and the Gold Standard document will be displayed in the right pane, under the Gold Standard tab.

Once you have selected the document you want to use to evaluate and refine the rules, you can select the collapse icon to collapse the Collection Evaluation pane. This gives you more screen space as you create and modify the rules.

Understanding the Evaluation Table

The Evaluation Table:

  • lists the documents in the selected Gold Standard Collection

  • evaluates the rules against each document to count the correct, spurious or missed elements.

  • calculates a summary score for each column and an overall score.

Element Status: Colour Coding

Elements are colour coded to indicate their status:

  • correct means that the elements are in the gold standard and have been selected by the rule set.

  • spurious means that the elements are not in the gold standard but have been selected by the rule set.

    Corrective Action: To correct this error, you either:

    • add the element to the Gold Standard, or

    • change/remove the rule to avoid capturing the spurious element.

  • missed means that the elements are in the gold standard but have not been selected by the rule set.

    Corrective Action: To correct this error, you either:

    • remove the element from the Gold Standard document, or

    • change/add a rule to capture the missed element.

The Errors tab provides a list of each error summarised in the Evaluation Table. The Errors tab provides a quick and easy way to clear errors. See Errors tab: Quickly Fix Errors

Rule Set Scoring

The F1 score indicates the precision with which the rule set is selecting the text you want and the level of recall it is achieving (that is, whether it’s missing a few or many elements). An F1 score of 1 indicates perfect precision and recall.

Click on a document in the table to display the Full Page document with correct, spurious and missed elements highlighted.