Dictionaries
Introduction
Dictionaries are lists of words or phrases that you want to identify in documents, such as types of weapons, names of illicit drugs, names of specific organisations and war crime indicators. Each word or phrase is referred to as an entry.
Dictionaries provide a simple way of creating links on the text graph, including text references which appear in Sintelix documents.
Entity Extraction Scripts A Sintelix configuration for marking up and creating connections between document text using a highly configurable scripting syntax. (EESs) work much faster when Dictionaries are used to create the initial text graph. This avoids using matching pattern elements, as generic matching pattern elements like Token A segment of text. In general, the number of tokens in a document equates to the number of words and punctuation marks in it. as in Token <string=XXX>, result in slow running EES rules.
Capabilities
Dictionaries offer the following capabilities:
- multi-word phrases
- features in extracted text references
- typographic filtering
- context-sensitivity
- automatic pluralization
- escaping - to allow complex characters within word list items
- work with EESs to make them briefer and faster
- can generate text references directly (without any need for EESs).
Tools
Tools for working with dictionaries are:
- Screen editor - with code highlighting
- Import capability
- Text Graph analyser for testing and review.
Configure Dictionaries
To configure Dictionaries:
- Open the project for which you want to configure Dictionaries.
- On the Main Navigation Bar click Configurations A group of settings that enable you to control the way in which a specific process or function in Sintelix works..
- Click Dictionaries.
- Do one of the following:
- Open a configuration - Select the configuration or type the configuration name in the Search field.
- Create a new configuration - Enter a name on the Create tab and select Create.
- Copy a configuration in the list - Select Create a copy
, and enter a name for the new configuration. Select Create & Open.
- Copy a configuration from another project - Select Copy From, then select the project the configuration is in. Select the configuration you want to copy, then select Copy.
- Import a configuration - Select Import > Choose file, then navigate to the file and select Open. Rename the file, if necessary, then select Import.
-
To rename, export or delete a configuration see Manage Configurations.
- Create or modify the word list.
- Click Save or Save & Test