Text analytics

You can use Pega Platform to analyze unstructured text that comes in through different channels: emails, social networks, chat channels, and so on. You can structure and classify the analyzed data to derive useful business information to help you retain and grow your customer base.

Pega Platform provides the following methods of text analysis:

  • Sentiment analysis – Detects positive, neutral, or negative sentiment. Supports machine learning.
  • Topic detection – Detects the underlying topic of the document. Supports machine learning and rule-based classification that is based on taxonomy keywords. For example, topic detection can determine that the sentence My uPlusTelco laptop is not working, need help! belongs to Customer Service > User Support category.
  • Intent detection – Assigns intents to text input. For example, intent analysis can detect whether the analyzed text is a complaint or an inquiry.
  • Text extraction – Extracts named entities from text. Entity extraction can be configured by using Apache RUTA scripts or machine learning models.

You can build machine learning models for sentiment analysis, topic detection, text extraction, and intent analysis and deploy those models using Text Analyzer rules. A text analyzer parses text, automatically recognizes the language, and processes the models. A text analyzer rule may refer to one or more models of the methods that are listed above.

To train a machine learnirng-based text analytics model, you must upload training data with sample texts and associated outcomes. For example, for sentiment analysis, these sample records must be associated with a positive, neutral or negative outcome. For text categorization, the outcome must be one of the categories in the taxonomy, and so on. In the process of creating a model, the data is split into a training sample and a test sample. The training sample is used to train the model. The test sample is the hold-out sample that is used validate the model. When a model is built, you can validate its performance.