Creating workflow template for Predictive Analytics in PEGA


I went through the tutorial on 'Building Predictive Models' and tried to create a predictive model in my Prediction/Decision studio. However, some of the options/features provided in the default workflow are confusing to me e.g. I tried solving standard 'Loan Prediction Problem', in the end I could see a chart that looks like k-fold cross validation. However, I was expecting it to be a confusion matrix.

Can somebody tell/point me how to create a customized template/workflow for prediction cases?



Keep up to date on this post and subscribe to comments

October 7, 2019 - 6:31am

Hi Sanjeev,

Thanks for your question, it sounds like you have some good data science background. 

In a nutshell we actually dont expose the underlying pipeline as it is really supposed to be a 'model factory' not a 'model laboratory' hence it follows a best practice pipe line, similar to some of the other tools in the market that follow a high degree of automation.

The pipeline is roughly as follows:

  • Specify the goals (binary scoring, numerical regression), define the population and outcomes, specify hold out validation schemes (not n fold)
  • Automatically prepare all predictors (numerical binning, symbolical grouping) and measure performance univariately
  • Perform subset feature selection - group the predictors based on cross correlation, and sort on performance in each group
  • Automatically build regression and decision tree models. Ability to build additional models, including genetic programming
  • Evaluate the models. First at the score level (lift charts, discrimination/ROC). Calibrate the scores by binning into classes and calculate 'true' probabilities.
  • Generate model documentation and export



October 7, 2019 - 10:01am

Hi Sanjeev,

In addition to Peter's comments above, in the 'Score distribution' step of 'Model analysis' you can find the numbers of each score interval in the Tabular view.

For example:

and with the 'behaviour' in each of these intervals you can calculate the numbers in the confusion matrix,