Question

Rule Search Index Not Being Updated / pyFTSIncrementalIndexer Exception Pega 8.1.3

The Rule Search Index is not being updated.

Upon checking the PegaRULES log file, the pyFTSIncrementalIndexer is throwing out exception.

The following is the first couple lines from the PegaRULE log file:

2019-05-22 14:57:14,048 [CHEDULER_THREAD_POOL] [ STANDARD] [ ] [ ] (ueueProcessorFailedRunsManager) ERROR - Unable to re-start failed queue processor pyFTSIncrementalIndexer. Caught exception: Cannot resubmit the run [pyFTSIncrementalIndexer] because some of its partitions are not in end state
java.lang.IllegalStateException: Cannot resubmit the run [pyFTSIncrementalIndexer] because some of its partitions are not in end state

PRPC Version: 8.1.3

***Edited by Moderator: Lochan to update SR details***

Group Tags

Correct Answer
September 25, 2019 - 9:39am

Hi,

As per tSR-D17907 notes,The root cause of this issue is When a Queue Processor's Kafka partition ends up in an unknown state, the automatic restart does not clear the partition which results in the exception, and the queue processor fails to be restarted.

It got resolved by using the API function pxStartRunByID, and pass it the failed queue processor's name.  This API function will reset the state of the partition when it starts the run.
You can verify the Queue Processor is running normally from the Admin Studio > Resources > Queue Processor page.

In 8.1.6, this is addressed by having the automatic restart functionality also reset the state of the kafka partition when a failed run is restarted.

Thanks,
Abhinav

Comments

Keep up to date on this post and subscribe to comments

May 22, 2019 - 6:03pm

Can you please see what is the status of Stream node ( Pega Landing page --> Decisioning --> Infrastructure )?

May 23, 2019 - 9:16am

Statuses of both the agent and the stream node appear to be fine.  Please see the two attached images.

June 3, 2019 - 10:45am

SR-D17907 was created to investigate this issue.

Will update this thread further when the issue is resolved.

June 21, 2019 - 8:06am

Hi, 

Aparently we have the same scenario: same error, same PRPC version... So we apreciate any information added to this post to try to solve it.

Raquel

 

June 26, 2019 - 10:49am

Call the API function: Data-Decision-DDF-RunOptions pxStartRunByID

This API function resets the state of the partitions of a (failed, in this case) queue processor and starts the run. You will pass it the ID of the queue processor that is stuck in a failure state - in this case, pyFTSIncrementalIndexer.

Starting in 8.1.6 and 8.2.3, this resetting the state of the partitions will be used when the engine performs automatic restarting of failed queue processors so that this scenario where the partition is in an unexpected state is handled by the automatic restart.

You can check that the queue processor is running properly after doing this by going to Admin Studio > Resources > Queue Processors

July 18, 2019 - 2:18pm
Response to SylSchinco_GCS

Hi Syl, I am unable to find an option to call this API function "Data-Decision-DDF-RunOptions pxStartRunByID". Can you provide a set of steps for the same ?

September 24, 2019 - 5:43pm
Response to SateeshV1351

Hi - what version are you using?  The above recommendation relates to 8.1.x prior to 8.1.6, and 8.2.x prior to 8.2.3.  Perhaps make a new thread.  7.x indexing uses different mechanisms entirely so your error would not be identical, nor would the API functions be available.

 

Also, if you're attempting to find this rule in your environment using the search bar, you need to check the enable diagnostic checkbox in your operator profile (click your initials/profile picture in the lower left of dev studio > preferences)

November 7, 2019 - 7:49pm
Response to SylSchinco_GCS

Hi,

 

We have upgraded from 8.1.3 to 8.1.6 and still have this issue. Do we need to reset ?

Pega
September 25, 2019 - 9:39am

Hi,

As per tSR-D17907 notes,The root cause of this issue is When a Queue Processor's Kafka partition ends up in an unknown state, the automatic restart does not clear the partition which results in the exception, and the queue processor fails to be restarted.

It got resolved by using the API function pxStartRunByID, and pass it the failed queue processor's name.  This API function will reset the state of the partition when it starts the run.
You can verify the Queue Processor is running normally from the Admin Studio > Resources > Queue Processor page.

In 8.1.6, this is addressed by having the automatic restart functionality also reset the state of the kafka partition when a failed run is restarted.

Thanks,
Abhinav

November 7, 2019 - 7:49pm
Response to Abhinav7

We have upgraded from 8.1.3 to 8.1.6 and still have this issue. Do we need to reset ?

Pega
November 8, 2019 - 1:31am
Response to RajaniKanth

It is addressed in 8.1.6,If it did not work then please perform the same steps that you have done in previous version.

Thanks,

Abhinav

Pega
September 25, 2019 - 5:33pm

Delete the instance(pyFTSIncrementalIndexer) of Data-Decision-DDF-RunOptions-Queueprocessor will also restart the data flow if the resume button is disables but it needs a restart.