Question

Fetch and Process more than 10000 records through Obj-Browse

Hi All,

We have a requirement to close(Resolve) all the open/pending cases in the application which are not worked upon by for more than 2 months/ 60 days. We have built an agent to do all the processing and scheduled it to run once a week.

We are using Obj-browse to fetch the records, with MaxRecords parameter left blank and hence by default Obj-Browse returns 10000 records. However if there are more than 10000 records (approx. 25000), my question is what would be the advisable and best practice solution to handle such scenario.

We are thinking of multiple approaches:

Should we schedule the agent multiple times to execute all the cases?

Or should we leave it to executed once and the agent will pick up rest of the cases in next week's run?

Or should we have one time execution of activity after deployment to take care of all such open cases and then leave the agent to execute once as scheduled?

Or any other option?

Please share your thoughts and knowledge of similar scenarios you might have encountered in your respective projects/experience.

Many Thanks!!

**Moderation Team has archived post**

This post has been archived for educational purposes. Contents and links will no longer be updated. If you have the same/similar question, please write a new post.

Correct Answer
September 15, 2015 - 11:19am

I think the key question is, do you expect to generate more than 10,000 items a week going forward? If so, then yes, you should spread the load or run more frequently (and really, the "best practice" would probably depend more on business need than technical ones). If you expect you generate fewer items than that, then it really depends on how quickly you need that backlog cleaned up. You could, for instance set the limit to something like 5,000 and run it daily during off hours and eat through the backlog in a week, have multiple nodes run at the same time, to spread the load (although you'll want to add code to deal with inevitable collisions as each node works through a similar result set, or have each node pull a different subset of the items). If you generate significantly fewer items every week, you could possibly live with it and let it burn off over a couple of months. Again, it really has more to do with your business needs than technical limitations of the platform.

I also want to take a moment to just ask why you have items that are going unworked for 60+ days. Is there some business process upstream that is failing which could be corrected to avoid having items end up in this state? If this is to clean up some failure, I'm a strong advocate of fixing the root problem so you don't end up with broken items.

Comments

Keep up to date on this post and subscribe to comments

September 15, 2015 - 9:13am

what kind of logic are you going to execute for those to be closed work objects?

if it is only a matter of changing work status I don't see any problem with dealing 25000 records in a couple of hours.

September 15, 2015 - 9:25am

Why not perform a bench mark test with 100 records to see how long does it take?

September 15, 2015 - 10:22am
Response to Chunzhi_Hong

Thanks Hong for your response.

As part of the processing we are updating the status, removing the flows and notifying the users and updating/maintaining an external DB table with records of the cases updated.

We have already performed the  benchmark test with 500 records and they are processed in 12 min 43 sec. So, I assume it should take approx. 4 h 30 min for 10000 records and approx. 11 h for 25000 records (just by simple maths but may take less).

Which is too much and hence I want to check and gather information from the community on how this kind of functionality / requirement is handled in other projects and the best practice followed.

Pega
September 15, 2015 - 10:36am
Response to ParshantS

So only one node will handle all this load?

Did you consider queuing it up for a standard agent which can run on all the nodes so that the load is shared?

September 15, 2015 - 11:19am

I think the key question is, do you expect to generate more than 10,000 items a week going forward? If so, then yes, you should spread the load or run more frequently (and really, the "best practice" would probably depend more on business need than technical ones). If you expect you generate fewer items than that, then it really depends on how quickly you need that backlog cleaned up. You could, for instance set the limit to something like 5,000 and run it daily during off hours and eat through the backlog in a week, have multiple nodes run at the same time, to spread the load (although you'll want to add code to deal with inevitable collisions as each node works through a similar result set, or have each node pull a different subset of the items). If you generate significantly fewer items every week, you could possibly live with it and let it burn off over a couple of months. Again, it really has more to do with your business needs than technical limitations of the platform.

I also want to take a moment to just ask why you have items that are going unworked for 60+ days. Is there some business process upstream that is failing which could be corrected to avoid having items end up in this state? If this is to clean up some failure, I'm a strong advocate of fixing the root problem so you don't end up with broken items.

September 15, 2015 - 12:14pm
Response to MikeTownsend_GCS

The reason these many cases are left unworked are because the users leaving the organization and new user starting afresh with a new case. And as these are legacy application running 11 odd processes from past 6 years the number accumulated with no one noticing, quite ironic isn't it?

However we don't foresee these many cases being left going forward as now the business is becoming more knowledgeable and support team more proactive. So, we expect this to be a one time activity for processing these many cases and then running the agent once a week to process any old unworked item, which would be rare and quite few only.

Thanks for providing the information and sharing your thoughts.

So, we are planning of sharing the load between multiple nodes for initial run and then stopping the agent on one node, making the agent run on one node only.

September 15, 2015 - 12:35pm

Prashant,

As it is an agent activity, you can chose to run it at a certain interval of time. If it is taking 12 mins for 500 matters, it is not recommended to let it run that long. So, put a benchmark that it is close to 2 mins and put an agent interval of 10 sec or so. This will give the agent to wake up again after 10 secs and starts processing.

As you will be looking for only open cases it will certainly leave the cases processed earlier. You can schedule it to run at night and you will pretty much done by morning.

August 18, 2016 - 3:50am

yes!