Question

Making optimistic locking safe

Continuing my questions on locking, from a question I'd asked on the PDN article on Pega7 case locking...

Here's the Pega7 design: “Agents and services such as SLAs and bulk actions can update a case that is being worked by an operator. When the user attempts to save work, the update will fail.”

That doesn't sound like a good design. Is there anyway to use optimistic locking wherein a user's lock could take precedence over that of an agent process?

Moreover, this touches upon the question of whether the engine -- via a Websockets-like messenging -- can notify the client of locklost before the user clicks submit?

***Updated by moderator: Marissa to close post***

This post has been archived for educational purposes. Contents and links will no longer be updated. If you have the same/similar question, please write a new post.

Comments

Keep up to date on this post and subscribe to comments

March 13, 2015 - 10:49am

If I understand things correctly, the agent doesn't actually take the lock until it's committing its changes, since that's how optimistic locking works. Allowing the user's changes to take precedence, would mean that the user's update overwrites the agent's update in the database. You would need a mechanism to requeue the agent's work or it would be lost entirely. What you are proposing would have some serious data-integrity consequences.  If you are using optimistic locking, presumably your application and user-base is aware that there will be times that updates fail because another user/process got there first. If that is unacceptable for your business requirements, then you are probably better off using default locking.

March 16, 2015 - 8:09am
Response to MikeTownsend_GCS

Doesn't Optimistic Locking, by its nature, introduce potential data integrity issues?

Consider this common example of a workbasket.

If Alice opens a case from the workbasket, and then leaves her desk, Bob or Charlie can't work on it.

With optimistic locking -- Alice could contact Bob through another channel (email, phone, etc), "Hey Bob, I need to run, so go ahead and open CASE-1234 with details X,Y,Z"

If she doesn't consider that, Bob could update the case, and then she can't make updates.

So, data integrity concerns are mitigated through communications between updaters.

Now let's consider this agent. It doesn't presently communicate its updates to anybody else holding the lock.

Ideally, there's a common solution to both of the above scenarios--

When multiple parties have the (optimistic) lock, after one commit is made, the engine will need to communicate to all *other* parties about that update -- through the system.

Bob [or the SLA Agent] has just updated the following properties:

  • X - Read-only property; it will now update on your form automatically
  • Y - You have not updated this, so it will now update on your form automatically
  • Z - You have modified the text for this; your changes are in green and Bob's are in red, please reconcile

In most cases, the SLA Agent will just be updating some read-only properties, so the reconciliation work would be minimal.

I assume this is the direction the product will be going, to match the multi-updater capabilities of Google Docs, OneNote, etc.

March 16, 2015 - 1:26pm
Response to JonnyGar

There are some customers who choose to have Alice get a message saying that her changes are lost because Bob got there first. They are OK with that and build their business processes around it. I'd suggest that isn't a data integrity issue, it's more of a design decision. The data in the database is correct and the message Alice gets is to ensure that. Alice and Bob talking outside of the system and agreeing not to change the same fields is an unreliable mechanism to build an enterprise application around and still not going to avoid problems since the work object is opened/saved as a whole object. You're right, agents can't have that kind of communication. Your assumption that agents only update read only properties may be true for your business cases, but I can assure you that there are customers who do significant work via agents. Their needs are likely different.

March 16, 2015 - 9:32pm
Response to MikeTownsend_GCS

Mike, work with me here.

I wrote: "In most cases, the SLA Agent will just be updating some read-only properties..."

You wrote: "Your assumption that agents only update read only properties..."

My assumption was about one agent -- the SLA Agent, and I said "in most cases". Most of the time that is used to updates the input into pxUrgencyWork. So let me reword that: "In the common situation where an SLA agent updated values like inputs to pxUrgencyWork, the user would be spared from having to manually reconcile the values." Certainly, the next most common use for SLA is to dequeue an assignment and send it along in the flow. That's a more complex situation, and it wouldn't support safe updating.

But that's not the main point. I'm trying to get at how one can best work with this -- or as part of dealing with locking issues in general.

Some customers "are OK with that [Alice getting a LockLost error after submission] and build their business processes around it."

I can't really picture that.

The business process around this are either communicating outside the system -- or within the system. The former is agreeably non ideal ("an unreliable mechanism to build an enterprise application around" as you say), which is why I am making the case for an in-system messaging solution.

An in-message system would work with traditional locking as well.

March 18, 2015 - 2:26pm
Response to JonnyGar

Jon,

It seems that the thrust of your post is to suggest how locking could work better. I believe you mentioned in a previous locking post that you have an enhancement request in with Pega already. Is it for this particular bit of functionality? If not, I would suggest working with your account team to document how you would like to see locking work and having them work with product management to get those user stories assigned to backlogs, prioritized, etc. Unfortunately, I work in support so I can help with the way it works today, but have much less influence over the future direction of the product than your account executive.

- Mike

March 18, 2015 - 3:08pm
Response to MikeTownsend_GCS

I think this sounds like a good recommendation based on what we have in place today.

As far as this initiative goes, community ideation is certainly something we are discussing. While I have the ability to enable specific ideation features for use in this space, we've made a conscious decision not to do so.

We believe providing an ideation mechanism gives the impression that those ideas have a place to go once submitted and discussed. Right now there isn't. We believe that trust and transparency is what will grow this initiative in these early days, so rather than give a false impression that we're doing something we really aren't by providing ideation, we are keeping the focus on discussion and content for now.

For the short-term, I think it's fine to discuss suggestions for improvement here and toss ideas around. But please appreciate that, until we define our processes and rules of engagement specific to ideation, there will probably be limits to what Pega employees can discuss about future improvements in this space.

Thanks.

B.

Brendan | Community Moderator | Pegasystems Inc.

March 19, 2015 - 5:00pm
Response to BrendanHoran_GCS

We should probably branch this onto a separate topic -- it's important to discuss.

Indeed, there's a natural progression from question to enhancement. This started from some questions of mine whether things were possible, and why they were designed in a certain way. Along that progression that path will be some coalescing of ideas from various parties.

We did that to some extant on PDN Forums, but there's a limit to doing this via discussions.

re: "I would suggest working with your account team to document how you would like to see X work..."

Where to document?

This is where I see the power of a community like this, is to collaborate on ideas that aren't really customer-specific.

From that point on, we could bring them to our AE through the conventional pipeline.

November 3, 2015 - 11:07am
Response to BrendanHoran_GCS

8 months ago... "I would suggest working with your account team to document how you would like to see X work..."

We can add documents now, so I'll created this on the Mesh Product Support area some time this week.

July 29, 2015 - 12:27am

I think optimistic lock is safe in terms of data integrity, because

The system compares .pxLastUpdateTime of the case on clipboard to the DB’s object to determine whether another user has committed an action since the current user has opened the case.

It is the nature of optimistic lock which makes the The first user to submit an update "wins", so users also need to be optimistic as they chose optimistic locking.

March 17, 2016 - 1:12pm
Response to Chunzhi_Hong

Hi,

A general question on the original thread. The SLA agent uses the optimistic locking as the context setting activity (establishContext) calls the OpenAndLockWork. How can we allow the custom Standard agent use this feature as well? Do we need to explicitly mention this in the EstablishContext activity as the SLA agent? I though there would be some default methods or Process APIs which we could use in the Standard Agents to utilize this feature.

Any help in this regard would be appreciated.

March 17, 2016 - 3:16pm

Optimistic locking doesn't introduce data integrity.   But it does introduce frustration, although the flip side is that it relieves frustration.

The introduced frustration:  Sometimes after you spend awhile filling in fields and you finally click submit, you get "another change has prevented you from doing this update" so you have to start over.  (With default locking, this won't occur because you will have been given the lock when you started filling in the fields).

The relieved frustration:  If you go to lunch while you're in the middle of filling in the screen, another user will be allowed to make updates to the work.  (With default locking, the other user would get a message saying "you can't make your change because your colleague has it locked but they are at lunch")

Personally, I prefer default locking but to prevent the lunch problem, there should be a timeout that kicks in if there is inactivity.  That way, if you are taking a long time typing in a screen, but are actively working on it, you would keep the lock, but if you work on something else or go to lunch, you'd lose the lock.

Oh, and to address the first topic, I don't think agents such as SLA need to use optimistic locking, since they tend to be in and out quickly so they don't keep the work locked for long anyway.

/Eric

March 17, 2016 - 4:49pm
Response to ericosman_GCS

Indeed. That's the problem I raised 12 months ago at the start of the thread. Optimistic locking by itself does not quite address the edge cases, or the need for messaging between clients about the state of locks.

We reached a stalemate here in awaiting an account executive who understood the ins-and-outs of lock management to come to the rescue. Or maybe there's someone on the Product Architecture team who could lend some insight on the product direction now.

Pega
May 10, 2016 - 6:55pm
Response to ericosman_GCS

Lock can be stale after 1 hour which could solve your "lunch" problem.

Is that right?

/Don

July 13, 2016 - 4:38pm

From what I have learned, both the approaches has it's advantages and disadvantages but none of them causes any data integrity issues. Personally, it depends on what your bussiness need is but mostly default mechanism is sufficient because.

If a company has an application which creates/updates work, when their customers is on line then default locking is the way to go. If the operator opened a case which has default locking mechanism and when a agent looks to work on the same item, the EstablishContext Activity can't get the lock of that object and it will be sent to broken queue(if AQM is enabled). User needs to requeue again.

even when the processing of a particular work is less aggresive, then also it's better to go with default locking as there is no time frame(<5mins) to complete the work.  we can even make the system to release the lock after some time by specifying a custom timeout value in default mechanism.

I believe optimistic locking doesn't cause data issues as it will inform to the late user to refresh the form and submit it again to save his/her work. If using optimistic approcah gives an edge to client in the way the work is done or in some other aspects, they can eliminate the frustation part by using pega pulse to notify the other users on which case he/she is working on.

July 27, 2016 - 9:24am

"I believe optimistic locking doesn't cause data issues as it will inform to the late user to refresh the form and submit it again to save his/her work...."

Granted, that's not a data integrity issue. It's a UX issue. Does it inform the user before they've submitted the form, or after?

What I've been after since the start here is a holistic explanation of how locking can address edge cases -- today and in future versions.