Question

How to avoid application JVM restarts during the event of Database failover.

Hi,

Could someone please suggest recommendations to avoid application JVM restarts (DB session persistence) during the event of Database failover.

Application details : PRPC731,MSSQL and Tomcat

Thank you.

Comments

Keep up to date on this post and subscribe to comments

April 9, 2019 - 10:03pm

Pega normally delegates handling of db failover to the depending vendor, in this case, Microsoft, at JDBC driver level, see the relevant link: https://docs.microsoft.com/en-us/sql/connect/jdbc/jdbc-driver-support-for-high-availability-disaster-recovery?view=sql-server-2017. If you configure your tomcat datasource using the connection properties described in the link, in theory it should be transparent for Pega. Other application servers (e.g., Websphere) have similar mechanisms.

September 11, 2019 - 10:11am
Response to KevinZheng_GCS

Hi Kevin,

Could you also please suggest regarding the requirement wherein JVM’s need to re-establish connection to database without restart whenever it loses the connection (Application details : PRPC731,MSSQL and Tomcat).

Please let me know if you need additional details.

Thank you.

September 11, 2019 - 11:50am
Response to PradeepChowdaryP

As far as I know, Pega does not have any universal solution to automatically reconnect after db outage, not even in the latest release. Have you tried the link from Microsoft and found it is not working? I am aware that some clients implementing similar solution completely independent of Pega using WebSphere fail-over mechanism. 

September 12, 2019 - 12:28pm

Hi Kevin,

Could you please confirm if the following parameters test on borrow, test on return, with a validation query can help with discarding old connections and re-acquiring connections to the db,without requiring a need to restart the Application JVM's ?  (also confirm if these parameters are compatible with MSSQL and Tomcat).

Based on our previous performance tests we have observed performance overhead issues with validation query.

September 12, 2019 - 1:44pm

I would always suggest set testonborrow to true, even for scenarios where there is no db outage - such as short-lived network outage. However, to be clear these will NOT ensure Pega restart unnecessary during DB outage. What is worse is that sometimes you do not see the problems right away. I can see two options moving forward (assume Pega does not relax the limitation) - these are just my own thoughts:

1. Moving forward with wider industry adoption of cloud native architecture, e.g., kubernetes deployment. Pega can provide a health check (still a significant enhancement) that detects scenarios like db outage and kubernetes can automatically restart the pods until success (e.g., db service restored). This way, there will be no admin intervention at the very least. However, your service can still be disrupted if you do not have a fail-over db. Unfortunately, this will not applicable for your current traditional deployment.

2. Relying on 3rd party solution, such as the one provided by Websphere, which hides all db fail-over behind Pega layer. I thought the Microsoft link I sent you earlier could potentially address that but you will have to verify should you go to that route.