Question

How to read the HTML email Body (MS Outlook) - Loop the Dynamic Table from the body

How to read the Microsoft office HTML email body and loop the dynamic row table from the body to create the folders in the Specific location for the all the Error Code. Refer the attached email example.

**Moderation Team has archived post**

This post has been archived for educational purposes. Contents and links will no longer be updated. If you have the same/similar question, please write a new post.

Comments

Keep up to date on this post and subscribe to comments

Pega
July 9, 2018 - 12:52pm

You'll need to write some custom code here.

If you are on a version of PRPC that contains a HTML 'Tag Soup' processing library (like 'Tag Soup' or 'JSoup') - you can configure your Email Service / Listener to use 'prefer HTML'; which should populate the '.pyBody' property with the entire HTML content of your email.

Then you can reference a post-processing Activity to examine the contents of the '.pyBody' property to extract the information you need.

This will most likely be a 'messy' exercise; with lots of exceptions to deal with - but its not impossible.

Here's an example (for PRPC73) of some Java Code using 'Text Soup' to process HTML emails - this example just 'strips' the HTML body of ALL elements and extracts text. (which is a destructive process; loses tables, images etc - but can be useful).

You should be able to adapt this to deal with TABLES/[TH|TR]/TD blocks in your HTML.....

It needs three 'local' variables (on the Parameter Tab of the Activity) set up:

  1. pyBody (String)
  2. inBody (Boolean)
  3. BodyTextSB (StringBuffer)

And the 'Step Page' should be '.pyInboundEmail'.

You need to reference this Activity from your Email Service Rule.

Its one giant Java Step I'm afraid - which is very much outside of Guard Rails; so please understand the possible risks (to code maintainability etc) you are introducing by following an approach like this.

  1. // See end of this Java Step for call to inner class.
  2. java.io.InputStream is = new org.apache.commons.io.input.ReaderInputStream(new java.io.StringReader( pyBody ), java.nio.charset.StandardCharsets.UTF_8);
  3. inBody=false;
  4. bodyTextSB=new StringBuffer();
  5.  
  6.  
  7. class myHandler extends org.xml.sax.helpers.DefaultHandler {
  8. String[] tags= { "body", "script" , "style" };
  9. java.util.Map<String, Boolean> tagsOfInterest=new java.util.HashMap<>();
  10. {
  11. for (String tag: tags) {
  12. tagsOfInterest.put(tag, false);
  13. }
  14. }
  15. @Override
  16. public void characters(char ch[], int start, int length) {
  17. if ( tagsOfInterest.get( "body" ) ) {
  18. boolean dontWrite=false;
  19. StringBuilder debugTags=new StringBuilder();
  20. for (String tag: tagsOfInterest.keySet() ) {
  21. if (!tag.equals("body")) {
  22. dontWrite= dontWrite | tagsOfInterest.get(tag);
  23. }
  24. if (oLog.isDebugEnabled()) {
  25. debugTags.append( tag+"="+tagsOfInterest.get(tag)+" " );
  26. }
  27. }
  28. oLog.debug( debugTags );
  29.  
  30. if (oLog.isDebugEnabled()) {
  31. oLog.debug("dontwrite="+dontWrite);
  32. }
  33. if (dontWrite) {
  34. return;
  35. }
  36. else {
  37. bodyTextSB.append( new String(ch, start, length).trim() );
  38. }
  39. }
  40.  
  41. }
  42.  
  43. @Override
  44. public void startElement(String uri, String localName,
  45. String name, org.xml.sax.Attributes a) {
  46. String lc=localName.toLowerCase();
  47. if (tagsOfInterest.containsKey( lc )) {
  48. tagsOfInterest.put(lc, true);
  49. if (oLog.isDebugEnabled()) {
  50. oLog.debug("Start Tag Detected for:"+lc);
  51. }
  52. }
  53. }
  54.  
  55. @Override
  56. public void endElement(String namespaceURI, String localName, String qName) {
  57. String lc=localName.toLowerCase();
  58. if (tagsOfInterest.containsKey( lc )) {
  59. tagsOfInterest.put(lc, false);
  60. if (oLog.isDebugEnabled()) {
  61. oLog.debug("End Tag Detected for:"+lc);
  62. }
  63. }
  64. }
  65. }
  66.  
  67. try {
  68. org.ccil.cowan.tagsoup.jaxp.SAXParserImpl.newInstance(null).parse( is , new myHandler() );
  69. String tempString=bodyTextSB.toString();
  70. if (oLog.isDebugEnabled()) {
  71. oLog.debug("Original .pyBody: "+ pyBody);
  72. oLog.debug("Modified .pyBody:" + tempString);
  73. }
  74. if (tempString.length() > 0) {
  75. pyBody=tempString;
  76. }
  77. else { oLog.info("Zero Length String returned when parsing input; returning original .pyBody"); }
  78. }
  79. catch(Exception e) { if (oLog.isDebugEnabled()) { oLog.error(e);}
  80. oLog.info("Error Parsing Input: pyBody is unaltered.");
  81. }

There's a lot of 'cruft' in here; but basically the important thing to note is that the 'Tag Soup' library looks like Standard SAX Parser (from the 'outside') ; it can take a 'messy' input document and try and output a well-formed HTML document - which you can then react to using SAX methods.

 

Pega
July 9, 2018 - 12:54pm
Response to JOHNPW_GCS

Oh: and BTW: the Java Step above; requires a 'Property-Set' of .pyBody->Local.pyBody before it; and another 'Property-Set' of Local.pyBody->.pyBody after it.

(To transfer the Page Value .pyBody into a Local Variable; do the work; then transfer it back out again).

 

September 21, 2018 - 4:23am

Are you trying this with Pega Robotics Studio?