Wednesday, October 24, 2007

Challenges in implementing BAM - drawing a relationship between business processes and operational system exceptions

I have a question to pose to those of you that have end to end business processes that are composed of multiple legacy business applications/ business services that may be deployed on disparate platforms and technologies. I am curious about what architecture patterns have been deployed by you to report on the operational state of business processes. Also, I would love to hear about how you provide business executives visibility to the business process execution related issues specifically to those metrics that impact business SLAs and KPIs governing the core value streams.

Product vendors have touted Business Activity Monitoring (BAM) to be the magic bullet that allows the creation of intuitive executive dashboards. Most BAM tools that monitor KPIs expect to subscribe to business events/ business alerts being generated from the business process. There in lies the challenge. The reality of where to gather the relevant data for creating the executive dashboard is what complicates most BAM implementations!!! The going in assumption is that these business applications are not really conducive to refactoring and may be mission critical systems that cannot be re-architected to follow an Event Driven Architecture EDA model that expresses it's internal state via a series of events.

In my experience most of the legacy business applications are created with both the business logic and the process logic embedded in them. In addition, most of these business applications do not really communicate or interact using business events/ business alerts. Furthermore, business applications that encapsulate parts of a business process may leverage messaging to communicate states to the next step in the business process. These business applications may have custom message listeners and handlers that transform the message and execute pertinent APIs that represent the next activity of the business process. Given this, I am wondering how folks are dealing with abstracting and discerning between the business exceptions, system exceptions and business process exceptions. All these categories of exceptions and the internal state of the business information may need to be monitored in order to aggregate information and accurately report the health of the business process on an executive dashboard.

We have resorted to the following architecture pattern. The various form of exceptions and business entity information (whether they be business or system or process related) are extracted from the logs, business application operational tables or system monitoring data structures. These data points or the heart beats across the process are then persisted into a business process state-monitoring data-mart. The extraction process has transformation and reconciliation logic to correlate the exceptions with the state of the business information. The extraction process also has the logic build to deal with the after effects of data discrepancies and business information inaccuracies caused by communications disruptions and other system exceptions. The baseline threshold latency between the steps in the process is used to then report back on how the system alerts or data integrity issues may have led to missing the SLA for a particular process. It must be noted that the ETL tool being used is an industry strength ETL tool and not some data and log scrapping scripts.

Finally , vendor BAM tools could be used but the key challenge lies in the ability to reconcile between which system issue or data integrity issues caused the business alert. I would love to hear about other common patterns that are being used to address this challenge of gathering data to compute BAM metrics!!

No comments:

Key Learnings - Using EDA to implement the core SOA principle of "loose-coupling"!!!

A lot has been said about how SOA and EDA are unique "architecture styles". It seems like only one or the other architectural prin...