Self-Adaptive Performance Monitoring for Component-Based
- Zug., Kiel, Univ., Diss. 2011
Effective monitoring of a software system’s runtime behavior is necessary to evaluate the compliance of performance objectives.
In addition to studying the construction and evolution of software systems, the software engineering discipline needs to emphasize the interest on robust and flexible software system operation, including means for continuous monitoring. Though performance is a critical characteristic for software systems, tools addressing application performance monitoring, i.e. monitoring the operation of software systems at application level, are rarely used in practice. This prevalent negligence is expressed by the following symptoms: (1) a posteriori failure analysis, i.e. appropriate monitoring data is seldom collected and evaluated systematically before a failure or performance anomaly occurs, (2) inflexible instrumentation, i.e. probes are injected only at a limited number of fixed measuring points such that component recompilation and redeployment are required for future modifications, and (3) inability of tracing in distributed systems, i.e. tracing of user requests or transactions beyond the borders of components or their execution containers is not supported or not applied. This thesis has emerged in the context of the Kieker monitoring framework, which targets the above shortcomings. A finding of our experimental overhead evaluation is that it is feasible to instrument probes at a multitude of possibly relevant measuring points, as long as not all of them are active at the same time during operation. Therefore, this thesis proposes a self-adaptive performance monitoring approach allowing for dynamic activation of probes and measuring points. As typical for autonomic systems, a control feedback cycle manages the adaptation of the monitoring coverage at runtime. The solution is based on OCL-based monitoring rules that refine the monitoring granularity for those components that show anomalous responsiveness. The monitoring data includes performance measures such as throughput and response time statistics, the utilization of system resources, as well as the inter- and intra-component control flow. Based on this data, performance anomaly scores for each provided service and component-inherent operation are computed. The presented anomaly scoring algorithms are based on time series analysis and distribution clustering, respectively. This self-adaptive performance monitoring approach for component-based
software systems reduces the business-critical failure diagnosis time, as it saves time-consuming manual debugging activities. The approach and its underlying anomaly scores are extensively evaluated in lab experiments, e.g. using the SPECjEnterprise2010 industry benchmark. The evaluation results, in combination with the implementation of the Kieker tool, demonstrate the feasibility and the practicability of the approach.