Adaptive Quality of Service for Availability (AQuA)

The AQuA project

The goal of the AQuA project was to develop an architecture for building dependable distributed systems and develop methods for validating dependable distributed systems. The AQuA architecture allows distributed applications to request and obtain a desired level of availability using the Quality Objects (QuO) framework and includes a dependability manager that attempts to meet the requested availability levels by configuring the system in response to outside requests and changes in system resources due to faults. Validation is performed using the Loki fault injector, which can inject faults in a distributed system based on a measure-driven partial view of global state and provide statistically sound estimates of system dependability.

The AQuA software

The AQuA architecture allows distributed applications to request and obtain a desired level of dependability using the Quality Objects (QuO) framework and includes a dependability manager that attempts to meet the requested availability levels by configuring the system in response to outside requests and changes in system resources due to faults.

The tool is not yet available. It will be free if used for educational and research purposes by academic institutions. Non-academic users can also arrange to use the tool. Please contact Prof. Bill Sanders at whs at illinois.edu for more information.

Manual


Participants in the AQuA project

The AQuA project was supported by the Quorum Program in the Defense Advanced Research Projects Agency Information Technology Office (DARPA/ITO). It was funded under contracts F30602-96-C-0315, F30602-97-C-0276, and F30602-98-C-0187. The project is a joint effort between the University of Illinois and BBN Systems and Technologies. It makes use of the Ensemble group communication system developed at Cornell University. BBN also has a home page for the AQuA project.

The researchers involved at the University of Illinois are:

The researchers involved at BBN Systems and Technologies are:

At Washington State University:


Papers generated by the AQuA project

An Adaptive Quality of Service Aware Middleware for Replicated Services.
S. Krishnamurthy, W. H. Sanders, and M. Cukier. (02KRI05)
IEEE Transactions on Parallel and Distributed Systems, vol. 14, no. 11, November 2003, pp. 1112-1125. [IEEE Xplore entry]

Evaluating Unavailability Caused by Group Membership Using Global-State-Based Fault Injection.
K. R. Joshi (03JOS01)
Master’s Thesis, University of Illinois, 2003.

An Experimental Evaluation of the Coda Distributed File System Using the Loki State-Driven Fault Injector.
R. M. Lefever. (03LEF02)
Master’s Thesis, University of Illinois, 2003.

An Experimental Evaluation of Correlated Network Partitions in the Coda Distributed File System.
R. M. Lefever, M. Cukier, and W. H. Sanders. (03LEF01)
Proceedings of the 22nd International Symposium on Reliable Distributed Systems (SRDS’03), Florence, Italy, October 6-8, 2003, pp. 273-282. [IEEE Xplore entry]

AQuA: An Adaptive Architecture that Provides Dependable Distributed Objects.
Y. (J.) Ren, D. E. Bakken, T. Courtney, M. Cukier, D. A. Karr, P. Rubel, C. Sabnis, W. H. Sanders, R. E. Schantz, and M. Seri. (99CUK04)
IEEE Transactions on Computers, vol. 52, no. 1, January 2003, pp. 31-50. [IEEE Xplore entry]

Passive Replication Schemes in AQuA.
Y. (J.) Ren, P. Rubel, M. Seri, M. Cukier, W. H. Sanders, and T. Courtney. (02REN01)
Proceedings of the 2002 Pacific Rim International Symposium on Dependable Computing (PRDC 2002), Tsukuba, Japan, December 16-18, 2002, pp. 125-130. [IEEE Xplore entry]

An Adaptive Quality of Service Aware Middleware for Replicated Services.
S. Krishnamurthy. (02KRI04)
Ph.D. Thesis, University of Illinois, 2002.

A Configurable CORBA Gateway for Providing Adaptable System Properties.
M. Seri, T. Courtney, M. Cukier, V. Gupta, S. Krishnamurthy, J. Lyons, H. Ramasamy, J. Ren, and W. H. Sanders. (02SER01)
Supplemental Volume of the 2002 International Conference on Dependable Systems & Networks (DSN-2002), Washington, DC, June 23-26, 2002, pp. G-26 to G-30.

An Adaptive Framework for Tunable Consistency and Timeliness Using Replication.
S. Krishnamurthy, W. H. Sanders, and M. Cukier. (01KRI03)
Proceedings of the 2002 International Conference on Dependable Systems and Networks (DSN-2002), Washington, DC, June 23-26, 2002, pp. 17-26. [IEEE Xplore entry]

Performance Evaluation of a QoS-Aware Framework for Providing Tunable Consistency and Timeliness.
S. Krishnamurthy, W. H. Sanders, and M. Cukier. (02KRI02)
Proceedings of the 2002 10th IEEE International Workshop on Quality of Service (IWQoS 2002), Miami Beach, Florida, May 15-17, 2002, pp. 214-223. [IEEE Xplore entry]

Performance Evaluation of a Probabilistic Replica Selection Algorithm.
S. Krishnamurthy, W. H. Sanders, and M. Cukier. (01KRI02)
Proceedings of the 7th IEEE International Workshop on Object-oriented Real-time Dependable Systems (WORDS 2002), San Diego, California, January 7-9, 2002, pp. 119-127. [IEEE Xplore entry]

An Overview of the AQuA Gateway.
M. Seri, T. Courtney, M. Cukier, and W. H. Sanders. (01SER01)
1st Workshop on The ACE ORB (TAO), St. Louis, MO, August 5-6, 2001. (Presented but not published in a proceedings.)

A Dynamic Replica Selection Algorithm for Tolerating Timing Faults.
S. Krishnamurthy, W. H. Sanders, and M. Cukier. (00KRI01)
Proceedings of the International Conference on Dependable Systems and Networks (DSN-2001), Göteborg, Sweden, July 1-4, 2001, pp. 107-116. [IEEE Xplore entry]

AQuA: A Framework for Providing Adaptive Fault Tolerance to Distributed Applications.
Y. Ren. (01REN01)
Ph.D. thesis, University of Illinois at Urbana-Champaign, 2001.

Building Dependable Distributed Systems Using the AQuA Architecture.
W. H. Sanders. (01SAN01)
Proceedings of SCTF’2001 – IX Brazilian Symposium on Fault-Tolerant Computing, Florianópolis, Santa Catarina, Brazil, March 5-7, 2001, p. 1.

An Adaptive Algorithm for Tolerating Value Faults and Crash Failures.
Y. Ren, M. Cukier, and W. H. Sanders. (00REN02)
IEEE Transactions on Parallel and Distributed Systems, vol. 12, no. 2, February 2001, pp. 173-192. [IEEE Xplore entry]

Passive Replication in the AQuA System.
P. G. Rubel. (00RUB01)
Master’s Thesis, University of Illinois, 2000.

Building Dependable Distributed Applications Using AQuA.
J. Ren, M. Cukier, P. Rubel, W. H. Sanders, D. E. Bakken, and D. A. Karr. (99CUK03)
Proceedings of the 4th IEEE Symposium on High Assurance Systems Engineering (HASE’99), Washington D.C., November 17-19, 1999, pp. 189-196. [IEEE Xplore entry]

Building Dependable Distributed Objects with the AQuA Architecture.
M. Cukier, J. Ren, P. Rubel, D. E. Bakken, and D. A. Karr. (99CUK02)
Digest of FastAbstracts presented at the 29th Annual International Symposium on Fault-Tolerant Computing (FTCS-29), Madison, Wisconsin, USA, June 15-18, 1999, pp. 17-18.

Proteus: A Flexible Infrastructure to Implement Adaptive Fault Tolerance in AQuA.
C. Sabnis, M. Cukier, J. Ren, P. Rubel, W. H. Sanders, D. E. Bakken, and D. A. Karr. (98SAB02)
in C. B. Weinstock and J. Rushby (Eds.), Dependable Computing for Critical Applications 7 , vol. 12 in series Dependable Computing and Fault-Tolerant Systems (A. Avizienis, H. Kopetz, and J. C. Laprie, Eds.), pp. 149-168. Los Alamitos, CA: IEEE Computer Society, 1999. [IEEE Xplore entry]

AQuA: An Adaptive Architecture That Provides Dependable Distributed Objects.
M. Cukier, J. Ren, C. Sabnis, D. Henke, J. Pistole, W. H. Sanders, D. E. Bakken, M. E. Berman, D. A. Karr, and R. E. Schantz. (98CUK01)
Proceedings of the 17th IEEE Symposium on Reliable Distributed Systems (SRDS’98), West Lafayette, Indiana, USA, October 20-23, 1998, pp. 245-253. [IEEE Xplore entry]

Specifying and Measuring Quality of Service in Distributed Object Systems (link to HTML document)
Joseph P. Loyall, Richard E. Schantz, John A. Zinky, David E. Bakken.

Proteus: A Software Infrastructure Providing Dependability for CORBA Applications.
B. S. Sabnis. (98SAB01)
Master’s Thesis, University of Illinois, 1998.

Probabilistic Verification of a Synchronous Round-Based Consensus Protocol.
H. S. Duggal, M. Cukier, and W. H. Sanders. (97DUG01)
Proceedings of the Sixteenth IEEE Symposium on Reliable Distributed Systems (SRDS-97), Durham, NC, October 22-24, 1997, pp. 165-174. [IEEE Xplore entry]

Architectural Support for Quality of Service for CORBA Objects, Theory and Practice of Object Systems, vol. 3, no. 1, April 1997, pp. 55-73. (link to HTML document)
J. A. Zinky, D. E. Bakken, and R. E. Schantz.

Overview of Quality of Service for Distributed Objects, Proceedings of the Fifth IEEE Dual Use Applications and Technologies Conference, Utica, NY, May 22-25,1995.
J. A. Zinky, D. E. Bakken, and R. E. Schantz.