[top]

Annotated Bibliography

(Generated from XML)

More recent papers first

[Toggle All Abstracts] [Toggle All Notes]

Jump to


Complete info on individual entries


[Littlewood01]

Littlewood, B., Popov, P. and Strigini, L.. Design Diversity: an Update from Research on Reliability Modelling, ,  2001. pages .

online: home pdf

abstract:

Diversity between redundant subsystems is, in various forms, a common design approach for improving system dependability. Its value in the case of software-based systems is still controversial. This paper gives an overview of reliability modelling work we carried out in recent projects on design diversity, presented in the context of previous knowledge and practice. These results provide additional insight for decisions in applying diversity and in assessing diverse-redundant systems. A general observation is that, just as diversity is a very general design approach, the models of diversity can help conceptual understanding of a range of different situations. We summarise results in the general modelling of common-mode failure, in inference from observed failure data, and in decision-making for diversity in development.


[Littlewood01]

Bev Littlewood and Peter Popov and Lorenzo Strigini. Modelling software design diversity: a review, ACM Computing Surveys 33(2):177-208, June 2001. pages 177-208.

online: citeseer acm ps.gz

abstract:

Design diversity has been used for many years now as a means of achieving a degree of fault tolerance in software-based systems. While there is clear evidence that the approach can be expected to deliver some increase in reliability compared to a single version, there is no agreement about the extent of this. More importantly, it remains difficult to evaluate exactly how reliable a particular diverse fault-tolerant system is. This difficulty arises because assumptions of independence of failures between different versions have been shown to be untenable: assessment of the actual level of dependence present is therefore needed, and this is difficult. In this tutorial, we survey the modeling issues here, with an emphasis upon the impact these have upon the problem of assessing the reliability of fault-tolerant systems. The intended audience is one of designers, assessors, and project managers with only a basic knowledge of probabilities, as well as reliability experts without detailed knowledge of software, who seek an introduction to the probabilistic issues in decisions about design diversity.


[Partridge97]

Derek Partridge and Wojtek Krzanowski. Distinct Failure Diversity in Multiversion Software, ,  1997. pages .

online: citeseer ps.gz

abstract:

In earlier studies of multiversion programming, both empirical and analytical, emphasis switched from notions of independence to one of minimization of coincident failure. We show that neither independence of failure, nor lack of coincident failure are the single important properties. Indeed, an N-version system may deliver an optimal performance (under some voting strategy) even when the incidence of coincident failure is arbitrarily high. The key notion that this study contributes is one of distinct different failure, and hence distinct-failure diversity. The important property is not whether versions fail on the same input so much as whether they fail in the same way. If the failures of an N-version system (on some input) are dispersed over a set of distinct alternative outcomes, then this (hitherto unacknowledged) aspect of diversity may be exploited to substantially enhance system reliability. (...)


[Knight86]

John C. Knight and Nancy G. Leveson. An Experimental Evaluation Of The Assumption Of Independence In Multi-Version Programming, ,  1986. pages .

online: citeseer ps.gz

abstract:

N-version programming has been proposed as a method of incorporating fault tolerance into software. Multiple versions of a program (i.e. "N") are prepared and executed in parallel. Their outputs are collected and examined by a voter,and, if theyare not identical, it is assumed that the majority is correct. This method depends for its reliability improvement on the assumption that programs that have been developed independently will fail independently.Inthis paper an experiment is described in which the fundamental axiom is tested. Atotal of twenty sevenversions of a program were prepared independently from the same specification at twouniversities and then subjected to one million tests. The results of the tests revealed that the programs were individually extremely reliable but that the number of tests in which more than one program failed was substantially more than expected. The results of these tests are presented along with an analysis of some of the faults that were found in the programs. Background information on the programmers used is also summarized. The conclusion from this experiment is that N-version programming must be used with care and that analysis of its reliability must include the effect of dependent errors.


[Vouk90]

Mladen A. Vouk and Alper K. Caglayan and David E. Eckhardt and David F. McAllister and James L. Walker, Jr. and John J.P. Kelly and John Knight. Analysis of Faults Detected in a Large-Scale Multi-Version Software Development Experiment, , 1990 . pages .

online: citeseer ps.gz pdf
John C. Knight

abstract:

Twenty programs were built to the same specification of an inertial navigation problem. The programs were then subjected to a three phase testing and debugging process: an acceptance test, a certification test, and an operational test. Less than 20% of the faults discovered during the certification and operational testing were non-unique, i.e. the same or very similar faults would be found in more than one program. However, some of these "common" faults spanned as many as half of the versions (...)


[web]

. Web links, ,  . pages .

online: Ballista: COTS Software Robustness Testing
SEI Home Page
The Berkeley/Stanford Recovery-Oriented Computing (ROC) Project
Safety-Critical systems links


Generated from XML by JPM