Approaches by which software failures may be explained:
Knowing the categorization of the critical software upon which our lives often depend, it is also very important for us to closely know the approaches by which software failures may be explained. So, we made a thorough study and we agree that the approaches are mainly two:
- Software – centric approach – views failures as a property of the software itself. The software is considered in isolation, and not within the context of the system in which it operates;
- System – centric approach – views failure in relation to the system. The system – centric approach is similar to the modeling of the human performance where an unsafe human act is considered a harm only in the context of the system in which it occurs;
To err is human and at each stage of the development, errors may be introduced into the software.
Of course, there will always be errors. After all, to err is human and at each stage of the development, errors may be introduced into the software. For instance, the requirement analysis may be incomplete, the design may be inadequate or omitted entirely. That happens quite often unfortunately and is also one of the main reasons for failure of the software projects. So, it is human to err, but when human lives depend on a small – sized or mid – sized team, then it becomes unforgivable.
It is also quite wide – spread that the earlier an error is introduced into a software (and not fixed), the more severe and costly its impact is likely to be as the error tends to expand in the subsequent stages of the development process.
Large – scale software (usually critical ones) tends to fail three to five times more compared to small ones. The larger the software –the more complex it is to be built, tested or maintained.
What is a failure and what are the most commonly met types of failure?
Still, we talk about failures, but what is a failure and what are the most commonly met types of failure? A failure is the inability of a system or component to perform its required function within the specified performance requirement. Types of failures
- Process failure
- Real – time anomalies
- Faulty code
- Operational error
The following classification of the failures is presented by Microsoft Corporation and Jalote:
- Unplanned events – these are traditional failures like crash, hang, incorrect output or no output at all. These are caused by software failure. Other forms are disasters, system errors, employee error, application error, operations overruns, utility failure, hardware failures, etc.
- Planned events – these events often cause systems to shut down in a planned manner to perform some housekeeping tasks. Examples include updates requiring restarts, etc.
- Configuration failure – these failures occur as a result or due to configuration setting. In many systems, configuration failures account for a large percentage of failures – installation/ set up failures and application/ system incompatibility.
To fail the system and to classify the error, we do have a cause of failure in the first place:
A lack of logic
– poor or no design at all. One of the major reasons why software fails as developers usually do not have a good design before starting coding.
This resembles the case when an architect is building a house without a detailed plan.
Soon, the architect will discover that there are errors in the parts of the house.
What he does is simply demolish the faulty parts and start again.
Building a house this way will take more time and will be cost consuming, and yet, the architect is unable to discover and demolish all erroneous parts of the house.
In fact, the house may collapse as well after being just finished.
- Inadequate software testing – to be reliable and free of defect, the software must be properly debugged and rigorously tested. Defects must be detected and corrected and the software should be repeatedly tested to ensure that the changes were correctly made and that they do not break other functionalities. Also, the software should be tested with all available data, both positive and negative, both imaginary and real just to ensure that the system is able to handle them without crashing.
- Attitudinal changes
- Software changes introduce incompatibilities– large scale – focus software must evolve during their lifespan and stay useful. As such, when they evolve and new features are being added, incompatibilities and errors that initially were not in the software may now be introduced and thus, result in a software error
- Software is attacked by a hostile agent.
- Failure resulting from the unanticipated application or use – for example, in embedded systems or applications, the number of ways the environment can change becomes so large that it cannot be realistically anticipated every possible failure.
- Lack of customer or user involvement – without it, the project is doomed to fail.
- Unclear goals and objectives – the goal of the project may be only partially clear due to the lack of good requirements.
- Unclear requirements
- Lack of resources – every manager tries to minimize the resources involved in a project and yet, increase the productivity. That is a paradox.
- Failure to communicate and act as a team
- Project planning and scheduling
- Cost estimation – not only the cost to create the software product, but the cost of the software and hardware needed, training of the employees, travelling, communication cost, etc.
- Inappropriate estimation methodology – every methodology has strong and weak points and they should be all considered. A good suggestion is that more than one estimation methodology should be used.
- Cost estimation tools – need to be customized for the specific need of the organization.
- Risk management – if it is not managed timely and effectively, it is an important factor for failure. Risk management means dealing with a concern before it becomes a crisis.
- Unrealistic expectations
External causes of software failures:
- Human error– using software in an inappropriate way, input incorrect data, dividing by zero, etc.
- Management laxity – when there is a warning that the failure may happen, but due to different reasons, the management does not take corrective actions
- Support systems
- Cyber security
- Environment – natural catastrophes can affect the computers that embedded the software
So, as we’ve covered all the reasons why failures happen, we can go on to the light in the tunnel in the next chapter of our Software Failures story, where we’ll talk about how they can be avoided and also share some very interesting bug instances in history.