Defensive programming. Defuse time bombs!


Link to this posting

Postby Ursego » 21 Feb 2013, 11:52

Handle errors, don't ignore them. Our code should be as foolproof as is reasonably possible. If the code is used incorrectly, if something bad or unexpected occurred, don't keep that sad fact secret dark - signal, throw exceptions, shout, scream at the top of your lungs!!!

Resilient code is code that is designed to last, and to be safely reused by others. Code that is not susceptible to being broken by unexpected use cases. Defensive programming is producing of resilient code that robustly and gracefully handles cases of unintended use, and is resilient to future changes.

Many developers stop work as soon as their code passes a few basic tests to confirm that it produces the "right result" in a given use case. They do not consider the other possible ways in which the code might be used in the future. How their code will respond to common change, such as a table column switched from required to nullable, or introduction of a new customer status in the business? In the short-term, this approach is attractive; those developers get things done faster. However, if the code is designed to be used for more than just a few months, then it is very likely that such changes can and will occur, and the inevitable result is broken code or, even worse, code that silently starts to behave differently, or produce different results. When this happens, the data integrity is threatened, as is the validity of the reports on which critical business decisions are often based. At this point, years later (long after the original developer has forgotten what she/he was doing, or even left the company), begins the painstaking process of troubleshooting and fixing the problem.

Would it not be easier to prevent all this troubleshooting from happening? Would it not be better to spend a little more time and effort during original development, to save considerably more time on bugs investigation, fixing, retesting, and redeploying? After all, many of the problems that cause our code to break are very common; they repeat over and over again in different teams and on different projects. This is what defensive programming is all about:

We learn what can go wrong with our code, and we proactively identify and eliminate potential vulnerabilities during development rather than after the event when the problems have already occurred.

When your program starts running in production, the conditions and cases you thought impossible will happen, period. In reality, it's impossible to ensure that your code will have optimal running conditions all the time. As a developer, you'd like to ensure your projects will be able to cope with the imperfections of the real world. That is why techniques like defensive programming will always have a place in your toolbox.

This approach is the programming equivalent of defensive driving. In defensive driving, you take the responsibility of anticipating risky conditions and mistakes made by other drivers. In other words, you take responsibility for protecting yourself from other driver's mistakes and bad driving conditions.

Defensive programming, by comparison, requires you to recognize that bad things will happen to your program and design accordingly. You must anticipate and prepare for cases such as data corruption, connection timeouts and passing wrong data into a function. By deliberately handling cases that "will never happen" your code will be prepared to fare better under those circumstances.

https://www.brainstobytes.com/defensive-programming-fundamentals/

Why am I writing about defensive programming in a site, dedicated to elegant programming? Because code, which is vulnerable, is not looking elegant to me. In most cases, code elegance has nothing to do with cosmetic look.

1. Validate methods' parameters

If there is a limitation on what is allowed to be passed to a method, throw an exception if an invalid value is supplied in run-time.

Classical examples of such constraints: greater than zero; future or past date; expire date not later than effective date; range of values; list of allowed values. It's especially important to validate parameters in public methods, called by other developers!

Data errors are simpler to debug if an exception is thrown close to the source of the bad data. If the method doesn't validate its parameters, it can produce an incorrect result which is silently returned to the calling method, which will do the same and so on and so on. Who knows where and when the problem will be detected? Don't worry - maybe, it will never be detected!

Pre-conditions are one of the most widely spread forms of defensive programming. They guarantee that a method can be executed only when some requirements are met. Here's a typical example:

Code: Select all
public void CreateAppointment(DateTime dateTime)
{
    if (dateTime.Date < DateTime.Now.AddDays(1).Date)
        throw new ArgumentException("Date is too early");
 
    if (dateTime.Date > DateTime.Now.AddMonths(1).Date)
        throw new ArgumentException("Date is too late");
 
    /* Create an appointment */
}

Writing code like this is a good practice as it allows you to quickly react to any unexpected situations, therefore adhering to the fail fast principle.
Defensive programming: the good, the bad and the ugly

Don't worry that these validations make the code longer (while we try to keep functions shorter): the parameters validation section should be visually separated from the beginning of the "interesting part" by an empty line or a comment, so it is not taken into account when the function length is measured.

But...

Don't validate parameters in the beginning of the method if their incorrectness can be caught later within the method.

Here are three typical situations:

• When an SQL statement retrieves / affects no row by parameter(s) by which a row must be retrieved / affected. Example: if your stored procedure SELECTs, UPDATEs or DELETEs an employee record by the emp_id parameter, analyze that parameter for null (or an invalid value, BTW!), and describe the the problem in an error message, only if no record found.

• When the system will throw an exception for you. For example, in Java you don't need to check if the passed pointer is referencing an object. If it will be null, Java will throw the NullPointerException:

Code: Select all
if (controller == null)
    throw new IllegalArgumentException("Arg controller is null"); // unnecessary!

controller.doSomething(); // it will throw NullPointerException, so the problem will be revealed - why to write extra code?

But we can use exceptions, thrown for us by the system, only when the automatic error message shows the object and the method names - like in Java and C#. Unfortunately, it's not the case in many databases: errors, raised in stored procedures, reveal no clue where the error happened :twisted: . So, it's a programmer's responsibility to build a message with maximum useful information (including the proc name). Since we try to trap errors as early as possible, the parameters should be validated in the beginning of the procedure (anyway, before the line which can fail).

• By processing impossible (illegal) options, described in the next section.

2. Process impossible (illegal) options

In code branching constructions, process all existing options. Throw an exception if an unexpected option is supplied in run-time.

We are talking about such decision making constructions as if, switch, case, choose case etc.

Defend against the impossible, because the impossible will happen. :lol:

When you are processing a number of pre-defined options (like order statuses or vehicle types) then keep in mind, that the list of existing options can grow in the future to reflect changes in the business requirements. You (or other developers) can forget to process the newborn option(s). But they may require a special treatment, or should be treated in the same way as one of the existing options. That can produce a hidden bug. We can prevent it by forcing the code to complain: "Developer, THINK how to process the new option - maybe, it should not be ignored!".

To process impossible options, two simple steps should be done:

STEP 1: In the code branching construction, create a case for EACH option which currently exists in the system. If a few options are treated in a same way, simply process them ALL in one case. For all the options which are NOT processed by the business logic, create a case anyway, and write a comment like "do nothing", "irrelevant" or "not applicable". That comment will let developers (and future yourself) know that you did not process these options intentionally rather than forgot about them or didn't know they existed.

STEP 2: Add the default / else section which throws an exception. This section will never be executed if everything goes fine. Its only purpose is to trap an unexpected option and signal about that by raising an error. Important: the error message should display the invalid value! That can immediately give the developer the direction to investigate the bug.

If you have an if construction, you can add additional else if sections, or change the whole construction to switch / case (i.e. switch to switch :lol: ).

For example, 4 customer statuses exist in the system today: ACTIVE, PENDING, INACTIVE and DELETED. But the given piece of code is processing only 2 of them - ACTIVE and PENDING:

*** BAD code: ***

Code: Select all
switch (custStatus) {
   case CustStatus.ACTIVE:
      [process active customers]
      break;
   case CustStatus.PENDING:
      [process pending customers]
      break;
}

As you see, INACTIVE and DELETED statuses are not even mentioned. We can save the situation - now, INACTIVE and DELETED statuses are listed explicitly (even though not processed in the business logic):

*** GOOD code: ***

Code: Select all
switch (custStatus) {
   case CustStatus.ACTIVE:
      [process active customers]
      break;
   case CustStatus.PENDING:
      [process pending customers]
      break;
   case CustStatus.INACTIVE:
   case CustStatus.DELETED:
      // do nothing
      break;
   default:
      throw new Exception ("Invalid customer status " + custStatus + ".");
}

So, if a new status (for example, SUSPENDED) will be added to the business after many years, the code fragment itself will force the then developers to update the logic if needed. And that will usually happen in a pretty early stage of development or unit test! But even if the exception will be thrown in production - it's better than long living with a hidden bug (which can cause a lot of headache, and even be very expensive). Maybe, the new status must be treated in a special way, or in a same way as one of the existing statuses. Anyway, you have to decide (or discuss with business analysts) how to deal with it. If the new status requires no special treatment, then simply add it to the "do nothing" section. I remember getting a chain of such errors (six or seven!) on unit test, when I was building an application for a football association - after adding a new code value, which didn't exist in the initial design documentation, to an almost created module. The messages chain was similar to a penalty shootouts - and I won with no goal conceded!

In addition to defensive programming, processing of impossible options brings two more important benefits:

1. Looking at a switch / case construction, developers see the whole picture (not only its fragment!), so they don't have to guess (or investigate) which options are processed in the else / default. A half-truth is often a falsehood...

2. Global Text Search will find ALL the spots in the application where the searched code (represented by a constant, of course) could be dealt with. If some business logic is processed in else/default sections (very bad practice! :evil: ), then you are in trouble - you can find only the spots where the searched code is explicitly mentioned! So, you will need to waste extra time to investigate, having an excellent chance not to find EVERYTHING you need.

Since the else / default section is used for throwing exceptions, we cannot use it for business code. That naturally brings us to the following idea:

Don't use Boolean flags for non-Boolean entities, even if only 2 options exist.

We are talking only about entities, for which more than two options are theoretically possible (like CustomerStatus) - even if only two are used. We are NOT talking about states which are Boolean by their nature, so no 3rd option can even be imagined (like orderShipped).

Let's use a Transport Details screen as an example. It can display data in 2 modes - for cars and for motorcycles. The mode is defined by a Boolean instance variable isCarMode (false = motorcycle mode :cry: ):

*** BAD code: ***

Code: Select all
if (isCarMode)
   [process cars]
else
   [process motorcycles]

The best solution in this situation is to create a set of constants (enum if your language allows) even though there are only 2 options:

*** GOOD code: ***

Code: Select all
switch (currEntity) {
   case TransportType.CAR:
      [process cars]
      break;
   case TransportType.MOTORCYCLE:
      [process motorcycles]
      break;
   default:
      throw new Exception ("Invalid entity " + currEntity + ".");
}

If, after long time, you decide to use that screen for one more transport type (let's say, boat), simply add the corresponding constant (BOAT) to the TransportType enum, pass that constant to the fragment, run the application and... read the time capsule message, written many years ago! Probably, that will happen in multiple methods of the object being enhanced, so you will get that exception many times, leaving the QA testers bored... :lol:

You will enjoy an additional benefit of using constants instead of Boolean: you will see what is the meaning of false: isCarMode tells you what true is, but gives no clue what false is - who knows, maybe it's airplane, train or Harry Potter broom?

Of course, processing of impossible options makes sense only if you have a small number of pre-defined options (like only a few statuses). The advice is not applicable if there are tens or hundreds of options, or their number is unknown. For example, if there are 200 insurance coverage types in the application and you are processing only one of them, nobody expects you to list 199 others in the "do nothing" section! :lol:

3. Don't suppress error messages

If an error (a technical exception) is theoretically possible, let it happen and then fix the root cause.

I repeat one more time: if something bad occurred - shout! Don't create hidden bugs which are hard to detect and investigate: the incorrect results can appear in absolutely other part of the application! It's one of the most important principles of defensive programming.

For example, if an object is mandatory, don't check if the pointer is not null:

*** BAD code: ***

Code: Select all
if (controller != null)
   controller.doSomething();

If the pointer is null, then we DO want the error (like NullPointerException in Java) to happen!

*** GOOD code: ***

Code: Select all
controller.doSomething();

I recently asked a co-worker why a certain check was being done, and he answered with a shrug and said, "Just to be safe." Over my career, I've seen a lot of code written just to be safe. I've written a lot of that code myself! If I wasn't sure if I could rely on something, I'd add a safety check to prevent it from throwing an exception. At a glance, there doesn't appear to be a problem. But you haven't patched the hole and you haven't fixed the bug. Instead of an easy-to-trace exception, you have unusable values - bad data - infiltrating your program.
Overly defensive programming

The next example is a real life story.

Years ago I participated in a PowerBuilder project. The technical lead asked me NOT to use asterisk (SELECT *) to load all the fields of a table to a set of host variables (when we wanted ALL the fields). Instead, she asked me to explicitly list all the fields in the SELECT statement, so each listed field is loaded to its corresponding variable. When I asked why, she answered:
- We don't want error messages in production if a new field is added to the table.
Amazing! Fantastic! An error message (which helps us to easily find and investigate a bug) is the end of the world, but a potential hidden bug is absolutely OK! :twisted:

4. In SQL statements, make sure the correct number of records is affected.
User avatar
Ursego
Site Admin
 
Posts: 143
Joined: 19 Feb 2013, 20:33

Link to this posting

Postby Ursego » 28 Aug 2019, 09:56

Fail Fast principle

The fail fast principle stands for stopping the current operation as soon as any unexpected error occurs. Adhering to this principle generally results in a more stable solution. It might appear counter-intuitive at first. How is it so that failing at any error leads to stability? Isn't it backward?

Indeed, when you start applying this practice, application crashes might seem overwhelming, especially if you have a working software whose developers didn't stick to fail-fast before. But if you don't give up, grit your teeth and fix all the code causing those failures, you will benefit from it greatly.

The trick here is that adhering to the fail fast principle, we improve the feedback loop. We quickly reveal all problems in our software making it easier to spot and fix them.

Fail-silently

The opposite to fail-fast is fail-silently. How often did you encounter such code?

Code: Select all
try
{
    DoSomething();
}
catch (Exception ex)
{
    Logger.Log(ex);
}

A common justification for wrapping everything in a generic try-catch block is that it makes a software feel more stable by not letting end users know about errors in it.

I bet you already know swallowing exceptions in a generic try-catch block generally is a bad practice. But why is it so, exactly?

The problem with such approach is that instead of revealing issues in the software, we mask them and thus extend the feedback loop. If something goes wrong with the application, it wouldn't be obvious. The incorrect behavior is now hidden from the eyes of developers and end users and might stay unnoticed for a long time.

Moreover, the application's persistence state may get corrupted if the code continues executing after an error took place.

The fix here is simple, we just need to add a "throw" statement to the catch block:

Code: Select all
try
{
    DoSomething();
}
catch (Exception ex)
{
    Logger.Log(ex);
    throw;
}

Benefits of the fail fast practice

So, what are the benefits the fail fast principle provides?

@ Shortening the feedback loop. The costs of fixing a bug found while software is under development is an order of magnitude less than when it's in production. It is remarkable how quickly bugs are revealed when you stick to the fail fast principle. Even if the application is in production, you are still better off letting the end users know if something went wrong with it.

@ The confidence the software works as intended. You might have heard about some strongly-typed functional languages. Their type system is so strict that if you manage to compile a program written in such language, then it most likely will work right. We can say the same about a program written with the fail fast principle in mind. If such program is still running, it most likely does its job correctly.

@ The protection of the persistence state. I mentioned this issue earlier. If we allow the software to continue working after an error occurs, it may come into an invalid state, and, more importantly, save that state to the database. This, in turn, leads to a bigger problem – data corruption – which can't be solved just by restarting the application.

Failing silently is like burying your head in the sand. It doesn't solve the underlying problem, only pretends the problem isn't there.

The fail fast principle in practice

In most cases, adhering to the fail fast principle means we should shut down the application (probably, with a polite apology) in case of any unexpected situation.

There are cases where we don't have to kill the whole process, though. If the software is inherently stateless meaning that it doesn't store any state in memory between the operations, it might be just fine to halt only the operation the error took place in and let the application continue working.

An example here is a web server. Requests it processes don't leave any marks in the server's memory so we don't have to shut it down. Another good example would be a background job.

The fail fast principle and design by contract

The principle is highly related to the design by contract concept.

The main purpose of code contracts is to give the ability to quickly determine issues in our code. Design-by-contract programming takes the fail fast principle to its extreme. It prescribes that software developers should define a formal set of rules the code itself and its clients should live by and crash the application if any violation of those rules takes place.

The great strictness makes it even possible to create static analyzers on top of the code contracts so that the problems can be revealed on the compilation stage, without ever launching the app.

Summary

Following the fail fast principle can become a great source of confidence your application works correctly. The main point here is that we shouldn't let any unexpected situation escape our attention.

Let your software fail fast. That, in turn, would allow you to quickly fix any problem in it.

Source
User avatar
Ursego
Site Admin
 
Posts: 143
Joined: 19 Feb 2013, 20:33




Ketones are a more high-octane fuel for your brain than glucose. Become a biohacker and upgrade yourself to version 2.0!



cron
Traffic Counter

eXTReMe Tracker