Wednesday, October 24, 2018

Debug It! Find,Repair,and Prevent Bugs in Your Code - by Paul Butcher

Notes from the book:

The core of the debugging process consists of four steps:

1. Reproduce: Find a way to reliably and conveniently reproduce the problem on demand.

2. Diagnose: Construct hypotheses, and test them by performing experiments until you are confident that you have identified the underlying cause of the bug.
3. Fix: Design and implement changes that fix the problem, avoid introducing regressions,and maintain or improve the overall quality of the software.
4. Reflect: Learn the lessons of the bug.Where did things go wrong?Are there any other examples of the same problem that will also need fixing? What can you do to ensure that the same problem doesn’t happen again?

The things you need to control break down into three areas:

- The software itself: If the bug is in an area that has changed recently, then ensuring that you’re running the same version of the software as it was reported against is a good first step.

- The environment it’s running within: If interaction with an external system (some particular piece of hardware or a remote server perhaps) is involved, then you probably want to ensure that you’re using the same.

- The inputs you provide to it: If the bug is related to an area that behaves very differently depending upon how the software is configured, then start by replicating the user’s configuration.

Ensure that your reproduction is both reliable and convenient through iterative refinement:

- Reduce the number of steps, amount of data, or time required.
- Remove nondeterminism.
- Automate.

The scientific method can work in two different directions.1 In one case, we start with a hypothesis and attempt to create experiments, the results of which will either support or refute it. In the other, we start with an observation that doesn’t fit with our current theory and as a result modify that theory or possibly even replace it with something completely different.
In debugging, we almost always start from the latter.Our theory(that the software behaves as we think it does) is disproved by an observation(the bug) that demonstrates that we are mistaken.

1.    Examine what you know about the software’s behavior, and construct a hypothesis about what might cause it.
2.    Design an experiment that will allow you to test its truth(or otherwise).
3.    If the experiment disproves your hypothesis, come up with a new one, and start again.
4.    If it supports your hypothesis, keep coming up with experiments until you have either disproved it or reached a high enough level of certainty to consider it proven

Instrumentation is code that doesn’t affect how the software behaves but instead provides insight into why it behaves as it does. e.. logging

Once you have found the source of bug, there might be some changes that you had made during the diagnosis phase that you perhaps want to retain. check out fresh copy of your code. then follow sequence as follows

1. Run the existing tests, and demonstrate that they pass.
2. Add one or more new tests,or fix the existing tests,to demonstrate the bug(in other words,to fail)
3.Fix the bug.
4. Demonstrate that your fix works(the failing tests no longer fail).
5. Demonstrate that you haven’t introduced any regressions (none of the tests that previously passed now fail).

Bug fixing involves three goals:

1. Fix the problem.
2. Avoid introducing regressions.
3. Maintain or improve overall quality(readability, architecture, test coverage, and so on) of the code.

Two golden rules:

1. Refactor but never at the same time as modifying functionality
2. One logical change, one checkin

Make it obvious how to report a bug: Place instructions (or better yet, a direct link) to how to report a bug in your software’s About dialog box, online help, website, and anywhere else you think appropriate.
Automate: Install a top-level exception handler, and give the user the option to file a bug report that automatically contains all the relevant details.

Keep it simple: Each action you ask your users to perform will reduce the number who complete a transaction by half. In other words, ask them to click three times, and only 12.5 percent of them will complete. Five times, and you’ve reduced that figure to a little more than 3 percent.
Don’t have too rigid a template: It can be a good idea to have a standard template for bug reports, but beware of making that template too strict. Make sure that you have sensible options for each field including “none of the above.”

Automate environment and configuration reporting to ensure accurate reports.
Aim for bug reports that are - specific, unambiguous, detailed, minimal and unique.

To deal with a poor quality codebase:
1. have the following in place - source code control system, automatic build process, continuous integration, automated testing
2. separate clean code from unclean and keep it clean
3. prioritise bugs
4. incrementally clean up code by putting tests in place and refactoring

Add identifying compatibility issues to your bug-fixing checklist.

Addressing Compatibility Issues

Provide a Migration Path
Give your users some way to modify their existing data, code, or other artifacts to fit in with the new order, such as a utility that converts existing files so they work correctly with the new software,for example.
It might be possible to automate this so that data is automatically upgraded during installation. Make sure that you both test this carefully and save a backup, though—your users will not thank you if the upgrade fails and destroys all their data in the process.
Implement a Compatibility Mode
Alternatively,you can provide a release that contains both the old and new code,together with some means of switching between them.Users can start by using the compatibility mode, which runs the old code, and switch to the new after they’ve migrated. Ideally this switch is automatic—when the software detects an old file, for example

Microsoft Word is a good example of this approach. When it opens an old file(with a .doc extension),it does so in a compatibility mode (see Figure 8.1). Save that file in the new format(.docx), and Word’s behavior, and possibly your document’s layout, changes.

This is not a solution to be adopted lightly. It’s very high cost, both for you and for your users.

From your point of view, it does nothing for the quality of the code. From the user’s point of
view, it’s confusing—they need to understand that the software supports two different behaviors, what the differences are, and when each is appropriate.Turn to it only if this cost is truly justified.

Provide Forewarning
If you know that you’re going to have to make a significant change but don’t have to make it immediately, you can provide users with forewarning that they will eventually need to migrate.

Of course, this works only if you can afford to delay your fix for long enough to enable your users to migrate—and whether your users do migrate.

It is an excellent idea to incorporate performance tests into your regression test suite. They might run representative operations on large data sets and report if the time taken falls outside of acceptable bounds,for example.

It can even be worth having tests that fail when things become unexpectedly faster. If a test suddenly runs twice as fast after a change that shouldn’t have affected performance noticeably, that can also indicate a problem. Perhaps some code you were expecting to be executed isn’t any longer?

When patching an existing release concentrate on reducing risk consider compatibility implications when fixing bugs fix performance bugs only after accurate profiling

There’s more to effective automated testing than simply automating your tests. To achieve maximum benefit, your tests need to satisfy the following goals:

1. Unambiguous pass/fail: Each test outputs a single bit—pass or fail.No shades of gray, no qualitative output, no interpretation required. Just a simple yes or no.

2. Self-contained: No setup required before running a test.Before it runs, it sets up whatever environment it needs automatically, and just as important, it undoes any changes to the environment afterward, leaving everything as it found it.
3. Single-click to run all the tests: All tests can be run in one step without interfering with each other. As with a single test, the output of the complete test suite is a simple pass or fail—pass if every test passes, fail otherwise.
4. Comprehensive coverage: It’s easy to prove that achieving complete coverage for any nontrivial body of code is prohibitively expensive.But don’t allow that theoretical limitation to put you off—it is possible to achieve close enough to complete coverage as to make no practical difference.

Mocks and stubs are often confused. Stubs are passive, simply responding with canned Mocks are active; stubs data when called, whereas mocks are active, are passive. validating expectations about how and when they are called.

⦁    Branch as late as possible. It may be tempting to create your stabilization branch well in advance(after all,if some stabilization is good, more must be better?),but the chances are that the productivity you lose by doing so isn’t worth it.
⦁    Stick to a single level of branching. If you find yourself branching your branches,you know that you’re in trouble

⦁    Setup your continuous integration server to build all the branches that are actively being worked on.

⦁    Check in small changes often. Small changes are easier to understand, merge, and roll back if necessary.

⦁    Make only those changes that really need to be in the branch in the branch.
⦁    Merge from the branch to the trunk,not the other way around.The branch represents released software, so a problem in the branch is likely to have more severe consequences than a problem in the trunk.
⦁    Merge changes from branch to trunk immediately while the change is fresh jn your kind.
⦁    Keep an audit trail to know which changes were merged and when

So, it’s a good idea to have a build machine that is used to make release builds (possibly several build machines if you’re working on cross-platform software). It should always be kept pristine and not be used for anything else so that you can trust that it’s in the right state.
Whenever you make a release, you need to make sure that you keep a record of what source was used to create that release.
If you do have problems with tests that take too long to run, consider creating a suite of short tests that you can run for every check-in, as well as running the full suite overnight.
So, the first rule is to use static analysis. Switch on all of the warnings supported by your compiler and get hold of any other tools that might prove useful in your environment.

The second rule is to integrate your chosen tool or tools tightly into your development process. Don’t run them only occasionally—when you’re looking for a bug, for example. Run them every single time you compile your source. Treat the warnings they generate as errors,and fixt hem immediately. Integrate static analysis into every build.

Contracts,Pre-conditions,Post-conditions, and Invariants
One way of thinking about the interface between one piece of code and anotheris as a contract.The calling code promises to provide the called code with an environment and arguments that confirm to its expectations. In return, the called code promises to carry out certain actions or return certain values that the calling code can then use.
It’s helpful to consider three types of condition that, taken together, make up a contract:
Pre-conditions: The pre-conditions for a method are those things that must hold before it’s called in order for it to behave as expected. The pre-conditions for our addHeader()method are that its arguments are nonempty, don’t contain invalid characters, and so on.
Post-conditions:The post-conditions for a method are those things that it guarantees will hold after it’s called (as long as its pre-conditions were met). A post-condition for our addHeader() method is that the size of the headers map is one greater than it was before.
Invariants: The invariants of an object are those things that(as long as its method’s pre-conditions are met before they’re called) it guarantees will always be true—that the cached length of a linked list is always equal to the length of the list, for example.
If you make a point of writing assertions that capture each of these three things whenever you implement a class, you will naturally end up with software that automatically detects a wide range of possible bugs.

Evaluating assertions takes time and doesn’t contribute anything to the functionality of the software(after all,if the software is functioning correctly, none of the assertions should ever do anything). If an assertion is in the heart of a performance critical loop or the condition takes a while to evaluate, it is possible to have a detrimental effect on performance.

A more pertinent reason for disabling assertions, however, is robustness. If an assertion fails, the software unceremoniously exits with a terse and (to an end user) unhelpful message.

Have the best of both worlds—robust production software i.e. software bat will work even in the presence of bugs and fragile development/debugging software o.e. with assert statements.

assert s != null : "Null string passed to allUpper" ;
if (s == null)
return false;

As with many tools, assertions can be abused. There are two common mistakes you need to avoid—assertions with side effects and using them to detect errors instead of bugs.
An assertion’s task is to check that the code is working as it should, not to affect how it works. For this reason, it’s important that you test with assertions disabled as well as with assertions enabled. If any side effects have crept in, you want to find them before the user does.

Errors may be undesirable, but they can happen in bug-free code. Bugs, on the other hand, are impossible if the code is operating as intended.

Here are some examples of conditions that almost certainly should not be handled with an assertion:
⦁    Trying to open a file and discovering that it doesn’t exist
⦁    Detecting and handling invalid data received over a network connection
⦁    Running out of space while writing to a file
⦁    Network failure

Error-handling mechanisms such as exceptions or error codes are the right way to handle these situations.

Be very suspicious of any proposal to rewrite. Perform a very careful cost/benefit analysis. Sometimes the old code really is so terrible that it’s not worth persevering with it, but take the time to prove this to yourself.

If you do decide to go down this road,minimize your exposure as much as possible. Try to find a way to rewrite the code incrementally instead of in a “big bang.”

Test against the existing code, and verify that you get the same results.Be particularly careful to find the corner cases that the existing code handles correctly and that you need to replicate.

No comments: