We have been using OverOps for a while now on the software that we develop at our sister company WebTuna. I am the main developer for WebTuna and I can see how it has subtly changed the way I code.
Before OverOps, the fact that we had an exception was the most important thing and not its specific type. As a consequence we had few exception types. In OverOps you can rank the top exceptions by type. Therefore it makes much more sense to have more fine-grain exceptions, so I have started to introduce those. You can also do this in log-aggregation tools, which we have used in the past, but OverOps has another important advantage, namely its ability to zoom into the source code and see the variable values. Having specific examples of the values that caused the exception helps to name the exceptions appropriately.
We make use of package names as a way to distinguish between code developed by different teams. We include some third-party code that we do not want to ignore and want to categorise it separately (you can entirely ignore exceptions raised by third-party code if you want to). We generate alerts based on com.webtuna packages but not on com.thirdparty packages, for example, preferring to periodically inspect them. As a consequence we have tidied up our package names so that they can be grouped more easily.
We make use of deployment names that include the product version number so that we can see when problems were introduced or disappeared (or even re-introduced).
There is nothing like seeing the same exception at the top of the charts to make you want to do something about it (well, it does me anyway). You can obviously just hide it but that is not playing the game or really addressing the issue. Behind OverOps there is a philosophy of aiming for zero exceptions (in a similar way to aiming for a zero e-mail inbox, something else I haven’t achieved). This is discussed in far more detail in the OverOps e-book and I have blogged about the book in the past. So how do you reduce the number of exceptions? The first is to recognise that some exceptions cannot be eliminated because they effectively form part of the flow control: for example, you cannot easily know whether a string can be converted to a number unless you try it. You can safely hide these. Even here you can do some simple things such as testing for a null or empty string or if it contains zero (the most common value in our application).
Exceptions arise in three basic situations: invalid parameters passed to your method, processing within your method and invalid values returned by your method. The first is the responsibility of the caller; the second and third are your responsibility. By far the most common cause of exceptions within WebTuna is the first case: bad data received by the API. In the past, I have just rejected these calls with an exception - programming by contract if you will. But it is an unfortunate fact of life that these bad calls are not going to go away any time soon; it is going to be years before they disappear entirely. And, in the meantime, there will probably be new ones introduced. Therefore, I have started to accommodate these calls with a few simple fix-ups so that they make sense. Doing this has removed 85% of the exceptions caused by bad calls. Obviously this is a particular scenario that works for WebTuna and may not apply to you.
At each release we aim to reduce the number of exceptions and, in turn, create more reliable software. I am sure the way that I write code will continue to evolve as I use OverOps.
If you would like to try OverOps and see how to improve your own coding, then click below, or go to http://www.applicationperformance.com/overops/ to find out more. If you'd like to talk to one of the team here at AP then please contact us.