Before joining Application Performance, the previous decade of my career was spent mainly delivering e-commerce platforms, especially within the online food delivery business. The majority of the platforms were completely customised solutions, developed from the ground up. These systems processed high volumes of orders during peak trading times and needed to cope with ever-increasing transaction loads.
One of the solutions I worked on at the time was of an age where it had gone through several iterations of development life cycles, developers and approaches. Certainly, within older parts of the solution, we had a mixture of logging approaches for which the main method was to output verbose details into a log file. This approach was not efficient, and so by default, it would be switched off within a production environment.
While the solution worked great, trying to track down a specific issue was, at times, a challenge.
Due to the volume and value of orders going through, combined with our now historic logging approach, tracking down a specific issue during a customer’s peak trading hour was no easy task. First, we would need to enable the verbose logging and then track down which load balanced server it occurred on. Once we found the appropriate log file, we often had to trawl through hundreds, if not gigabytes, of log file data, searching for a specific exception code or captured message.
Despite best efforts, issues can and will happen.
Looking back now, the pressure that I, the development teams, the customer, and their suppliers dealt with during those times was extreme. Naturally, the customer wanted a quick, effective resolution followed by a quick follow up explanation of what went wrong and how to prevent it from re-occurring.
With high profile brands and customers, an issue or outage could cause reputation damage for both ourselves and our customers. In addition we could be liable for any financial compensation, something that could easily sit with you over the weekend period, waiting to welcome you back to your desk first thing on a Monday morning!
The Ultimate Production Debug Tool
OverOps is the perfect tool to help developers see what is going on under the hood of their solution within a production environment, all without the overheads and time-consuming approach of enabling logging, and searching gigabytes worth of data to try and understand and gain insights into an issue.
If OverOps existed back then, I would have most certainly acquired it for my development teams, for several key reasons:
- OverOps is a low overhead production debugger. There is no manual switching on or off. It simply sits quietly in the background capturing debug data before, during and after the issue occurred.
- OverOps captures both the variable states and the exception source code. By capturing these variable values and the source code for the exception, we can quickly identify the code area and replicate the issue, providing a quicker understanding of the issue and both faster resolution and testing of any code fixes.
- An easy to use UI providing direct access to errors within your solution. When you first log into OverOps, you get presented with a dashboard view of the errors along with key values such as the total times thrown, the percentage error rate, when the error got introduced and more. This view is great in understanding what key errors should be focused on, helping drive your product backlog forwards.
- Simple integrations with 3rd party tools such as JIRA, Slack, HipChat, Splunk, ServiceNow, AppDynamics, New Relic and more.
These days I'm working on the other side of the fence helping customers to improve their code quality and the performance of their applications. And in that new role I'm happy to recommend OverOps in the hope that it save more people from the stresses of troubleshooting production application issues with only log files.
If you would like to try OverOps on your own application click below or go to http://www.applicationperformance.com/overops/ to find out more. If you'd like to talk to one of the team here at AP then please contact us.