It’s easy to skip past the basic points to check first in troubleshooting “rush.”
I have always regarded debugging circuit hardware and software as the most challenging engineering disciplines. Yes, new design can be difficult, with its many constraints and tradeoffs, but when something doesn’t work as planned — or works and then fails — that’s when the real “I have to think this through carefully and figure it out” clock starts running. Sometimes, the problem and its root causes are hard to discern, but other times, they are right in front of you, and you either can’t see them — or you are not looking for them in the right way.
It’s easy to lose track of basic realities when debugging. Many years ago, I spent — well, more like wasted — an entire morning debugging a circuit board that had worked the day before. However, it wasn’t working at all, and test-point voltages and waveforms were inconsistent and erratic.
Our project team leader and mentor kept asking what was going on, and I admitted I was mystified. I took careful notes and even captured key waveforms on the oscilloscope.
So what was the problem? Long story short: I neglected to follow the first two steps when debugging: 1) check the power supply nominal value and its quality at the load, and 2) check the connections. In this case, the problem bridged both rules. The external power supply output was fine, but that output wasn’t reaching the circuit board due to a bad connection between the supply and the board. If I had simply checked the voltage rail at the non-working board, I would have quickly seen the real problem.
Fortunately, my mentor understood me, overlooking the obvious and subsequent waste of time. He noted that it was a lesson I would never forget (and he was so right). He also gave me good debugging advice: “When things don’t make sense, just stop, walk away for a while, and then start over from the beginning.”
Recently, though, I made another “hey, you’re not thinking here” mistake in a situation that was just as simple as the previous one. I had an electric blanket whose heat seemed less than it used to be. This blanket has two parts: the blanket itself and the external controller (Figure 1).
Fortunately, I had an identical controller from a second blanket of the same type that worked fine, so I figured I could swap controllers to see if that was the problem. To make the A/B comparison easier, I would simply plug that second controller into an unused outlet on the extension cord to which it was plugged (Figure 2).
I could then test each controller by swapping between them at the blanket’s special connector but could also avoid the hassle of swapping their AC plugs.
One minor add-on: I had previously installed a simple AC on/off switch between the extension cord and the original controller’s line cord to provide a peace-of-mind “hard” shutoff when the blanket was not in use (Figure 3). So now I had one blanket controller plugged directly into the AC line and the other connected through an on/off switch.
I swapped the connectors at the blanket as planned and observed the results, which made no sense. The blanket worked with controller #2 but not with controller #1. I was puzzled about why that controller, which had worked just a few minutes before, no longer worked. I even hypothesized possibilities: perhaps control #2 had induced a fatal transient spike into controller #1 via the common extension cord outlet block? I was about to trash that first controller when I stopped, took a deep pause, stepped away, and then went back over the set-up.
The problem was immediately apparent. My clever idea to share the extension cord triple outlet to eliminate the need to plug/unplug and swap the cords had worked against me. Mystery solved: Original controller #1 was connected to the AC line via the on/off switch, while controller #2 was plugged in directly, and the switch was in the off position! Once again, the source of the mystery was right in front of me, but it took me a while to remove the mental blinders.
So what can you do to improve debugging besides “practice, practice, practice” and take a break when things don’t make sense?
One answer is to read a book. No, not an unrelated book about life in the tropics to calm you down or a book telling of heroic debugging efforts and how they eventually succeeded, but a single book that walks and talks you through a systematic approach to debugging, what to do, what not to do, and more. Read “Debugging: The 9 Indispensable Rules for Finding Even the Most Elusive Software and Hardware Problems” by David J. Agans (Figure 4), and you’ll get good lessons about a strategy and reality-based tactics for facing the debugging battle.
Admittedly, my blanket problem was a very small-time debugging problem. But if you can get blinded by this sort of simple oversight, it’s not a good omen when encountering the bigger ones. My advice: read the book, keep calm, and carry on — but with a methodical process, careful steps, and attention to detail.
Related EE World content
Toolkit, breakout test fixture help debug automotive Ethernet problems
USB Type-C test coupon fixtures speed scope debug tasks
Trace/debugger helps analyzer high-speed Wi-Fi streams
First 56-GBd optical probes for real-time scopes help troubleshoot 400G PAM4 components
System integration and debug: Go incremental or go “all up”?
Leave a Reply
You must be logged in to post a comment.