Ramblings of an aging IT geek
← Ramblings of an aging IT geek
hardware

a cheap logic analyser and finally seeing the bus

How a sub-ten-pound logic analyser and sigrok turned a misbehaving I2C sensor from guesswork into a problem I could actually read off the wire.

A logic analyser clipped onto a breadboard

For about a fortnight I had a temperature sensor that worked perfectly, except when it didn't. An I2C job, a little BME280 hanging off a microcontroller, reading humidity and pressure and temperature into a logger. Most of the time it was flawless. Then every so often a read would come back as nonsense, or the whole bus would lock solid and need a power cycle to come back.

I'd done the usual things you do when you can only see one end of a conversation. Added retries. Added a bus-recovery routine that clocks out nine pulses to unstick a wedged slave. Stared at the datasheet timing diagrams until they stopped meaning anything. All of it was guesswork, because I was reading the symptoms off the microcontroller, which is exactly the side that's confused. I had no idea what was actually happening on the two wires between the chips.

So I finally bought the thing I'd been putting off for years: an eight-channel USB logic analyser. One of the little blue ones, a clone of the Saleae, about eight quid delivered. I'd avoided it on the grounds that I "didn't really need one", which turned out to mean "didn't realise how much I needed one".

the software is the actual product

The hardware is almost beside the point. What you're really buying access to is sigrok, and specifically PulseView, its GUI. You install it, plug the analyser in, and it shows up as a fx2lafw device. The clones all use the same firmware, so the open-source stack treats them as first-class.

Eight channels, sampling up to a few megahertz on the cheap ones, which sounds slow until you remember standard-mode I2C runs at 100kHz and fast-mode at 400kHz. You need to sample several times faster than your signal to see it cleanly, and even at 1MS/s you've got comfortable headroom over a 100kHz bus.

You clip the probes on, two for I2C plus a ground, set a trigger, and capture. And then the magic bit: PulseView has protocol decoders. You don't sit there counting square waves and translating them into bytes in your head like it's 1985. You add the I2C decoder, point it at your clock and data channels, and it annotates the trace with the actual transaction. Start condition, address, read or write bit, ACK, data bytes, stop.

A close-up of a circuit board and probe clips

The first time you see your own bus traffic decoded into "START, write 0x76, ACK, 0xF7, ACK, repeated START, read 0x76, ACK, 0x84..." it is genuinely a small revelation. The thing you'd been imagining is just there, in front of you, true.

what the trace actually showed

I set a trigger on a falling edge of SDA while SCL was high, which is the I2C start condition, and let it capture a few hundred transactions overnight into the logger. Then I went hunting for the bad ones.

The good reads looked exactly like the datasheet. Address, register pointer, repeated start, three bytes back, clean stop. Textbook. I lined a few of them up side by side in PulseView and they were near enough identical, which is its own kind of reassurance: when the happy path is boringly consistent, the failures have to be doing something different, and now I could go and find the difference rather than imagine it.

Hunting through a few hundred captures by eye would have been tedious, so I leaned on the other thing the open stack gives you. sigrok-cli will run the same decoders from the command line and dump the annotations as text, which you can then pipe through grep like any other log. A capture of every transaction, decoded to lines, and a quick search for the ones that didn't end in a clean stop. That narrowed a few hundred frames down to the dozen or so that had gone wrong, and they all shared a feature.

The bad ones had a tell I'd never have guessed at from the microcontroller side. Right before a failed read, there was a stretched clock low that went on far longer than the rest. The slave was clock-stretching, holding SCL down to say "wait, I'm not ready", which is a perfectly legal part of the I2C spec. And my bit-banged I2C implementation on the microcontroller wasn't honouring it. It was driving the clock on a fixed delay regardless of whether the slave had released the line.

Most of the time the sensor was quick enough that the stretch never collided with my next clock edge, and everything worked. Occasionally, under some internal-conversion timing, the stretch ran long, my code clocked anyway, the slave and master disagreed about which bit they were on, and the whole frame derailed. That's your locked bus. That's your nonsense byte.

You cannot deduce clock-stretching from the confused master. The master, by definition, isn't watching the line it's stretching. You can only see it from the outside, on the wire, which is precisely what the analyser gave me.

the fix, and the wider point

The fix was small once I knew what I was fixing. In the SCL-release part of the bit-bang, instead of just delaying and moving on, wait for the line to actually go high before continuing:

// release the clock, then wait for the slave to let it rise
gpio_set_input(SCL);          // let the pull-up bring it high
uint32_t timeout = 10000;
while (gpio_read(SCL) == 0) { // slave is clock-stretching
    if (--timeout == 0) break;
}

That's the whole bug. A missing wait-for-high, a hardware handshake I'd been ignoring because I never knew it was happening. Reads have been clean for several days now, no lockups, no recovery routine triggering. I've even pulled the nine-pulse unstick code back out, because it was only ever a plaster over a wound I couldn't see.

What I keep thinking about is how long I'd have carried on guessing. I'd written retries and recovery and defensive nonsense, all of it treating the symptom, none of it touching the cause, because I was debugging blind. The analyser cost less than lunch and turned an invisible two-wire argument into something I could just read.

If you do any embedded or hardware work and you've been telling yourself you don't need one, you're me a fortnight ago. Buy the cheap one. Install PulseView. The first decoded trace pays for it.