Digging myself into (and out of) a rabbit hole^W pipe

May 29, 2025

I've been hacking on aerc lately, and I'd like to talk about a very recent issue I've been fighting with. It's been super frustrating, but I also learnt a lot, and eventually came out "victorious"... so in the end, it was fun.

If you talk too much, I'll shut you off!

You can almost do anything with aerc: it provides generic mechanisms and is highly configurable, so the sky is the limit. If you want to see an email's raw data, you can just pipe its content into less using :pipe -s less and inspect the data in the tab that this opens. And if you have selected several emails, it will also work - the messages will be combined into a Mbox.

Excited about this, and because it seems I like to f... around with software, I selected a bunch of messages, pipe -s less-d them and... nothing. Not nothing like "it does not do anything", but rather like "I am frozen, and your only option is to kill -9 me".

A big believer that "a problem reproduced is half solved", I started to read the code in charge of the pipe command, thinking I'd quickly get to the bottom of the problem. How naive! :-)

pipes for dummies

Needless to say, everything looked fine in the code, so I started debugging (aerc is written in Go, and I'm still early in my journey with this language, so "debugging" means "printf debugging" here), and was going nowhere... so I decided to invest a bit of time and teach myself how to debug Go programs [1] using delve.

It turns out that aerc was blocking on a call to io.Copy, and a quick search taught me that (Unix) pipes don't have an infinite buffer (duh!), and that if you try to write into a pipe with a full buffer, you'll block until something reads data from the other end.

Since aerc's:pipe is structured such that a Go routine writes data into the pipe and then signals another Go routine to start reading, the idea of the fix is trivial: (1) figure out what the pipe limit buffer is, and (2) make sure that you don't write more than that.

How much it too much?

The first point already is not that straight-forward: like most low-level APIs, the answer is that it depends :-) Fortunately, I found a great answer on Stack Exchange.

I naively hoped that Go would expose an API to find out the exact limit, so that we can KNOW and not guess, but it's unfortunately not the case... So I convinced myself that I would not be too wrong in working with a limit of 65535 bytes [2], wrote some code, and could quickly start testing.

All was good when piping a single message, but when piping several, I noticed that the warning I added to notify the user that data was truncated due to pipe limitations did not always show.

I tried a few changes, but I was going nowhere, and got frustrated by the fact that to test every change, I had to compile, open aerc, load my mailbox, select N messages and pipe -s less them... and see that the problem was still there.

Interlude on testing

aerc has a lot of unit tests, but only a few for UI aspects. I decided to bite the bullet and write some for the :pipe command. As expected, it was not fun, but it proved crucial for the rest of this adventure and my sanity.

So yes, some tests are a pain to write, but don't chicken out, and write them. It will pay (lots of) dividends down the line.

Counting (bytes) when someone is lying

So now I had four things: a test demonstrating that indeed, there was a bug with "Mbox mode", my two eyes (to cry :-)), and my mouth (to curse).

After (too many) trials and errors, a few runs to think about something else, and lots of looking at the code trying to understand why it'd spit things like invalid write result or short write, I asked myself the question that I should have asked from the start [3]: "what if something is lying?".

From there things got easy: I did not look only at aerc code, but also at that of its dependencies, in particular go-mbox. I quickly spotted suspicious code, and an old ticket reporting a pretty similar issue.

After submitting a fix for that issue, verifying that everything [4] is fine with that fix, and adapting the aerc code and tests to live until a new version of go-mbox is released and integrated, I was finally able to submit a patch fixing the freeze \o/

Concluding remarks

This investigation took place over 3 days, and while it was definitely not my longest [5], I'll remember it for a few reasons:

  • It got me to dive into lowish level stuff, something I love but had not done in years,
  • It forced me to look into the debugger landscape for Go,
  • In the end, I solved the issue, improving both aerc and go-mbox

June 3rd, 2025 update

The patch I proposed was eventually discarded, since ~rjarry found a better one that does not involve any truncation. The majority of this post remains valid.


[1] It was literally an investment of a few minutes; the go ecosystem is rich and well documented.

[2] Spoiler alert: I was naive, again :-)

[3] It's always like that, isn't it?

[4] go-mbox's Writer still lies to its users if/when it replaces "\r\n" by "\n" and it's not as easy to fix; I might or might not try to fix it in the future.

[5] The longest was clearly at Amadeus, when a new version of our home-made service bus would crash within 2 minutes to 1 hour, in Production only. It took more than two months (not full time, thankfully) to find an (old) race condition due to a mishandled double (or was it triple? ;-)) negation.

https://simartin.dev/blog/rss.xml