This is a transcript of episode 361 of the Troubleshooting Agile podcast with Jeffrey Fredrick and Douglas Squirrel.

What do you do when you inherit a 20 year old codebase?

Listen to the episode on SoundCloud or Apple Podcasts.

10 Tips for Software Archeologists

Listen to this section at 00:14

Squirrel: Welcome back to Troubleshooting Agile. Hi there, Jeffrey.

Jeffrey: Hi, Squirrel.

Squirrel: So we have a letter in the mailbag! And boy, is it a long and detailed and interesting letter, and it’s all about software archeology! Which is exactly what we were talking about last week.

Jeffrey: Yeah!

Squirrel: So, who wrote us, and what did he or she have to say?

Jeffrey: Well, this is a listener named Brian, and people will be able to learn a bit more about Brian because he also gave us a LinkedIn link, which we’ll put in the show notes. But he told this great story, and I want to start with this story. I think that our talk brought back memories for him, because he said it resonated with his last five years of work experience. He talked about when he joined a company. We were talking about legacy code, only valuable things survive to become legacy.

Jeffrey: Well, he talked about the scenario where he had joined a company that had only had a single solo co-founder developer for almost 20 years. So he came in and inherited a 20 year old code base that had been a single person’s creation. What he says fits perfectly with our thesis, ‘the code survived because it was useful to customers.’ Also, ‘it survived in spite of its quality, clarity, and stability.’ So very low on quality, clarity, stability, very high on usefulness. And that’s the kind of stuff that becomes legacy. If you’re going to choose one or the other, choose ‘useful to customers’.

Jeffrey: So he was in there, and it’s just what you’d expect from someone working alone for 20 years. There’s no need to have documentation. And also, if you think about what things were like 20 years ago… Very different standards, very different attitudes about testing, in particular, no tests. Any documentation was outdated and, um, no historical records, you know, no git history, if you have no git so…

Squirrel: Hadn’t been invented yet!

Jeffrey: Exactly.

Squirrel: Linus Torvalds hadn’t made it up yet.

Jeffrey: So what you have is: the code as written. And so this is real archeology, right? You have working software, and you have the artifact, and that’s it. So what do you do?

Squirrel: Very similar to what real archeologists do. They dig up something, or they found a mechanism on the bottom of the ocean, and they say, ‘we think this has something to do with astronomy, but it’s some complicated machine.’ We’ll put a link to it in the show notes, I think it’s the antikythera. I’m sure I’m saying its name wrong, and nobody knows what this thing is for, but it’s great archeology because it tells you that people in those times had the capacity to actually do something like what we do with computers, but with machines. And similarly, our friend Brian here, is looking at artifacts that he digs up and says, ‘this does something. I don’t know what, but I do have it in front of me here, and I wonder what I can do with it.’ So what did he do with it?

Jeffrey: Well, the story has a happy ending. He spent his first year getting things in order and then was able to grow the team and be productive, and it all got to a good space. Even better, he wrote up a set of tips and this is what he has in his LinkedIn article (link in the show notes), and he talks about 11 different tips for when you’re going to go into software archeology. I thought we might each highlight some of the ones that jumped out to us.

Let Them Blow Off Steam

Listen to this section at 03:53

Jeffrey: I want to highlight number 11, because it fits with an article, one of my favorite articles, which I’ve often brought up in the context of legacy code from many years ago. So Brian’s version of it, he says, ‘your developers are going to need to blow off steam, and you should let them’. And the equivalent was a post, again link in the show notes, from J.B. Rainsberger from very long ago, where he wrote an article that said, ‘ask why but never answer.’

Jeffrey: He wrote that in 2010 and this always stuck with me because he describes is, ‘you’re going to be in the code, you’re going to be digging around, and you’re going to find stuff and just be like, why? Why do they do this? What were they thinking?’ Brian gives the same kind of input when he says that one of his developers said that it sometimes feels more like janitorial work than software engineering. So, um, J.B.’s version of it, he says, ‘we permit each other to ask why, and even to do so in a loud, obnoxious, and pained tone of voice. We simply don’t answer the question. And if someone tries, we remind them, we don’t go there,’ because the idea is they don’t want the trying to answer why, to descend into wallowing or blame or anything else. We can acknowledge this is a bad situation, and now we need to figure out what we’re going to do to get out of it. That’s something I have found very useful. So that was my top one of Brian’s tips, anything stand out to you?

Just Submit the Dang PR & Don’t Assume the Code Makes Sense “Today”

Listen to this section at 05:33

Squirrel: Well there’s one that kind of has an opposite effect, so it gives us a good contrast. You’re on the side there of just work with it. He has another one: ‘just submit the dang PR’. You know, just go ahead and write the code and make the change. And don’t worry about how old it is. And he’s right. There are some really good and interesting examples, I’ve got one in mind, that match up with another of his points, which is ‘don’t assume the code makes sense “today”’. Because how something worked before doesn’t mean that it will work the same.

Squirrel: You know, we have this idea of drift in large language models and machine learning models, that something can have a meaning that it didn’t have before. Here in Britain, for example, the word ‘trump’ has long been a synonym for a particular bodily function… It has a different meaning today! So when people refer to that, they typically mean the President of the United States about to take office as we record this, not the bodily function that it has meant for a long time in Britain. That’s okay, the word, the meaning of the word changed. But guess what? That means your models that might be assuming certain types of meanings in words might not be relevant! Not because you changed any of the code, but because the world changed around you.

Squirrel: And a perfect example of that is another article I found, we got lots of links today, where someone determined that there was a reason that lots of hobbyists who were building old Macintosh computers, or repairing them, or working with them, a lot of them were having their Macintoshes catch fire. And the person said, ‘boy, I really wonder why, when they plug them in, the smoke comes out, because that never happened with the actual Macintosh of this model.’ And he investigated and looked very carefully at it and did a whole bunch of archeology on the physical machine. What he found out was, the folks who put the Mac together specified it wrong, and they put a capacitor in backwards—

Jeffrey: Oh!

Squirrel: —And when you put a capacitor in backwards, it gets loaded the wrong way, and bad things happen that involve fire.

Squirrel: Then he asked the question, ‘why is this happening to us today? What has drifted? Because I’m putting the capacitor in the same place according to the specification, just like the old board that I’ve got here. And when I put the modern one in, it catches fire. And the old one from the 1990s does not.’ And the answer is capacitors were different! They were manufactured slightly differently, I’m no electrical engineer, so the details escape me, but the point is that there was drift! So there was a specification, there was code essentially, that said ‘here’s how you build a macintosh’ and the way that it was implemented, the way that it worked, had drifted, had changed.

Squirrel: So that can be an important source of bugs and problems in your software archeology. Don’t assume that what you dig up today is going to have the same meaning to us today in our world as it did many, many years ago. I’ve got clients who have 40 or 45 year old code that’s still working today. It doesn’t work the same way anymore! Often you have to do a lot to update it, make it work, to adjust it to the modern world. But when you do, boy, does it give you a lot of value!

Don’t Assume the Code Made Sense in a Different Context

Listen to this section at 08:57

Jeffrey: That’s a great one. And I’ll just say that it’s a nice contrast, again, with what I think will probably be the last one I’d want to highlight. People can read Brian’s full set, but the last one I would highlight is— you were just saying don’t assume the code makes sense today —and this was: ‘don’t assume the code made logical sense in a different context’, which is to say, just because you have the software working as a whole doesn’t mean every bit of that code actually makes sense, because people make mistakes and they are fallible.

Jeffrey: They would find items where maybe the original design wasn’t good, or maybe they’re struggling to understand some logic of code that turned out to be dead, and it was never actually called, or it was copy-pasted from another place, and didn’t actually work the way the author thought it did. So don’t assume that because it’s there that it actually makes sense. You need to check, to know that it’s a living, functioning piece of code. It might be doing something and working, but not in the way that the author intended, so you can be misled by variable names, by method names. And I’m saying this part not from what Brian said, but from my own experience of dealing with legacy code, because how often have I been with someone, and they say, ‘well, let’s copy and paste this, and we’ll just make this quick change, and we’ll fix it later.’ Well…

Squirrel: Later never comes.

Jeffrey: Maybe later came for 90% of those examples! But, over 20 years, that 10% ends up as quite a lot of junk DNA that you have littering the genome of your software. So there is an element here of needing to not overly trust what’s there, which is why a lot of this is about trying to get in touch with what the code is actually doing today. There’s a limit to how much sense you can make just from examining it.

Squirrel: Fantastic, okay. That gives our listeners lots of things to be trying out and to be improving, most notably not ignoring the context of the code that they’re investigating, not ignoring or discounting the difficulties that might come about when you’re using software that is 20, 30, 40 years old. It may drift. It may not work the way you expect. If you’re not in technology, if you’re not writing code, spare a thought for those of us who are, because just like you may have to deal with ancient documentation for customer service or complex CRM systems and other things like that, we have our own complexities that make it very difficult to work with. And do what you can to help those in engineering understand how things used to be, because they’re dealing with a painful version of archeology, without a lot of artifacts, without a lot of explanation. So that’s our advice to listeners.

Squirrel: Just like Brian, we’d sure like to get a letter from you, perhaps one that disagrees with us. And you say, ‘wait a minute, legacy code should be thrown away. That’s what worked for me! Why didn’t Brian do that?’ We’d really be interested in a contrary view. Or questions about how you could apply this, what software archeology means for you, how you can use maybe some of Brian’s tips that we didn’t get to in this particular episode. All of those are things that we like to hear, and the way to get in touch with us, just like Brian did, is to go to agileconversations.com. The other way to keep in touch with us is to come back again next week when there’ll be another episode of Troubleshooting Agile. Jeffrey, I’ll see you then.

Jeffrey: All right. Thanks, Squirrel.