Lies, predictions, experiments, and bets - Conversational Transformation

This is a transcript of episode 248 of the Troubleshooting Agile podcast with Jeffrey Fredrick and Douglas Squirrel.

Squirrel and Jeffrey discuss various ways to make statements about the future (“this feature will take a week” or “users will buy 20% more if we cut prices in half”) and disagree about exactly how valuable they are and how to use them.

Show links:

Listen to the episode on SoundCloud or Apple Podcasts.

Introduction

Listen to this section at 00:11

Squirrel: Welcome back to Troubleshooting Agile. Hi there, Jeffrey.

Jeffrey: Hi, Squirrel. I was just sharing with you an interesting conversation I’d been having about the difference between experiments and bets, and how those relate to predictions. As we were discussing that, you mentioned that you recently had a discussion about how estimates are lies, and I thought “this is perfect. Now we have four things to discuss! We have bets, experiments, lies and predictions, and the differences between them.” What I like about this is that in all cases here, we’re really talking about uncertainty and embracing uncertainty to a certain extent. The question of whether estimates are helpful or harmful often comes down to how people deal with or fail to deal with the uncertainty inherent in them.

Squirrel: Well, I know how to deal with it, Jeffrey. It’s very simple. What we’re going to do is just remove all uncertainty, because uncertainty is bad all the time. So let’s get rid of all uncertainty and then we won’t be uncertain about anything.

Jeffrey: Right. And we can do that by just spending more time on the estimates, maybe if we’re more careful.

Squirrel: We have three backlog grooming sessions instead of just two. Then we have accurate predictions. Why don’t we just do that?

Jeffrey: Exactly. But neither you nor I believe that that’s the right way to go.

Squirrel: Aw, you’re unmasking me.

Predictions are Stories

Listen to this section at 01:40

Jeffrey: I think we’ve covered that before in our podcast episode on the Tilted Slider. You and I both take it as a a truth that there’s some inherent uncertainty in what we’re doing, different types of uncertainty. In that conversation of mine I was talking to someone and basically saying when I talk about planning with people, I always try to put the language in terms of predictions, becuase when we when we make plans we’re actually making predictions about what we expect to happen. I like to make this distinction for similar reasons to what you’re describing with estimates, which is a lot of people say, “well, we’ve made the plan, so naturally everything is going to happen the way we’ve said.”

Squirrel: Nothing could be farther from the truth. Remember that everything that’s written on this paper is a lie. None of it has happened yet. Some of it might become true if we’re really lucky, but we should expect that most of these things, almost all of them, are not going to come true.

Jeffrey: Yes! That’s why I bring the language of predictions to make it clear that even the things that we’re most certain about, that just means we have a high degree of confidence in our prediction. It does not mean that the prediction is truth. And here I’m influenced in this discussion about predictions by my exposure to the rationalist community and the work of Philip Tetlock. He talks about superforecasting and the use of betting markets, the idea that when we make predictions we want to have a level of confidence that we express. So it’s not just that I make a prediction something’s going to happen, but I also have a level of confidence and that allows me to establish a track record over time of how well calibrated I am. If I estimate a 70% chance something is going to happen and I’m 80% confident, does that thing happen seven out of ten times? Do I need to update the chance or my confidence?

Squirrel: The appropriate level of confidence is unrelated to the odds it will happen, because it might happen one out of ten times and you could be really confident that that’s what it’s going to be.

Jeffrey: That’s right. This is a really useful framework for me, because very often when people are doing planning and making predictions, they tend to lose sight of their own level of confidence. When they predict a thing is more than 50% likely to happen, they write it down as though it’s a certainty. To remind ourselves that we’re dealing with uncertainty, I use this question of “what’s your level of confidence” a lot, and I also use with people I coach. I was recently talking to a product manager and working with multiple development teams, and I asked them when they got a prediction “Oh, this will be done in two weeks.” What’s your level of confidence? How certain are you you’ll be done in two weeks? Are you 95% certain. Are you 50%? That led to very productive conversations with the development teams to make the distinction between something that was like 55% versus no, no, 95%, we really feel like we have it under control. That’s the kind of thing that came out with this prediction framework.

Why Daydream?

Listen to this section at 05:09

Squirrel: See, I have a simpler but more radical view: why don’t we just not estimate? Why don’t we just stop pretending that these things are true? Because I find so often with my clients that they set themselves at the wrong place on the tilted slider. They’re too close to predictability when they need to be closer to productivity. What I’d prefer to see is at the top of every Jira planning board or whatever tool people are using, the word “lies” should appear, just so everybody’s always reminded subliminally: everything on this page is false. None of this has happened. We shouldn’t plan on any of this occurring. We might have various levels of confidence, but any large number we might happen to put down…do people use the Fibonacci method so they can try to separate their their large estimates? The small estimates are one, two, three, five and then the large ones are eight and 13? I don’t know if it goes up to 21, but any of those large ones have almost no value. They should have like a red “lie” right on top of them, because by definition you’re going to have a very low confidence level, or you should. If not, then you’re not very well calibrated. So I would just like to label these things, make sure that we remember that all of them are false.

Jeffrey: I think you and I are going for a very similar thing here between the lies and predictions. Lies being a bit more attention catching, so maybe there’s a good use to use that language. But one thing though, I still find value in the process of estimating, not the estimates themselves. “Planning is essential, plans are useless.” When people say, “oh, we just shouldn’t do estimates” I kind of push back and say, “Well, I think there’s value in it.” If the goal is to get aligned with other people, not necessarily on what we think will happen, but rather “are we seeing the tasks the same way.” So it’s a way of surfacing differences of opinion.

Squirrel: I know exceptions, clients of mine who work in biotech and other more highly planned, more constrained environments where some of that is valuable. But otherwise I don’t see why we don’t just do the alignment without doing the estimation. If the estimation isn’t that valuable, let’s skip it.

Jeffrey: But you’re saying it’s the estimation that is valuable, not the estiamtes. I don’t know a better way, how would you prompt people to gain alignment on what they think is going to happen? I’ll give you an example, which is from the old old days when I used to be in a team that used cards.

Squirrel: I remember cards. They’re so great. Please use cards. Don’t use Jira.

Jeffrey: Well, we had cards and we would do kind of a planning poker thing. When we would go on the card, we’d end up writing what we thought the size was. We were using small, medium and large, but we’d have people essentially bid what they thought it was. What was valuable was that in the group, when people would have different results, when someone would say it’s small and someone else would say it with large, you ask “why that? Why do you think that it’s small? Why do you think it’s large?” You explain yourself. If you both say it’s “two”, then we probably don’t have a conversation.

Some Productive Disagreement

Listen to this section at 09:01

Squirrel: But that’s unfortunate, because your “two” might mean something different from my “two, and we don’t get a chance to explore that disagreement. See, I’d prefer that we actually write down very small steps for the tasks far enough out that we’re clear on delivering some significant value at the end of those steps, but each one valuable in itself, and then we can see what those are. So if you have ten of them and I have two of them, we get the same value that you notice that you think it’s got a lot more to it than I do, or I have a shorter route to value. And if I write down two and there you write down two, and they’re very different, we can see that rather than just having a hidden by the number two, I just don’t find the numbers very helpful.

Jeffrey: Good point you make there, I would also do writing things out, but at a later stage. I consider that the breakdown process. So when it comes to actually do the work, then I would expect people to do what you’re describing, to have a discussion about “what are the steps involved here” and that kind of breakdown process.

Squirrel: But why not do the alignment sooner? Why wait to find out that you’re misaligned and you think there are two database changes to do and I think there are two user interface changes to do?

Jeffrey: I think this question is the trade-off between predictability and productivity. I want to surface the bigger disagreements earlier and the finer disagreements later, as we get closer to doing the work. So if we have, say, 50 cards that we’re going to end up doing, there’s a question here of how we’re going to order them, what way we want to tackle them. Part of that will be questions of risk, which we uncover through that process of “what are these things and how differently do we feel about them, where do we think the uncertainty is?” That comes out in that kind of estimating process and that helps us order the work. I wouldn’t do the breakdown for all the steps ahead of time, I find it introduces a lot of waste because we end up doing breakdowns for something that we may end up actually never doing.

Squirrel: Sure. I think of a breakdown as a very brief activity that doesn’t go very far. So you’re not going to break down the entire new log-in experience. But you might say, “add the new authorization code,” “add the new password validation,” then something else, enough to give you a sketch or an understanding of what the what the feature is.

Jeffrey: I have a feeling that we’re each describing a sort of a progressive discovery that that happens throughout the process.

Squirrel: Yes, certainly that. I’m emphasizing the productivity end. I think you’re erring a little more on the predictability end; neither is bad.

Jeffrey: I’m curious that that would come out in practice. Certainly the people who work with me would put me more on the productivity side.

Squirrel: We’re way up the slide compared to most people, but you’re a little further down than me.

The Other Half of the Quartet

Listen to this section at 12:13

Jeffrey: That’s possible. So we’re saying here even in the planning stage we have the understanding estimates are lies, and we should understand we’re making predictions, not certainties. But that was only two of our four, let’s make this distinction about experiments and bets. In the conversation I was having, I was saying that I’ve often tried to introduce the language of experiments. “We’re going to go do this thing and we’re making a prediction of what the outcome will be.” “We expect this to drop the load on the database,” or “we expect our conversion rate to go up,” or “we are going to send this survey and we expect people to answer.” But whatever we’re doing, it may not work out. There’s just limits to what we know. This was covered very well in the Art of Action book, where they talked about the gaps between knowledge of what we would like to know versus what we actually know, between the predicted effect and the actual effect, and so on. So our actions in that way are always experiments. There’s a funny distinction here between that and bets, and the person I was talking with was referencing Annie Duke in her book Thinking in Bets, and the idea that there’s things that we’re going to try and it might be a good bet, but it might not work out. We had this kind of strange unresolved tension in the moment between the distinction between experiments and a bet. Do you have an immediate response to that, how you would look at these things differently? Because they’re both just language describing that there’s unpredictable or unknown results of your actions.

Squirrel: Sure. I don’t think it would be too surprising to hear that I’m interested in both. I guess I’d say that a bet involves a very high level, raw, undeveloped prediction, and an experiment involves a more nuanced, processed, better justified prediction: a hypothesis. That’s the distinction I would see between them. And they’re both lies. So I’m perfectly comfortable saying that they’re both completely false until you actually run the experiment, then you’ve proven something. Then you know something. But until then, you should remind yourself that you are taking a bet. You are running an experiment because you don’t know. I think that’s what we far too often forget when we start to believe what’s written on the paper and not take it with the appropriate grain of salt. I think they’re both very useful. I’d say that if you’re more at the productivity end and more experimental and risk friendly in general, then you’d be more likely to take a bet by saying, “Hey, I wonder if people would like to buy this at half price. I’ll try slashing the price to 50% and see what happens,” rather than someone who might be more disciplined about it, more predictable, who might say, “Well, we’ve done some user studies, and that suggests price sensitivity means we can gain 20% conversion uplift and conversion rate if the headline price is 50% lower,” they’re both going to do the same thing. One has just thought about it a lot more than the other.

Jeffrey: What’s interesting when you describe that is, one of the objections I’ve heard from a longtime coworker is we often talk about experiments, but we really haven’t done the work to make it a good experiment. What he would say is that we’re more on the bet side of things. I like the word experiments, and I think I’m very influenced here by Toyota Kata and the idea of you’re doing this PDCA cycle and I’m-

Squirrel: “Plan, decide…”

Jeffrey: “Plan, do, check, act.”

Squirrel: Right.

Jeffrey: That’s described as an application of the scientific method, where as you say, you have a hypothesis, you have a result you’re trying to get, and you’re doing this PDCA cycle trying to surface the knowledge, the learning that would let you unlock the problem and get the result you want. I like that experiment language in particular because it has the end of the cycle, where you’ve done the thing and you look back to say, “Well, what happened versus what we expected? What did we learn?” Really this is the key thing. What did we learn from doing this? For me, the language of bets doesn’t inherently have that. It’s sort of like, “well, I can put the money on that horse. If I win, great. If I don’t? Oh, well,” and I move on.

Not Very Learning Loop

Listen to this section at 17:24

Squirrel: It doesn’t tell you how to find the next horse.

Jeffrey: Yeah. There’s nothing inherently that says, “Hey, you should look at your pattern of betting and decide if it’s actually a good one.” Now, I know from hearing Annie Duke in an interview that was not what she means. She would say, “no, you definitely should be analyzing what you’re doing and be learning from it and getting better. That’s a key part.” But the natural everyday language of bets doesn’t seem to have that. So that’s why in my coaching I recommend people adopt the language of experiments, because with an experiment you could do it, and then you go look at the result deliberately trying to learn from it.

Squirrel: Really good poker players like Annie do in fact study very carefully what they’re doing, and they’re very intentional about what they’re betting, contrary to movies and things. They don’t just say, “Oh, I think I’ll bluff and suddenly win a zillion dollars.” They study probability tables very closely and analyze their thinking and try to improve. That’s what her book is all about, if I understand it right.

Jeffrey: Exactly. Now we’ve talked about this, and still I want to go further at some point, not in this episode, but in life I’m going to continue to consider the difference between bets and experiments, because there is a part of me that has been thinking about how to bring these kind of things more into the development lifecycle, and I’ve often thought it’d be really interesting to have a software development tool or platform that included betting markets. The goal here is to try to bring information out from the team. So you have the hypothesis of the product and you have the estimates of how long things are going to take. You have hoped-for milestones or delivery, and let the team bet on them as a way of surfacing whether they have belief about what the results are going to be. For me it would be very interesting to have that, and I can imagine built-in calibration that would happen. I’ll give an example, people probably are not familiar with betting markets, but I’ll give you a link to one in the show notes, it allows you to really bet on all kinds of different things. And the idea of a betting platform like this is it allows.

Squirrel: You to bet on whether the rocket ship will go to the moon or whether the other one would hit the asteroid and things like that.

Jeffrey: Right. And they’re live and the advantage here is they’re changing, because it’s a market, meaning I may have made one bet at one time, but I could change later. So as the project goes along, my level of confidence might change and I could be changing my bets in the market. That would change the signal being sent to the person looking at the overall project, they would have a good sense of the team’s confidence by what was happening with that betting market. So I’m very fascinated this idea of lightweight, cheap ways to surface the beliefs and knowledge that are in the team. That’s why I like the estimates we were talking earlier and the breakdowns in your approach. “Write down the number of steps. Oh, you have three steps. I have 50. We clearly are thinking different things. Let’s compare them.” What are the ways that we can surface the differences of viewpoints within the team? I think a betting market might be a fun way to to bring that into the development process. So just sharing that idea that’s been sitting in the back of my head for a while.

Squirrel: Excellent. Well, the back of your head is a very interesting place. Thanks, Jeffrey.

Jeffrey: Thanks, Squirrel.