202204262054 Prefer duplication over the wrong abstraction

Notes from Dan Abramov's talk The Wet Codebase¹

Start with two modules

  flowchart BT
    a((a))
    b((b))

Abstract

  flowchart BT
    a((a))
    b((b))
    c((c))
    a-->c
    b-->c

Awesome, we’re reusing it and everything. But then something comes along and there’s a slight difference.

  flowchart BT
    a((a))
    b((b))
    c((c+))
    d((d))
    a-->c
    b-->c
    d-->c

but then there’s bug in c+ that requires a special case for b’s use-case.

  flowchart BT
    a((a))
    b((b))
    c((c+*))
    d((d))
    a-->c
    b-->c
    d-->c

And then another slightly different one from a

  flowchart BT
    a((a))
    b((b))
    c((c+**))
    d((d))
    a-->c
    b-->c
    d-->c

So we pull those cases out and parameterize the call sites in a, b, and d.

  flowchart BT
    a_((a_))
    a*((a*))
    b_((b_))
    b*((b*))
    c((c+__))
    d_((d_))
    d*((d*))
    a_-->a*
    a*-->c
    b_-->b*
    b*-->c
    d_-->d*
    d*-->c

Later, after small fixes over time, logging, minor changes, you end up somewhere like this.

  flowchart BT
    a_((a_))
    a*((a*))
    b_((b_))
    b*((b*))
    c((c+__))
    d_((d_))
    d*((d*))
    x#((x#))
    x_((x_))
    y#((y#))
    a_-->c
    b_-->b*
    b*-->c
    d_-->d*
    d*-->c
    y#-->c
    x#-->y#
    x#-->x_
    x_-->d_
    c-->b_

But the key piece is that each of these individual steps made sense at the time. You just can’t really see the whole picture and understand what the original intention was.

Let’s go back in our time to the place where we added d and updated to c+. What we should have done is duplicate the abstraction, re-inline those pieces in a and b and make just the needed thing in d. Something like

  flowchart BT
    a((ac))
    b((bc))
    d((d+))

Then the bug evolution might look like this

  flowchart BT
    a((a%))
    b((bc*))
    d((d*))

And we don’t spend the time evolving these things in lock-step. And maybe later you end up pulling out something different than you originally thought after they stabilize.

  flowchart BT
    a((a%))
    b((bc))
    d((d))
    *((*))
    b-->*
    d-->*

So, when we educate people, make sure to talk about the benefits and the costs to things like abstractions. Don’t teach the next generation “don’t repeat yourself” teach them, “it depends”.

Benefits of Abstraction
- Focusing on intent
- Code reuse
- Avoiding some types of bugs
Costs of Abstraction
- Accidental coupling (could accidentally cause bugs, need to evolve in lock-step)
- Extra indirection, complexity (instead of spaghetti code, we create lasagna code where there’s too many layers for us to understand what’s going on)
- Inertia (the difficulty, ability, willingness, and perception around changing all this tangled code, the cost of unwinding an abstraction is much higher than just going along with it)

Easy to replace systems tend to get replaced with hard to replaced systems. The second law of thermodynamics in action in our system designs. — Malte Ubi on Twitter

Avoiding the mess

Test concrete code, if you only test the abstraction you can’t re-inline because those tests then fail and you don’t know if it worked. You need to test the concrete code at least.
Delay adding layers. Restrain yourself from asking for abstraction and DRYing out PRs and stuff. Just because the structure looks similar, doesn’t mean it is.
Be ready to inline things! Team should have a culture of deleting abstractions and decreasing complexity. There’s a technical component, though, because of the dependencies that arise from bad abstractions to begin with.

Duplication is far cheaper than the wrong abstraction.²

Prefer duplication over the wrong abstraction.²

Existing code exerts a powerful influence. Its very presence argues that it is both correct and necessary.³

You most often arrive at a code-base after it's been running, working, and changing for a while. You're likely to be subconsciously pushed by this power of existence towards working within a messy abstraction and adding to the mess instead of breaking it down and changing things.

Abramov, D. (2019, July 12). The Wet Codebase. https://www.deconstructconf.com/2019/dan-abramov-the-wet-codebase ↩
Confreaks. (2014, May 21). RailsConf 2014—All the Little Things by Sandi Metz. https://www.youtube.com/watch?v=8bZh5LMaSmE ↩ ↩²
Metz, S. (2016, January 20). The Wrong Abstraction. Sandi Metz. https://sandimetz.com/blog/2016/1/20/the-wrong-abstraction ↩