Tech β€” Work β€” Ramblings

by Mike Kalvas

The Only Correct Way to Organize Code

A Joking Rant About Keeping Code Neat

Why we need a system for code organization and what that system should look like.

#blog #tech

Top Gun buzz the tower gif

Highway to the Danger Zone

Though the pattern is full and it will definitely get me in trouble, I'm going to metaphorically buzz the tower by putting some pretty strong opinions out there today about file naming and code organization. So buckle up Goose, and don't hit that eject button.

First off, file naming. The correct answer here is kebab-aka-hyphen-case which is all lowercase and hyphen delimited. No exceptions, and no, your community standards aren't a good enough reason to change this one as we'll see in a minute. Looking at you ComponentName.jsx πŸ‘€ .

Second, code organization. The only correct answer is folder-by-feature as opposed to folder-by-random-devs-logic or folder-by-tech. We'll take an organic trip through all three of these systems and see where we end up.

Either of these alone is enough to spark a holy war among devs, but unlike tabs vs spaces, I think there's good enough objective reasons to pick one over the others. Let's take a look.

File Naming

I don't want to spend too much time on this topic because it's much less important to a team of developers than having a consistent system for creating, finding, and maintaining the actual functionality of your product through its source code. Still, you can see how it's in the same vein and deserves an honorable mention.

There are two key reasons that I think a symbol-delimited file naming structure is optimal. There are plenty more reasons I have for this one, some subjective and some objective, but I won't get into them here.

Firstly, acronyms and words. Symbol delimited names are the only way to universally disambiguate words. There's a reason that CSV stands for Comma Delimited Values and not Capital-letter Delimited Value. It only takes one example to see this.

What do we do when we want to name something like "URL Parser". Our options are urlParser, uRLParser, URLParser, and UrlParser. The option urlParser isn't so bad, but we also need to solve for when the acronym isn't the first word in the sentence like "parse URL": parseUrl or parseURL. It certainly loses even more appeal if the acronym is in the middle: parseUrlParams and parseURLParams. And (though the example is getting contrived now I've seen similar and worse in real world code) the final nail in the coffin is "a URL parser": aURLParser, AURLParser, AUrlParser, or aUrlParser.

Ok, well we could just make a bunch of rules about acronyms and words and things (which I've seen done) for people to adhere to, but consider that for every instance like this, some dev is wasting time doing one or many of the following:

  1. Trying to massage names into a form that isn't ambiguous to avoid dealing with these issues in the first place, often resulting in less than ideal naming.
  2. Just picking one that seems right to them at the time.
  3. Having to remember or worse, spend time looking up, the rules that we decided would be our standard.

Now think about how if we just go with kebab-case, all of this is solved instantly. It is unambiguously clear that we name these url-parser, parse-url, and a-url-parser. We choose where the breaks are, casing is always all lowercase, and we never have to pick anything besides the clearest name possible.

The second key reason is systems, history, behavior, safety, etc. The choice of a hyphen as the symbol for our naming structure is important because it's one of the best symbols that just worksℒ️ cross-platform and cross-tooling. This includes things like what the behavior is when you hit βŒ₯ + ⌫ while typing the name, if these cause parsing errors, whether it's ASCII vs UTF-8, and more. Think about how annoying it is to use files that have spaces in their name from the terminal.

Addendum: snake_case is a close runner up, but I have anecdotally experienced better performance in this second category with hyphens over underscores. Also the shockingly minor yet impactful difference of typing the - vs ⇧+- really adds up over time.

Alright that's enough on that, let's move on to the crux of this post.

Code Organization Systems

Folder by... Something?

This is the "option" to not have an organizational scheme or standard at all. This usually comes about through accident, because no one ever established a standard, or just based on whims of one or a few developers. Few would argue that this option is scalable or efficient, but this is often (and probably correctly from a quick-prototyping perspective, but that's another post) the way we start new projects.

Imagine a natural situation where I have my project and you have yours. I like to organize my code one way and you like a different way. Neither of us have put much thought into why we do it like we do β€” we just put files where they make sense at the time.

In isolation, this doesn't seem too bad really, but now let's imagine that we think each other's code is cool and we agree to team up and make a company that uses both of them. At first, we're the only employees and nothing changes. But then, the business starts taking off and we need to hire another developer. She comes on board and has to learn both of our systems, which is inefficient but not terrible, and she manages fine. Turns out, though, that she has a third different opinion of how to organize code. So she makes a new project and uses her opinions to organize things.

You can see how there's obviously an issue when we add five, ten, or a hundred other developers. It just isn't scalable. No one will know where to find the thing they're looking for. More importantly, code quality will suffer because without a well-defined way to understand where all the code related to a feature is, we might not understand what side-effects new code or a change in existing code might cause. Think about how difficult it is to remove a feature from a massive spaghetti monolith while being 100% sure you didn't break something or forgot to delete some of the feature code somewhere.

If this simple illustration isn't enough for you, I'd recommend thinking about your motivations here. Is it that you think your way is the best? Maybe it is, but if that's the case then you still need to get everyone else to buy into that so it's worth objectively evaluating it against other ideas like the one presented here.

What if it's that you don't think you'll ever need to scale your project? Maybe that's true and that's OK, but I still think that a well organized project will lead to better quality and ease of maintenance. After all "cleanliness is next to godliness".

Moving to Something Better

At this point, no matter how we got here, we're looking to have a more logical system. Ideally, this system would achieve a few objectives:

  1. Anyone can easily find any piece of code they're looking for.
  2. No one ever has to even think about the questions, "where should I put this new file?" or "how do I name it?"
  3. It's easy to understand where all code related to a feature is, which is equivalent to saying we're optimizing for change.

There are lots of systems that satisfy #1 and #2, but I think the most important objective is #3. This makes adding a new feature trivial, enables new features to take full advantage of the features that already exist, and allows refactoring, maintenance, and even removal of features with unprecedented certainty.

Let's continue along the path from folder-by-feelings to something more systematic.

Folder by Tech

If you're like me (and maybe I'm showing my age here), one of the first places that many people experience folder-by-tech is through the Model, View, Controller (MVC) paradigm. Part of what made this a great choice was the standardization it created around a pattern. It was standardized in most frameworks to a folder structure that looks like this:

app
β”œβ”€ controllers
β”œβ”€ models
β”œβ”€ tests
└─ views

You can see that folder-by-tech puts similar types of files together. You could just as easily imagine a styles folder that contains all the .css files for the app. The idea here is that just like you would want to separate the concerns of your application, you would likewise organize your code. Now is a good time to bring up how you slice up your application: vertically or horizontally.

Think of an application like a stack of pancakes. One pancake is a layer or concern. The bottom pancake could be the models, the middle the controllers, and the top the views. Now think about how you eat pancakes. You cut through the layers vertically to get a bite. You can think of this bite as the idea of a feature β€” to get any feature to work, you need a bit of each layer.

Now we're getting to the heart of what separates folder-by-tech from folder-by-feature. The core idea is that folder-by-tech slices your app horizontally and folder-by-feature slices it vertically.

So why would we want to slice our app vertically vs. horizontally? The answer is in that third point above. We want our app to be easily changeable. We want to be able to follow feedback and experimental data to add, change, and remove features rapidly. We want to have certainty that our changes are limited in scope to the features we're modifying unless we're making API or interface changes in which case we want to know that we're causing side-effects.

But what about separation of concerns? Isn't the whole point of folder-by-tech to nudge us in the right direction of keeping things like database queries out of our view files? I have a few thoughts on this.

First, I think we've reached a level of maturity in the tech world to know that we need to make good security decisions and follow best practices. There are people who are still learning these things every day, which is totally fine, but we as a group should be past using unparameterized queries for instance. A corollary here is that much of modern architecture uses decoupled clients and backing services anyway, reducing much of the vertical height of any given project to begin with.

Second, are we actually gaining anything by doing this? It's possible but unlikely that the organization of your code could include things like making the import of a database connection unavailable in your view file.

Third, I think the horizontal separation of concerns is a flawed goal. Why would we want to think about all our controllers at once over thinking about how a controller works together with the models and services of that feature. I'm sure there are times or reasons that we might want to, but day in and day out on the job, devs are reasoning about the vertical slice, not the horizontal one.

Lastly, I think the idea of having a component or feature of your app be a self-contained, vertical scope of deliverable business value is instrumental to how we operate in today's tech industry. It allows us to follow the data and build products that serve the customer, one incremental slice at a time.

If this brief discussion hasn't already persuaded you, let's see how folder-by-tech stacks up against our agreed upon goals.

  1. Halfway. It makes finding code easier, but not obvious in all cases. You have to know that the feature is split across different folders and you also have to know which of the folders this feature actually uses. Do I even have controllers, models, services, views, helpers, mailers, etc. for this feature?
  2. Yes. Although, I could certainly contrive some situation where I'd have to ask where to put a file (service vs model comes to mind), but really, that's just being unfair. If you're working on a normal app, this system solves that question.
  3. No. It doesn't optimize for change. Knowing where all code related to a feature lives requires looking through many folders, possibly sub-folders or namespaces. I would argue that you can never truly know in this system, especially if your framework has auto-loading or your language isn't statically typed, referenced, and compiled. You could easily have unknown side-effects or orphaned dead code.

We're not doing so hot here meeting only 50% of our goals. Fear not though, for I have a route to the promised land.

Folder by Feature

Before I just throw the final product at you, I'm going to pull a switcharoo and up the stakes by adding another goal to our list of demands for the new system. The fourth goal is this: there are no hidden levels of organization. All features can be seen by the app level and all code for a feature can be seen from the feature level. I only add it here because no other system could possibly compare to this one in this regard, but it's a valuable facet to gain.

What does this one mean in practice and why would we want it? It means that we should never nest folders inside of a feature. We can have folders in the feature, but no folders in those folders. This gives us powerful transparency where the second we open the feature folder, we understand the type of feature we're working with. For instance, if I open a feature folder and only see a folder for services and one for factories, I know at a glance that this feature is different in interesting ways from one that has views.

So here is folder-by-feature broken down into a few simple, unambiguous rules that meets all of our four goals:

  1. Each feature folder has one main entry point that serves as the features API or public interface (e.g. index.js). All other features that reference this feature only use the interface, never a direct import or reference to the source code. This reference rule gives us goal #3.
  2. Each feature folder can contain any folders that organize the code for that feature, but these sub-folders cannot have further folder nesting. This gives us goal #4.
  3. The root app code folder also has an interface, but other than that, there are only feature folders at this level. The combination of this and others give us #1 and #2. The only point where #1 could fall down is if people don't name features well and if your devs aren't willing to name things reasonably well, no system will help you.

Finally, we get to the actual folder-by-feature example. This has been a lot of words to get here, but I think it's a great example of how simplicity is not the same as easiness. It can take a great deal of thought to write less code just as it can take a lot of words to describe a simple system.

This example is within the context of a React app, but don't worry if you don't see the kinds of things that you write in your code: there are two generic examples at the end of the article. Let's imagine adding the idea of login and a profile to our app. A simple example of the login feature would include a page with a login form and a login hook that would serve two purposes: sending the login and being used in the profile feature to make sure the user is logged in and get their data.

src
β”œβ”€ login
β”‚  β”œβ”€ components
β”‚  β”‚  β”œβ”€ form.jsx
β”‚  β”‚  β”œβ”€ layout.jsx
β”‚  β”‚  └─ login.jsx
β”‚  β”œβ”€ tests
β”‚  β”‚  β”œβ”€ form.spec.jsx
β”‚  β”‚  β”œβ”€ login.spec.jsx
β”‚  β”‚  └─ use-login.spec.jsx
β”‚  β”œβ”€ utils
β”‚  β”‚  └─ use-login.js
β”‚  └─ index.js
β”œβ”€ profile
└─ index.js

Let's now look how we'd build the login feature interface and use that feature in the profile feature.

// in login/index.js
export { default as Login } from './components/login';
export { default as useLogin } from './utils/use-login';

// in profile/components/some-component
import { useLogin } from '../../login';
// Or better yet, use aliasing or root paths if the tech allows.
// import { useLogin } from 'src/login';

This allows profile to use the "public" parts of the login feature, and at the same time lets us refactor the internals like form.jsx without any concern for side effects.

For example, maybe use-login.js is complex and we want to pull out a post-login.js file that's concerned with the network fetching and also convert to using something like axios or graphql. See how this is much easier to do in this system? Maintenance and upgrades are a breeze too because we get very clear seams to work with. I can update/refactor one feature at a time and never have to worry about chicken and egg problems.


So there you have it. folder-by-feature is clearly superior and those who say otherwise are obviously wrong.

Top Gun high five gif


Before you go though, I thought I would include two generic examples of the system and a whole section of FAQ's that I've worked through with people that have all, without exception, come around to the folder-by-feature side.

Examples

Generically for [INSERT YOUR JS LIB HERE],

src
β”œβ”€ feature-a
β”‚  β”œβ”€ components
β”‚  β”œβ”€ utils
β”‚  β”œβ”€ tests
β”‚  β”œβ”€ ...
β”‚  └─ index.js
β”‚
└─ feature-b

Generically for MVC if we were working without autoloading etc. I think this one really shows the overhead of how many different places you could have to look for related code if it wasn't all in the same feature folder. By not nesting folders, we also alleviate the common problem of weird namespacing overwrites that (IMO) cause way more harm than good. This doesn't mean we can't use namespacing and inheritance, we just keep the file organization flat.

src
β”œβ”€ feature-a
β”‚  β”œβ”€ concerns
β”‚  β”œβ”€ controllers
β”‚  β”œβ”€ helpers
β”‚  β”œβ”€ mailers
β”‚  β”œβ”€ models
β”‚  β”œβ”€ presenters
β”‚  β”œβ”€ services
β”‚  β”œβ”€ specs
β”‚  β”œβ”€ validators
β”‚  β”œβ”€ views
β”‚  β”œβ”€ workers
β”‚  β”œβ”€ ...
β”‚  └─ module.rb/php/py
β”‚
└─ feature-b

FAQ's

"What about cross-cutting concerns?"

There are 2Β½ answers to this question:

  1. Your "cross-cutting concern" is actually part of a feature. I picked the login example because that's a common one that's used outside of the feature itself. This is just a matter of semantics and I think that you'll end up with better feature/problem/business domain definitions if you get used to thinking about everything as being a part of a feature.
  2. The concern is actually its own feature (and this isn't just a punt). An example I can think of off the top of my head is end-to-end tests for an entire app (because integration and E2E inside a feature and even across boundaries should then just be a folder or part of tests). The feature that you want to be able to put in and take out and be independent and valuable as a unit is the testing itself. Go for it and make a end-to-end feature.
  3. (or 2Β½) Occasionally (especially when still converting everyone to the promised land of folder-by-feature) people will want an app feature or something similar like root or shared. This can make sense as a place for some things. For instance if your whole app has one layout, this could work. But I recommend against putting things here and instead using your brain a bit more. Another thing that sometimes falls in here are things that should be their own libraries (like maybe all of the common UI components for your app). The solution here is either extract because it's actually an independently evolvable part of your wider tech system or put it in a feature (ui for instance).

"What about IDE help?" and "What about statically typed, compiled languages?"

Yes, certain IDE's and languages make it easier to deal with organization and visibility. Things like reference tracing and fuzzy search make the sprawl/side-effect problem more manageable. But that's the thing, you're just working with something that's more manageable when there's a way that doesn't need management to begin with.

There's also the problem of allowing others who use different tools to have a good experience and the fact that you may not have all of your development machines or environments set up identically. When you're working on a server or in a container and there's nothing but vi and ls to use, you'll be glad that you can know what references what.

"What about duplicated folders?"

"So you're just moving the controllers, models, views etc. folders into a feature. What about DRY? Why would I want to have to put a bunch of these folders around?"

This is somewhat fair, but I believe it's misguided. First off, the folder-by-feature I described here doesn't preclude you from not having those organizational folders at all. You could just have files in the feature folder if you wanted. I don't recommend that, but it's not a problem in the system since those folders are just there for convenience anyway. Second, who cares if it's long? That indicates that the feature contains those types of objects or functions. I'll reiterate that devs much more often reason about the feature vertically than horizontally.

"I don't like how long the feature list can be."

Not really a question but a concern I often hear and one that's no better in other systems. If you have a lot of code, your app in folder-by-tech will include a large list of controllers or models. Having to locate a feature in a giant list like that across folder-by-tech folders is exactly the waste of time we discussed above. Worst case here is that folder-by-feature is a lateral move and best case it's a massive upgrade.

Also, I think that it's valuable to see all the features of a project in one glance. It often indicates great opportunities to have better domain design or split out independent services. I believe that this flatter structure is a great asset to understanding our codebases.

"What about [THING] that you didn't mention?"

If it's something really unique and worth dealing with, I'd be willing to bet it's specific to a feature and if that's the case then cool, do whatever weirdness you want inside of a feature folder. This is one of the biggest advantages β€” it doesn't matter at all so long as you can make a good interface!

"What about language/framework X?"

I know that some frameworks (rails πŸ‘€) require files to be in certain places with certain names for them to work right. I personally don't like this. I don't think it saves much time and it actively hides and confuses people when we're trying to understand the code.

Look at how JS lets you just import things. Is that overly cumbersome and difficult to work with? Are you really saving that much time by not having it auto-loaded? What about the confusion that auto-loading can cause by being transparent? What about the lack of traceability that you have when there aren't explicit imports? For all these reasons and more, I'm against it, but if you're using something that requires it, just go with it. This is an opinion after all, which leads me to the last FAQ.

"Are you actually this big of a jerk about all this stuff in real life?"

No. I use different systems all the time. I just vastly prefer these to others having used all of them. The most important thing to do is just do what's working for your team. This post is obviously written tongue-in-cheek, but, with this being the internet, I felt the need to disclaim all that here.