Monolithic repository vs a monolith

In software, a monolith architecture is one in which all application parts are encapsulated in a single component offering many services. A monolith makes sense from a convenience and developer productivity standpoint. 

In a monolith, all code is in one place, and it is easy to add features and reuse components. All developers can contribute to all parts of the code as needed. Importantly, all code in a monolith is tested and deployed together as a single unit in which everything is compatible.

It is easy to get bogged down in religious aspects of software architecture and build architectural flaws into the application that will be difficult to overcome later. Strict adherence to domain-driven architecture, for example, leads to the opposite problem to that of the monoliths. Both code and teams working on it become so decoupled they can’t perform together.

As an architect, I am not opposed to monolith architecture per se. At the onset of brand new application development, it is not always obvious what boundaries are necessary. I don’t believe that time spent in meetings trying to boil the architectural ocean is conducive to productivity. A well-designed monolith with firm logical boundaries (i.e., modules) between distinct layers of functionality is good enough to get an application out of the door.

The advantages of the monolith are therefore obvious:

  1. All code in one place is conducive to developer productivity and agility. All developers can see all code. They can contribute to all parts of the application and transfer their skills from one area to another;
  2. Code reuse and refactoring are easy because all code is in one place;
  3. Simple builds and deployments

Over time, however, services offered by the monolith develop a life of their own. Here are the main areas where a monolith begins to get in the way of a well-designed and functional architecture:

  1. Different security profiles: Some APIs in a monolith should be open to the public Internet, while others should not. Some services should live in the application-tier subnet, and others should live in the database-tier subnet. In a hybrid cloud model, some services should have access to the company’s internal on-premise infrastructure, while others should not, etc.;
  2. Different performance characteristics: Different parts of the monolith have unique performance characteristics with specialized auto-scaling rules;
  3. Different release cycles: Some parts of the monolith are project hotspots that require a fast release cycle. It should be possible to deploy hotfixes to some parts of the application without having to regression test the entire code base;
  4. Code base too large for the tooling: The code base has become so large that the toolchain can’t handle it. Unit tests run too long; compiler crashes with out-of-memory errors, etc. Some programming languages reach this point earlier than others, but JavaScript-based projects are particularly notorious for not scaling well with the size of the code base;
  5. Programming language for the monolith is inappropriate for some tasks: for example, imposing Node.js on machine learning services will result in neither good use of Node.js nor good machine learning;

A monorepo can address all of the above problems without sacrificing some of the main advantages of a monolith. Using a monorepo, you can:

  1. Keep all code in one place;
  2. Facilitate code reuse and refactoring across the entire project;
  3. Separate services based on security, scalability, and performance profiles while still having all of their code at your fingertips;
  4. Incrementally build and deploy only those services that have been modified for a particular release;
  5. Use different programming languages as needed, utilizing the right tool for the tasks;

Now, I am not advocating for all components and all projects in a company to be in a monorepo. Monorepo makes sense under some circumstances and makes no sense under others. A set of related features with related code, similar security, performance, and scalability profiles belong in a single deployable service. Services that are functionally related and have a closely aligned release cycle belong to the same monorepo.

Generally speaking, I am also not advocating for an approach taken by Google, which has some 90% of its code in a single monorepo. Standardization of tooling is good to an extent — until it inhibits innovation and agility. Developers should own the proverbial sausage-making.