Who needs distributed systems anyway?
TLDR; monoliths can (a) be flexible, and (b) possibly handle big-ish data.
As a software engineer you are in the business of solving problems. Sometimes your problems are “big” in terms of bytes of data, sometimes they’re big in terms of number of humans involved each second, and sometimes they’re just big in terms of the raw real-world complexity that you are trying to model.
A modern attitude to solving such problems (even if they’re not necessarily that “big” in any sense) is to split the problem up and put different parts of it into different containers and/or services/databases etc. But solutions that rely heavily on complexity at the infrastructure-layer are not easy to build and manage - don’t kid yourself - there’s a lot of tooling to master, configurations to configure, protocols to design, failure modes to monitor, moving pieces to test, dependencies to keep up to date, APIs to synchronise, performance bottlenecks to widen, security concerns to take seriously, and more. Of course, if you have the time and resources, you can solve many of these things in a somewhat scalable way, but ultimately these problems never go away completely.
What if we tried solving problems purely in code, building as little infrastructure as possible? That’s not to say you should run your entire company off a single monolith, but maybe you can run some large chunks of the company off a monolithic application running on a single server. It certainly sounds simple, and old fashioned, but is it in any way manageable? In particular, is it flexible enough for an Agile world full of unknown unknowns, and will it fall over when it comes to handling “big data”?
Monoliths & Big Data…
We need to ask some questions before we can resolve that “possibly handle big-ish data” claim in the TLDR at the top of the page:
- Does the system have to operate in real-time, or is it allowed a few seconds of latency?
- How big is your data, really? Are we talking hundreds, thousands, millions or billions (per second)?
- And of those hundreds/thousands of operations, is there significant variety in terms of what has to be done?
If your system doesn’t have to operate in true real time and isn’t doing millions of varied things per second, then1 you can probably get away with a simple monolith. Really? Yes…
Suggestion - you could handle hundreds or thousands of operations per second within a single program if you batch them up. If the operations don’t arrive in batches, then just wait for half a second or so, and then process the latest batch in one go. It’ll be fast, I promise. This kind of optimization might feel familiar if you’ve ever looked into the difference between the CPU and GPU paradigms, or worked with data-loaders.
If your problem lends itself to being batched up, then congratulations, you can probably avoid building lots of infrastructure and still get decent performance.
Monoliths and Flexibility…
Can a single program be as flexible as an entire distributed system?
Claim - there are several highly successful multi-million-line programs that are under active development in 2020. Think operating systems, browsers, game engines etc. Modern browsers manage to add real features at a meaningful pace, while maintaining performance and security. So don’t say it can’t be done!
Claim - one of the greatest benefits of microservices is that they require physical lines to be drawn around parts of a complex domain. However, the physical aspect of these lines doesn’t necessarily add much in itself - rather the value comes from the fact that someone has worked out where the logical boundaries are and setup a clean API for communicating across the boundaries.
Suggestion - work out how to logically separate parts of a system, but still compile the whole thing into a single program. Using a strongly typed language (or TypeScript), we could define a clear API between these different parts, and we could put each of them in a different directory (or hierarchy of directories), each with their own conventions/design patterns, utilities, data architecture, and tests etc. Communication between these logical units would take the form of a basic function call (which is super fast), and so it ends up being much harder to build an inefficient system, or a system with complex failure modes. You also get much better static analysis, simpler testing, and the option to refactor across the whole system relatively easily. That all sounds like a good recipe for flexibility.
Claim - events are a handy way of communicating between parts of a complex domain. There are several ways of using events, but roughly speaking the power of events is that they decouple a publisher from one or more downstream consumers.
Suggestion - build an application around an in-memory event loop. You may want a “micro” event loop in memory and a “macro” event queue with some persistence. But regardless of the details, even though everything is happening within one applicaiton you still get decoupling between producers and consumers. In other words you’ve accomplished much the same flexibility as available in event-ish microservices. Of course, most frontend applications (on the web and otherwise) are very much event driven within a single application, so there is nothing especially surprising about this.
So, again, when it comes to flexibility there are actually a lot of serious advantages to going with a monolith over a distributed system, and you can even steal some of the best bits of distributed systems for your monolith. We haven’t touched on monitoring, debugging and error handling here - they do need to be thought through - but suffice to say that they tend to be easier in a monolith than in a distributed system (think basic stack traces, throw statements, and try-catch blocks).
In what context was this blog written?
This blog was written in the context of solving BizOps data problems at LandTech. We have a few thousand (highly valuable) customers at LandTech, but they don’t produce that much “account-management-related” data per second, nor do we need this data to reach our account management systems instantly - we can wait a few seconds. However, there is substantial complexity in the number and scope of the systems we have - pretty much each business unit has a dedicated SaaS product (Salesforce, Hubspot, etc. 2) that they heavily rely on, and we wanted to get data flowing between them all smoothly (a long-overdue project, that became that much more relevant during the crises of 2020).
We decided that flexibility was paramount when solving this problem as it’s impossible to plan the whole thing up front and discover all the limitations of all the third-party APIs in a finite amount of time. So, we took the approach described above, that is we built a monolithic application, containing clearly separated domains, and using an in-memory event-loop built around the idea of debounced-batching. This design came fairly naturally as most of the third party APIs support/prefer requests to be batch.
So Majestic Monoliths FTW, eh?
In this case, our choice of monolith turned out to be a really successful decision (at least in the short to medium term) as we have been able to build incredibly quickly. In the longer term, now that we have much of our business operations codified in a single repository, we might like to review things and decide if a more complex architecture would help to address the challenges awaiting us in 2021 and beyond!
In general at LandTech, we really like to Keep It (Stupid) Simple, but sometimes the only sensible approach is to invest more time/energy in developing larger systems - this is something we’ve been getting better at over the last few years. If you’re interested in joining us on that journey and solving the UK’s Housing Crisis (and more!) then check out our current job postings.
- You could possibly even make this work with millions of requests per second. However, you would need some kind of proxy sitting in front of your monolith to pre-batch the requests every few milliseconds and fan out the results (in which case your “monolith” has basically become a glorified single-instance database, and you’re back to having a fairly distributed system, plus you might run into bandwidth issues unless your data is very compact!).↩
- While it’s somewhat of a pain to have so many third party systems, it actually means we’ve accomplished DDD for free, and each business unit is free to pick the best tool for the job!↩
We are the engineers behind LandInsight and LandEnhance. We’re helping property professionals build more houses, one line of code at a time. We're based in London, and yes, we're hiring!