jarias (3) [Avatar] Offline
Hi, thank you for the course! I've been reading your blog for a while and this is like having a long wanted conversation with you smilie

I would like to know your personal point of view on how to implement the many repositories approach that you mention on unit 6. We've been using monorepos for our microservices for a while now, and we have tried to replicate this for our severless applications.

However I feel that the number of subprojects explodes when it comes to serverless, and that there is some sense of heterogeneous feel about the nature of the different repos tracked in the monorepo. We have three kind of repos in out projects:

- Serverless services: which are a collection of lambdas and a serverles.yml file.
- Shared libraries: which are node modules.
- Resources: which are serverless.yml files defining only shared resources (we find more convenient this format rather than cloudformation or terraform as it is easier to reference them from other templates).

We track the changes on each subproject in our CI platform and run tests and deployments (via sls) as needed. We are using just a collection of scripts for this and it is working fine for now, but I can't stop thinking that we are building something extremely fragile.

I haven't seen a single example of monorepos from the pros... Can you share with me some of your thoughts or experiencies with this kind of model?
Yan Cui (22) [Avatar] Offline
Hi ya,

Glad you're enjoying my posts on serverless, it's been a nice creative outlet for me to get lots of ideas out of my head and organise them in some coherent way.

My personal feeling towards monorepoes is that, in general (not limited to serverless), I think they are really hard to pull off as the no. of projects and people working on those projects grow:

  • the amount of knowledge a new joiner needs to acquire grows with the overall complexity of the system (as opposed to a single project that they need to touch, if that project was in its own repo), Michael Nygard's post on coherence penalty offers a really good explanation for this

  • there's an increased chance for concepts and abstractions to leak through project boundary because accidental sharing (and therefore accidental coupling between services) is easy when sharing code inside the same repo is easy and offers less friction than to share through shared lib which needs to be published to NPM first, etc.

  • release tracking and labelling becomes more difficult, becomes every release for every service is tagged and they all show up in the same github repo

  • similarly, understanding the trail of changes for a particular service also becomes more difficult if you're sharing code in the same repo as opposed to via NPM packages, e.g. a service might have changed because one of its dependencies is changed, but if that dependency change happened via a shared module in the same repo (as opposed to via an explicit package update) then it might not be reflected in the commit history for that service's project folder

  • Of course, it's possible to do monorepoes well, but it relies on strong discipline across the team to avoid the many pitfalls present in this approach, and in my experience when you're heavily reliant on discipline, and mix in staff turnover and new joiners, then it becomes a failure waiting to happen - any moment of ill-discipline or corner-cutting (or to acquire some tech debt in exchange for temporary velocity to meet a deadline) can have a lasting effect.

    I have heard some game companies use this approach but they tend to have small, skillful, and stable team so it was probably OK for them to enforce and rely on discipline, but I think it's a tight rope to walk on.

    I think you don't see any examples of monorepoes from the pros is because pretty much all of these guys have come from the microservices world where having individual repoes for each service and shared library is the norm, and a serverless architecture is almost always a microservice architecture too - microservice is an architectural style, and shouldn't be conflated with implementation technologies like EC2 or Docker - so the same principles apply.