3

Launch HN: Moonrepo (YC W23) – Open-source build system

 1 year ago
source link: https://news.ycombinator.com/item?id=34885077
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

Launch HN: Moonrepo (YC W23) – Open-source build system

Launch HN: Moonrepo (YC W23) – Open-source build system
162 points by mileswjohnson 12 hours ago | hide | past | favorite | 130 comments
Hey HN, Miles and James here from Moonrepo (https://moonrepo.dev). Are you struggling with large codebases? Well look no further! We built Moonrepo to simplify repository management, project ownership, task running, and everyday developer and productivity workflows.

If you’ve used Bazel (or another “enterprise” build system) in the past, you’re probably aware of how complex they can be to setup, configure, and use. Let alone the cognitive overhead required by developers on a day to day basis. After more than a decade in the industry, with many of those years working on infrastructure and developer tooling related products, we set out to build Moon, a language agnostic build system.

Existing systems focused solely on runtime logistics (faster builds, concurrency), while we want to also focus on the developer experience. We do this by automating workflows as much as possible, in an effort to reduce manual work. We constantly sync and verify configuration, so that the repository stays in a healthy state. We also infer/detect as much as we can from the environment/repository/codebase, so pieces "just work".

We wanted our system to be enjoyable to use and easy to understand, but also solve the same problems as existing systems. For example, configuration is in YAML, not a proprietary syntax. Tasks are defined and run as if you were running them in the terminal; no more abstractions like BUILD files. Unlike Bazel, we don’t hide or heavily rewrite terminal output, so the feedback loop is what you expect. We manage a toolchain, ensuring the correct version of languages is used (no more “works on my machine”). And lastly, our foundation is built on Rust and Tokio, so performance is first-class, the runtime is reliable, and memory safety is guaranteed.

We follow the open core model. Moon is open source, but we’re also working on a few subscription-based services for monitoring and improving your continuous integration pipelines, a registry of project and code ownership, a continuous deployment/delivery board, auxiliary application systems, and more. We haven't finalized the subscription model yet, so there's no pricing information on the website. However, we do have a starter/free tier that everyone can use by registering on https://moonrepo.app. In the future, we will offer on-prem as well.

Although Moonrepo is relatively new, we’re already feature-packed, stable, and used in production. We’re big fans of honest feedback, and look forward to your comments!

Configuration is code, so the YAML written in your tool is still code, code that developers are consuming and versioning. Making it YAML doesn’t make it not code. Your YAML configuration files are proprietary syntax, it is a proprietary code syntax written in a language (YAML) lacks basic abilities for logic or control flow. In a system of any worthwhile complexity, inevitably custom build logic will have to be expressed, and I can’t imagine it being good when it is being cooked up in a proprietary configuration language based on YAML.

How easy will it be for layers of abstractions to emerge in this system? That’s the ultimate limitation of markup-language-based / configuration-based build systems.

Look up the difference between external DSLs and internal DSLs. The YAML config file is an external DSL. I prefer to express configuration using internal DSLs.

s.gif
Why? Why YAML? Why? YAML is absolutely terrible!

Aside from the classic `country_code: NO`, the other day I ran into issues with scientific notations. Now, guess, which of the following are strings, and which are numbers:

- 1e+10

- 1.e+10

- 1.0e10

> Tasks are defined and run as if you were running them in the terminal; no more abstractions like BUILD files.

But abstractions are a good thing! They let you separate a system into layers, so that things programmed at a higher layer don't need to know all the details of the lower layers. It's the way we separate interface from implementation.

Let me give an example from Bazel. Suppose you are compiling some C or C++, and you need to define a preprocessor symbol. There are two ways of doing this, "defines" and "local_defines":

    cc_library(
        name = "my_lib",
        srcs = ["my_lib.cc"],
        hdrs = ["my_lib.h"],
        # This will pass -DFOO when compiling this library
        # *or* any library that depends on it.
        defines = ["FOO"],
    )

    cc_library(
        name = "my_lib",
        srcs = ["my_lib.cc"],
        hdrs = ["my_lib.h"],
        # This will pass -DFOO when compiling this library
        # only.  Libraries that depend on us will *not* get
        # the symbol.
        local_defines = ["FOO"],
    )
You can use "defines" when you have #ifdef statements in headers and "local_defines" when you only test the symbols in your .cc files.

I did not have to think about how defines will propagate to the libraries that depend on me, I just specified what I need and the build system handles the rest. I did not have to define my own CXXFLAGS variable and handle concatenating all of the right things together. The lower layer takes care of that.

What Bazel lets you do is create abstraction boundaries within the build system. People can write rules that define a self-contained set of build logic, then other people can use those abstractions without knowing all the details of how they are implemented.

Bazel is not perfect, and I have found myself banging my head against the wall many times. But overall the ability to define abstractions in a build system is, I think, one of the biggest things it gets right.

Disclosure: I work for Google, but not on Bazel.

s.gif
Look at https://moonrepo.dev/docs#supported-languages

They only fully support Javascript. The complex stuff, like defines, C++ toolchain, dynamic libs, etc.. is all out of scope.

s.gif
Wow true. Claiming to be a build system yet being limited to JS... Apparently, the legions of JS developers think the web is the only thing that exists today.
s.gif
Also, MIGRATIONS!

Interfaces are VERY GOOD for migrations.

If you decide that you want to stop using some compiler flag, or maybe use a different compiler, or change your python version, or...

You can right a regex to go over all your shell invocations, change them, then test, or you can do something like:

``` def build_thing(name, srcs, hdrs, migration=False): if migration: ...

build_thing( name = "my_lib", srcs = ["my_lib.cc"], hdrs = ["my_lib.h"], ) ```

You can then manually flip that to test stuff, write a script to flip it for every team, sending out a PR (because your targets can have oncalls) and they can land it, and then at the end you can flip the default and manually add a False to the holdouts.

All this stuff gives you the ability to do hard stuff at scale.

s.gif
Maybe that’s more relevant for the non-js ecosystem? I never find myself wanting to modify my source while it is being built.
Bazel has a heavy focus on correctness (pretty much to the exclusion of everything else). Where does Moon fall on the correctness gradient? Does it enforce hermecity or deterministic builds or give me tools to accomplish it?

In the same vein as those questions how does caching work? Is it content based like Bazel or mtime like Nx et al? If there is no sandboxing does it do input tracking or is there manual "cache key" shenanigans?

If the configuration language is YAML how am I expected to implement new behavior? Is that in Rust? Is there a plugin architecture? Do I need to rebuild the tool itself and redistribute it to my build machines and developers? The main appeal of Starlark/Python in build systems is ability to create macros or in many cases define entirely new rulesets to support new tools and languages without needing to modify the complex and performance sensitive core.

Sorry for the skeptisicm but build systems are very complex beasts and new entrants like Nx don't measure up to tools like Bazel very well.

s.gif
I think it's hilarious to say bazel focuses on correctness when it doesn't ship with any hermetic toolchains by default. Properly setting up hermetic toolchains is poorly documented and left as an exercise to the reader.

I say this as someone who wants to love bazel... I just can't understand why it picks up impure toolchains from the system at all.

s.gif
Yeah it can be done but it's painful. In general the Bazel rules are lower quality than Bazel itself - and rules_python is one of the better ones!
s.gif
You have to remember that it was the very first system to even attempt hermetic & deterministic builds.

Combine that with the fact that C++ toolchains at least make assumptions all over the place about being spewed over `/usr`... I think they just bowed out to practicality at that point.

You're definitely right it should use hermetic toolchains. I'm curious which other Bazel-like build system does that by default?

s.gif
To be clear, Bazel focuses on correctness because it's essential to acheiving performance at scale.

If you don't have correct caching of intermediate build artifacts, a system can't handle the compile and test requirements of large codebases.

s.gif
> Does it enforce hermecity or deterministic builds or give me tools to accomplish it?

I wouldn't say moon is hermetic, nor are we trying to be. We don't use the sandbox approach for tasks, and run against the original files. For the languages we support, this works best.

As for deterministic, we try to be. We have a "toolchain" where we download languages/tools in the background, and run tasks using these tools. This ensures, at minimum, that the same version/variant of a tool is used across machines.

> In the same vein as those questions how does caching work?

It's content based. We also hash other kinds of inputs depending on the language that is running.

> If the configuration language is YAML how am I expected to implement new behavior? (and other questions)

Our focus right now is on non-compiled languages, primarily web languages, where custom behavior (like Starlark) is not necessary. In the future, this may change.

s.gif
How do you target large codebases then? The whole point of hermeticity is to allow you to reliably not rebuild and test everything on every commit which is the issue most large codebases have.

Also a bit disingenuous to describe it as language agnostic if it only really works with non-compiled languages.

s.gif
Bazel as a taskrunner/build system is to most monorepos as a Pile Driver [1] is to a floorboard nail. It will do everything but it's massive overkill. We are agreed that Nx doesn't measure up, but I wouldn't compare it to Bazel.

[1] https://en.wikipedia.org/wiki/Pile_driver

s.gif
When you start having 100 applications (a typical game may have that amount), even if only 10 of them are thorouhgly tested, you end up needing that pile driver, or you end up compiling all 100 of these on every presubmit. Just sayin.
s.gif
Fair point. I treated the product as a build system and it just doesn't look right. However it makes perfect sense if it advertised itself as a "task runner".
> We wanted our system to be enjoyable to use and easy to understand, but also solve the same problems as existing systems. For example, configuration is in YAML, not a proprietary syntax

I'm incredibly skeptical of this.

I'm ex-meta and have worked a lot with the enterprise solutions you're talking about and the choice of starlark (originally python) as the build definition language is one of the killer features of the systems.

People want to create macros, target generators, etc. It's a common use case for a lot of engineers and IMO is a pretty killer feature.

Being able to say "This is an X86 Python Binary", "This is an M1 python binary" and then bundle those into "this is how you build either of those binaries based on inputs" without ever touching the internals or anything other than (more or less) a blob of python is why those tools scale organizationally.

It allows the teams that need to do weird stuff to unblock themselves without drowning the tools org. Sure, it has draw backs. Super deep macro layers are kinda a crime against humanity and debugging/evolving them can be quite expensive, but I think that's just the cost of software.

If that logic isn't in the build definitions it'll expand into a meta layer that generates configuration (I've seen giant "Translate this definition into 30 configs to run stuff" systems time and time again).

I may just be super biased from past mistakes and wins, but I think what you're doing is just moving the complexity out of your tool into neighboring tools and selling it as a win isn't really true, it's shuffling complexity around not removing it.

s.gif
Based on everyone's feedback about YAML (we didn't expect this much), we'll probably reconsider this!
s.gif
Glad to hear it!

To synthesis my comment down (because it's easier once I've written it once, poorly):

The complexity you see in Bazel/Buck/Pants build files may seem like a result of their decision to use a programming language. That's a red herring. That complexity is fundamental to the problems people need to solve to build software. If you remove the ability to solve it in your build system the complexity will move to other systems.

Wrappers, CI, whatever. The complexity is just the problem space rather than a result of bad tooling.

s.gif
There is also a Rust implementation of Starlark as a starting point https://github.com/facebookexperimental/starlark-rust

To add to everyone else, please don't use YAML. Starlark is great _precisely_ because it is a readable, well known (nearly Python) language that is limited at the same time (no unbounded for loops, no way to do non-deterministic things like get the current time or `random()`).

s.gif
Take a look at Apache Aurora [1] if you want some inspiration on how to mold python into a config language. I used this system for a few years and mainly agree with the person you're replying to – having a proper programming language to define config is a very nice feature that I miss.

[1]: https://aurora.apache.org/

s.gif
one of the benefits of starlark (unlike python): "Starlark is suitable for use in highly parallel applications. An application may invoke the Starlark interpreter concurrently from many threads, without the possibility of a data race, because shared data structures become immutable due to freezing." from https://github.com/bazelbuild/starlark/blob/master/spec.md - it's not python, you can't do recursion (!) and it's more limited (you can't read a file in bazel, and parse it, you have to make this operation into the graph somehow)
Sounds great but why on earth YAML? It's too ambiguous and lacks expressive power. Starlark is actually one of the good things about Bazel. Alternatively, why not Dhall?
Came in here to say that Miles is an absolute fucking unit of an engineer / thinker. We just adopted Moonrepo at Gallery and it's been excellent. He's felt the pain of other tools (bazel, nx, turborepo). So happy to see this launch on the front page
My advice to anyone making a new build system:

Most likely you did this because you felt all the other ones are too complicated.

But the reason the “enterprise” ones are so complicated is to serve their enterprise customers, who need “just this one feature” so they can use it. But those customers pay the bills.

So basically you have to choose complexity and profit or simplicity and less (or no) profit.

Good luck! But make sure you’re ready to make that choice.

s.gif
Are you sure you can't have your cake and eat it too? You can have many configuration options, but give each one a sane default.
s.gif
In my experience (Bazel, sample size of 2 projects), the complexity doesn't come from configuration options that have defaults, but from how well the "mental model" of the build system fits the preconceived notions of how to structure, organize, and depend on code in an existing project.

Almost none of the complexity comes from what configuration options I've registered ahead of time. It comes almost entirely from, "Well darn, this code depends on this completely unrelated part of the project. I wish it didn't, but now the build tool either needs to sometimes fail to rebuild something correctly, or it needs to build way too much to run quickly."

IMHO, if you're targeting the Javascript ecosystem, this area is already fairly crowded, with Turborepo, Nx and various open source tools providing various degrees of functionality (Bazel, Pants, Lerna, etc) already competing in the space.

I'm a tech lead for the web monorepo at Uber. We talked to the Turborepo guy a few years ago, and he admitted that he wasn't sure if it could handle our scale in terms of all the bells and whistles that a repo of our size leverages - and his is one of the more feature packed commercial offerings in this space.

As a random example: we see thousands of commits a week, so invalidating the whole build graph when the lockfile changes is a complete non-starter, yet most turn-key solutions target small teams and are not equipped to cope with this problem. Phantom dependencies[0] are a problem with disjointed build graphs. Etc.

As someone who's evaluated a ton of these systems, I would love to see where your experience in this space is coming from. A new kid in the block will have a lot to prove.

[0] https://rushjs.io/pages/advanced/phantom_deps/

s.gif
We agree. We weren't happy with all of the current solutions, at least in the JavaScript space.

In regards to build graph invalidation, we attempt to hash pieces at the granular level. This includes per file content hashing, and for dependencies (those in `package.json`), we parse the lockfile and extract the resolved version/integrity hashes. We can probably improve this further, but this has been working great so far.

As for phantom dependencies, this isn't something moon solves directly. We encourage users to use pnpm or yarn 3, where these phantom problems are less likely. We don't believe moon should solve this, and instead, the package managers should.

If you have any other questions or concerns, would love to hear them.

s.gif
I guess I'll echo some of the other comments re: extensibility. For us, Bazel's Starlark language is useful because it allows us to integrate w/ non-JS things like protobuf schema definition files. I suspect most companies are going to have hybrid stacks instead of Node.js-only ones. FWIW, Bazel vs Buck have some history wrt adoption wars, and Bazel largely pulled ahead due to a better ability to flourish an ecosystem (hell, the entire JS ecosystem around it is done via the extensibility system). It very much reminds me of how Typescript won against Flow.
s.gif
> We don't believe moon should solve this, and instead, the package managers should.

My IDE solves this in that I see red underlines, but I also want my build to fail as well. Not everyone uses an IDE that would show this error.

s.gif
Turborepo author here…

We do not invalidate the whole graph anymore for a lockfile change. We now have a sophisticated parser that can calculate if a change within the lockfile should actually alter the hash of a given target. In addition to higher cache hit rates, this is what powers our `turbo prune` command which allows teams to create slices of their monorepo for a target and its dependencies…useful for those building in Docker.

Prune docs: https://turbo.build/repo/docs/reference/command-line-referen...

Turborepo is much more scalable now than when we spoke pre-Vercel acquisition. It now powers the core web codebases at Netflix, Snap, Disney Streaming, Hearst, Plex and thousands of other high-performance teams. You can see a full list of users here: https://turbo.build/showcase

Would be happy to reconnect about Uber’s web monorepo sometime.

s.gif
Can Prune be used to build a bundle (as in a zip) for, say AWS Lambda, which includes only the dependencies (and not dev dependencies)? I've played around with pnpm's deploy but it felt a bit lackluster. Especially talking about situation where one has a backend package and some shared package. The bundle should contain all dependencies (but not dev dependencies) of the backend package and shared package and of course the built shared package should also be include in the bundle's node_modules.
s.gif
Kind of. Poster asked if you can prune only deps and exclude dev deps. That's currently unsupported: https://github.com/vercel/turbo/issues/1100
s.gif
Nx is hot garbage imo. Buggy, overly verbose, inflexible, poorly documented, and I'm dubious about their peer review process.

If I don't need build caching I'm not using any tool but PNPM and its workspace toolset - that's literally all most people need for a monorepo. I've looked into Turborepo, and its simplicity versus Nx is a strength. However, it's not the taskrunner that I want.

I now work in a monorepo where build caching is required, so I'm excited about moon and keeping a watchful eye on the project's progress. From my evaluation so far, it fixes all of my gripes about Nx and I'm keen on it not trying to do too much, while allowing me to make it as flexible and extensible as I need. Extending configs is chefs kiss

s.gif
That is really interesting.

NX has been on my "to learn" list for a long time and high enough that I have given it a few stabs with personal project but something have failed every single time. I chalked it down to my setup or my inexperience but maybe it just isn't as good as I thought.

s.gif
If you end up using moon, and there's anything abrasive/convoluted, let us know! We're always looking for ways to streamline.
s.gif
Nx is a pita (imo). I have had amazing success with rush and pnpm.
You're building something like buck/blaze and replacing starlark with YAML ? I want a full programming language when defining complex rules. Build systems do a lot more than execute commands.

You lost me there. Buck/Blaze aren't perfect but configuration was never an issue.

Moon like it is currently is a glorified task runner.

s.gif
We started using https://nuke.build/. Early days so can’t comment too much but it seems good so far.
We've started using Rush [1] at Buffer. Teams have been slowly migrating, and it has been great so far. We had (too many) repos with their own workflow and way too many different ways to build services, which has been annoying to maintain. I know teams use Rush at scale: Tiktok has ~450 projects, and Microsoft said they have ~700 projects.

To deploy the services locally, we use Tilt[2] (K8s for local). We want to be able to reproduce production as much as possible and remove developer overhead on how things work locally and in production.

Then come the issues with Docker and large node code bases:

1 challenge with large monorepos is the huge node_modules folder (rush among other tools put packages into a single large node_modules folder, and symlink every dependency there. It can contain millions of files and GBs, depending on how many 3rd-party libraries you use). On Linux, you can mount it without issues, but Mac has performance issues[3] with large folders.

We pre-build that huge node_modules into a "base" image, and each service in the monorepo pulls that base image and only mounts what's necessary (a few mb). So we can save that npm build time and don't need to copy all those files inside the containers. It is fine because package.json does not change that often. You need to do this pre-build locally then in your CI/CD -> dockerhub, so everyone can get it.

Another challenge is that you need to "watch" files to rebuild them. Watching all your files inside the monorepo isn't really viable. We use a dependency graph to know what services to watch and then copy the built files inside the Tilt containers.

Hope this can help people.

[1] https://rushjs.io

[2] https://tilt.dev/

[3] https://github.com/docker/roadmap/issues/7

What's the motivation for using YAML instead of Starlark (Bazel, Buck) or something closer to Python (Pants, please.build)? Seems as though much of the other monorepo tools have (kinda) standardized on this.

As I understand it, the primary reason these build systems leverage these Python-variants is so that the build rules, toolchains, constraints, and build definitions can all be written in the same language (since build rules often require some programmatic behavior). Perhaps with a future vision of them being totally interoperable across build systems.

s.gif
Not to mention using an actual language aids readability, extensibility, and static analysis, unlike a data exchange format. Starlark is a benefit, not a con, so YAML feels like a major step backwards.

I'm generally happy with Blaze/Bazel, so I'm not necessarily in the target market for Moonrepo, I guess.

EDIT: This isn't really competing with Blaze/Bazel either when I look at the execution model. It goes back to imperatively defined tasks instead of declaratively defined dependencies, which feels more spirtually aligned with Make than Blaze/Bazel.

s.gif
> Not to mention using an actual language aids readability, extensibility, and static analysis, unlike a data exchange format.

Clearly, you've never used Gradle.

s.gif
Yeah that's fair feedback. At this time, we consider ourselves more of a Bazel-lite than an actual Bazel replacement. Once we support more languages and features, this may change in the future.
s.gif
I'm not trying to be overly critical here, in fact, I want to share the type of empathetic feedback I'd hope to receive as a founder.

Have you actually talked to Blaze/Bazel users to understand what frustrations (if any) they have with their current build system? Have these users asked for a Bazel-lite? If so, and you still want to position yourself as Bazel-lite, then you should include some of their direct feedback and write your messaging accordingly.

As a daily Blaze/Bazel user, I don't have a desire for a Bazel-lite. I've worked at a midsize company that used Bazel, and I'm working at the company that created Blaze/Bazel.

Disclaimer: Opinions expressed are my own; not representative of my employer.

s.gif
Let me give them feedback in the vein of what you're asking for. I want the following as a user:

- Distributed cache of built artifacts. Work with my company's network topology and security requirements. Make this easy to setup, operate, and seamless for developers to opt into

- Seamless monorepo support (multiple languages)

That could be considered Bazel-lite. And yes, I'd take it over Bazel currently.

If you don't work at Google, convincing your engineering organization into learning Bazel is almost always a non-starter. Who uses Bazel in the wild? Xooglers primarily.

Their value proposition is sound.

s.gif
No one has, to my knowledge, argued that Bazel isn't useful or worthwhile. My point was adopting Bazel is costly for an engineering organization, and you may not need all of its features. Bazel-lite may be enough.

Large companies, with established platform / ops teams, who can support Bazel and push for its adoption within an engineering organization definitely exist. There's probably a few Xooglers working there who'd like to see it happen too.

s.gif
> If you don't work at Google, convincing your engineering organization into learning Bazel is almost always a non-starter. Who uses Bazel in the wild? Xooglers primarily.

My response shows these hypotheses to be false. You’re correct, no one’s arguing whether Bazel is useful or worthwhile, not even I.

> Large companies […] who can support Basel within an engineering organization definitely exist.

Ok great, we agree. You’ve revised your point from earlier.

> There’s probably a few Xooglers […]

A baseless hypothesis that’s confirmed false by all the references shared with you. Feel free to plug all the lead engineers’ names into LinkedIn to verify that the majority are not ex-Google.

> Bazel-lite may be enough.

Sure, it may be. That’s why I’m offering the constructive feedback to go talk to the thousands of active Bazel users to see whether and what they want from a Bazel-lite, if that’s the positioning Moonrepo wants to choose.

Usually, there’s a strong rationale and need behind an organization’s adoption of Bazel. Some of that rationale is captured in the references shared earlier. These organizations need Bazel; they’re generally happier with Bazel than their previous systems. These baseline needs represent table stakes for any build system that wants to compete with Bazel.

If Moonrepo wants to compete where Bazel is weak, then I’m suggesting that they need to sharpen their communication such that an engineer well-versed in Bazel has an “aha!” moment within 2 minutes. The proverbial elevator pitch.

Once again, I’m not dissuading Moonrepo from pursuing their vision. I’m offering constructive feedback on their messaging/positioning — the type of feedback I’d like to hear as a founder.

s.gif
... you seem highly invested in this so I'll humor you.

Even in my original reply:

> If you don't work at Google, convincing your engineering organization into learning Bazel is almost always a non-starter. Who uses Bazel in the wild? Xooglers primarily.

Emphasis on almost since you've missed it three replies now.

What are you even getting at here? I freely admit that some companies adopt Bazel. Many do not. You doing a Bing search to find the ones that do doesn't change that fact.

> Ok great, we agree. You’ve revised your point from earlier.

I haven't revised anything. You misread, but feel free to be needlessly hostile defending a product your current employer makes against a newly launched competitor. I use the word competitor loosely. It's not a good look either way.

> That’s why I’m offering the constructive feedback to go talk to the thousands of active Bazel users to see whether and what they want from a Bazel-lite, if that’s the positioning Moonrepo wants to choose.

Why would they need to talk to people who are actively using Bazel to see if they want Bazel-lite? It's a small population that's already, presumably, well served.

There's a whole other, much, much larger segment of customers who'd love to use Bazel-lite that aren't using Bazel, for many reasons, including the high cost of adopting and supporting it. I'm one of those customers. Unsure why this point escapes you.

Your advice, by the way comes off as attacking their idea and trying to tear it down regardless of how many times you preamble it's constructive criticism.

s.gif
A lot of our decisions were based on our past experience with Bazel. We both have worked at companies that tried to use Bazel and failed miserably. There's stuff we like about Bazel (and copied in some capacity, like file groups), and definitely parts we hate about Bazel.

Bazel doesn't work well for every language, but for those languages that it does, it definitely makes sense to use Bazel. For those languages that work better with something more lightweight, that's where moon comes in. I'm assuming you work at Google, so you're experience around Bazel is probably much better than those that don't work at Google.

I appreciate all the feedback and comments though, very much appreciated.

s.gif
I see that you are targetting the Web ecosystem. IMO using Bazel in this case was pretty tough, and may still be. The way Bazel does things may not make sense for 95% JS projects (and 99.9% FOSS projects), so I wonder whether it's better to simply drop the mention of Bazel. Just don't compare to Bazel.

Better “monorepo-ish” (whatever that means in the frontend world) tools are still valuable, but as GP said, it's pretty crowded here.

I don't know if you have talked with people working on C++ or Java projects (I see that these are not among your supported languages), but if you do, you get to see why people are talking about Bazel.

s.gif
Yeah, we kind of regret mentioning Bazel in the original post, but we can't change it now.
s.gif
I think it's more illuminating to talk to people who tried to use Bazel and failed, often wasting months in the process.
s.gif
We chose YAML for a few reasons. The first being that we wanted a format that is basically universally supported everywhere. This filtered down to JSON, TOML, and YAML. JSON is an awful configuration format, so that was a no go. TOML is pretty great, but is also not very ubiquitous. That left us with YAML.

The second reason is we wanted something language agnostic and not proprietary. It also helps that many other tools, like GitHub actions, are written in YAML.

And lastly, we wanted a format that developers are familiar with, and won't need to spend time learning. Requiring developers to understand a special syntax simply to define a task to run is something we want to avoid.

s.gif
IDK. I've been using buck for a bit, and having python within grasp is pretty useful. I don't like yaml for a variety of reasons, but the biggest of which is I like my config being able to be generated... maybe that's an anti pattern? IDK. Would love to know more about your take on yaml vs anything more pythoney.
s.gif
I'm not very familiar with Buck. But I'm assuming it works the same way as Bazel's Starlark?

I can see the benefits of "generating code" within the BUILD file, but honestly, we haven't required it yet, and have been able to do most things with explicit configuration. Our token syntax helps with some dynamic aspects of this.

Maybe in the future we'll revisit this.

s.gif
Sorry, I meant more dynamic than generatable... like doing stuff in the build file to build something complex in a more simple way.

It's really 'do you make config an artifact, or do you make it code' sort of argument.

s.gif
Gotcha, understood. Mirroring @Denzel's edit, I think that rationale makes sense for configuring some of the intermediate glue which is project-specific. For instance, running test suites, orchestrating deployment, and other more custom actions which aren't terribly reusable across projects (think Bazel's 'genrule'). Caching test results in a distributed way is a great use case, and environment isolation (i.e. without forcing the user to define a Dockerfile or similar, and leaving environment configuration to the build system) seem like great use cases.

However, I think for describing actual build toolchains this way of implementing things might end up being significantly more complicated (and require even more arcane end-user cognitive load) than the Starlark/Python-based build rule / toolchain / constraint approach for actually assembling libraries, binaries, and other compiled artifacts. There are a combinatorial number of backends, conditional options settings, etc. which will be hard to capture with a purely declarative system. For instance, a C++ binary might needs platform-specific compiler flags, or some #ifdef nonsense. YAML doesn't have a clean way of implementing conditionals based on some constraint. So for heterogenous ecosystems (C/C++, Python, container assembly, GPU development, microcontrollers, etc.) this pushes a ton of complexity into the build rules themselves which may be opaque to end users (and thus introduce a ton of cognitive load, steep learning curves, hard-to-debug errors, etc.).

s.gif
I don't disagree here. At the moment, moon leans closer to a task runner, with hashing, incremental caching, and other features tacked on. All of our currently supported languages are not compile based languages, and run with pre-built binaries.

Once we starts supporting more languages, especially compile based ones (probably Rust first), our decision around YAML will probably change. Off the top of my head, a non-YAML format would probably be used as a secondary format, kind of like how a Dockerfile works.

s.gif
> The second reason is we wanted something language agnostic and not proprietary.

You will end up developing your own yaml-based DSL that is incompatible with everything else, has weird limitations and poorly documented constraints... just like any other YAML-based system.

And by the time you'll have written a 1000-line YAML (where half of the file is nothing more than awkwardly quoted and escaped bash scripts) and realise that YAML cannot be split into reusable files, it will already be too late.

Oh. I just saw that you already use a custom non-standard extension of YAML with `extends` so there you are

s.gif
I much prefer YAML rather than having to write more code for a build system.
A few months ago I tried moonrepo and couldn't get it to work. IIRC, it tries to bring its own nodejs and has no option for using the system provided one - this breaks entirely in NixOS as Nix's nodejs is patched for the lack of FHS.
How do you compare with BuildBuddy? BuildBuddy embraces Bazel, but makes it easier to setup and operate Bazel when you don't have Google infrastructure.

- https://www.buildbuddy.io/

- https://github.com/buildbuddy-io/buildbuddy

s.gif
We have a long ways to go, but our end goal for moonbase is basically buildbuddy, but for non-bazel users. Right now moonbase requires moon, but we're looking to decouple it.
Congrats on the launch! I've been following Moon since a few months, seems like an interesting project.

Could you explain why any existing project using Turborepo/Nx should switch to Moonrepo? What are the advantages and disadvantages? The support for multiple languages seems like a big advantage.

s.gif
I can speak to both of these.

I'll start with Turborepo. Turbo is primarily a task runner for `package.json` scripts with some caching... and that's basically it. If that's all you need, then great, but if you're looking for more functionality, that's where moon comes in. moon is more than just a task runner, we're aiming to be a repository management tool as a whole. This includes project/code ownership, direct CI support, future CD support, code generation, hooks management, constraints, release workflows, and much more. With that being said, we do have a comparison article against Turbo: https://moonrepo.dev/docs/comparison#turborepo

As for Nx, they're more of a competitor than Turborepo. Nx and moon are aiming to solve the same problems, but go about it in different ways. Nx is Node.js based and requires heavy adoption of their ecosystem (@nrwl packages) and their executors pattern. In the long run, this becomes a heavy source of tech debt, as your dependencies are now tightly coupled to their packages and release timelines. With moon, we wanted to avoid this all together. There are no coupled dependencies, and tasks are ran as if you ran them yourself on the command line. No abstraction layer necessary. We also want to embrace a language's ecosystem as much as possible, so moon adoption should be rather simple and transparent (at most each project has a moon.yml file).

But to your last point, we agree, multi-language support is a massive advantage. Having both backend and frontend code in the same repository, powered by the same build system, is a massive win in maintenance costs and developer time saved.

s.gif
Thanks for your detailed answer.

> release workflows

Looking forward for this, especially if that also means auto-publishing of NPM packages, Rust crates, etc.

s.gif
Yup exactly that! We want a single to to handle version bumping, changelog generation, publishing to a registry, etc, for _all_ languages that we support.
It's built in Rust and does not have full Rust language support? ... how?
A home page that literally just has company name, log in with github, and "learn more about company name" doesn't seem a very convincing pitch.

I'm not going to click "learn more" because you've given me absolutely no reason to. The page may as well just be "log in with github" and nothing else. That'd serve the same functionality, and have an equally convincing pitch to prospective customers.

Awesome, congrats. I've been an early trier of moon repo and I really fell for the slickness of the website and the name, when evaluating build tools.

I've primarily worked in typescript codebases and have used raw yarn workspaces, lerna, nx and recently evaluated moon and turbo.

The funky part is I eventually simply went with nx, not just because I've used it before, but also because I felt like the configuration is just simpler and more lightweight. Esp. since you can pretty much roll with it without defining much more than single simple config file - while moon required some 2-3 separate config files, plus config files in each project (I understand the per-project config is not required, I don't remember why I needed it - something to do with my build tasks)..

In any case I intend to give it another shot soon.

As for the whole config in yaml vs json vs toml or whatever, not a deal breaker since most of the time these sorts of configs are perhaps not something you end up interacting with programmatically - I know a lot of people enjoy editing yaml more than json files

s.gif
I burned few months evaluating and for trying out some monorepo options, for a cross-platform typescript project. Eventually I picked Nx, the main reason being having single package.json.

Besides, yes, nx comes with a single configuration file for each project, but alongside of it are jest, eslint, babel and app/lib/spec/ide tsconfig configs - that’s a lot!

All of this shouldn’t be visible to the developer. Initially, when trying to find myself in this mess, I thought that the solution lies in autogenerated ide workspaces - for vscode and sublime. But in practice it wasn’t that helpful, because there is always something which is not handled by autogenerate multiroot structure but is needed, so one needs to have multiple windows open anyways.

Hopefully typescript 5.0 is going to help reduce some of this boilerplate with multiple tsconfig base classes, so lib/app/spec/ide tsconfigs will be able to extend from common base, which currently is not possible.

The worst part of Nx is that there is lack of ssr support, so I had to patch nx with patch-package to generate obfuscated css classes in prod, because currently it’s not possible with their webpack executor! Recently they tried to simplify it a little bit, reducing this webpack boilerplate, which introduced few bugs but it's better than not doing anything!

However, saying all of that and as someone mentioned in previous comments, dx is not primarily improved by lack of config boilerplate, but by having semantic structure of the actual code, clearly knowing what depends on what and where one can find it and put it. Right now we are on our own, because there is lack of information on how to structure cross platform codebase properly. I strongly believe it will change though. Best of luck to the people at moon!

Ps. Nx has a nice blog post why they do t use bazel under the hood https://blog.nrwl.io/on-bazel-support-6be3b3ceba29

s.gif
Thanks for the feedback. Too much configuration is something we keep thinking about, and are working to streamline. It's a bit involved since we support multiple languages.
I'm supremely disappointed to see another service using YAML to configure task running. I do in fact need a real programming language to do this, and copying others in this vein is inheriting mistakes and not picking a battle-tested solution.

What you will find is the vast majority of your configurations will invoke a make.sh script that does everything that you want to support in your system.

s.gif
Yeah, I’m kind of aligned with this. Github workflow definitions would be much more pleasant to write if they were plain Typescript.

Feel like any data definition language eventually bolts on a shitty form of templates.

s.gif
And doesn't any imperative configuration atrophy toward spaghetti code? I've tried both, and usually they're as bad as each other on complex projects.
s.gif
Conversely, if moon required code instead of config, I wouldn't give it a second look. Ecosystem needs vary greatly.
s.gif
Can you speak to what kind of "functionality" you need a language for? Are you referring to Starlark-like files?
s.gif
Turing completeness. No, I'm referring to using Python (waf, conan, etc), Javascript (nodejs scripts, esbuild, etc), and other "real" programming languages.

It's incredibly frustrating to have to configure a build with something as shitty as YAML. There's a tangible amount of money I have wasted in organizations on it.

A build is not actually a static configuration of another system. It's a program and deserves everything we need from programming languages.

s.gif
You actually don't want a real programming language because their unbounded loops and standard libraries are sources of non-determinism that can introduce incorrectness in your build. Bazel's correctness is highly contingent on the same set of inputs producing the same outputs, and granularly tracking dependencies, running commands in sandboxes AND having a restricted build language is a key part of it.

That said, Starlark is way closer to a real language (Python) than YAML.

s.gif
Unbounded loops have nothing to do with determinism. You can be deterministic and still Turing-complete.
(for context - I'm not interested in first class node support)

This seems pretty cool. I particularly like how 'gradual' it seems to be relative to things like Bazel, i.e. you can take some shell scripts and migrate things over. I did have a play and hit an initial problem around project caching I think, which I raised at [0].

One comment, from the paranoid point of view of someone who has built distributed caching build systems before is that your caching is very pessimistic! I understand why you hash outputs by default (as well as inputs), but I think that will massively reduce hit rate a lot of the time when it may not be necessary? I raised [1].

Edit: for any future readers, I spotted an additional issue around the cache not being pessimistic enough [3]

As an aside, I do wish build systems moved beyond the 'file-based' approach to inputs/outputs to something more abstract/extensible. For example, when creating docker images I'd prefer to define an extension that informs the build system of the docker image hash, rather than create marker files on disk (the same is true of initiating rebuilds on environment variable change, which I see moon has some limited support for). It just feels like language agnostic build systems saw the file-based nature of Make and said 'good enough for us' (honorable mention to Shake, which is an exception [2]).

[0] https://github.com/moonrepo/moon/issues/637

[1] https://github.com/moonrepo/moon/issues/638

[2] https://shakebuild.com/why#expresses-many-types-of-build-rul...

[3] https://github.com/moonrepo/moon/issues/640

s.gif
Thanks for the feedback and prototyping with it immediately! Always appreciated to get hands on feedback.
Instead of yaml have you looked into CUE?
Would this be overkill for a personal project that is currently two repos, but I'm having to copy-paste code from one to the other whenever I make a change to a particular section?
s.gif
Can you talk a bit more about what kind of code is being copied?
s.gif
The main repository is a single page app that's mostly Typescript compiled via Svelte down to a JavaScript bundle.

Most of that repository is a series of modules in their own directories (e.g., "src/modules/character").

The second repository is a Node.js API. The majority of its functionality comes from one of the modules from the first repository that I copy in its entirety to the other repository whenever I change it.

s.gif
You should check git submodules out. Basically you could have your api-repo checkout certain commit from the spa-repo to a folder in the project root and reference the module reltive to that path.
s.gif
Gotcha. While moon doesn't solve this directly, you do have a few options here:

- Use git submodules. Have the node repo have a submodule on the app repo.

- Publish the shared code to a private registry (like github's package registry), and pull it in as an npm package.

s.gif
Could you add the character module as a new package, the import it in the server with something like “char-package”: “file:../../client/modules/character”?
Apologies if this sounds reductive but can I think of it as an open source CircleCI?

The thing that annoys me most about CircleCI, TravisCI, Github Actions and Appveyor is that there is no simple way to run the same thing locally to debug or test workflows without creating either git history mess or temporary hacks like taking off branch restrictions in the yaml.

one of the reasons Bazel needs BUILD files with explicit inputs /outputs defined per file is to do fine grained incremental builds and test runs. so if i change say foo.c, i only need to recompile foo.obj and run ‘foo-tests’. Moon seems to take globs as input. Thus modifying even a single file inside ‘src’ dir will trigger rebuild/retest of the entire ‘project’
s.gif
Our configuration is using globs but under the hood we content hash all the files that match the glob, and only run/cache if the aggregated hash has changed.

For the languages we currently support, this is more than enough. Once we dive deeper into compiled languages (probably starting with Rust), we'll look into more granular reactivity and possible use something like sccache.

s.gif
Bazel's glob does the same thing. I'm not sure what the OP means, Bazel's incrementality is aided by fine-grained build input specification, but it's incrementaility is at core a combination of deterministic build rules, storage of build rule execution history, and early-stopping.
s.gif
I don't see how that solves the problem mentioned by OP. If the build rule mentions foo.c, we only need to recompile that one object and relink. When you are using globs, changing one file changes the aggregated hash and then would necessitate recompiling every object file.
I reviewed the market for JS monorepo tools a few months back and found NX to be a strong choice. Looking at Moon it seems like the syntax is quite nice but I don’t see graph feature of NX? Can moon only run on affected packages?
s.gif
Yes it can! That's pretty much the only way it works. The `moon run` command will only run if affected by changed files, and `moon ci` will only run affected tasks/projects in CI pipelines.
Congrats on launch! Do you have examples of monorepos, for example React with TypeScript and Django with Python?
s.gif
Thank you! We have an example monorepo (https://github.com/moonrepo/examples) but at this time it's only JavaScript. We're working on adding Go to this repo, but we personally don't have enough Python experience to add Python. Always open to contributions!
Congrats on the launch! Indeed interesting space. I wonder if it has some overlap with https://skaffold.dev/. I would be curious to hear the differences.
s.gif
At a quick glance, I don't believe there's any overlap.
Why would I want to use this instead of, say, GitLab's CI/CD Pipelines or Azure Pipelines?

It's a fair bit of effort to change one's already established code repository and build system, and I don't really understand from your pitch what you are offering that is sufficiently advantageous for a decision maker to choose to switch.

s.gif
So moon wouldn't replace your actual CI pipelines. It would be a tool that runs _in your pipeline_ to effectively run tasks as fast as possible.

For example, in CI, tasks are only ran if they are affected by files changed in the pull request. No more running everything unnecessarily. We also support remote caching of artifacts, which helps to speed up CI even further, by avoiding long build times.

s.gif
Interesting, so does it keep all the intermediates from previous builds and then does an incremental build on top of this? Like doing a local dev build but in the cloud?
s.gif
Yeah at a high-level, that's how it works. However, the incremental caching part does require remote caching of artifacts, so that they can be shared across CI jobs/runs.
s.gif
Thanks for explaining, that does sound compelling, and useful that it can be slotted into an existing system.
s.gif
We've had a lot of success with Brisk (https://brisktest.com/) doing this exact thing. It keeps our environment running so we don't have to rebuild on every run, kind of like a dev build. It's really fast.
congrats on the launch! can you share any information on how the system is currently deployed? or how it will be deployed for on premises solutions?

also: are you looking for any other founders?

s.gif
We haven't built the on-prem solution yet, but we're leaning towards using Helm charts + Kubernetes. At minimum, it will all be Dockerfile based.
That sounds like a "there are too many competing standards, let's make standard to unify them!" situation.
Gonna plug our setup which is Justfiles[1] and turborepo.

Just is a task runner (with full shell integration) that calls our turborepo tasks.

We define all of our tasks in one justfile (things like repo setup, syncing env vars, and compiling dependencies) and then link them to turbo processes which cache the result.

Massively reduced our cognitive load working with our monorepo, and is lightning fast.

If we ever want to change it will be simple to remove both, so we're not tied to the ecosystem.

[1]https://github.com/casey/just

How are you going to deal with the mass diaspora when your runway runs short and VCs want to make money?

This seems to be a cyclical thing with build tools. Things are great until you expect customers to pay.

I'm not a lawyer, but given that monorepo was in use in the versioning/build space long before launch, I would have been hesitant to launch with a name where it's going to be an uphill battle to enforce trademark.

(I also worked for Google a while back, and they were very conscious about not using Google as a verb internally, losing a slow battle against the tide of trademark dilution.)

s.gif
Did you notice that this product is called `moonrepo`, and not `monorepo`?
s.gif
It took me three readthroughs to notice as well, but the name is actually `moonrepo` not `monorepo`.
s.gif
Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search:

About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK