Brutalist Builds - Anti-patterns
Uber Builds and In-House Toolkits
Here is my irreverent take on about 2 of the biggest barriers to well designed, performant, maintainable builds.
- Uber Builds
- In-House Build Utilities
Uber Builds are bloated hard to understand monolithic builds that try to build and do way too much in one build plan.
In-House Build Utilities are collections of poorly coded in house invented and opaque build utilities, most often in compiled code that builds (often Uber) depend on.
Uber Builds
Uber builds are long running builds often beginning with a huge source checkout. They build many libraries and applications from this single checkout, then run tests with complex setup and teardown, finishing up with creating installation packages such as MSI's and PRM's. They run as an all-or-nothing manner, taking hours to run, often failing near the end, requiring the entire build to be restarted.
Understand some project builds are going to be large and complex in varying degrees. Consider Open Source projectgs OpenSSL or Office Libre, these require some setup to be built locally, but build setup tends to be well documented. These builds tends to have well defined compilation, testing, and packaging stages.
Chances are that your company applications do not fall any of these categories.
Here are the main Uber Build characteristics:
- There many projects and dependencies together within a large solution.
- Dependent libraries that rarely change are often repeatedly built from source as part of a large project source tree.
- Build machines requiring complex and often unrepeatable setup. Developers get stuck on stale old crap versions of frameworks and libraries because upgrading is so difficult and error prone.
- Developers cannot build and test the code locally easily, if at all. They must hack their way into a local developing scenario for testing POC's and new features.
- Long running builds often fail midway or at the end need to be re-run, wasting a lot of time.
- In-House utilities are often fused into the build process. These often depend on outdated versions of Java, .NET, C/C++ runtimes. The originator who wanted, let's say Visual C++ on their resume, has long departed.
Refactoring Large Solutions With Several Projects & Dependencies
This addresses characteristics #1 and #2. Given a large solution that builds many different executables and dependent libraries in one large build, to plan for the refactored solution we need to distinguish between 2 terms:
- Software Packages
- Installation and Deployment Packages
Software packages are language and framework specific. The are consumed by builds as dependencies and generated by builds as dependencies for downstream compilation and deployment packaging builds.
Example of software packages are npm packages for NodeJS, gems for Ruby, NuGet pacakges for .NET.
Installation packages deal with installing the software on a target system and are often OS specific. For exampl rpm's are used for RedHat based Linux distributions, MSI and NSIS installers for Windows. Deployment packages have more of a general scope, including installation packages and also things like container images.
Beware, somethings the lines blur. Some software packages are also used as deployment packages such as npm's for Node applications.
Steps to Uber Build Refactorization
Conquering these monster builds includes the following steps, written out below and visually described in Figure 1:
- Break up the solution refactoring shared libraries into their own builds and build those first. The input for these builds is the source code from version control and any external dependencies fetched from public and/or private artifact managers. The output is versioned software packages that end up in an artifact manager.
- The applications are built from their source, but now the common libraries used are now not rebuilt from source. Instead the packages from step 1 are fetched from the artifact manager. The input for these builds includes versioned source code for the application executanbles, any external package dependencies, and also the packages created in the upstream builds. Similar to step 1, the outputs of this step are software packages the get uploaded to an artifact manager. These packages include the applications and their runtime dependency package references.
- Functional and end-to-end testing builds now only need to retrieve the application packages and their runtime dependencies to run tests. Test results can be packaged in whatever format makes sense for archival and later review.
- The builds that create installer packages for installing the apps on target systems builds download the application packages and their runtime dependencies just like the test builds. No application source code is compiled and the apps get downloaded in their software packages to get extracted and bundled in something like a .deb or .msi. The inputs for these builds are the application software packages, their runtime dependencies, plus any scripts and project files for building the package. The outputs are the installation packages themselves.
No longer do we have an uber build that downloads all the application and common library source code, builds the application and common libraries to get staged for functional and end-to-end tests and then packaged into installation packagers in 1 grand build.
Figure 1 – Example Build Steps
In-House Utilities
These do things like:
- Manage and define complex build and release workflows requiring non-intuitive human interaction. The utiltiy has been added to the build pipeline years ago but nobody knows for sure what it does and the code is either lost or to complex to untangle and figure out in a reasonable amount of time.
- Generate complex configuration and other types of files in builds. Often these utilities are written in compiled languages such as C++ or Java, or just big ball of mud scripts to large and spagetti-coded to understand.
- Manage task scheduling and triggering.
- Use complex imperative logic to create installation packages such as Windows installer MSIs and RHEL rpm's programmatically.
In the build and devops infrastructure they are often hybrids of the "Lava Flow", "God Object", "Big Ball of Mud" fused together into steampunk and stovepipe build pipeline madness.
Motivational Factors
Here some factores that nudge developers into creating in-house build utilities:
- A build developer wants something like C#, C++ etc. on their resume. Rather than using something like GNU Make, Bash, PowerShell they bake build logic in an opaque executable that gets called from a build script.
- Similar to 1) people abuse plugin friendly build systems like Ant, MS Build, and Gradle, creating plugins just to get personal experience in stuff like Java and C#.
- Arcane, complesx source control tools like ClearCase, older versions of TFS, PVCS, etc. are in use. Nasty logic and tedious CMS tool interaction steps are encapsulated in the an in-house utility just to do a checkout on the build server.
- People don't what to learn PowerShell, Bash, GNU Make, sed, awk well enough to implement moderatetly complex logic. They end up using something they know, like C# to build XML files for a Python project they are building.
- People don't learn their build ecosystem well enough. .NET? OK you got some decent templating engines you can use along with C# script, no more MSBuild task DLLs please. Node? Good grief plenty of stuff there.