Devops

Devops

Devops

Apr 7, 2017

Your CI build process should be in your code repository

Your CI build process should be in your code repository

Your CI build process should be in your code repository

It's always been clear to developers that a project's source code and how to build that source code are inextricably linked. After all, we've been including Makefiles

(and, more recently, declarative build specifications like

pom.xml for Maven and stack.yaml for

Haskell Stack) with our source code since time immemorial (well,

1976).


What has been less clear is that the build process and environment are also important, especially when using

Continuous Integration. First-generation CI systems such as Jenkins

CI and Bamboo have you use a web-based UI in order to set up the

build variables, jobs, stages, and triggers. You set up an

agent/slave running on a machine that has tools, system libraries,

and other prerequisites installed (or just run the builds on the

same machine as the CI system). When a new version of your software

needs a change to its build pipeline, you log into the web UI and

make that change. If you need a newer version of a system library

or tool for your build, you SSH into the build agent and make the

changes on the command line (or maybe you use a configuration

management tool such as Chef or Ansible to help).


New generation CI systems such as Travis CI, AppVeyor, and

Gitlab's integrated CI instead have you put as much of this

information as possible in a file inside the repository (e.g. named

.travis.yml or .gitlab-ci.yml). With the

advent of convenient cloud VMs and containerization using Docker,

each build job can specify a full OS image with the required

prerequisites, and the services it needs to have running. Jobs are

fully isolated from each other. When a new version of your software

needs a change to its build pipeline, that change is made right

alongside the source code. If you need a newer version of a system

library or tool for your build, you just push a new Docker image

and instruct the build to use it.


Tracking the build pipeline and environment alongside the source code rather than through a web UI has a number advantages:

  • If a new version of your software needs a different build

    pipeline, or needs a newer system tool installed in order to build,

    that change is right alongside the new version of the source

    code.


  • You can make changes to the build pipeline and environment in a

    feature branch, test them there without risk of interfering with

    other branches, and then a simple merge applies those changes to

    the master branch. No need to carefully coordinate merge timing

    with making manual changes to the build pipeline and

    environment.


  • Building old versions of your project "just works" without

    needing to worry that the changes you made to support a newer

    version break building the old version, since the old version will

    build using the pipeline and environment that was in place at the

    time it was tagged.


  • Developers can manage the build pipeline and environment

    themselves, rather than having to ask a CI administrator to make

    changes for them.


  • There is no need to manage what is installed on build agent

    machines except for the bare minimum to support the agent itself,

    since the vast majority of build requirements are included in an

    image instead of directly on the machine.


  • Developers can more easily build and test locally in an

    environment that is identical that used by CI by reusing images and

    build scripts (although "inlined" scripts in a

    .gitlab-ci.yml or .travis.yml are not

    usable directly on a developer workstation, so the CI metadata

    should reference scripts stored elsewhere in the repo). However,

    for thorough testing (rather than rapid iteration) in a real

    environment, developers will often prefer to just push to a branch

    and let the CI system take care of building and deploying instead

    (especially using review apps).


  • A history of changes to the CI configuration are retained, and

    it's easy to revert bad changes. Web based UIs may keep an audit

    log, but this is harder to deal with than a Git commit history of a

    text file.


  • You can include comments with the CI configuration, which web

    based UIs usually don't have room for aside from perhaps a

    "description" textarea which is nowhere near the actual bit of

    configuration that the comment applies to.


  • Making changes to machine-readable text files is less error prone than clicking around a web UI.

  • It's easier to move your project to a different CI server, since

    most of the CI configuration does not need to be re-created on the

    new server since it is in the code repo. With a web UI, you end up

    spending a lot of time clicking around trying to make the new

    server's job configuration look the same as the old server's,

    copying-and-pasting various bits of text, and it's easy to miss

    something.


There are also potential pitfalls to be aware of:

  • You need to consider whether any aspects of the build pipeline may change "externally" and should therefore not be inside

    the repo itself. For example, you might decide to migrate your

    Docker images to a different registry, in which case having the

    docker registry's hostname hard-coded in your repo would make it

    complicated to build older versions. It is best to using your CI

    system's facility to pass variables to a build for any such

    external values.


  • Similarly, credentials should never be hard coded in your repo,

    and should always be passed in as variables from your CI

    system.


Of course, nothing described here is entirely new. You could be

judicious about having the a first generation CI system only make

very simple call-outs to scripts in your repo, and those scripts

could use VMs or chrooted environments themselves. In

fact, these have been long considered best practices. Jenkins has

plug-ins to integrate with any VM or containerization environment

you can think of, as well as plugins to support in-repo pipelines. The

difference is that the newer generation of CI systems make this way

of operating the default rather than something you have to do extra

work to achieve (albeit with a loss of some of the flexibility of

the first generation tools).


CI has always been a important part of the FP Complete

development and DevOps arsenal, and these principles are at the

core of our approach regardless of which CI system is being used.

We have considerable experience converting existing CI pipelines to

these principles in both first-generation and newer generation CI

systems, and we offer consulting and training.