iOS Continuous Integration: Uploading your Pipeline

iOS Continuous Integration: Uploading your Pipeline

Modern software development involves quite a bit more than classes and functions. To provide quality assurance, code should be tested, which is often a repetitive task and well suited to automation. Given a test suite, it should be run every time before merging code into a development branch to ensure that integration of the new code is likely to be safe. This is known as continuous integration, or CI, and it requires tooling to be effective, especially with large teams. In this post, I am going to discuss some aspects of the tooling which is particular to iOS development, and I will gloss over the more fundamental aspects of CI.

Using a CI service

For CI to work, it is necessary to have something which is monitoring the source code repository. In the old days, this would be a Mac which was running Jenkins, which would check the repository periodically, and run a script if there were changes. Obviously, the ability to run the test suite on the command line is a pre-requisite.

If the source code is hosted by a service which supports hooks, such as GitHub, then it can signal when source code changes are pushed. These can be new commits on existing branches, but also when whole branches are published. This avoids the inefficiency of polling for changes. Jenkins can also support working in this manner.

However, using Jenkins requires maintenance (time, money etc), and a spare Mac. If your time and sanity are precious to you, there are hosted CI services, such as Travis CI to use instead. These services will execute the test script on a machine in the cloud, automatically for you! Typically these machines are called agents, and a key feature of "the cloud" is in scaling to your needs. Therefore, if you have a big team, with multiple branches in development, or multiple projects, it helps to have more agents available to run builds. It is surprising how much waiting for CI can cause frustration, stress and generally disrupt progress - especially if your team performs manual QA on each branch (which requires binaries to build and distribute to team members).

In some cases using a hosted CI service can be a problem. Obviously, security might be an issue. Typically, the hosted machines will be virtualized and this can be slow. Lastly, waiting for hosted CI services to install the correct tooling can be a problem, especially if it is necessary to run builds on beta SDKs. For these reasons, I prefer to manage my own build machines.

Buildkite is an automation service which excels at automatically distributing jobs, triggered via source code changes, across agents. When a change is detected, a build is started, and it runs a build pipeline, which is a sequence of build steps. Buildkite can and will distribute the execution of these steps across the available agents as appropriate.

iOS Build Steps

For iOS development, depending on your projects, typically the pipeline would have steps such as Run Logic Tests, possibly Run UI Tests, maybe even Deploy to iTunes Connect. In addition, after the recent XcodeGhost malware, it is prudent to verify Xcode (especially before deploying apps). To ensure consistent code quality, recording the test coverage of the code base should certainly be considered.

With this in mind, the build pipeline of my open source Operations framework, looks a little bit like this:

Example of running unit tests in Buildkite

For all of these steps, it's important that the three test suites are run on the same agent which has verified Xcode. Additionally, when sending code coverage from build agents, it is crucial that it is sent from the same agent which ran the tests. Therefore, we really want to run all the steps on the same agent. The only way to ensure that the same agent runs all the tasks is to limit the project to one agent. If there are multiple agents, the steps will be run on all available agents, and we would not be able to trust the code coverage results.

Of course, this is quite a severe restriction. Although for my CI setup there is only one developer (me), I have multiple branches and multiple projects. I have previously worked in a team of 8 developers using the same repository, and we needed at least four build machines. We want to have many agents available so that multiple branches or projects can be built simultaneously, yet restrict all the steps of each build to the same agent.

Why not just group everything into one step?

One option would be to lump all the steps into one, however, this is not a good idea. It reduces transparency which makes it harder to find where a build has failed. It's also much harder to develop the pipeline. It's why we don't typically write apps using one class anymore.

Automating the pipeline

The solution to the problem is to automate the pipeline, which in BuildKite terms, is done via uploading pipelines. It's incredibly awesome, and essentially works like this:

The pipeline can be described in a text file and stored inside source control. The project's pipeline in Buildkite's web interface is then replaced with a single step which invokes the pipeline upload command of the Buildkite agent. This command will read the pipeline from the source code repository and upload it to Buildkite to run.

This is incredibly powerful, because we can process that text document on the fly before it gets uploaded.

Filtering agents

BuildKite supports agent metadata, which is a key & value list included in the agent's configuration. For example, until Xcode 7 and Swift 2 was released, I had one agent which supported Xcode 6 and Swift 1.2 and another running the Xcode betas and Swift 2.0. Therefore, using a branch naming scheme (swift_2/*) I could ensure that Swift 2.0 code was built by the Xcode 7 agent. This is configured in Buildkite by adding the key and required value to the build step.

When a build runs these agent metadata values are exported into the shell environment variables.

BUILDKITE_AGENT_META_DATA_QUEUE=default
BUILDKITE_AGENT_META_DATA_SWIFT=2
BUILDKITE_AGENT_META_DATA_XCODE=7

Using this mechanism we can filter all the build steps to one agent by selecting, say, the agent name to be the current agent. By adding the name of the agent as part of its metadata, we can write the pipeline template like this, which we store inside the repository at .buildkite/pipeline.template.yml.

steps:
  -
    name: ":fastlane: Verify Xcode"
    command: .scripts/verify-xcode.sh
    agents:
      name: "$BUILDKITE_AGENT_META_DATA_NAME"
  -
    name: ":fastlane: Test iOS Extension Only"
    command: .scripts/test-extension.sh
    agents:
      name: "$BUILDKITE_AGENT_META_DATA_NAME"
  -
    name: ":fastlane: Test Mac OS X"
    command: .scripts/test-osx.sh
    agents:
      name: "$BUILDKITE_AGENT_META_DATA_NAME"
  -
    name: ":fastlane: Test iOS"
    command: .scripts/test-ios.sh
    agents:
      name: "$BUILDKITE_AGENT_META_DATA_NAME"
  -
    type: "waiter"
  -
    name: "Send Coverage"
    command: .scripts/send-coverage.sh
    agents:
      name: "$BUILDKITE_AGENT_META_DATA_NAME"

All that is left is to write a script to replace $BUILDKITE_AGENT_META_DATA_NAME with the value of this environment variable when the build is running. The script below will do exactly this, and we save it at .buildkite/pipeline.sh ensuring that it is executable.

#!/bin/bash

set -eu

# Makes sure all the steps run on this same agent
sed "s/\$BUILDKITE_AGENT_META_DATA_NAME/$BUILDKITE_AGENT_META_DATA_NAME/" .buildkite/pipeline.template.yml

The last element to set up is the build step in BuildKite's web UI. All that is required is to invoke the above script and run the pipeline upload command.

Adding a run script to Buildkite's pipeline settings screen

The key point above is that the step's command is:

.buildkite/pipeline.sh | buildkite-agent pipeline upload

It is also possible to see that each agent has name=<NAME> in their metadata.

Running a build

When changes are pushed, the Upload Pipeline step runs on the next available agent which has Xcode 7 (as it is required metadata). Initially, this is the only step in the pipeline.

Running the initial upload pipeline step.

However, almost immediately, the pipeline is uploaded, and the steps appear.

Once the Upload Pipeline step is green, the full pipeline steps are uploaded.

At this point, the pipeline steps execute as normal, with the key detail being, that every step runs on the same agent, in this case Tyrion.

Pipeline steps are all running on the same agent.

In Summary

I'm a huge fan of Buildkite - all of this I pretty much figured out after chatting to @toolmantim today. But beyond that, I guess, it just really helps to automate as much as possible.