Datacenter build and management service
The Conch API is developed using Github.
Requests and bugs are tracked using Github Issues.
The master
branch is protected and cannot be modified directly:
master
, containing no merge
conflicts (with the exception of hotfixes intended for production, which are
made against a release/v<X>
branch and also merged into master
)The build world is an odd duck in that we are also the customers of our own work. The Conch Shell (kosh) and Conch Web UI are the main consumers of the API and also owned by the Build Team. Those projects each obviously have their own customers and needs but the features and optimizations for the API can largely be determined in-house.
When it is time to release a new version of the API, the release manager
creates a new tag in git, named like v2.45.0
, with a commit message containing the
changelog. The changelog is typically autogenerated using the misc/ghch
script in the repository.
When the tag is pushed into Github, Buildbot executes a test run and, if successful, creates a new Github release. The text of that release is the changelog that was posted in the relevant tag. Buildbot results can be viewed here.
This new release is pushed into staging by a member of the Conch team using Ansible (see the private buildops-infra Github repo).
The release manager sends out an email or mattermost notification announcing the deployment and summarizing the changes.
The code stays in staging for a two weeks, during which time automated testing is performed. The user base is encouraged to test as well.
Any bug fixes are applied both to master
and the release/v2.45
branch. They are then tagged as a minor version like v2.45.1
and
redeployed into staging.
After the two week period, staging is deployed to production. A go/no-go call is made, usually during a BuildOps staff meeting.
If bugs are found in production, they are applied both to master
and
the release/v2.45
branch, tagged as a minor version and redeployed
into production. If applicable, these bug fixes are also applied to the
staging branch (release/v2.46
in this example) and redeployed into staging
Currently, deploys are scheduled for Monday afternoons (US/Eastern) and are visible in the BuildOps Google Calendar.
In general, we prefer a two week deploy cadence. v2.45.0 goes into staging on Monday and we begin accepting PRs for v2.46. Two weeks later, v2.45 goes to production, v2.46 is deployed to staging, and we begin to accept PRs for v2.47.
Every change pushed to github (unless a branch rule has been configured on a
subbranch) will result in a webhook event that triggers Buildbot to execute a
test run. These test results are reported to the ~conch-devel
chat channel,
as well as emailed to the user who triggered the build.
The tests are executed via make test
in the Makefile at the root of the
repository. The tests cover all aspects of the application, from low level
functionality such as database access, logging, and json schema evaluation,
to higher level integration testing of individual api endpoints. All the
tests live in the t/
directory in the repository.
It is a necessary requirement that all pull requests must pass tests before they are considered for merging, whether to the master branch or a topic/feature branch. If you received a failure notificaton, follow the links in Buildbot to get more information about how your tests failed.
When written out like this, the development and release processes seem complicated. In practice, however, it is pretty lightweight and has minimal requirements for the developer. The release process requires more work out of the release manager but, again, in practice the process is pretty lightweight, particularly on a two week release cadence.