A few months ago we contributed a project, ContainerJS, to the Symphony Software Foundation, an organization that fosters open source and collaboration within financial services. The Foundation has various legal requirements that must be adhered to, including that all contributors (i.e. committers) sign a Contributor License Agreement (CLA).

Unfortunately as a project maintainer the requirement to ensure contributors have a signed CLA is quite cumbersome, for each committer within a pull request you must email ‘legal counsel’ to verify whether they have a CLA (either individual or company) on record!

Having contributed to Facebook projects in the past, where a ‘bot’ verifies the status of pull requests, I knew it was possible to streamline this process. This is why my ‘pet project’ for the past few weeks has been the creation of a cla-bot.

Funnily enough, I’m not the first to have had this idea. I checked with the Symphony Foundation DevOps Director, and they’d looked at a number of options, including clabot, CLAHub and cla-assistant. Some were flexible, but required hosting, while others were provided as a service, but you have to use their CLA signing workflow. None were ideal.

For the cla-bot I wanted to provide it as a hosted service, in this case a GitHub App for ease of integration. I also wanted to leave the specifics of how CLAs are managed and signed to the individual projects.

Just what is a CLA?

If you maintain an open source project you’ll likely have selected an open source licence that governs your project. This provides an implicit agreement for contributors to your project, whereas a Contributor Licence Agreement (CLA) makes these terms explicit and provides a record of these agreements.

As described via Wikipedia

The purpose of a CLA is to ensure that the guardian of a project’s outputs has the necessary ownership or grants of rights over all contributions to allow them to distribute under the chosen license.

You can find example CLAs from GitHub, Facebook, Microsoft, JSFoundation and many other projects.

Do all projects need a CLA?

No!!!

I don’t intend adding a CLA to any of my personal open source projects. There is a good question on Stack Exchange that details some of the ambiguities that a CLA resolves. However, I personally think that for any project that lacks commercial value, this is simply overkill.

What does cla-bot do?

Adding a CLA to your project workflow can add quite a bit of overhead. For each pull request you need to ensure that each committer has signed your CLA before merging. For small projects this can probably be done manually, but for larger projects you probably want to automate this process.

As you can see from the above examples CLAs differ between projects, and the mechanics for signing them also differ. For that reason this bot doesn’t automate that part of the process. However, the process of checking pull requests against a list of (approved) contributors is what this bot automates.

For projects that use cla-bot, when a pull request is opened, the commits are checked to determine the CLA status for each committer. If any fail the check, the bot posts a (customisable) comment, prompting them to get in touch and arrange a CLA:

If the checks pass, a (configurable) label is added to the pull request:

In each case the pull request status is set to a pass or fail. It’s a good idea to make the verification/cla-signed status a required status check so that it must pass before pull requests are merged.

If a committer signs the CLA and you wish to have a pull request re-check, you can request this posting a comment that contains the text @cla-bot check:

How does the cla-bot work?

The bot is installed as a GitHub App, which means it receives webhooks when various events occur on a project (comment created, pull request opened, etc …)

The project-specific configuration for the bot is stored in a .clabot file. When a pull request is opened, the bot fetches the configuration for the project. This file details how to determine contributors for a project, their usernames can be stored directly in this configuration file, or it can point to a URL, or webhook that supplies this information.

You can read more about the configuration on the cla-bot homepage.

Symphony Foundation are now using this bot on a trial basis across a number of their projects. Hopefully it will be of use to others too!

Colin E.