Let’s assume you have a single Github repository where you store all your code for all your products and / or projects. You have a CI application as well, e.g. Jenkins. Under Jenkins you have a job for each of your products and projects. Dozens of jobs which work a lot on a simple day. The jobs are configured to build the different products based on Maven modules, so each job builds only its part of the code.
If you want to follow the continuous integration guidelines, the Github repository is polled by all Jenkins jobs every minute or so to detect the changes in the code. Whenever e.g. a new comment is inserted in Github repository or whenever a typo is fixed, ALL the Jenkins builds will be started as a result of polling. Is this good? Absolutely not. When you change the code of a project, only that specific project should be built. Building all of them is just waste of time and energy, not to mention the time loss: when you want to have a fresh build NOW but the build is already running because of a previous code change on some other project, it can be very annoying to wait…yes, you can have a coffee meanwhile if your build is short enough but after the third one you will be really upset.
One solution for the problem above would be to create a controller job in Jenkins which can decide what to build based on the information in the last commit. Github has a solution called webhooks. Webhooks can notify Jenkins whenever there is a new commit on your huge repository. If you have installed Github plugin on Jenkins (I’m sure you have), you can set up the url where Jenkins is listening to the notifications, so you can integrate Github and Jenkins easily.
On the new controller job’s configuration page you have to set up the “Source Code Management” section and you have to select “GitHub hook trigger for GITScm polling” under “Build triggers” section. This means Github will push a notification to Jenkins whenever there is a new commit and the job will be started immediately. Try to test it: put a simple “echo ‘Hello World'” shell command in it as “Execute shell” build step and then change something (e.g. a comment) in Github code repository. The build should be started in a couple of seconds.
The next step would be to find out what was changed in the commit. The controller job will clone the Github repo, so you will have the last commit in it. Create a build step “Execute shell” and put this command into it:
changes=”$(git diff-tree –no-commit-id –name-only -r $(git log –format=”%H” -n 1))”
This command will give you back all the file names which were changed plus the path to them under the repo. This is valuable information as based on this you can decide what project was changed and which job to start. You can start a job from command line interface with curl, there are different ways to set it up. This bash code is just an example how to decide which project to run:
if [[ $changes = “projectA/”* ]]; then
echo “Project A was modified”
curl -I -XPOST –user testuser:testpw http://jenkins.mycompany.com/job/mybuild/build
And here we are, you have triggered one single job with one commit and not all of them. And the proper one. Don’t forget to switch off polling in each of your jobs as from now on they will be started by the controller job.
With this process you can start multiple jobs if multiple projects were changed in the same comment (which is rare but might happen).
Now the load on your Jenkins node is mitigated significantly and if you are using Jenkins slave nodes, maybe some of them can be switched off, saving money for your company, so time to go to your boss to tell him how efficient you are. 😉