This page summarizes the git-based development process we are using for Zeek and its associated subprojects.
Contents
Our git repositories can be cloned from https://github.com/zeek/zeek/<repo>. If you are a developer with SSH-based write access, use git@github.com:zeek/zeek/<repo> instead. See the full list of repos at GitHub.
Some of the repositories, in particular zeek, include others as submodules. To clone the repository, including the submodules, use --recursive. For example, to clone zeek with all its submodules:
git clone --recursive https://github.com/zeek/zeek
The (summarized) submodule directory tree that results from the recursive clone looks like:
zeek/ cmake/ aux/bifcl/ aux/binpac/ aux/zeek-aux/ aux/broker/ aux/broctl/ aux/capstats/ aux/pysubnettree/ aux/trace-summary/ aux/btest
If the public URLs are used in the original cloning of repositories, but later write access is needed, the following command can be used to reset what URL is used for fetch/push operations (e.g. for the zeek repository):
git remote set-url origin git@github.com:zeek/zeek
As an alternative, a global git config option can be set to automatically rewrite any of the public URLs into the private/write URLs:
git config --global url.git@github.com:zeek.pushInsteadOf https://github.com/zeek
The following diagram illustrates how we structure the development process:
Regression tests are provided by Travis CI
Generally, we have three groups of people involved with the development: maintainers have full write access to the central repositories with authority to commit to their master branches; developers have write access to the central repositories for working on topic branches (but not the master branches); and external contributors contribute functionality via github pull requests. We discuss roles and branch structure in more detail in the following.
We use the following standard branches:
Topic branches are created by developers for working on a specific feature/bugfix/etc. They must be branched off of master. If there’s a developer primarily in charge of the branch, include the name (e.g., topic/robin/exciting-new-feature); otherwise just give it a descriptive name topic/new-documentation. All development relevant to the particular topic must be committed to the branch.
All developers can generally create/modify all topic branches, but please coordinate before working on somebody else’s topic branch.
This is a special, long-lived topic branch that allows for fast approval of small changes. All developers may commit into fastpath. However, each commit must be self-contained (i.e., not depend on earlier or later changes, and not be changed/amended afterwards), and it must be immediately ready for integration into master. fastpath should be regularly merged with master, any developer can do that.
Note that in case there is later a change necessary to a commit that has already been pushed into this branch, manual coordination with the maintainers will be necessary to make sure the right thing happens.
We have a designated set of maintainers with authority to commit to master in all *.bro.org repositories; these folks are in charge of merging in topic branches, fastpath, and external contributions. We select them by consensus among the existing group of maintainers. Individual repositories may have further maintainers in addition to the global group.
Generally, all maintainers may merge any pull requests and fastpath commits. There’s no "must" however, everybody’s free to skip changes where they don’t feel sufficiently familiar with the corresponding code.
For changes authored by maintainers themselves, we generally stick to the "two people rule": maintainers do not merge their own patches, another maintainer has to do that. The exception is small straight-forward stuff, like simple bug fixes and cleanup. A good rule of thumb: if it’s a fastpath-suitable patch, direct commit into master without review is fine; if it’s a topic branch, consider filing a pull request for your fellow maintainers. However final decision is left to the discretion of the maintainer authoring the change.
See Merging a Topic Branch for the steps involved in doing a merge.
See our separate page for contributing patches and functionality.
The following is primarily for developers with write access to the central repositories.
One note ahead of everything else: Never use git rebase if you’re not sure what you’re doing. In particular, never use rebase with anything that has already been pushed out to the origin repository. You would be "changing history" this way, create a huge mess, and have to pay for lots of beer at Jupiter.
Create a new topic branch locally and remotely, and make sure the local one tracks the remote one.
> git checkout master
> git checkout -b topic/robin/foo
> git push -u origin HEAD
Edit and commit locally:
> vi foo.c [...] > git commit -a # Commits everything changed.
Often it’s helpful to stage things individually before committing everything:
> vi foo.txt # Edit. > git add foo.txt # Stage. > vi bar.txt # Edit. > git add bar.txt # Stage. > git commit # Commit everything staged.
See Using the Index For Checkpoints for more.
See Writing Commit Messages for some thoughts on what to put into a commit message.
Push local changes upstream:
> git push origin HEAD
This will generate commit notifications to the mailing list. Note that you can do just git push as well, but that will push your changes from all branches upstream (Update: the default for this has changed in newer git versions to stick to the current branch only).
Regularly pull in changes from upstream in case somebody is working on the same branch:
> git pull # This combines "git fetch && git merge"
If this needs to merge upstream changes with local ones, the pull will automatically commit the merge.
Likewise, regularly merge with master to incorporate any changes:
> git fetch # Ensure local repository is current. > git merge origin/master # Merge in master.
Run the test suite to make sure none of the tests fail.
To run the entire test suite for Zeek and all of its components, first chdir to the top-level Zeek source directory, and then run:
./configure && make && make test-all
If a test fails, you can find information about why it failed in the "diag.log" file in the test directory. Occasionally, this information might not be sufficient, and you may find it helpful to look for more clues in the test’s tmp directory (look in directory ".tmp" for a subdirectory matching the name of the failing test).
If you want to update or create a test baseline for one test, then run btest with the "-u" option and the name of the test (e.g., btest -u core.icmp.icmp-events). Alternatively, to do this for all tests that failed, then run "btest -r -u". The "-u" option will make btest pause for each failing test and ask if you want to record a new baseline. Before updating a baseline, make sure that the new baseline is actually correct (i.e., don’t just blindly overwrite baselines for every failing test).
Once a topic branch is ready for integration into master, you can create a pull request on GitHub, as follows:
Make sure you have merged the current master into your topic branch.
Create a GitHub pull request. If your branch spans multiple git repositories you may choose to fine a single pull request within the top-level git repo, or else one for each repository. Always be sure to explain which branches need to be merged in which repositories for any pull request to help the person performing the merge be aware of the dependencies.
Note that your commit messages should already have sufficient information for the maintainer to create a CHANGES entry. If that’s not the case (in particular if it’s distributed over multiple commits) include appropriate text in the pull request.
If the maintainers find that further work is needed before the branch can be merged, they’ll add comments to the pull request and reassign it to you. If so, address the comments by committing further to the topic branch and update the pull request once done. Assign the ticket back to the maintainer you communicated with when you’re ready for another attempt.
If the maintainers find the topic branch suitable for merging, they will do so, close the ticket, and delete the topic branch.
Topic branches should be deleted once they have been merged into master (and, naturally, also whenever else you don’t need them anymore). Note that you need to delete a branch separately from both your local and the remote central repository:
Delete the branch from the remote repository:
> git push origin :topic/robin/foo
Remove the branch from the local repository by first switching to master and then deleting it:
> git checkout master > git branch -d topic/robin/foo
Deleting a branch will warn you if it has not been fully merged into the current branch (i.e., here, if there are changes in topic/robin/foo that aren’t in master). To override, you can do:
> git branch -D topic/robin/foo
Note that branches are just symbolic names pointing to a particular commit. Git does not keep a history of previous branch names.
For maintainers: there’s a script zeek-aux/devel-tools/git-delete-old-branches that will delete all branches that have been fully merged into master; that’s good for cleanup when old topic branches still stick around.
If something goes wrong, you can usually revert to the previous state, even after a commit. Every "large" git operation sets ORIG_HEAD to the point before it started:
> git reset --hard ORIG_HEAD
This will revert all changes in the working tree and set the current HEAD accordingly. Instead of ORIG_HEAD you can also directly specify a revision you want to revert to.
Otherwise git status will generally give you hints on how to revert specific changes in the working copy or the index.
This section summarizes typical tasks for the repository maintainers.
Once a developer/contributor files a pull request, then the branch can be merged into master. At the high-level doing a merge consists of these steps:
Assign the issue / pull request to yourself to claim it and avoid conflicts.
Locally merge the branch into master and perform any additional tweaking as required, potentially coordinating with the author of the change where further work is needed.
Update VERSION and CHANGES (there’s a script for that: zeek-aux/devel-tools/update-changes).
Update NEWS for more significant / user visible changes and extensions. This is going to become the release notes for the next Zeek release, so keeping notes there is important.
Ensure all test-suites pass, including external public and private.
Push everything upstream to GitHub.
Note that the Travis test suite is triggered from commits to the zeek repository’s master branch. If there’s associated changes to merge in external test suites or submodules, those need to be merged/pushed into their own repository first.
Monitor Travis’ emails for trouble.
At a lower-level, merging involves the following steps (or variations of these, generally there are a number of ways some things can be done):
Assign the ticket to yourself.
Make sure your repository has everything needed:
> git fetch
You should be able to see the topic branch (skip this step if you’re merging in a github pull request):
> git branch -a
Switch to master, and make sure it’s up to date and the submodules are set right:
> git checkout master > git merge origin/master > git submodule update --recursive --init
The fix-submodules macro in useful macros provides an alias for the submodule command.
Check what revisions the topic branch has that will be included by the merge:
> git log --no-merges origin/topic/cmake-port ^master
If you also want to see diffs, add -p.
If you’re merging in a github pull request, then review the commits in the pull request by following the link to the pull request as specified in the nightly "Merge Status" mail.
If that all looks fine (or mostly fine), merge the branch into master, but don’t commit yet:
> git merge --no-ff --no-commit --log origin/topic/cmake-port
The merge-branch macro in useful macros provides a handy alias for this git command.
If you’re merging in a github pull request, then the command to use is specified in the nightly "Merge Status" mail.
Explaining the options:
Always records the merge as a separate commit, even if it could be fast-forwarded.
Does not commit the merge immediately, but just stages it.
Include the first sentence of each topic-branch commit in the commit log.
Resolve conflicts, if any.
Check the staged changes:
> git diff --cached
Edit anything you need to change and stage the changes:
> vi foo.c > git add foo.c
Verify that all the test-suites pass.
Commit the merge:
> git commit
Update VERSION and CHANGES by running zeek-aux/devel-tools/update-changes. Note, do this after the git commit; the script will amend the previous commit so that no new one is created.
Update NEWS if the change is sufficiently major to be mentioned specifically in the release notes for the next Zeek release. Best to also amend this to the merge commit:
> vi NEWS > git add NEWS > git commit --amend
If any submodules have changed, the parents need to be updated to reference the new versions. There’s a script zeek-aux/devel-tools/git-move-submodules that moves all submodules recursively to the HEAD of a given branch and then updates all parent modules to reference that. In short: running git-move-submodules master in the top-level Zeek repository after a merge makes sure things are in order. The script creates a new commit but marks it so that no email notifications are going to be sent for it.
Delete the topic branch (if you merged a github pull request, then skip this step):
> git branch -d topic/cmake-port > git push origin :topic/cmake-port # Delete remote.
If you have tried a merge that resulted in complex conflicts and want to start over, you can recover with:
> git reset --merge
See the section on reverting changes above for other options.
Generally, this works similar to merging in a topic branch, except for the assumption that each commit on fastpath is an independent change. There are two ways to handle this:
We don’t further coordinate fastpath merges among maintainers, as the chance for conflicts is hopefully small. Just try to finish fastpath merges quickly once started. In case it takes longer for some reason, send mail to zeek-dev that you’re working on it.
The important thing to remember is that submodules reference a specific commit (i.e. repository snapshot) in the foreign repository. Because of this, the foreign repository can be developed completely independently of whatever includes it as a submodule. However, the maintainer may wish to update submodules to point to newer versions (presumably "stable" releases) from time to time. That is done as follows:
Checkout the master branch of your repository and make sure it is up-to-date:
> git checkout master > git pull
Checkout the version you want to set the submodule to, like a specific branch:
> cd path/to/submodule/foo > git pull > git checkout release/v1.1
Double-check that the parent repository shows submodule changes:
> cd path/to/parent/repo > git status
Stage the updated submodule (the lack of a trailing slash is important!):
> git add path/to/submodule/foo
Commit and push the updated submodule:
> git commit -m "Updated submodule foo to release 1.1" > git push origin master
There’s a script that does all this automagically. Need to clean up the script and add to the repository, then document here. -Robin
Note that current release process is now maintained here: https://docs.zeek.org/en/latest/devel/maintainers/release-process.html
Update Zeek’s NEWS with release notes. The file should contain the most important information about the new version, including a summary of major changes, especially incompatible ones. It should also point out the major changes in submodules.
Also, if local.bro has changed, point out how to adapt a previous one accordingly.
Double-check these:
- Ensure tests succeed in testing/btest, testing/external/bro-testing, and testing/external/bro-testing-private.
- Make sure to also run all these tests on a Zeek compiled with perftools support. That activates additional leak checking.
Finalize submodules before parent modules. For each module you want to release:
Make sure your checkout is up to date:
> git pullIf the module has submodules, make sure to reference their current versions:
> git add <module> > git commit -m "Updating submodule." <module>Make sure everything else is committed as well:
> git statusUpdate README as necessary and commit.
Do the final CHANGES update, set the new version, and tag as release:
> cd <into/module> > update-changes -R v0.31Note
If you rerun the update-changes script for a version already set in an earlier run, it will do the right thing: delete the old tags and create new ones. But you should do that only if you haven’t published that version yet.
git describe should now show the new version.
Push all modules:
> git submodule foreach --recursive git push > git push
Run check-release to check if everything is in the right state. It should output something like this:
Branch CHANGES Pending Modif Sub VERSION Tags + bro master ok 0 ok ok 2.0 v2.0,release + binpac master ok 0 ok ok 0.31 v0.31,release + bro-aux master ok 0 ok ok 0.22 v0.22,release + broccoli master ok 0 ok ok 1.8 v1.8,release + broccoli-python master ok 0 ok ok 0.52 v0.52,release + broccoli-ruby master ok 0 ok ok 1.52 v1.52,release + broctl master ok 0 ok ok 1.0 v1.0,release + capstats master ok 0 ok ok 0.16 v0.16,release + pysubnettree master ok 0 ok ok 0.17 v0.17,release + trace-summary master ok 0 ok ok 0.73 v0.73,release + btest master ok 0 ok ok 0.31 v0.31,release
If there’s a line starting with a minus instead of a plus, there may be something wrong (not necessarily a hard stop if you know what you’re doing, but something to check out). Here’s what the columns mean:
- Branch
The branch that the working tree for the repository is currently on. When doing a release, this must be master for all repos.
- CHANGES
This runs update-changes -c to check that there’s an entry in CHANGES refering to the repo’s most recent commit.
- Pending
Number of local commits not yet pushed. This should be zero before the release gets made publically available to ensure everything is synced with GitHub. However, while preparing a release, it will often still show some non-yet-pushed commits, that’s fine.
- Modif
This shows Mod! (modified) if there are any uncommited changes in the working tree. That might be ok, like if there are some files still lying around that shouldn’t go in; but it’s worth double-checking that nothing got forgotten to commit.
- Sub
This checks that all of a repo’s submodules are pointing to their most current version. More precisely, it checks that git submodule status doesn’t report -/+/U for a submodule. In the words of the man page: "Each SHA-1 will be prefixed with - if the submodule is not initialized, + if the currently checked out submodule commit does not match the SHA-1 found in the index of the containing repository and U if the submodule has merge conflicts."
- VERSION
The contents of the VERSION file. The file is updated by update-changes and should show the version matching the targeted release number for the repo. The will add an ! if there’s a dash in the version, as an indicator that it has the format of an internal development number (as opposed to the release format X.XX). Note that for a beta version (see below) that is fine, as we add a -beta postfix; just make sure it looks right.
- Tags
Any tags that refer to the current HEAD. update-changes takes care of setting these when called with -R or -B. For each repo, there should be two tags (at least): one matching VERSION and one that says release to mark HEAD as a release (beta for a beta). The release/beta tags is what the web pages key on to refer to the most current versions. If there’s no release or beta among the tags, the output here will add a !.
Time to build tar-balls. Make sure you have a gpg-agent running. This will then build tgz for all modules and sign them:
> make-release --recursive
The output will show the new files with their signatures:
--- All distributions in /home/robin/bro/master/build/dist: -rw-r--r-- 1 robin robin 16619 Jan 10 18:39 /home/robin/bro/master/build/dist/release/trace-summary-0.73.tar.gz.asc -rw-r--r-- 1 robin robin 11601 Jan 10 18:37 /home/robin/bro/master/build/dist/release/trace-summary-0.73.tar.gz -rw-r--r-- 1 robin robin 53628 Jan 10 18:39 /home/robin/bro/master/build/dist/release/pysubnettree-0.17.tar.gz.asc [...]
Copy them over to the web server:
> scp build/dist/release/* www.zeek.org:~www/public_html/downloads
Push the tags now. This will reflect the new versions on the web server:
> git submodule foreach --recursive git push --tags > git push --tags
Tag the external test suite repositories with release tags, too:
> git tag -m "Baselines matching Bro 2.5 release" -a release/2.5 > git push --tags
Create a new GitHub release at https://github.com/zeek/zeek/releases by uploading the .tar.gz and .tar.gz.asc files. Note that this only helps because GitHub’s automatic process for creating zip/tarfiles of release tags don’t include submodules (and that tends to confuse users). Hopefully GitHub changes this in the future.
Check/monitor that Read the Docs builds and update docs.zeek.org
Create a maintenance branch:
> git checkout master > git checkout -v release/2.0 > git push
Do that for all modules you want to maintain separate release versions for.
Create a release tag in the zeek-docs repository (it’s also submodule of zeek in the doc/ dir). The commands to do that look something like:
> git tag -a v$(cat ../VERSION) -m "Docs for Zeek $(cat ../VERSION)" > git push --tags
This follows mostly the same process as when doing a release, with the following tweaks:
update-changes has a separate option -B <version> to make a beta version; using that creates a corresponding beta tag instead of release (the latter remains untouched). A beta version number must be of the form vX.Y[.Z]-beta* (the script will enforce that).
When doing a Zeek beta, it’s usually best to simply go ahead and make releases of all the submodules, except BroControl, first. Often the submodules won’t change anymore between beta and release, so that saves some time later. If a submodule changes, just do another release for it eventually; their version numbers don’t matter much anyway. Once all submodules are tagged as releases, prepare betas for Zeek and BroControl.
Copy the tar balls into the downloads/beta/ directory, not downloads/.
Edit the web pages in the www repository:
- In scripts/make-docs add a line beta -beta to VERSIONS.
- In root/download/index.rst enable the (raw HTML) block that shows the link to the beta tar ball.
- In root/documentation/index.rst add a new link to the beta edition of the "Zeek Manual" (the docs generated by sphinx in the "bro" git repo).
- Update root/documentation/beta/index.rst as necessary.
While there is a git for SVN users crash course on the git homepage, it can be more confusing than helpful, especially since it doesn’t deal much with the local vs. remote relationships.
However, two very good and quick to read introductions to git are here:
A more comprehensive book on Git is available online:
A great comprehensive book / guide to git is:
Since Zeek uses an optional Git "superproject" organization for the repositories, developers and maintainers may find it helpful to review Git Submodules.
Generally, it’s worth reading (and understanding :) the principles behind git (e.g., by reading the first few chapters in the O’Reilly book) as this will greatly help understanding what’s going on.
Merging upstream changes into the current branch:
> git pull
Actually, this command is shorthand for fetching all changes from remote and then merging the remote of the current branch into the current branch (i.e., a pull fetches all remote but only merges/applies changes to the current local branch).
List all available local branches (the one you’re currently on is marked with an asterisk):
> git branch
List all branches, remote and local with remote tracking and latest commit information:
> git branch -avv
To switch the working copy over to a specific branch, say fastpath:
> git checkout fastpath
Push (only) the current local branch to its remote counterpart:
> git push origin HEAD
Using only git push will push all local branches to their remote counterpart.
In order to commit changes to git, one has to git add FILE to stage the file and then git commit to actually commit the file. (Or use the shorthand git commit -a.) However, the staging area can also be used for checkpoints.
Whenever a change is staged with git add FILE git remembers the state of the file at this time. If you then further edit the FILE, you can either stage the new version by using git add FILE again, or revert to the previously staged version by using git checkout -- FILE. In order to unstage the file, you can use git reset HEAD FILE. (This will not change the file’s contents in the working copy. It only removes the staged content. In order to revert the FILE to the last commit version, you have to do another git checkout -- FILE.) If in doubt, git status is your friend.
Thus, if you want checkpoints, every git add can be considered a checkpoint and a git checkout gets you back to the last checkpoint.
If a branch is pushed upstream, each local commit will result in a commit to the upstream branch, thus triggering a commit message. However, sometimes it may be useful to squash several local commits (e.g., a bunch of checkpoint commits) into one upstream commit using git rebase -i. See here for more information.
But keep the Golden Rule in mind: when rebasing, never change anything that has already been pushed. One way to rebase only what has not yet been merged with origin/master is:
git rebase -i origin/master
Alternately, you might consider using the index instead of many local commits, see Using the Index for Checkpoints.
Normally, a fast-forward merge occurs by default when the current branch HEAD is directly upstream from the HEAD of the branch being merged. The --no-ff flag will, in this case, force the generation of a merge commit instead of silently moving the branch HEAD forward (a fast-forward). This avoids losing historical information about the existence of a topic branch because, although it doesn’t contain any content changes itself, the "merge commit" will have parent pointers corresponding to each of the old branch HEADs, thus allowing one to see exactly what commits were involved in a topic branch.
Commit messages should follow the following style as some of the git tools rely on that structure for picking out the right information:
Short (50 chars or less) summary of changes. More detailed explanatory text, if necessary. Wrap it to about 72 characters or so. In some contexts, the first line is treated as the subject of an email and the rest of the text as the body. The blank line separating the summary from the body is critical (unless you omit the body entirely); tools like rebase can get confused if you run the two together. Further paragraphs come after blank lines. - Bullet points are okay, too - Typically a hyphen or asterisk is used for the bullet, preceded by a single space, with blank lines in between, but conventions vary here - Use a hanging indent
(This is stolen and adapted from here.)
In addition, make sure to include sufficient context with your changes that the repository maintainer can easily create a CHANGES entry for your work.
When committing something in response to working on a GitHub issue, you may include "close", "fix", or "resolve" keywords in the commit message so that the issue is automatically closed upon merge. For example, write "Fixes GH-42" to close issue 42 upon merging. Or you can omit the verb to just generate a reference to the commit within the issue/PR without taking any action.
See the full GitHub keywords documentation
This section only applies to those who have checked out the zeek repository recursively in order to get a "superproject" containing all (or most) of the Repositories:
git clone --recursive https://github.com/zeek/zeek
This command will checkout the Zeek master branch along with whatever versions of all submodules that repository maintainers currently deem stable.
If one wants to make changes to anything in the "superproject" source tree, it’s the same process as outlined in either the For Developers or For External Contributors sections. However, developers with write access to repositories will notice that all the submodule repositories are initialized using the public URLs, so they may want to add a global git option to automatically rewrite them into the private URLs for write access during push commands:
git config --global url.git@github.com:zeek/zeek.pushInsteadOf https://github.com/zeek/zeek
Developers that only plan to make changes to the top-level zeek repository may not need to worry about the submodule repositories at all — the local copy will remain at whatever version was originally cloned even after performing git pull, however if the pulled version of the parent repository changes what commit the submodule points to, the opportunity can be taken to update the local copy. See the fix-submodules macro below for how to update the local copies of submodules after the parent repository’s maintainer has changed them.
git allows to define custom commands in ~/.gitconfig that then turn into more complex ones. You have to put them into a section [alias]. Here are some that might be helpful for the workflow above:
Like git merge but always add the options --log --no-commit --no-ff (see merging).
merge-branch = merge --log --no-commit --no-ff
Like git push, but pushes only the current branch upstream.
push-current = push origin HEAD
Executes git <git cmd> recursively for all submodules.
recursive = "!sh -c 'for i in . `git submodule foreach -q --recursive pwd`; do cd $i && git $@; done' -"
Executes the given shell command recursively for all submodules.
recursive-sh = "!sh -c 'for i in . `git submodule foreach -q --recursive pwd`; do cd $i && $@; done' -"
Makes sure all submodules are checked out at the revisions specified by their parent module.
fix-submodules = submodule update --recursive --init
Like git log, but compresses the output format into one-line summaries.
log1 = log --format=oneline
© 2014 The Bro Project.