<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" ><generator uri="https://jekyllrb.com/" version="3.10.0">Jekyll</generator><link href="https://brokenco.de//feed/by_tag/git.xml" rel="self" type="application/atom+xml" /><link href="https://brokenco.de//" rel="alternate" type="text/html" /><updated>2026-05-03T00:12:50+00:00</updated><id>https://brokenco.de//feed/by_tag/git.xml</id><title type="html">rtyler</title><subtitle>a moderately technical blog</subtitle><author><name>R. Tyler Croy</name></author><entry><title type="html">Taking control of Git</title><link href="https://brokenco.de//2018/11/25/taking-control-of-git.html" rel="alternate" type="text/html" title="Taking control of Git" /><published>2018-11-25T00:00:00+00:00</published><updated>2018-11-25T00:00:00+00:00</updated><id>https://brokenco.de//2018/11/25/taking-control-of-git</id><content type="html" xml:base="https://brokenco.de//2018/11/25/taking-control-of-git.html"><![CDATA[<p>In the development of service-oriented applications we often will use the
phrase “source of truth” when referring to data and its ownership. The
expectation being that there is generally a <em>single</em> source of truth in the
system. Take DNS for example, we generally trust that a nameserver somewhere
out there is acting as the single source of truth for a single domain, such as
<code class="language-plaintext highlighter-rouge">brokenco.de</code>. Without this guarantee, much of our experience on the internet
would break down. For the software we write, increasingly GitHub has become the
source of truth for the source code itself. So much so that systems have been
built on top of GitHub which further wed the software ecosystem to a single
source of truth, such as Golang’s dependency definition conventions.</p>

<p>I have no fear of the GitHub acquisition by Microsoft, but I do concern myself
with increasingly large single points of failure. A single entity owning too
much of my interactions or data makes me feel uneasy. A little timer starts in
my brain: how long until the good vibes run out, and this ends up screwing me?</p>

<p>With our source code living in Git, switching up the source of truth has never
been easier.  I set out recently to take back <em>control</em> for the source of truth
for my own free and open source work. Using a server I have at my disposal, I
deployed <a href="https://gitea.io/">Gitea</a>. Based originally on Gogs, I have found
Gitea rather pleasant and simple to work with.</p>

<p>Fortunately, somebody else has written a tool: the <a href="https://git.jonasfranz.software/JonasFranzDEV/gitea-github-migrator">gitea-github-migrator</a>
which made initializing the Gitea instance with my repositories quite simple.
Due to some GitHub rate limits and other weird transient network errors, I
ended up running the migrator over and over again until everything was
synchronized properly to my server.</p>

<p>A quick look at my <a href="https://github.com/rtyler">GitHub profile</a> and you may
notice that nothing has been <em>deleted</em>. My objective is to own the source of
truth, not to reduce the redundancy for my source code. Unfortunately as of
today, Gitea cannot automatically push to another Git remote
(<a href="https://github.com/go-gitea/gitea/issues/3480">issue #3480</a>), but creating a script
which can be configured as a <code class="language-plaintext highlighter-rouge">post-receive</code> hook is easy enough:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c">#!/bin/sh</span>

<span class="nb">echo
echo</span> <span class="s2">"Mirroring changes to GitHub under </span><span class="k">${</span><span class="nv">GITEA_REPO_USER_NAME</span><span class="k">}</span><span class="s2">/</span><span class="k">${</span><span class="nv">GITEA_REPO_NAME</span><span class="k">}</span><span class="s2">"</span>
<span class="nb">echo
</span>git push <span class="nt">--mirror</span> git@github.com:<span class="k">${</span><span class="nv">GITEA_REPO_USER_NAME</span><span class="k">}</span>/<span class="k">${</span><span class="nv">GITEA_REPO_NAME</span><span class="k">}</span>.git
<span class="nb">echo</span>
</code></pre></div></div>

<p>To support this script I needed to set up a few of things:</p>

<ol>
  <li>A newly generated  SSH public/private key pair for Gitea to use.</li>
  <li>The new SSH public key needed to be added to my GitHub account</li>
  <li>The above script <code class="language-plaintext highlighter-rouge">gitea-github-mirror</code> installed on the server’s filesystem</li>
  <li>The repositories I wished to mirror needed to have a <code class="language-plaintext highlighter-rouge">post-receive</code> hook
configured which executes <code class="language-plaintext highlighter-rouge">gitea-github-mirror</code></li>
</ol>

<p>Once the desired repositories have been set up, I only needed to change my
local repositories to point somewhere else for their <code class="language-plaintext highlighter-rouge">origin</code> remote.
Not-too-coincidentally, this is where my previous blog post about <a href="/2018/11/05/transparent-ssh-over-tor.html">transparently
switching SSH between Tor and the LAN</a> comes in.</p>

<p>I can now treat GitHub like a public backup for these repositories, and
maintain control over the source of truth for each repository I own and
maintain.</p>

<h3 id="mirroring-other-repositories">Mirroring other repositories</h3>

<p>Gitea has another feature worth mentioning in this same vein, one which I am
only now starting to use: (pull-based) repository mirroring. Inevitably I find
myself relying on third-party repositories either as Git submodules, or for
source-builds of some piece of software. Rather than trust that those
repositories will exist in perpetuity in somebody else’s GitHub organization or
user account, Gitea mirroring allows me to create an automatically-updated
mirror of the upstream repository. I’ve since found myself creating new
organizations in Gitea to house different collections of libraries and tools I
depend on, all automatically synchronized by Gitea.</p>

<hr />

<p>Data provenance is an important subject to me and while not everything is as
easily decentralized as Git, I believe it’s worth the effort to try to <strong>own
your data</strong> as much as possible. For those things which are easily added into
source control, Gitea and a modicum of extra disk space does the job nicely!</p>

<p>(Of course, this blog post was published to GitHub pages, after being mirrored
from my Gitea instance.)</p>]]></content><author><name>R. Tyler Croy</name></author><category term="software" /><category term="git" /><category term="github" /><category term="opinion" /><summary type="html"><![CDATA[In the development of service-oriented applications we often will use the phrase “source of truth” when referring to data and its ownership. The expectation being that there is generally a single source of truth in the system. Take DNS for example, we generally trust that a nameserver somewhere out there is acting as the single source of truth for a single domain, such as brokenco.de. Without this guarantee, much of our experience on the internet would break down. For the software we write, increasingly GitHub has become the source of truth for the source code itself. So much so that systems have been built on top of GitHub which further wed the software ecosystem to a single source of truth, such as Golang’s dependency definition conventions.]]></summary></entry><entry><title type="html">A rebase-based workflow</title><link href="https://brokenco.de//2010/04/02/a-rebase-based-workflow.html" rel="alternate" type="text/html" title="A rebase-based workflow" /><published>2010-04-02T00:00:00+00:00</published><updated>2010-04-02T00:00:00+00:00</updated><id>https://brokenco.de//2010/04/02/a-rebase-based-workflow</id><content type="html" xml:base="https://brokenco.de//2010/04/02/a-rebase-based-workflow.html"><![CDATA[<p><a href="http://agentdero.cachefly.net/unethicalblogger.com/images/branch_madness.jpeg" target="_blank">
<img src="http://agentdero.cachefly.net/unethicalblogger.com/images/branch_madness.jpeg" width="200" align="right" /></a></p>

<p>When I first started working with Git in <a href="http://unethicalblogger.com/posts/2008/07/experimenting_with_git_slide_part_13">mid
2008</a>
I was blissfully oblivious to the concept of a “rebase” and why somebody might
ever use it. While at Slide we were <strong>crazy</strong> for merging (<em>see diagram to the
right</em>), everything pretty much revolved around merges between branches. To add
insult to injury, development revolved around a single central repository which
<em>everyone</em> had the ability to push to. Merges compounded upon merges led to a
frustratingly complex merge history.</p>

<p>When I first arrived at Apture, we were still using Subversion, similar to Slide when I arrived (I have a Git-effect on companies). In order to work effectively, I <em>had</em> to use git-svn(1) in order to commit changes that weren’t quite finished on a day-to-day basis. Rebasing is fundamental to the git-svn(1) workflow, as Subversion requires a linear revision history; I would typically work in the <code class="language-plaintext highlighter-rouge">master</code> branch and execute <code class="language-plaintext highlighter-rouge">git svn rebase</code> prior to <code class="language-plaintext highlighter-rouge">git svn dcommit</code> to ensure that my changes could be properly committed at the head of trunk.</p>

<p>When we finally switched from Subversion to Git we adopted an “integration-manager workflow” which is far more conducive to rebase being useful than the purely centralized repository workflow I had previously used at Slide.</p>

<center><img src="http://agentdero.cachefly.net/unethicalblogger.com/images/integration_manager_workflow.png" /></center>
<center><small>From the [Pro Git](http://progit.org/book/ch5-1.html) site</small></center>

<p>In addition to the publicly readable repositories for each developer, we use Gerrit religiously which I’ll cover in a later post.</p>

<p>We use rebase heavily in this workflow to accomplish three main goals:</p>

<ul>
  <li>Linear revision history</li>
  <li>Concise commits covering a logical change</li>
  <li>Reduction of merge conflicts</li>
</ul>

<p>Creating a solid linear revision history, while not immediately important, is nicer in the longer term allowing developers (or new hires) to walk the history of a particular file or module and see a clear progression of changes.</p>

<p><img src="http://agentdero.cachefly.net/unethicalblogger.com/images/qgit_apture_graph.png" align="right" hspace="4" vspace="4" />Creating concise commits is probably the <strong>most</strong> important reason to use rebase, when working in a topic branch I will typically commit every 20-40 minutes. In order to not break my flow, the commit messages will typically be brief and cover only a few lines of changes, atomic commits are great when writing code but they’re lousy at informing other developers about the changes. To do this, an “interactive rebase” can be used, for example, collapsing the commits in a topic branch <code class="language-plaintext highlighter-rouge">ticket-1234</code> would look like:</p>

<ul>
  <li><code class="language-plaintext highlighter-rouge">git checkout ticket-1234</code></li>
  <li><code class="language-plaintext highlighter-rouge">git rebase -i master</code></li>
</ul>

<p>This will bring up an editor with a list of commits, where you can “squash” commits together and re-write the final commit message to be more informative.</p>

<h3 id="the-workflow">The Workflow</h3>

<p>For the purposes of the example, let’s use the topic branch from above (<code class="language-plaintext highlighter-rouge">ticket-1234</code>) which we’ll assume has 3 commits unique to it.</p>

<ol>
  <li>Fetch the latest changes from the upstream “master” branch
    <ul>
      <li><code class="language-plaintext highlighter-rouge">git fetch origin</code></li>
    </ul>
  </li>
  <li>Rebase the topic branch, effectively piling the 3 commits on top of the latest tip of the upstream “master” branch
    <ul>
      <li><code class="language-plaintext highlighter-rouge">git rebase origin/master</code></li>
    </ul>
  </li>
  <li>Collapse the 3 commits in the topic branch down into one commit
    <ul>
      <li><code class="language-plaintext highlighter-rouge">git rebase -i origin/master</code></li>
    </ul>
  </li>
  <li>(<em>Later</em>) Bringing those commits down into the “master” branch
    <ul>
      <li><code class="language-plaintext highlighter-rouge">git checkout master &amp;&amp; git rebase ticket-1234</code></li>
    </ul>
  </li>
</ol>

<p>With an interactive rebase, you can chop commits up, re-order them, squash them, etc, with the non-interactive rebase you can pile your commits on top of an upstream head making your changes apply cleanly to the latest code in the upstream repository.</p>

<p><a href="http://www.gitready.com/">git ready</a> has a few nice articles on the subject as well, such as an <a href="http://www.gitready.com/intermediate/2009/01/31/intro-to-rebase.html">intro to rebase</a> and an article on <a href="http://www.gitready.com/advanced/2009/02/10/squashing-commits-with-rebase.html">squashing commits with rebase</a></p>]]></content><author><name>R. Tyler Croy</name></author><category term="software development" /><category term="git" /><summary type="html"><![CDATA[]]></summary></entry><entry><title type="html">Pre-tested commits with Hudson and Git</title><link href="https://brokenco.de//2009/12/31/pre-tested-commits-with-hudson-and-git.html" rel="alternate" type="text/html" title="Pre-tested commits with Hudson and Git" /><published>2009-12-31T00:00:00+00:00</published><updated>2009-12-31T00:00:00+00:00</updated><id>https://brokenco.de//2009/12/31/pre-tested-commits-with-hudson-and-git</id><content type="html" xml:base="https://brokenco.de//2009/12/31/pre-tested-commits-with-hudson-and-git.html"><![CDATA[<p>A few months ago <a id="aptureLink_yMRaEAQt6P" href="http://twitter.com/kohsukekawa">Kohsuke</a>, author of the <a id="aptureLink_gay9zt4yuf" href="http://twitter.com/hudsonci">Hudson continuous integration server</a>, 
introduced me to the concept of the “pre-tested commit”, a feature of the <a id="aptureLink_h8ICO1PttT" href="http://en.wikipedia.org/wiki/TeamCity">TeamCity</a>
build management and continuous integration system. The concept is simple, the build
system stands as a roadblock between your commit entering trunk and only after the 
build system determines that your commit doesn’t break things does it allow the commit
to be introduced into version control, where other developers will sync and integrate 
that change into their local working copies. The reasoning and workflow put forth by 
TeamCity for “pre-tested commits” is very dependent on a centralized version control
system, it is solving an issue <a id="aptureLink_IXcu5r11no" href="http://en.wikipedia.org/wiki/Git%20%28software%29">Git</a> or <a id="aptureLink_cPtvZ5XxiP" href="http://en.wikipedia.org/wiki/Mercurial%20%28software%29">Mercurial</a> users don’t really run into. Those using 
Git can commit their hearts out all day long and it won’t affect their colleagues until they
<strong>merge</strong> their commits with others.</p>

<p>In some cases, allowing buggy or broken code to be <em>merged</em> in from another developer’s Git
repository can be worse than in a central version control system, since the recipient of the 
broken code might perform a knee-jerk <a id="aptureLink_N7GE0Q9soz" href="http://www.kernel.org/pub/software/scm/git/docs/git-revert.html">git-revert(1)</a> command on the merge! When you revert 
a merge commit in Git, what happens is you not only revert the merge, you revert the commits 
associated with that merge commit; in essence, you’re reverting <em>everything</em> you just merged in 
when you likely just wanted to get the broken code out of your local tree so you could continue
working without interruption. To solve for this problem-case, I utilize a “pre-tested commit” or 
“pre-tested merge” workflow with Hudson.</p>

<p>My workflow with Hudson for pre-tested commits involves three separate Git repositories: my local
repo (local), the canonical/central repo (origin) and my “world-readable” (inside the firewall) repo (public). 
For pre-tested commits, I utilize a constantly changing branch called “pu” (potential updates) on the 
world-readable repo. Inside of Hudson I created a job that polls the world-readable repo (public) 
for changes in the “pu” branch and will kick off builds when updates are pushed. Since the content of 
<code class="language-plaintext highlighter-rouge">public/pu</code> is constantly changing, the <a id="aptureLink_O9LMHblU7c" href="http://www.kernel.org/pub/software/scm/git/docs/git-push.html">git-push(1)</a> commands to it must be “forced-updates” since I am 
effectively rewriting history every time I push to <code class="language-plaintext highlighter-rouge">public/pu</code>.</p>

<p>To help forcefully pushing updates from my current local branch to <code class="language-plaintext highlighter-rouge">public/pu</code> I use the following <a id="aptureLink_jO9JAsy1Sm" href="http://git.or.cz/gitwiki/Aliases">git alias</a>:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% git config alias.pup "\!f() { branch=\$(git symbolic-ref HEAD | sed 's/refs\\/heads\\///g');\
      git push -f \$1 +\${branch}:pu;}; f"
</code></pre></div></div>

<p>While a little obfuscated, thie <code class="language-plaintext highlighter-rouge">pup</code> alias forcefully pushes the contents of the current branch to the specified 
remote repository’s <code class="language-plaintext highlighter-rouge">pu</code> branch. I find this is easier than constantly typing out: <code class="language-plaintext highlighter-rouge">git push -f public +topic:pu</code></p>

<p>In list form, my workflow for taking a change from inception to <code class="language-plaintext highlighter-rouge">origin</code> is:</p>

<ul>
  <li><em>hack, hack, hack</em></li>
  <li>commit to <code class="language-plaintext highlighter-rouge">local/topic</code></li>
  <li><code class="language-plaintext highlighter-rouge">git pup public</code></li>
  <li>Hudson polls <code class="language-plaintext highlighter-rouge">public/pu</code></li>
  <li>Hudson runs potential-updates job</li>
  <li>Tests fail?
    <ul>
      <li><strong>Yes</strong>: Rework commit, try again</li>
      <li><strong>No</strong>: Continue</li>
    </ul>
  </li>
  <li>Rebase onto <code class="language-plaintext highlighter-rouge">local/master</code></li>
  <li>Push to <code class="language-plaintext highlighter-rouge">origin/master</code></li>
</ul>

<p>Using this pre-tested commit workflow I can offload the majority of my testing requirements to the build system’s cluster of machines instead of running them locally, meaning I can spend the <strong>majority</strong> of my time writing code instead of waiting for tests to complete on my own machine in between coding iterations.</p>]]></content><author><name>R. Tyler Croy</name></author><category term="software development" /><category term="git" /><category term="hudson" /><summary type="html"><![CDATA[A few months ago Kohsuke, author of the Hudson continuous integration server, introduced me to the concept of the “pre-tested commit”, a feature of the TeamCity build management and continuous integration system. The concept is simple, the build system stands as a roadblock between your commit entering trunk and only after the build system determines that your commit doesn’t break things does it allow the commit to be introduced into version control, where other developers will sync and integrate that change into their local working copies. The reasoning and workflow put forth by TeamCity for “pre-tested commits” is very dependent on a centralized version control system, it is solving an issue Git or Mercurial users don’t really run into. Those using Git can commit their hearts out all day long and it won’t affect their colleagues until they merge their commits with others. In some cases, allowing buggy or broken code to be merged in from another developer’s Git repository can be worse than in a central version control system, since the recipient of the broken code might perform a knee-jerk git-revert(1) command on the merge! When you revert a merge commit in Git, what happens is you not only revert the merge, you revert the commits associated with that merge commit; in essence, you’re reverting everything you just merged in when you likely just wanted to get the broken code out of your local tree so you could continue working without interruption. To solve for this problem-case, I utilize a “pre-tested commit” or “pre-tested merge” workflow with Hudson. My workflow with Hudson for pre-tested commits involves three separate Git repositories: my local repo (local), the canonical/central repo (origin) and my “world-readable” (inside the firewall) repo (public). For pre-tested commits, I utilize a constantly changing branch called “pu” (potential updates) on the world-readable repo. Inside of Hudson I created a job that polls the world-readable repo (public) for changes in the “pu” branch and will kick off builds when updates are pushed. Since the content of public/pu is constantly changing, the git-push(1) commands to it must be “forced-updates” since I am effectively rewriting history every time I push to public/pu. To help forcefully pushing updates from my current local branch to public/pu I use the following git alias: % git config alias.pup "\!f() { branch=\$(git symbolic-ref HEAD | sed 's/refs\\/heads\\///g');\ git push -f \$1 +\${branch}:pu;}; f" While a little obfuscated, thie pup alias forcefully pushes the contents of the current branch to the specified remote repository’s pu branch. I find this is easier than constantly typing out: git push -f public +topic:pu In list form, my workflow for taking a change from inception to origin is: hack, hack, hack commit to local/topic git pup public Hudson polls public/pu Hudson runs potential-updates job Tests fail? Yes: Rework commit, try again No: Continue Rebase onto local/master Push to origin/master Using this pre-tested commit workflow I can offload the majority of my testing requirements to the build system’s cluster of machines instead of running them locally, meaning I can spend the majority of my time writing code instead of waiting for tests to complete on my own machine in between coding iterations.]]></summary></entry><entry><title type="html">Code Review with Gerrit, a mostly visual guide</title><link href="https://brokenco.de//2009/12/07/code-review-with-gerrit-a-mostly-visual-guide.html" rel="alternate" type="text/html" title="Code Review with Gerrit, a mostly visual guide" /><published>2009-12-07T00:00:00+00:00</published><updated>2009-12-07T00:00:00+00:00</updated><id>https://brokenco.de//2009/12/07/code-review-with-gerrit-a-mostly-visual-guide</id><content type="html" xml:base="https://brokenco.de//2009/12/07/code-review-with-gerrit-a-mostly-visual-guide.html"><![CDATA[<p><strong>Update:</strong> Some of this information is out of date. Instead of pushing to the
<code class="language-plaintext highlighter-rouge">gerrit</code> master branch I recommend setting up
“<a href="http://gerrit.googlecode.com/svn/documentation/2.0/config-replication.html">replication</a>” and using the
“Submit” button inside of the “Review” page.</p>

<hr />

<p>A while ago, when <a id="aptureLink_DCQGFvVLOq" href="http://twitter.com/pjthiel">Paul</a>, <a id="aptureLink_BbwdfFjMPz" href="http://twitter.com/jasonrubenstein">Jason</a> and I worked together, I became a big fan of code reviews before merging code. It was no surprise really, we were the first to adopt <a id="aptureLink_ySC1aL45rF" href="http://en.wikipedia.org/wiki/Git%20%28software%29">Git</a> at the company and our workflow was quite ad-hoc, the need to federate knowledge within the group meant code reviews were a pretty big deal. At the time, we mostly did code reviews in person by way of “hey, what’s this you’re doing here?” or by literally sending patch emails with <a id="aptureLink_NlYWR6qaQY" href="http://www.kernel.org/pub/software/scm/git/docs/git-format-patch.html">git-format-patch(1)</a> to the team mailing list so all could participate in the discussion about what merits “good code” exhibited versus “less good code.” Now that I’ve left that company and joined another one, I’ve found myself in another small-team situation, where my teammates place high value on code review. Fortunately this time around better tools exist, namely: <a id="aptureLink_suzQh0OgeJ" href="http://code.google.com/p/gerrit/">Gerrit</a>.</p>

<p>The history behind Gerrit I’m a bit hazy on, what I do know is that it’s primary developer Shawn Pearce (<a id="aptureLink_ZO1gp7ghRJ" href="http://www.linkedin.com/pub/shawn-pearce/0/a93/61">spearce</a>) is one of the Git “inner circle” who contributes heavily to Git itself as well as <a id="aptureLink_ORrreTOiql" href="http://www.jgit.org/">JGit</a>, a Git implementation in Java which sits underneath Gerrit’s internals. What makes Gerrit unique in the land of code review systems is how tightly coupled Gerrit is with Git itself, so much so that you submit changes by <strong>pushing</strong> as if the Gerrit server were “just another Git repo.”</p>

<p>I recommend building Gerrit from source for now, spearce is planning a proper release of the recent Gerrit developments shortly before Christmas, but who has that kind of patience! To build Gerrit you will need <a id="aptureLink_za0iMCBpFC" href="http://en.wikipedia.org/wiki/Apache%20Maven">Maven</a> and the Sun <a id="aptureLink_V99Bh9QLC8" href="http://en.wikipedia.org/wiki/Java%20Development%20Kit">JDK</a> 1.6.</p>

<h2 id="setting-up-the-gerrit-daemon">Setting up the Gerrit daemon</h2>

<p>First you should clone one of Gerrit’s dependencies, followed by Gerrit itself:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>banana% git clone git://android.git.kernel.org/tools/gwtexpui.git
banana% git clone git://android.git.kernel.org/tools/gerrit.git
</code></pre></div></div>

<p>Once both clones are complete, you can start by building one and then the other (which might take a while, go grab yourself a coffee, you’ve earned it):</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>banana% (cd gwtexpui &amp;&amp; mvn install)
banana% cd gerrit &amp;&amp; mvn clean package
</code></pre></div></div>

<p>After Gerrit has finished building, you’ll have a <code class="language-plaintext highlighter-rouge">.war</code> file ready to run Gerrit with (<em>note:</em> depending on when you read this article, your path to gerrit.war might have changed). First we’ll initialize the directory “/srv/gerrit” as the location where the executing Gerrit daemon will store its logs, data, etc:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>banana% java -jar gerrit-war/target/gerrit-2.0.25-SNAPSHOT.war init -d /srv/gerrit
*** Gerrit Code Review v2.0.24.2-72-g4c37167
***

Initialize '/srv/gerrit' [y/n]? y

*** Git Repositories
***

Location of Git repositories   [git]:

*** SQL Database
***

Database server type           [H2/?]:

*** User Authentication
***

Authentication method          [OPENID/?]:

*** Email Delivery
***

SMTP server hostname           [localhost]:
SMTP server port               [(default)]:
SMTP encryption                [NONE/?]:
SMTP username                  :

*** SSH Daemon
***

Gerrit SSH listens on address  [*]:
Gerrit SSH listens on port     [29418]:

Gerrit Code Review is not shipped with Bouncy Castle Crypto v144
  If available, Gerrit can take advantage of features
  in the library, but will also function without it.
Download and install it now [y/n]? y
Downloading http://www.bouncycastle.org/download/bcprov-jdk16-144.jar ... OK
Checksum bcprov-jdk16-144.jar OK
Generating SSH host key ... rsa... dsa... done

*** HTTP Daemon
***

Behind reverse HTTP proxy (e.g. Apache mod_proxy) [y/n]? n
Use https:// (SSL)             [y/n]? n
Gerrit HTTP listens on address [*]:
Gerrit HTTP listens on port    [8080]: 

Initialized /srv/gerrit
</code></pre></div></div>

<p>After running through Gerrit’s brief wizard, you’ll be ready to start Gerrit itself (<em>note:</em> this command will not detach from the terminal, so you might want to start it within screen for now):</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>banana% java -jar gerrit-war/target/gerrit-2.0.25-SNAPSHOT.war daemon -d /srv/gerrit
</code></pre></div></div>

<p>Now that you’ve reached this point you’ll have Gerrit running a web application on port 8080, and listening for SSH connections on port 29418, congratulations! You’re most of the way there :)</p>

<h2 id="creating-users-and-groups">Creating users and groups</h2>
<p>Welcome to Gerrit</p>
<center><a href="http://agentdero.cachefly.net/unethicalblogger.com/images/gerrit_start.png" rel="lightbox"><img src="http://agentdero.cachefly.net/unethicalblogger.com/images/gerrit_start.png" width="550" /></a></center>
<p>First thing you should do after starting Gerrit up is log in to make sure your user is the administrator, you can do so by clicking the “Register” link in the top right corner which should present you with an openID login dialog</p>
<center><a href="http://agentdero.cachefly.net/unethicalblogger.com/images/gerrit_openid.png" rel="lightbox"><img width="550" src="http://agentdero.cachefly.net/unethicalblogger.com/images/gerrit_openid.png" /></a></center>
<p>After logging in with your favorite openID provider, Gerrit will allow you to enter in information about you (SSH key, email address, etc). It’s worth noting that the email address is <strong>very</strong> important as Gerrit uses the email address to match your commits to your Gerrit account</p>
<center><a href="http://agentdero.cachefly.net/unethicalblogger.com/images/gerrit_account_create.png" rel="lightbox"><img width="550" src="http://agentdero.cachefly.net/unethicalblogger.com/images/gerrit_account_create.png" /></a></center>
<p>When you create your SSH key for Gerrit, it’s recommended that you give it a custom entry in <code class="language-plaintext highlighter-rouge">~/.ssh/config</code> along the lines of:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Host gerrithost
    User &lt;you&gt;
    Port 29418
    Hostname &lt;gerrithost&gt;
    IdentityFile &lt;path/to/private/key&gt;
</code></pre></div></div>

<p>After you click “Continue” at the bottom of the user information page, you will be taken to your dashboard which is where your changes waiting to be reviewed as well as changes waiting to be reviewed <em>by</em> you will be waiting</p>
<center><a href="http://agentdero.cachefly.net/unethicalblogger.com/images/gerrit_mydashboard.png" rel="lightbox"><img width="550" src="http://agentdero.cachefly.net/unethicalblogger.com/images/gerrit_mydashboard.png" /></a></center>

<p>Now that your account is all set up, let’s create a group for “integrators”, integrators in Git parlance are those that are responsible for reviewing code and integrating it into the “official” repository (typically integrators are project maintainers or core developers). Be sure to add yourself to the “Integrators” group, we’ll use this “Integrators” group later to create more granular permissions on a particular project:</p>
<center><a href="http://agentdero.cachefly.net/unethicalblogger.com/images/gerrit_creategroup.png" rel="lightbox"><img width="550" src="http://agentdero.cachefly.net/unethicalblogger.com/images/gerrit_creategroup.png" /></a></center>

<h2 id="projects-in-gerrit">Projects in Gerrit</h2>
<p>Creating a new project in Gerrit is fairly easy but a little <em>different</em> insofar that there isn’t a web UI for doing so but there is a command line one:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>banana% ssh gerrithost gerrit create-project -n &lt;project-name&gt;
</code></pre></div></div>

<p>For the purposes of my examples moving forward, we’ll use a project created in Gerrit for one of the Python modules I maintain, <a id="aptureLink_B0WQyZCJVK" href="http://search.twitter.com/search?q=py-yajl">py-yajl</a>. After creating the “py-yajl” project with the command line, I can visit Admin &gt; Projects and select “py-yajl” and edited some of its permissions. Here we’ll give “Integrators” the ability to <strong>Verify</strong> changes as well as <strong>Push Branch</strong>.</p>
<center><a href="http://agentdero.cachefly.net/unethicalblogger.com/images/gerrit_integratoraccess.png" rel="lightbox"><img width="550" src="http://agentdero.cachefly.net/unethicalblogger.com/images/gerrit_integratoraccess.png" /></a></center>

<p>With the py-yajl project all set up in Gerrit, I can return to my Git repository and add a “remote” for Gerrit, and push my master branch to it</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>banana% git checkout master
banana% git remote add gerritrhost ssh://gerrithost/py-yajl.git
banana% git push gerrithost master
</code></pre></div></div>

<p>This will give Gerrit a baseline for reviewing changes against and allow it to determine when a change has been merged down. Before getting down to business and starting to commit changes, it’s recommended that you install the <a href="http://gerrit.googlecode.com/svn/documentation/2.0/user-changeid.html#creation" target="_blank"><strong>Gerrit Change-Id commit-msg hook documented here</strong></a> which will help Gerrit track changes through rebasing; once that’s taken care of, have at it!</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>banana% git checkout -b topic-branch
banana% &lt;work&gt;
banana% git commit 
banana% git push gerrithost HEAD:refs/for/master
</code></pre></div></div>

<p>The last command will push my commit to Gerrit, the command is kind of weird looking so feel free to put it behind a <a id="aptureLink_4QD4sdoRxy" href="http://git.or.cz/gitwiki/Aliases">git-alias(1)</a>. After the push is complete however, my changes will be awaiting review in Gerrit</p>
<center><a href="http://agentdero.cachefly.net/unethicalblogger.com/images/gerrit_openchanges.png" rel="lightbox"><img width="550" src="http://agentdero.cachefly.net/unethicalblogger.com/images/gerrit_openchanges.png" /></a></center>

<center><a href="http://agentdero.cachefly.net/unethicalblogger.com/images/gerrit_changeoverview.png" rel="lightbox"><img width="550" src="http://agentdero.cachefly.net/unethicalblogger.com/images/gerrit_changeoverview.png" /></a></center>

<p>At this point, you’d likely wait for another reviewer to come along and either comment your code inline in the side-by-side viewer or otherwise approve the commit bu clicking “Publish Comments”</p>
<center><a href="http://agentdero.cachefly.net/unethicalblogger.com/images/gerrit_publishcomments.png" rel="lightbox"><img width="550" src="http://agentdero.cachefly.net/unethicalblogger.com/images/gerrit_publishcomments.png" /></a></center>

<p>After comments have been published, the view in My Dashboard has changed to indicate that the change has not only been reviewed but also verified:</p>
<center><a href="http://agentdero.cachefly.net/unethicalblogger.com/images/gerrit_mydashboard_changesreviewed.png" rel="lightbox"><img width="550" src="http://agentdero.cachefly.net/unethicalblogger.com/images/gerrit_mydashboard_changesreviewed.png" /></a></center>

<p>Upon seeing this, I can return back to my Git repository and feel comfortable merging my code to the master branch:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>banana% git checkout master
banana% git merge topic-branch
banana% git push origin master
banana% git push gerrithost master
</code></pre></div></div>

<p>The last command is significant again, by pushing the updated master branch to Gerrit, we indicate that the change has been merged, which is also reflected in My Dashboard</p>
<center><a href="http://agentdero.cachefly.net/unethicalblogger.com/images/gerrit_mydashboard_changesmerged.png" rel="lightbox"><img width="550" src="http://agentdero.cachefly.net/unethicalblogger.com/images/gerrit_mydashboard_changesmerged.png" /></a></center>

<p>Tada! You’ve just had your code reviewed and subsequently integrated into the upstream tree, pat yourself on the back. It’s worth noting that while Gerrit is under steady development it <em>is</em> being used by the likes of the Android team, JGit/EGit team and countless others. Gerrit contains a number of nice subtle features, like double-clicking a line inside the side-by-side diff to add a comment to that line specifically, the ability to “star” changes (similar to bookmarking) and a too many others to go into detail in this post.</p>

<p>While it may seem like this was a fair amount of set-up to get code reviews going, the payoff can be tremendous, Gerrit facilitates a solid Git-oriented code review process that scales very well with the number of committers and changes. I hope you enjoy it :)</p>]]></content><author><name>R. Tyler Croy</name></author><category term="software development" /><category term="git" /><summary type="html"><![CDATA[Update: Some of this information is out of date. Instead of pushing to the gerrit master branch I recommend setting up “replication” and using the “Submit” button inside of the “Review” page.]]></summary></entry><entry><title type="html">On GitHub and how I came to write the fastest Python JSON module in town</title><link href="https://brokenco.de//2009/12/04/on-github-and-how-i-came-to-write-the-fastest-python-json-module-in-town.html" rel="alternate" type="text/html" title="On GitHub and how I came to write the fastest Python JSON module in town" /><published>2009-12-04T00:00:00+00:00</published><updated>2009-12-04T00:00:00+00:00</updated><id>https://brokenco.de//2009/12/04/on-github-and-how-i-came-to-write-the-fastest-python-json-module-in-town</id><content type="html" xml:base="https://brokenco.de//2009/12/04/on-github-and-how-i-came-to-write-the-fastest-python-json-module-in-town.html"><![CDATA[<p>Perhaps the title is a bit too much ego stroking, yes, I did write the fastest Python module for decoding JSON strings and encoding Python objects to JSON. I didn’t however write the parser behind the scenes.</p>

<p>Over the summer I discovered “<a id="aptureLink_n24z7kSMi1" href="http://lloyd.github.com/yajl/">Yet Another JSON Library</a>” on <a id="aptureLink_u0eQz9GMNI" href="http://www.crunchbase.com/company/github">GitHub</a>, written by <a id="aptureLink_YqaYOvz7FP" href="http://twitter.com/lloydhilaiel">Lloyd Hilaiel</a>, jonesing for a Saturday afternoon project I started the “<a id="aptureLink_iih8O9gONv" href="http://search.twitter.com/search?q=py-yajl">py-yajl</a>” project to see if I could implement a Python C module atop Lloyd’s marvelous parsing library. After tinkering with the project for a while I got a working prototype building (learning how to define custom types in Python along the way) and let the project stagnate as my weekend ended and the workweek resumed.</p>

<p>A little over a week ago “<a id="aptureLink_S2nwrzEgQp" href="http://github.com/autodata">autodata</a>”, another GitHub user, sent me a “Pull Request” with some minor changes to make py-yajl build cleaner on amd64; my interest in the project was suddenly reignited, amazing what a little interest can do for motivation. Over the 10  days following autodata’s pull request I discovered that a former colleague of mine and fellow GitHub user “<a id="aptureLink_mY3NgqZfrq" href="http://twitter.com/teepark">teepark</a>” had forked the project as well, working on Python 3 support. Going from zero to <strong>two</strong> people interested in the project, I quickly converted the code from a stagnant, borderline embarrassing, dump of C code into a leak-free, swift JSON library
for Python. Not one to miss out on the fun, I pinged Lloyd who quickly became as enamored with making py-yajl the best Python JSON module available, he forked the project and almost immediately sent a number of pull requests my way with further optimizations to py-yajl such as:</p>

<ul>
  <li>Swapping out the use of Python lists to a custom pointer stack for maintaining internal state</li>
  <li>Accelerating parsing and handling of Number objects</li>
  <li>Pruning a few memory leaks here and there</li>
</ul>

<p>Thanks to <a id="aptureLink_CZHm3Z4vyV" href="http://twitter.com/mikeal">mikeal</a>’s <a id="aptureLink_2E75jRgjq1" href="http://www.mikealrogers.com/archives/695">JSON post</a> and <a href="http://gist.github.com/239887">jsonperf.py</a> script, Lloyd and I could both see how py-yajl was stacking up against <a id="aptureLink_kofLpe0ikl" href="http://pypi.python.org/pypi/python-cjson">cjson</a>, jsonlib, <a id="aptureLink_V0T79aEWbu" href="http://code.google.com/p/jsonlib2/">jsonlib2</a> and <a id="aptureLink_bZhlC8WgRE" href="http://code.google.com/p/simplejson/">simplejson</a>; things got competitive. Below are the most recent <code class="language-plaintext highlighter-rouge">jsonperf.py</code> results with py-yajl v0.1.1:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>json.loads:         6470.22037ms
simplejson.loads:   202.21063ms  
yajl.loads:         145.32621ms
cjson.decode:       102.44788ms

json.dumps:         2309.15286ms
cjson.encode:       276.49586ms   
simplejson.dumps:   201.59785ms
yajl.dumps:         161.00153ms
</code></pre></div></div>

<p>Over the coming days or weeks (as time permits) I’m planning on adding JSON stream parsing support, i.e. parsing a stream of data as it’s coming in off a socket or file object, as well as a few other miscellaneous tasks.</p>

<p>Given the nature of GitHub’s social coding dynamic, py-yajl got off the ground as a project but Yajl itself gained an IRC channel (#yajl on Freenode) and a mailing list (yajl@librelist.com). To date I have over 20 unique repositories on GitHub (i.e. authored by me) but the experience around Yajl has been the most exciting and finally proved the “social coding” concept beneficial to me.</p>]]></content><author><name>R. Tyler Croy</name></author><category term="software development" /><category term="git" /><category term="python" /><summary type="html"><![CDATA[Perhaps the title is a bit too much ego stroking, yes, I did write the fastest Python module for decoding JSON strings and encoding Python objects to JSON. I didn’t however write the parser behind the scenes. Over the summer I discovered “Yet Another JSON Library” on GitHub, written by Lloyd Hilaiel, jonesing for a Saturday afternoon project I started the “py-yajl” project to see if I could implement a Python C module atop Lloyd’s marvelous parsing library. After tinkering with the project for a while I got a working prototype building (learning how to define custom types in Python along the way) and let the project stagnate as my weekend ended and the workweek resumed. A little over a week ago “autodata”, another GitHub user, sent me a “Pull Request” with some minor changes to make py-yajl build cleaner on amd64; my interest in the project was suddenly reignited, amazing what a little interest can do for motivation. Over the 10 days following autodata’s pull request I discovered that a former colleague of mine and fellow GitHub user “teepark” had forked the project as well, working on Python 3 support. Going from zero to two people interested in the project, I quickly converted the code from a stagnant, borderline embarrassing, dump of C code into a leak-free, swift JSON library for Python. Not one to miss out on the fun, I pinged Lloyd who quickly became as enamored with making py-yajl the best Python JSON module available, he forked the project and almost immediately sent a number of pull requests my way with further optimizations to py-yajl such as: Swapping out the use of Python lists to a custom pointer stack for maintaining internal state Accelerating parsing and handling of Number objects Pruning a few memory leaks here and there Thanks to mikeal’s JSON post and jsonperf.py script, Lloyd and I could both see how py-yajl was stacking up against cjson, jsonlib, jsonlib2 and simplejson; things got competitive. Below are the most recent jsonperf.py results with py-yajl v0.1.1: json.loads: 6470.22037ms simplejson.loads: 202.21063ms yajl.loads: 145.32621ms cjson.decode: 102.44788ms json.dumps: 2309.15286ms cjson.encode: 276.49586ms simplejson.dumps: 201.59785ms yajl.dumps: 161.00153ms Over the coming days or weeks (as time permits) I’m planning on adding JSON stream parsing support, i.e. parsing a stream of data as it’s coming in off a socket or file object, as well as a few other miscellaneous tasks. Given the nature of GitHub’s social coding dynamic, py-yajl got off the ground as a project but Yajl itself gained an IRC channel (#yajl on Freenode) and a mailing list (yajl@librelist.com). To date I have over 20 unique repositories on GitHub (i.e. authored by me) but the experience around Yajl has been the most exciting and finally proved the “social coding” concept beneficial to me.]]></summary></entry><entry><title type="html">Do you love Git too?</title><link href="https://brokenco.de//2009/11/03/do-you-love-git-too.html" rel="alternate" type="text/html" title="Do you love Git too?" /><published>2009-11-03T00:00:00+00:00</published><updated>2009-11-03T00:00:00+00:00</updated><id>https://brokenco.de//2009/11/03/do-you-love-git-too</id><content type="html" xml:base="https://brokenco.de//2009/11/03/do-you-love-git-too.html"><![CDATA[<p>In addition to RSS feeds, one of my favorite sources of reading material is the <a id="aptureLink_kVnWAHwnNd" href="http://git-scm.org">Git mailing list</a>; I’m not really active, I simply enjoy reading the discussions around code and the best solutions for certain problems. If you read the list long enough, you’ll start to appreciate the time and attention the Git core developers (<a id="aptureLink_xnlR489xfT" href="http://www.linkedin.com/pub/shawn-pearce/0/a93/61">spearce</a>, <a id="aptureLink_m0cWtPFy7a" href="http://peff.net/peff/">peff</a> and <a id="aptureLink_GNv5qRpV4O" href="http://gitster.livejournal.com">junio</a> (a.k.a. gitster)) put into cultivating the code and in cultivating new contributors. Of all the open source projects I watch to one extent or another, Git is very effective at bringing in new contributors and getting their contributions vetted for inclusion.</p>

<p>If you’re a heavy Git user (like me) you can certainly see the results of their tireless efforts, Junio’s (git.git’s maintainer) in particular. I highly recommend checking out his Amazon <a href="http://www.amazon.com/gp/registry/wishlist/1513KNZE30W63">wishlist</a> to thank him for his efforts.</p>]]></content><author><name>R. Tyler Croy</name></author><category term="git" /><summary type="html"><![CDATA[In addition to RSS feeds, one of my favorite sources of reading material is the Git mailing list; I’m not really active, I simply enjoy reading the discussions around code and the best solutions for certain problems. If you read the list long enough, you’ll start to appreciate the time and attention the Git core developers (spearce, peff and junio (a.k.a. gitster)) put into cultivating the code and in cultivating new contributors. Of all the open source projects I watch to one extent or another, Git is very effective at bringing in new contributors and getting their contributions vetted for inclusion. If you’re a heavy Git user (like me) you can certainly see the results of their tireless efforts, Junio’s (git.git’s maintainer) in particular. I highly recommend checking out his Amazon wishlist to thank him for his efforts.]]></summary></entry><entry><title type="html">Jython, JGit and co. in Hudson</title><link href="https://brokenco.de//2009/07/21/jython-jgit-and-co-in-hudson.html" rel="alternate" type="text/html" title="Jython, JGit and co. in Hudson" /><published>2009-07-21T00:00:00+00:00</published><updated>2009-07-21T00:00:00+00:00</updated><id>https://brokenco.de//2009/07/21/jython-jgit-and-co-in-hudson</id><content type="html" xml:base="https://brokenco.de//2009/07/21/jython-jgit-and-co-in-hudson.html"><![CDATA[<p>At the <a href="http://wiki.hudson-ci.org/display/HUDSON/BayAreaMeetup">Hudson Bay Area Meetup/Hackathon</a> 
that <a href="http://slideinc.github.com">Slide, Inc.</a> hosted last weekend, I worked on the 
<a href="http://wiki.hudson-ci.org/display/HUDSON/Jython+Plugin">Jython plugin</a> 
and released it just days after releasing a strikingly similar plugin, the 
<a href="http://wiki.hudson-ci.org/display/HUDSON/Python+Plugin">Python plugin</a>. I felt 
that an explanation might be warranted as to why I would do such a thing.</p>

<p>For those that don’t know, <a href="http://hudson-ci.org">Hudson</a> is a Java-based continuous 
integration server, one of the best CI servers developed (in my humblest of opinions). 
What makes Hudson so great is a <strong>very</strong> solid <a href="http://wiki.hudson-ci.org/display/HUDSON/Extend+Hudson">plugin architecture</a> 
allowing developers to extend Hudson to support a wide variety of scripting languages 
as well as notifiers, source control systems, and so on (<a href="http://weblogs.java.net/blog/kohsuke/archive/2009/06/growth_of_hudso.html">related post</a> 
on the growth of Hudson’s plugin ecosystem). Additionally, Hudson supports <em>slaves</em> on
any operating system that Java supports, allowing you to have a central manager (the 
“master” Hudson server/node) and a vast network of different machines performing tasks 
and executing jobs. Now that you’re up to speed, back to the topic at hand.</p>

<p><strong>Jython</strong> versus <strong>Python</strong> plugin. Why bother with either, as <a href="http://twitter.com/gboissinot">@gboissinot</a> 
pointed out in <a href="http://twitter.com/gboissinot/status/2619505521">this tweet</a>? The 
<em>interesting</em> thing about the Jython plugin, particularly when you use a large number
of slaves is that with the installation of the Jython plugin, suddenly you have the 
ability to execute Python script on <strong>every</strong> single slave, regardless of whether or 
not they actually have Python installed. The more “third party” that can be moved into 
Hudson by way of the plugin system means reduced dependencies and difficulty setting 
up slaves to help handle load.</p>

<p>Take the “git” versus the “git2” plugin, the git plugin was recently criticized on the 
<a href="irc://irc.freenode.net/hudson">#hudson channel</a> because of it’s use of the <a href="http://www.jgit.org/">JGit</a> 
library, versus “git2” which invokes <a href="http://git-scm.org">git(1)</a> on the command line. 
The latter approach is flawed for a number of reasons, particularly the reliance on the git 
command line executables and scripts to return consistent formatting is specious at best 
even if you aren’t relying on “porcelain” (git community terminology for front-end-ish 
script and code sitting on top of the “plumbing”, the breakdown is detailed <a href="http://www.kernel.org/pub/software/scm/git/docs/">here</a>). 
The command-line approach also means you now have to ensure every one of your slaves 
that are likely to be executing builds have the appropriate packages installed. 
One the flipside however, with the JGit-based approach, the Hudson slave 
agent can transfer the 
appropriate bytecode to the machine in question and execute that without relying on 
system-dependencies.</p>

<p>The Hudson Subversion plugin takes a similar approach, being based on <a href="http://svnkit.com/">SVNKit</a>.</p>

<p>Being a Python developer by trade, I am certainly not in the “Java Fanboy” camp, but 
the efficiencies gained by incorporating Java-based libraries in Hudson plugins and 
extensions is a no brainer, the reduction of dependencies on the systems incorporated 
in your build farm will save you plenty of time in maintenance and version woes alone. 
In my opinion, the benefits of JGit, Jython, SVNKit, and the other Java-based libraries 
that are running some of the most highly used plugins in the Hudson ecosystem continue 
to outweigh the costs, especially as <a href="http://slideinc.github.com">we</a> find ourselves bringing more and more slaves 
online.
<!--break--></p>]]></content><author><name>R. Tyler Croy</name></author><category term="opinion" /><category term="miscellaneous" /><category term="software development" /><category term="git" /><category term="hudson" /><summary type="html"><![CDATA[At the Hudson Bay Area Meetup/Hackathon that Slide, Inc. hosted last weekend, I worked on the Jython plugin and released it just days after releasing a strikingly similar plugin, the Python plugin. I felt that an explanation might be warranted as to why I would do such a thing. For those that don’t know, Hudson is a Java-based continuous integration server, one of the best CI servers developed (in my humblest of opinions). What makes Hudson so great is a very solid plugin architecture allowing developers to extend Hudson to support a wide variety of scripting languages as well as notifiers, source control systems, and so on (related post on the growth of Hudson’s plugin ecosystem). Additionally, Hudson supports slaves on any operating system that Java supports, allowing you to have a central manager (the “master” Hudson server/node) and a vast network of different machines performing tasks and executing jobs. Now that you’re up to speed, back to the topic at hand. Jython versus Python plugin. Why bother with either, as @gboissinot pointed out in this tweet? The interesting thing about the Jython plugin, particularly when you use a large number of slaves is that with the installation of the Jython plugin, suddenly you have the ability to execute Python script on every single slave, regardless of whether or not they actually have Python installed. The more “third party” that can be moved into Hudson by way of the plugin system means reduced dependencies and difficulty setting up slaves to help handle load. Take the “git” versus the “git2” plugin, the git plugin was recently criticized on the #hudson channel because of it’s use of the JGit library, versus “git2” which invokes git(1) on the command line. The latter approach is flawed for a number of reasons, particularly the reliance on the git command line executables and scripts to return consistent formatting is specious at best even if you aren’t relying on “porcelain” (git community terminology for front-end-ish script and code sitting on top of the “plumbing”, the breakdown is detailed here). The command-line approach also means you now have to ensure every one of your slaves that are likely to be executing builds have the appropriate packages installed. One the flipside however, with the JGit-based approach, the Hudson slave agent can transfer the appropriate bytecode to the machine in question and execute that without relying on system-dependencies. The Hudson Subversion plugin takes a similar approach, being based on SVNKit. Being a Python developer by trade, I am certainly not in the “Java Fanboy” camp, but the efficiencies gained by incorporating Java-based libraries in Hudson plugins and extensions is a no brainer, the reduction of dependencies on the systems incorporated in your build farm will save you plenty of time in maintenance and version woes alone. In my opinion, the benefits of JGit, Jython, SVNKit, and the other Java-based libraries that are running some of the most highly used plugins in the Hudson ecosystem continue to outweigh the costs, especially as we find ourselves bringing more and more slaves online.]]></summary></entry><entry><title type="html">Git Protip: Split it in half, understanding the anatomy of a bug (git bisect)</title><link href="https://brokenco.de//2009/03/06/git-protip-split-it-in-half-understanding-the-anatomy-of-a-bug-git-bisect.html" rel="alternate" type="text/html" title="Git Protip: Split it in half, understanding the anatomy of a bug (git bisect)" /><published>2009-03-06T00:00:00+00:00</published><updated>2009-03-06T00:00:00+00:00</updated><id>https://brokenco.de//2009/03/06/git-protip-split-it-in-half-understanding-the-anatomy-of-a-bug-git-bisect</id><content type="html" xml:base="https://brokenco.de//2009/03/06/git-protip-split-it-in-half-understanding-the-anatomy-of-a-bug-git-bisect.html"><![CDATA[I've been sending "Protip" emails about Git to the rest of engineering here at <a href="http://slide.com">Slide</a> for a while now, using the "Protips" as a means of introducing more interesting and complex features Git offers.<br>
<hr/><br>
There are those among us who can look at a reproduction case for a bug and <strong><em>just know</em></strong> what the bug is. For the rest of us mere mortals, finding out what change or set of changes actually introduced a bug is extremely useful for figuring out why a particular bug exists. This is even more true for the more elusive bugs or the cases where code "looks" correct and you're stumped as to why the bug exists now, when it didn't yesterday/last week/last month.  The options in most classical version control systems you have available to you are to sift through diffs or wade through log message after log message trying to spot the particular change that introduced the regression you're now tasked with resolving. <br/><br>
Fortunately (of course) Git offers a handy feature to assist you in tracking down regressions as they're introduced, <strong>git bisect</strong>. Take the following scenario:<br /><blockquote>Roger has been working on some lower level changes in a project branch lately. When he left work last night, he ran his unit tests (everything passed), committed his code and went home for the day. When he came in the next morning, per his typical routine, he synchronized his project branch with the master branch to ensure his code wasn't stomping on released changes. For some reason however, after synchronizing his branch, his unit tests started to fail indicating that a bug was introduced in one of the changes that was integrated into Roger's project branch. </blockquote><br>
Before switching to Git, Roger might have spent an hour looking over changes trying to pinpoint what went wrong, but now Roger can use <strong>git bisect</strong> to figure out exactly where the issue is. Taking the commit hash from his last good commit, Roger can walk through changes and pinpoint the issue as follows:<br>
<pre><br>
## Format for use is: git bisect start &#91;&#60;bad&#62; &#91;&#60;good&#62;...]] &#91;--] &#91;&#60;paths&#62;...]<br>
xdev4&#37; git bisect start HEAD 324d2f2235c93769dd97680d80173388dc5c8253<br>
Bisecting: 10 revisions left to test after this<br>
<br>
&#91;064443d3164112554600f6da39a36ffb639787d7] Changed the name of an a/b test.<br>
xdev4&#37;<br>
</pre><br>
This will start the bisect process, which is interactive, and start you halfway between the two revisions specified above (see the image below). Following the scenario above, Roger would then run his unit tests. Upon their success, he'd execute "git bisect good" which would move the tree halfway between that "good" revision and the "bad" revision. Roger will continue doing this until he lands on the commit that is responsible for the regression. Knowing this, Roger can either revert that change, or make a subsequent revision that corrects the regression introduced.<br>
<center><img src="http://agentdero.cachefly.net/unethicalblogger.com/images/git_bisect.png"/></center><br>
A sample of what this sort of transcript might look like is below:<pre><br>
xdev4&#37; git bisect good                              <br>
Bisecting: -1 revisions left to test after this<br>
&#91;bcf020a6c4ac7cc5df064c66b182b2500470000a] Merge branch &#39;cjssp&#39; into master<br>
xdev4&#37; git bisect bad<br>
bcf020a6c4ac7cc5df064c66b182b2500470000a is first bad commit<br>
xdev4&#37; git show bcf020a6c4ac7cc5df064c66b182b2500470000a<br>
commit bcf020a6c4ac7cc5df064c66b182b2500470000a<br>
Merge: 62153e2... 064443d...<br>
Author: Chris &#60;chris&#64;foo&#62;<br>
<br>
Date:   Tue Jan 27 12:57:45 2009 -0800<br>
<br>
    Merge branch &#39;cjssp&#39; into master<br>
<br>
xdev4&#37; git bisect log<br>
# bad: &#91;7a5d4f3c90b022cb66fd8ea1635c5de6768882d7] Merge branch &#39;foo&#39; into master<br>
# good: &#91;d1014fd52bebd3c56db37362548e588165b7f299] Merge branch &#39;bar&#39;<br>
git bisect start &#39;HEAD&#39; &#39;d1014fd52bebd3c56db37362548e588165b7f299&#39; &#39;--&#39; &#39;apps&#39;<br>
<br>
# good: &#91;064443d3164112554600f6da39a36ffb639787d7] Changed the name of an a/b test.  PLEASE PICK ME UP WITH NEXT PUSH.  thx<br>
git bisect good 064443d3164112554600f6da39a36ffb639787d7<br>
# bad: &#91;bcf020a6c4ac7cc5df064c66b182b2500470000a] Merge branch &#39;cjssp&#39; into master<br>
git bisect bad bcf020a6c4ac7cc5df064c66b182b2500470000a<br>
xdev4&#37; git bisect reset <br>
xdev4&#37;<br>
</pre><br>
Instead of spending an hour looking at changes, Roger was able to quickly walk a few revisions and run the unit tests he has to figure out which commit was the one causing trouble, and then get back to work squashing those bugs.<br>
<br>
Roger is, like most developers, inherently lazy, and running through a series of revisions running unit tests sounds like "work" that doesn't need to be done. Fortunately for Roger, git-bisect(1) supports the subcommand "<strong>run</strong>" which goes hand in hand with unit tests or other tests. In the example above, let's pretend that Roger had a test case exhibiting the bug he was noticing. What he could actually do is let <strong>git bisect run</strong> automatically run a test script to run his unit tests to find the offending revision i.e.:<br>
<pre><br>
xdev4&#37; git bisect start HEAD 324d2f2235c93769dd97680d80173388dc5c8253<br>
Bisecting: 10 revisions left to test after this<br>
<br>
&#91;064443d3164112554600f6da39a36ffb639787d7] Changed the name of an a/b test.<br>
xdev4&#37; git bisect run ./mytest.sh<br>
</pre><br>
After executing the <strong>run</strong> command, git-bisect(1) will binary search the revisions between GOOD and BAD testing whether or not "mytest.sh" returns a zero (success) or non-zero (failure) return code until it finds the commit that causes the test to fail. The end result should be the exact commit the regression was introduced into the tree, after finding this Roger can either grab his rubber chicken and go slap his fellow developer around or fix the issue and get back to playing Nethack.<br/><br>
All in all git-bisect(1) is extraordinarily useful for pinning down bugs and diagnosing issues as they're introduced into the code base.<br/><br>
<br>
For more specific usage of `git bisect` refer to it's man page here:  <a href="http://www.kernel.org/pub/software/scm/git/docs/git-bisect.html" target="_top">git-bisect(1) man page</a><br>
<br>
<hr/><br>
<em>Did you know!</em> <a href="http://www.slide.com/static/jobs">Slide is hiring</a>! Looking for talented engineers to write some good Python and/or JavaScript, feel free to contact me at tyler[at]<a href="http://slide.com">slide</a><br>]]></content><author><name>R. Tyler Croy</name></author><category term="slide" /><category term="software development" /><category term="git" /><summary type="html"><![CDATA[I've been sending "Protip" emails about Git to the rest of engineering here at Slide for a while now, using the "Protips" as a means of introducing more interesting and complex features Git offers. There are those among us who can look at a reproduction case for a bug and just know what the bug is. For the rest of us mere mortals, finding out what change or set of changes actually introduced a bug is extremely useful for figuring out why a particular bug exists. This is even more true for the more elusive bugs or the cases where code "looks" correct and you're stumped as to why the bug exists now, when it didn't yesterday/last week/last month. The options in most classical version control systems you have available to you are to sift through diffs or wade through log message after log message trying to spot the particular change that introduced the regression you're now tasked with resolving. Fortunately (of course) Git offers a handy feature to assist you in tracking down regressions as they're introduced, git bisect. Take the following scenario:Roger has been working on some lower level changes in a project branch lately. When he left work last night, he ran his unit tests (everything passed), committed his code and went home for the day. When he came in the next morning, per his typical routine, he synchronized his project branch with the master branch to ensure his code wasn't stomping on released changes. For some reason however, after synchronizing his branch, his unit tests started to fail indicating that a bug was introduced in one of the changes that was integrated into Roger's project branch. Before switching to Git, Roger might have spent an hour looking over changes trying to pinpoint what went wrong, but now Roger can use git bisect to figure out exactly where the issue is. Taking the commit hash from his last good commit, Roger can walk through changes and pinpoint the issue as follows: ## Format for use is: git bisect start &#91;&#60;bad&#62; &#91;&#60;good&#62;...]] &#91;--] &#91;&#60;paths&#62;...] xdev4&#37; git bisect start HEAD 324d2f2235c93769dd97680d80173388dc5c8253 Bisecting: 10 revisions left to test after this &#91;064443d3164112554600f6da39a36ffb639787d7] Changed the name of an a/b test. xdev4&#37; This will start the bisect process, which is interactive, and start you halfway between the two revisions specified above (see the image below). Following the scenario above, Roger would then run his unit tests. Upon their success, he'd execute "git bisect good" which would move the tree halfway between that "good" revision and the "bad" revision. Roger will continue doing this until he lands on the commit that is responsible for the regression. Knowing this, Roger can either revert that change, or make a subsequent revision that corrects the regression introduced. A sample of what this sort of transcript might look like is below: xdev4&#37; git bisect good Bisecting: -1 revisions left to test after this &#91;bcf020a6c4ac7cc5df064c66b182b2500470000a] Merge branch &#39;cjssp&#39; into master xdev4&#37; git bisect bad bcf020a6c4ac7cc5df064c66b182b2500470000a is first bad commit xdev4&#37; git show bcf020a6c4ac7cc5df064c66b182b2500470000a commit bcf020a6c4ac7cc5df064c66b182b2500470000a Merge: 62153e2... 064443d... Author: Chris &#60;chris&#64;foo&#62; Date: Tue Jan 27 12:57:45 2009 -0800 Merge branch &#39;cjssp&#39; into master xdev4&#37; git bisect log # bad: &#91;7a5d4f3c90b022cb66fd8ea1635c5de6768882d7] Merge branch &#39;foo&#39; into master # good: &#91;d1014fd52bebd3c56db37362548e588165b7f299] Merge branch &#39;bar&#39; git bisect start &#39;HEAD&#39; &#39;d1014fd52bebd3c56db37362548e588165b7f299&#39; &#39;--&#39; &#39;apps&#39; # good: &#91;064443d3164112554600f6da39a36ffb639787d7] Changed the name of an a/b test. PLEASE PICK ME UP WITH NEXT PUSH. thx git bisect good 064443d3164112554600f6da39a36ffb639787d7 # bad: &#91;bcf020a6c4ac7cc5df064c66b182b2500470000a] Merge branch &#39;cjssp&#39; into master git bisect bad bcf020a6c4ac7cc5df064c66b182b2500470000a xdev4&#37; git bisect reset xdev4&#37; Instead of spending an hour looking at changes, Roger was able to quickly walk a few revisions and run the unit tests he has to figure out which commit was the one causing trouble, and then get back to work squashing those bugs. Roger is, like most developers, inherently lazy, and running through a series of revisions running unit tests sounds like "work" that doesn't need to be done. Fortunately for Roger, git-bisect(1) supports the subcommand "run" which goes hand in hand with unit tests or other tests. In the example above, let's pretend that Roger had a test case exhibiting the bug he was noticing. What he could actually do is let git bisect run automatically run a test script to run his unit tests to find the offending revision i.e.: xdev4&#37; git bisect start HEAD 324d2f2235c93769dd97680d80173388dc5c8253 Bisecting: 10 revisions left to test after this &#91;064443d3164112554600f6da39a36ffb639787d7] Changed the name of an a/b test. xdev4&#37; git bisect run ./mytest.sh After executing the run command, git-bisect(1) will binary search the revisions between GOOD and BAD testing whether or not "mytest.sh" returns a zero (success) or non-zero (failure) return code until it finds the commit that causes the test to fail. The end result should be the exact commit the regression was introduced into the tree, after finding this Roger can either grab his rubber chicken and go slap his fellow developer around or fix the issue and get back to playing Nethack. All in all git-bisect(1) is extraordinarily useful for pinning down bugs and diagnosing issues as they're introduced into the code base. For more specific usage of `git bisect` refer to it's man page here: git-bisect(1) man page Did you know! Slide is hiring! Looking for talented engineers to write some good Python and/or JavaScript, feel free to contact me at tyler[at]slide]]></summary></entry><entry><title type="html">Git Protip: A picture is worth a thousand words (git tag)</title><link href="https://brokenco.de//2009/01/15/git-protip-a-picture-is-worth-a-thousand-words-git-tag.html" rel="alternate" type="text/html" title="Git Protip: A picture is worth a thousand words (git tag)" /><published>2009-01-15T00:00:00+00:00</published><updated>2009-01-15T00:00:00+00:00</updated><id>https://brokenco.de//2009/01/15/git-protip-a-picture-is-worth-a-thousand-words-git-tag</id><content type="html" xml:base="https://brokenco.de//2009/01/15/git-protip-a-picture-is-worth-a-thousand-words-git-tag.html"><![CDATA[I've been sending weekly "Protip" emails about Git to the rest of engineering here at <a href="http://slide.com">Slide</a> for a while now, using the "Protips" as a means of introducing more interesting and complex features Git offers. Below is the fourth Protip written to date.<br>
<hr/><br>
<br/><br>
While the concept of "tagging" or "labeling" code is not a new, or original idea that was introduced with Git, our use of tags in a regular workflow does not predate the migration to Git however. At it's most basic level, a "tag" in any version control system is to take a "picture" of how the tree looks at a certain point in time such that it can be re-created later. This can be extremely helpful for both local and team development, take the following scenario for local development using tags:<br>
<div style="margin: 10px; padding: 7px; border: 1px solid #cecece;"><p>Tim is extremely busy, most of his days working at an exciting, fast-paced start-up seem to fly by. With one particular project Tim is working on, a lot of code is changing at a very fast pace and the branch he's currently working in is stable one minute and destabilized the next. Tim has two basic options for leaving himself "bread-crumbs" to step back in time to a stable or an unstable state. The first, complicated option, is to mark his commit messages with something like "STABLE", etc so he can <code>git diff</code> or <code>git reset --hard</code> from the current HEAD to the last stable point of the branch. </p><p><br>
The second option is to make use of tags. Whenever Tim reaches a stable point in his turmultuous development, he can simply run:<br>
<code>git tag wip-protips_`date "+%s"</code> <br>
(or something similar, `date` added to ensure the tag is unique). If Tim finds himself too far down the wrong path, he can rollback his branch to the latest tag (<code>git reset --hard protiptag</code>), create a new stable branch based on that tag (<code>git checkout -b wip-protip-2 protiptag</code>), or diff his current HEAD to the tag to see what all he's changed since his branch was stable (<code>git diff protiptag...HEAD</code>)</p></div><br>
<br>
This local development scenario can become a team development scenario involving tags, if for example, Tim needed QA to start testing portions of his branch (his changes are just that important). Since the current HEAD of Tim's branch is incredibly unstable, he can push his tag to the central repository so QA can push a stage using the tag to the last stable point in the branch's history with the command: <code>git push origin tag protiptag</code><br>
<br>
Tags are similar to most other "refs" in Git insofar that they are distributable, if I execute <code>git fetch your-repo --tags</code>, I can pull the tags you've set in "your-repo" and apply them locally aid development. The distributed nature is primarily how tags differ in Git from Subversion, nearly the rest of the concept is the exact same.<br>
<br>
Currently at Slide, tag usage is dominated by the <code>post-receive</code> hook in the central repository, where <strong>every</strong> push into the central repository ("origin") in the branch release branch is tagged. This allows us to quickly "revert" bad live pushes temporarily, by simply pushing the last "good" tagged release, to ensure minimal site destabilization (while we correct live issues outside of the release branch). <br>
<br>
For more specific usage of `git tag` refer to the <a href="http://www.kernel.org/pub/software/scm/git/docs/git-tag.html" target="_top">git-tag(1) man page</a><br>
<br>
<hr/><br>
<em>Did you know!</em> <a href="http://www.slide.com/static/jobs">Slide is hiring</a>! Looking for talented engineers to write some good Python and/or JavaScript, feel free to contact me at tyler[at]<a href="http://slide.com">slide</a><br>
<!--break--><br>]]></content><author><name>R. Tyler Croy</name></author><category term="slide" /><category term="software development" /><category term="git" /><summary type="html"><![CDATA[I've been sending weekly "Protip" emails about Git to the rest of engineering here at Slide for a while now, using the "Protips" as a means of introducing more interesting and complex features Git offers. Below is the fourth Protip written to date. While the concept of "tagging" or "labeling" code is not a new, or original idea that was introduced with Git, our use of tags in a regular workflow does not predate the migration to Git however. At it's most basic level, a "tag" in any version control system is to take a "picture" of how the tree looks at a certain point in time such that it can be re-created later. This can be extremely helpful for both local and team development, take the following scenario for local development using tags: Tim is extremely busy, most of his days working at an exciting, fast-paced start-up seem to fly by. With one particular project Tim is working on, a lot of code is changing at a very fast pace and the branch he's currently working in is stable one minute and destabilized the next. Tim has two basic options for leaving himself "bread-crumbs" to step back in time to a stable or an unstable state. The first, complicated option, is to mark his commit messages with something like "STABLE", etc so he can git diff or git reset --hard from the current HEAD to the last stable point of the branch. The second option is to make use of tags. Whenever Tim reaches a stable point in his turmultuous development, he can simply run: git tag wip-protips_`date "+%s" (or something similar, `date` added to ensure the tag is unique). If Tim finds himself too far down the wrong path, he can rollback his branch to the latest tag (git reset --hard protiptag), create a new stable branch based on that tag (git checkout -b wip-protip-2 protiptag), or diff his current HEAD to the tag to see what all he's changed since his branch was stable (git diff protiptag...HEAD) This local development scenario can become a team development scenario involving tags, if for example, Tim needed QA to start testing portions of his branch (his changes are just that important). Since the current HEAD of Tim's branch is incredibly unstable, he can push his tag to the central repository so QA can push a stage using the tag to the last stable point in the branch's history with the command: git push origin tag protiptag Tags are similar to most other "refs" in Git insofar that they are distributable, if I execute git fetch your-repo --tags, I can pull the tags you've set in "your-repo" and apply them locally aid development. The distributed nature is primarily how tags differ in Git from Subversion, nearly the rest of the concept is the exact same. Currently at Slide, tag usage is dominated by the post-receive hook in the central repository, where every push into the central repository ("origin") in the branch release branch is tagged. This allows us to quickly "revert" bad live pushes temporarily, by simply pushing the last "good" tagged release, to ensure minimal site destabilization (while we correct live issues outside of the release branch). For more specific usage of `git tag` refer to the git-tag(1) man page Did you know! Slide is hiring! Looking for talented engineers to write some good Python and/or JavaScript, feel free to contact me at tyler[at]slide]]></summary></entry><entry><title type="html">Find me on github (rtyler)</title><link href="https://brokenco.de//2009/01/05/find-me-on-github-rtyler.html" rel="alternate" type="text/html" title="Find me on github (rtyler)" /><published>2009-01-05T00:00:00+00:00</published><updated>2009-01-05T00:00:00+00:00</updated><id>https://brokenco.de//2009/01/05/find-me-on-github-rtyler</id><content type="html" xml:base="https://brokenco.de//2009/01/05/find-me-on-github-rtyler.html"><![CDATA[Rod reminded me with <a href="http://unethicalblogger.com/posts/2009/01/im_using_git_because_it_makes_me_feel_cool#comment-715" target="_blank">his comment</a> in one of <a href="http://unethicalblogger.com/posts/2009/01/im_using_git_because_it_makes_me_feel_cool" target="_blank">my other posts</a> that I've not yet mentioned <a href="http://github.com" target="_blank">github</a>.<br>
<br>
I've got a bunch of my nonsense thrown up on <a href="https://github.com/rtyler" target="_blank">github.com/rtyler</a>, it's awesome (no really, github rocks my socks, those guys are good people).]]></content><author><name>R. Tyler Croy</name></author><category term="slide" /><category term="miscellaneous" /><category term="software development" /><category term="git" /><category term="github" /><summary type="html"><![CDATA[Rod reminded me with his comment in one of my other posts that I've not yet mentioned github. I've got a bunch of my nonsense thrown up on github.com/rtyler, it's awesome (no really, github rocks my socks, those guys are good people).]]></summary></entry></feed>