Using Git to fork and contribute to a Rubyforge project

Posted by Brian in Howto, News, Rails, tips (January 13th, 2009)

If you’ve never comtributed to an open-source project before, there’s no better time to start. This article will walk you through forking an open-source project on RubyForge and making changes to it using Git. At the end, you’ll create a unified diff that can be sent back to the original author.

My current project FeelMySkills helps creative professionals promote themselves by creating an online portfolio that shows what they can do rather than who they know. I wanted the site to let a user export their profile page as a PDF, and so I found the amazing HTMLDoc gem which takes simple HTML pages and converts them to PDF documents.

While developing the site, I discovered that there is a bug in PDF::HTMLDoc that pops up when you embed images into the PDF. It turns out that it has to do with extra whitespace getting into the content which causes HTMLDoc to choke. The fix is really simple, and after I fixed it I found that there is already a patch for this posted to Rubyforge but it hasn’t been applied. I assume that the reason it’s not applied is that there were no tests supplied with the patch.

Let’s get the code, write the test, and submit the patch!

GITing the code

The HTMLDoc project (http://htmldoc.rubyforge.org) has a Subversion repository that we can use as our master branch. Now, if you’ve never used Git before, don’t worry about it because we’re only going to use it here as a really easy way to create patches.

Installing Git

Mac users with XCode and Macports installed can do it with

sudo port install git-core +svn

Windows users can install Msysgit.

Linux users should install Git using their package manager or from source.

Forking the code from Subversion

The git svn command lets you pull and push to a Subversion repository and is perfect for watching projects that don’t use Github yet.

Visit the project page at http://rubyforge.org/projects/htmldoc/ and click the link on the bottom for SCM Repository. That page lists the repo as http://htmldoc.rubyforge.org/svn.

Grab the trunk for HTMLDoc from RubyForge.

git svn clone http://htmldoc.rubyforge.org/svn/trunk htmldoc

Test before you start

If you want to start things off on the wrong foot, just start hacking away at the code. A much better approach is to see what tests are broken before you start working. Good Ruby projects should have a test suite that completely passes.

In the htmldoc folder, run rake which will run the test suite for this app.
Unfortunately the test suite shows an error:

Loaded suite -e
Started
..E........
Finished in 3.500649 seconds.

  1) Error:
test_get_command_pages(BasicTest):
NoMethodError: undefined method `path' for nil:NilClass
    ./test/basic_test.rb:91:in `test_get_command_pages'

11 tests, 361 assertions, 0 failures, 1 errors

However, on further inspection, the error is because of a path issue. The tests pass when you run them individually:

cd test
ruby basic_test.rb && ruby generation_test.rb
cd ..

See? Everything works!

Loaded suite basic_test
Started
......
Finished in 0.010653 seconds.

6 tests, 324 assertions, 0 failures, 0 errors
Loaded suite generation_test
Started
.....
Finished in 3.403335 seconds.

5 tests, 38 assertions, 0 failures, 0 errors

Creating a new branch for your changes

We’ll want to keep our work in a new branch. This makes it easy for us to create the patch later, as we can make Git give us the difference between the master branch, which we forked from Rubyforge, with our new branch which will contain our fixes.

$ git checkout -b fix_images

Writing a new test

Before you start hacking away on a new feature, you should write a test to prove that things are really broken. In this case, a PDF that contains a reference to an image causes things to break. A test that tries to put an image reference into the PDF data should break. Add this test to test/generation_test.rb

  def test_generation_results_with_image
    pdf = PDF::HTMLDoc.new
    pdf.set_option :webpage, true
    pdf.set_option :toc, false
    pdf << "

Random title

something you have to follow either.

Run this test

cd test
ruby generation_test.rb -n test_generation_results_with_image
cd ..

and you'll see that things don't work as expected:

Loaded suite generation_test
Started
F
Finished in 0.349939 seconds.

  1) Failure:
test_generation_results_with_image(GenerationTest) [generation_test.rb:43]:

expected to be kind_of?
 but was
.

So we can reproduce the bug, and now we just have to fix the problem.

A simple fix (this time)

The reason for the error is because HTMLDoc hates extra whitespace created by the inclusion of the image. Open up lib/htmldoc.rb and change line 186 from

          case line

to

          case line.strip

Save the file. Now run the entire test suite again to make sure that nothing else broke.

Loaded suite basic_test
Started
......
Finished in 0.007855 seconds.

6 tests, 324 assertions, 0 failures, 0 errors
Loaded suite generation_test
Started
......
Finished in 3.434714 seconds.

6 tests, 43 assertions, 0 failures, 0 errors

Hurray! We fixed it! Now we just have to make a patch!

Commit your changes!

You should commit your changes to your branch so you don't lose them.

  git commit -a -m "Fixed problem when embedding images in PDFs"

Making a patch

If you're quick, you can probably just create the patch right now, but let's assume the project is a fast-moving one, like Rails. Youl want to pull down the latest version of the project and fix any conflicts.

git checkout master
git svn rebase
git checkout fix_images
git rebase master

There won't be any collisions you have to fix now, so you can just make the patch file.

git format-patch master --stdout > htmldoc_fix_images.diff

This creates the file htmldoc_fix_images.diff which is a unified diff containing both the fix and the test. You can now send the patch file to the maintainer who will happily apply it because it has tests!

Wrapping up

Working with open-source projects gets easier every day, and you can really take ownership of the tools you use if you get involved. Hopefully this article gets you started on the path to contributing to projects. Good luck, and leave comments if something needs more clarification!