If you’ve never comtributed to an open-source project before, there’s no better time to start. This article will walk you through forking an open-source project on RubyForge and making changes to it using Git. At the end, you’ll create a unified diff that can be sent back to the original author.
My current project FeelMySkills helps creative professionals promote themselves by creating an online portfolio that shows what they can do rather than who they know. I wanted the site to let a user export their profile page as a PDF, and so I found the amazing HTMLDoc gem which takes simple HTML pages and converts them to PDF documents.
While developing the site, I discovered that there is a bug in PDF::HTMLDoc that pops up when you embed images into the PDF. It turns out that it has to do with extra whitespace getting into the content which causes HTMLDoc to choke. The fix is really simple, and after I fixed it I found that there is already a patch for this posted to Rubyforge but it hasn’t been applied. I assume that the reason it’s not applied is that there were no tests supplied with the patch.
Let’s get the code, write the test, and submit the patch!
GITing the code
The HTMLDoc project (http://htmldoc.rubyforge.org) has a Subversion repository that we can use as our master branch. Now, if you’ve never used Git before, don’t worry about it because we’re only going to use it here as a really easy way to create patches.
Mac users with XCode and Macports installed can do it with
sudo port install git-core +svn
Windows users can install Msysgit.
Linux users should install Git using their package manager or from source.
Forking the code from Subversion
The git svn command lets you pull and push to a Subversion repository and is perfect for watching projects that don’t use Github yet.
Grab the trunk for HTMLDoc from RubyForge.
git svn clone http://htmldoc.rubyforge.org/svn/trunk htmldoc
Test before you start
If you want to start things off on the wrong foot, just start hacking away at the code. A much better approach is to see what tests are broken before you start working. Good Ruby projects should have a test suite that completely passes.
In the htmldoc folder, run rake which will run the test suite for this app.
Unfortunately the test suite shows an error:
Loaded suite -e Started ..E........ Finished in 3.500649 seconds. 1) Error: test_get_command_pages(BasicTest): NoMethodError: undefined method `path' for nil:NilClass ./test/basic_test.rb:91:in `test_get_command_pages' 11 tests, 361 assertions, 0 failures, 1 errors
However, on further inspection, the error is because of a path issue. The tests pass when you run them individually:
cd test ruby basic_test.rb && ruby generation_test.rb cd ..
See? Everything works!
Loaded suite basic_test Started ...... Finished in 0.010653 seconds. 6 tests, 324 assertions, 0 failures, 0 errors Loaded suite generation_test Started ..... Finished in 3.403335 seconds. 5 tests, 38 assertions, 0 failures, 0 errors
Creating a new branch for your changes
We’ll want to keep our work in a new branch. This makes it easy for us to create the patch later, as we can make Git give us the difference between the master branch, which we forked from Rubyforge, with our new branch which will contain our fixes.
$ git checkout -b fix_images
Writing a new test
Before you start hacking away on a new feature, you should write a test to prove that things are really broken. In this case, a PDF that contains a reference to an image causes things to break. A test that tries to put an image reference into the PDF data should break. Add this test to test/generation_test.rb
def test_generation_results_with_image pdf = PDF::HTMLDoc.new pdf.set_option :webpage, true pdf.set_option :toc, false pdf << "
Random titlesomething you have to follow either.
Run this testcd test ruby generation_test.rb -n test_generation_results_with_image cd ..
and you'll see that things don't work as expected:Loaded suite generation_test Started F Finished in 0.349939 seconds. 1) Failure: test_generation_results_with_image(GenerationTest) [generation_test.rb:43]:
expected to be kind_of? but was .
So we can reproduce the bug, and now we just have to fix the problem.
A simple fix (this time)
The reason for the error is because HTMLDoc hates extra whitespace created by the inclusion of the image. Open up lib/htmldoc.rb and change line 186 fromcase line
Save the file. Now run the entire test suite again to make sure that nothing else broke.Loaded suite basic_test Started ...... Finished in 0.007855 seconds. 6 tests, 324 assertions, 0 failures, 0 errors Loaded suite generation_test Started ...... Finished in 3.434714 seconds. 6 tests, 43 assertions, 0 failures, 0 errors
Hurray! We fixed it! Now we just have to make a patch!
Commit your changes!
You should commit your changes to your branch so you don't lose them.git commit -a -m "Fixed problem when embedding images in PDFs"
Making a patch
If you're quick, you can probably just create the patch right now, but let's assume the project is a fast-moving one, like Rails. Youl want to pull down the latest version of the project and fix any conflicts.git checkout master git svn rebase git checkout fix_images git rebase master
There won't be any collisions you have to fix now, so you can just make the patch file.git format-patch master --stdout > htmldoc_fix_images.diff
This creates the file htmldoc_fix_images.diff which is a unified diff containing both the fix and the test. You can now send the patch file to the maintainer who will happily apply it because it has tests!
Working with open-source projects gets easier every day, and you can really take ownership of the tools you use if you get involved. Hopefully this article gets you started on the path to contributing to projects. Good luck, and leave comments if something needs more clarification!