The other day, I added experimental support for checkin directly from git-tfs to tfs. (Nate added an interactive version of checkin, too.) It doesn’t feel quite complete yet, and I haven’t decided which way to take it.

The main thing that’s missing is a way to tie the TFS checkin to the git branch. There are a few options that I’ve come up with for how to do this: dcommit, merge in TFS, or merge back to the git branch.

Dcommit would be similar to dcommit in git-svn, where the git commits are checked in to TFS one at a time, effectively rebasing the git branch onto the end of the TFS branch.

Merging in TFS doesn’t mean letting TFS do merges, but rather it means that git tfs checkin would fetch up to the new TFS commit, and give it two parents.

T1 --- T2 --- T3 --- X
   \                /
     G1 --- G2 -- G3

T1 is the base TFS changeset. G* are commits in git. T2 and T3 are commits made in TFS before the git branch is checked in to TFS. X is the TFS changeset created by git-tfs, with parents T3 and G3.

Merging back to the git branch is similar to merging in TFS, but with the merge commit in a different place:

T1 --- T2 --- T3 --- T4
   \                    \
     G1 --- G2 --- G3 -- X

Here, T4 is the new changeset created by git-tfs, and a merge commit is created with G3’s tree and parents G3 and T4.

The thing I like about dcommit is that it captures everything, if you want. It seems like it would be potentially problematic, in that it would be pretty slow and more error prone. Like rebase, it removes commits from history, which I’m a little wary of. Merging on the TFS branch is conceptually very nice, but it breaks the ability to refetch the exact same TFS history (given the same clone configuration). It also might not be as convenient for managing a git master that parallels the TFS mainline, because the merge won’t be available on the git branch. I’m not very well-versed in the mechanics of git’s merging awesomeness, so this might be a moot point, but it seems like the “merge in git” option would provide a better workflow.

If you have any thoughts, please leave a comment. I’m open to suggestions.


Super quick-start, based on another getting started with puppet guide:

Install Ubuntu 10.04 LTS server on two servers.
[server1] sudo apt-get install puppetmaster
[server2] sudo apt-get install puppet
[server2] sudo vi /etc/puppet/puppet.conf
add this line:
server=<fqdn of [server2]>
[server2] sudo puppetd --test
[server1] sudo puppetca --list
[server1] sudo puppetca --sign <fqdn of [server2]>
[server2] sudo puppetd --no-daemonize --verbose
[server2] sudo /etc/init.d/puppet start

If server1 is named “puppet”, the config change shouldn’t be necessary.

I just pushed out noodle, a new gem that we’re using to manage our .NET dependencies with ruby’s bundler.

Because .NET projects usually have to reference dependencies at a specific path, simple rubygems don’t quite cut it. With noodle, you use bundler to do the dependency analysis, and noodle copies the resolved dependencies into a local directory in the project. This way, .NET projects can reference the assemblies at a predictable path without having to check them all in.

Install with

gem install noodle

For example, say you have a project that uses StructureMap. Your Gemfile might look like this:

source :rubygems
gem 'structuremap'

If you create a Rakefile like this:

require 'noodle'

and then run

bundle install
rake noodle

Then you’ll have a copy of StructureMap.dll in lib/structuremap-<version>.

(Noodle 0.1.0 had an error, which added an extra ‘lib’ in the destination path.)

This post is about how to run your favorite rack application on IIS 7 using IronRuby. I’ve been unsatisfied with most other windows ruby app hosting I’ve tried, and IronRuby-Rack looks like it will fix that. (I haven’t tried deploying to JRuby on Windows, but I assume that experience would be pretty good.)

Surely I’m not the first to the punch on this, but there were some things I had to figure out that I thought I’d share.

I’m doing this in the context of a sinatra application I’m writing. More on the specific app later, but it wasn’t worth writing if it wasn’t going to run on IIS, or at least on Windows.

Also, I tried the ironruby-rack gem, but it’s pretty rough at this point. The best thing about it is that it included IronRuby.Rack.dll. My major complaint is that it put web.config in the root of the app, which meant that all the .rb files are in the web root. It seemed much classier to make public the web root, with web.config in there.

It wasn’t too hard to get the app running.

A rackup file seemed like a sensible first step, and it was. You can’t get very far these days without a rackup file.

I snagged IronRuby.Rack.dll from the ironruby gem, and checked it in public/bin. This was done because I’m lazy and didn’t want to build it myself. It’d be really nice if IronRuby.Rack was a stand-alone github project so I could fork it and patch it. Cloning all of ironruby just for a version of IronRuby.Rack that probably isn’t current wasn’t very interesting to me.

My rake tasks build the rest of the aspnet application. The tasks are aspnet:copybin, aspnet:logdir, aspnet:webconfig, and aspnet. The last just invokes the others.

aspnet:copybin finds IronRuby.Rack’s dependencies in the current ironruby environment and copies them into public/bin.

aspnet:logdir creates a directory for IronRuby.Rack to put its logs into. IronRuby.Rack is fussy about this directory existing, and about its ability to write to said directory.

aspnet:webconfig is more interesting. The web.config file it generates sets up the ASP.NET handler for ironruby.rack and tells it where everything is. I do bindingRedirects so that IronRuby.Rack can find the IronRuby version that I grabbed in aspnet:copybin. I started with the templates in the ironruby-rack gem and trimmed it down to what my app needed.

Here’s what I learned while crafting the web.config file:

IronRuby.Rack includes two hooks for ASP.NET: a module and a handler. The module seemed like the way to go, so I tried it first. I was a bit disappointed that it grabbed each request at the beginning of the application pipeline, and called EndRequest. It would have been fine if I didn’t care about anything that IIS was doing for me, but I did. I needed other modules to run (particularly the WindowsAuthentication module), and having IronRuby short-circuit the process broke that. I switched to the handler, and was much happier.

Also, IronRuby.Rack doesn’t mess with Environment.CurrentDirectory at all, so if your app needs to know about the directory it lives in, you need to tell it about that. Rails is pretty tolerant about this, with its Rails.root stuff, but bundler isn’t. Bundler was looking in c:\windows for my Gemfile. My first impulse was to set environment variables in web.config, but IronRuby.Rack doesn’t have hooks for that. So my app.rb has another bit of bundler bootstrapping that most apps can leave out: ENV['BUNDLE_GEMFILE'] ||= File.expand_path(__FILE__ + '/../Gemfile')

As a nice side-effect of using ASP.NET, to restart the application I just need to “rake aspnet:webconfig”. ASP.NET reloads the application whenever web.config changes.

Github is where to go to see the complete Rakefile.

I must admit that, in my career so far, character encodings have been a pretty insignificant concern. Most of the software I write is focused on small, domestic audiences. So character sets mean 1-byte vs. 2-byte or a couple of garbage-ish characters at the top of some files. But, after reading Joel’s guidance on character sets, I’ve been more alert to them. I have a better understanding of how character sets work, and I’m paranoid about them causing trouble for me, though how exactly they work is still a bit of smoke and mirrors.

All that said, I helped Dave solve a character set problem yesterday.

In the footer of a site we work on is this:

… except on some pages, where it looks like this:


Dave knew that normal aspx pages showed the symbol correctly, while CGI pages showed  before ©. The CGI pages are handled by an ASP.NET handler that I wrote, which is why he came to ask me.

My spidey-sense whispered “character encoding,” so I started trying to figure out what the charsets were. I popped open Chrome’s developer tools and checked the headers on a plain ASP.NET page and an ASP.NET/CGI page.

ASP.NET: Content-type: text/html; charset=utf-8

CGI: Content-type: text/html; charset=ISO-8859-1

Ahha! It is a charset thing!

“But it says &copy; in the source files,” Dave said. So why does charset matter? Does the browser really interpret ‘&copy;’ differently based on which charset it’s using? Or is ASP.NET being “helpful” again?

I poked through the skin files, finding this:

   Text="Terms and Conditions &copy; 2009 SEP"
   runat="server" />

It looks OK, but because ASP.NET is the consumer of the skin file, ASP.NET is interpreting the &copy; entity and storing it in a string as the character with code point A9. When it writes out the page, it doesn’t bother figuring out whether to make it an entity again (I wouldn’t either), so it outputs the UTF-8 encoding for A9, which is C2A9. To complete our comedy, and in an effort to avoid garbling the CGI output (which is, in fact, more important to get right than the copyright symbol in the footer of the page), the CGI handler is changing the Content-type header to match what the CGI program says it is (ISO-8859-1). In ISO-8859-1, C2A9 is ©.

The quick fix was to change the &copy; to &amp;copy; in the skin file so that ASP.NET actually renders &copy;. The complete fix will be either to align the encoding used by ASP.NET and CGI, or to modify the CGI handler to translate the CGI output from ISO-8859-1 (or whatever encoding it’s using) to UTF-8.

I built version 0.9.1 of git-tfs today. The notable changes are

  1. It should work seamlessly with the TFS client libs that come with VS2008 and VS2010.
  2. It has a new “quick-clone” command.

The quick-clone command is used exactly like clone. The difference is that, while clone will chug for hours trying to get an exact replica of all of the changesets in the TFS repository, quick-clone will just grab a snapshot from TFS.

Look for it on the downloads page at github.

I’m working on a prototyping project that has been going on for a few years. This year, its source got migrated to git. For the last month or two, all of the interesting action has been in one subdirectory of the repository. We wanted to split the work off into another repository that didn’t have all the old cruft. It wasn’t too hard.

To do it, I took advantage of git’s internal structures. Conceptually, I did the opposite of a subtree merge… so, it was a subtree extract. Our subdirectory has always been in the same place, so the combination of git log HEAD -- [subtree] and git ls-tree [commit] [subtree] got me a list of commits and the tree IDs for the subtree I was extracting. From there, I used commit-tree to build up the new history for the tree.

That description makes it sound like I should have had about a 5-line shell script. But there are obviously some details left out. If you want everything, check out extract_subtree.rb.

If you decide to use this script, please be careful with it. It shouldn’t destroy anything, but it might mess up your repo if something isn’t set up right. Also, this won’t deal with the .gitmodules file, so if you use submodules, you’ll need to manually build your .gitmodules file again.

If you want to know more about git’s internals, check out Scott Chacon’s ProGit book.

Visual Studio 2010 was recently released, and it brought a new version of the TFS client libraries. One of my goals for git-tfs was to have it detect and use the newest available TFS client libraries at runtime. I don’t want to have to build a separate version of git-tfs for each TFS version.

I looked at duck typing, .NET 4’s dynamic features, and the solution I ended up with.

To prepare, I pulled all of the TFS calls into wrapper classes. This isolated the problem so that I could then try to eliminate it. This was the common starting point for all of the solutions I tried.

Duck typing

My first try was to use deft flux’s Duck Typing library. My approach was going to be to have a small amount of reflection that loaded the right TFS assemblies and instantiated the root objects, and then hand off the objects to dynamic wrappers created at runtime by the duck typing library. The problem I ran into was that the library had a hard time figuring out how to unpack duck-typed objects that were parameters to other duck-typed interface methods. For example, let’s say that the original type was

public class Workspace
  public void Shelve(Shelveset s, PendingChange [] p, ShelvingOptions o) { /*...*/ }

and my duck-type was

public interface IWorkspace
  void Shelve(IShelveset s, IPendingChange [] p, TfsShelvingOptions o);

The duck-typing library wasn’t able to figure out how to un-duck-type s and p. If I had more uses for the duck-typing library, I would definitely be interested in making that work.


Instead, I decided instead to try the new, shiny “dynamic” keyword. This worked out OK, except for enums. I was also annoyed at the complete lack of intellisense in dynamic expressions, and I was surprised by the IL that was generated for dynamic expressions.

In general, “dynamic” in C# is quite a bit less shiny to me now. I had expected the CLR to explicitly handle dynamic invocations at runtime, but it turned out that the dynamic keyword is mostly a compiler trick. It makes more sense to me now why the DLR could be built to run on .NET 2, but it’s a little less magical, too. Also, the dynamic keyword feels clunky. It’s nice to start using it when types get in the way, but you can tell it’s shoe-horned in.

Polymorphic plugins FTW!

The approach I ended up using was polymorphism and plugins. Because the TFS libs all have the same names, I can’t reference them statically from a single assembly. So I have one assembly per TFS version. I moved out the isolated wrapper classes into the plugin assembly for VS2008. I cloned and tweaked the VS2008 assembly to make a VS2010 assembly. And I added a little bit of plumbing to find and load the correct plugin. I haven’t extensively tested this, but it does what I expect when I have all of the client libs or none of the client libs installed.

Other ideas

In the course of working on this, I came up with a couple other ideas that I didn’t try.

One idea was to write my own dynamic invoker using reflection. This would have been something like a home-made duck-typing library or dynamic evaluation. It seemed like a lot of work and not very elegant. And it would have involved a lot of strings.

The other idea is to reimplement git-tfs in ruby. I’d thought about using ruby when I started working on git-tfs, but I figured the convenience of having the TFS client libraries was worth the heavyweight cost of C#. Some recent developments have made ruby a much more viable option: MS acquired the assets of Teamprise and made them available to TFS users, so jruby would be able to use solid TFS client libraries; ironruby hit 1.0, so I would feel more comfortable targeting it; and, MS released VS2010, which precipated by desire to make git-tfs work with multiple TFS versions out of the box.

I created a tool called git-tfs. I alluded to it the other day. I’ve been using it for about a month on a solo project, and a few other people are picking it up.

As of a few days ago, I released its source code. (I plan to move the ‘official’ mainline to my company’s fork, eventually.) As of today, there’s a pre-built binary that I’ve called v0.9.

To install, you first need git (e.g. msysgit). Download git-tfs and add the extracted directory to your path (or somewhere else that git can find it).

To use, you need to clone your TFS repository. Then you use git like normal, and occasionally fetch of shelve to synchronize with TFS. I’m pretty comfortable with fetching from TFS. I plan to add more ways to push data into TFS.

For cloning: git tfs clone http://tfs:8080 $/MyProject/MyDir my_git_clone

Clone is the same as init + fetch. git-tfs’s clone and init are happy to add itself to an existing git repository or to create a new one for you.

To shelve: git tfs shelve “Shelveset name”

I opted for shelving first for a few reasons. First of all, it’s pretty safe: there are no permanent changes made to the TFS repository. Also, it sidesteps a couple of tricky issues, including: work item association and source control policy validation on the TFS side; and history preservation vs. rewriting on the git side. I would like to have a way to replicate git commits back into TFS, but that’s not done yet.

First, a disclaimer: some of the things I describe in this post involve directly changing data in TFS’s databases. This isn’t usually recommended. Also, I was using TFS 2005. I’m not sure how much of this works on other versions.

A couple of weeks ago, while a coworker and I were deploying some new work item types, we created a field as an Integer that we realized needed to be a String. Changing the field’s type in the work item type definition results in the error “TF26038: Field type for My.Custom.Field does not match the existing type. It was Integer, but now is String.”

A quick look on google made it look like there wasn’t a good way to change the field type. I tried updating the Fields table in the database. This looked like it worked, but the actual fields on the work item type didn’t get updated, so we couldn’t only insert integer data. Non-integers failed with a strange error; for me, it said that the TFS server was offline. So the quick-fix in the database was obviously a bad fix.

So now the question was, how do we preserve the existing data in a new field?

Step 1 was to create a new field. The first attempt used the same friendly name, but a different RefName. Team Foundation objected: “Error: Field name ‘Custom Field’ is used by the field ‘My.Custom.Field’, so it cannot be used by the field ‘My.Custom.Field2’. So now, I’m thinking about trying to change the type of the column in the database. Theoretically, SQL Server can do this. If I’d had some planned downtime, I would probably have gone this route. I wasn’t sure that changing the database column type would be the last time I had a problem, so I wanted to use TF’s tools to do most of the work.

Also, around this time, I (re)discovered the witfields tool.

Here’s what I did.

  1. Created a new field, My.Custom.Field.Temp, as a String, and added it to the work item type with witimport.
  2. Looked in the Fields database table for the FldId of the old and new fields. For example, the old field had FldId 10138 and the new one was 10146.
  3. In each of WorkItemsAre, WorkItemsWere, and WorkItemsLatest, I copied everything from Fld10138 to Fld10146.
  4. I used witimport to remove the old My.Custom.Field field from the work item type.
  5. I used witfields to delete the old field from the database.
  6. I updated the work item type with a new field, My.Custom.Field.
  7. I repeated steps 2 and 3 to get the data from the new field to the fixed old field.
  8. I used witimport to remove the temp field from the work item.
  9. I used witfields to remove the temp field from the database.

It looks like this worked, and each step was pretty quick, and I wasn’t too worried about being interrupted between steps. That said, I make no claim that this will work for you. If you decide to try this, definitely read up on the TFS commands you’re using, and quadruple-check the SQL updates you run.