The great git.drupal.org migration

Tagged:

After thorough discussions where we weighted the pros and cons of profiting from the Great Git Migration, I think it's pretty clear that we have a great opportunity to migrate our source repositories back to Drupal.org. We will get:

  • more visibility - people will find us through drupal.org easily
  • more consistency - versions in the issue queue matching the real releases
  • even more consistency - people will find the source better from drupal.org
  • and then more consistency - usage stats will work again!

We will be able to:

  • host our module and theme within the install profile - the major reason why we migrated away back then
  • use whatever naming convention we want for tags and branches - although only some will be used for releases for now, see this issue for follow-up
  • keep our history - the git team was nice enough to clone directly from our repositories (although see below, we're taking the opportunity of the migration to clean up our history)

What it means for you

This does have an impact on your existing repositories. Since we migrated to git.drupal.org, the repositories have changed location. The other thing is that the commit IDs have changed, and you will need to clone new checkouts before you can work with them effectively.

So in short:

  • repos will move to git.drupal.org
  • you need to clone again
  • make sure your identity is set correctly

URL changes

So the repositories have moved. Everyone has to fetch the source code from git.drupal.org now. At some point in the future, Koumbit will close git.aegirproject.org. Concretely, Koumbit will keep the mirror running readonly for a while just to be on the safe side, but you shouldn't expect git.aegirproject.org to work and should switch URLs, as detailed below.

This documentation is taken from the Drupal.org git workflows documentation.

Readonly repositories URL change

The readonly access will change from:

git://git.aegirproject.org/provision.git
git://git.aegirproject.org/hostmaster.git

to:

http://git.drupal.org/project/provision.git
http://git.drupal.org/project/hostmaster.git

See also the git patch maintainer guide.

Core committers repositories URL change

The read/write access will change from:

ssh://gitosis@git.aegirproject.org/provision.git
ssh://gitosis@git.aegirproject.org/hostmaster.git

to:

ssh://[username]@git.drupal.org/project/provision.git
ssh://[username]@git.drupal.org/project/hostmaster.git

... notice how you will need to set your Drupal.org username in the URL. Drupal.org has good documentation on how authentication works on git.drupal.org. See also the project maintainer git guide.

Sandboxes

The migration to Drupal.org also introduces sandboxes, something we didn't have before, and which allows you to host your own branches of our modules (even if you're not a core committer) or contrib/test modules, directly on Drupal.org. See the sandbox collaboration guide for more information on that.

What if I pull/push from/to the wrong one?

It could happen before/during/after the migration that you pull or push from or to the wrong repository. git can rebase with the new repository, but unfortunately, the old commits will be mingled with the new ones.

To be able to push to the new repositories, you will need to follow instructions below to cleanup your repositories.

Rewriting history

I have taken the liberty of rewriting our collective history. There were a lot of commits with inconsistent authors: some had invalid emails, some had no real names, some had multiple different email addresses for the same person. I narrowed the list down, through extensive detective work, to a complete list of 14 distinct authors.

I have rewritten all commit authors that didn't seem right to follow a consistent pattern:

Firstname Lastname <email>

The firstname/lastname is the combination you have used in certain commits. It's not the most popular way (ie. most commits had the nickname instead of first/last), but it seems the most logical one.

The email was your most often used email or the one you have registered as such at drupal.org, if you didn't have an email address specified (in case you committed your code back in the days of CVS). (There are 6 commits from someone with a box at ovh.net that I could track, and that I can only guess are Miguel's, but I'm not sure so I have let those one fly. :)

You should make sure you have your identity set straight in your git configuration, especially your email address, if you want your contributions to be properly credited in the future on Drupal.org. In short, this means:

git config --global user.name "User Name"
git config --global user.email user@example.com

Again, see the manual for more information, especially how this maps to your Drupal.org account.

A fresh clone is necessary!!!!

Note that, because we rewrote history, you will need to clone your repository from new remote. So yes, contrarily to what was documented here before, you will need to clone from scratch.

We understand this may be annoying if you work on a bunch of branches or have local branches you want to keep. Unfortunately, our original tests didn't cover all use cases and were misleading in thinking we could just rebase existing repositories.

Other documentation

People unfamiliar with git or specifically git on Drupal.org should read the growing git handbook on Drupal.org.

People that are familiar with git should contribute to that manual. :)

Under the hood

The history rewrite was performed through git-fast-export and git-fast-import magic, with a fairly simple perl script of my own. I have cloned the current repo, filtered it and published a new version for git.drupal.org to pull. You can see the results here:

http://git.aegirproject.org/?p=export/provision.git
http://git.aegirproject.org/?p=export/hostmaster.git

The exact calling sequence is this:

mkdir export orig
cd export
git init --bare hostmaster.git
cd ../orig
git clone --bare git://git.aegirproject.org/hostmaster
cd hostmaster.git
# look at the authors in the original repo
git fast-export --all |  grep -a '^author [^<]* <[^>]*> [0-9][0-9]* [+-][0-9][0-9][0-9][0-9]$' | sed 's/[0-9][0-9]* [+-][0-9][0-9][0-9][0-9]$//' | sort | uniq -c
git fast-export --all | perl /home/anarcat/bin/rewrite_authors.pl | ( cd ../../export/hostmaster.git ; git fast-import )
cd ../../export/hostmaster.git
# look at the authors in the new repo
git fast-export --all |  grep -a '^author [^<]* <[^>]*> [0-9][0-9]* [+-][0-9][0-9][0-9][0-9]$' | sed 's/[0-9][0-9]* [+-][0-9][0-9][0-9][0-9]$//' | sort | uniq -c
# rinse, repeat until the mapping is right
# push repositories online
git remote add origin ssh://gitosis@git.aegirproject.org/export/hostmaster.git
git push --all
git push --tags
# repeat with provision

This is the script source, without all the mappings:

#! /usr/bin/perl -w

%authors = ( 
    'anarcat ' => 'Antoine Beaupré ',
   # ... 
);

sub remap {
    my $a = shift;
    if ($authors{$a}) {
        return $authors{$a};
    } else {
        return $a;
    }
}

while (<>) {
    s#^author ([^<]+ <[^>]+>) ([0-9]+ [+-][0-9][0-9][0-9][0-9])$#'author ' . remap($1) . ' ' . $2#e;
    print;
}

Migration checklist

  1. serve exported repositories to migration team done!
  2. update the repositories with change authors done!
  3. the great git migration done! git.aegirproject.org is now readonly, commits are pushed to git.drupal.org and releases are performed on drupal.org
  4. update documentation in handbook in progress!
  5. update the home page links done - I have changed the link in the body from git.aegirproject.org to drupal.org provision/hostmaster project pages, and pointed the "Get the source" link in the bottom to the install manual instead
  6. update makefiles and files in docs/* in provision to fetch from git.drupal.org done!
  7. make at least one other release done - 0.4-rc2 release coordination
  8. turn off git.aegirproject.org done