The great git.drupal.org migration
After thorough discussions where we weighted the pros and cons of profiting from the Great Git Migration, I think it's pretty clear that we have a great opportunity to migrate our source repositories back to Drupal.org. We will get:
- more visibility - people will find us through drupal.org easily
- more consistency - versions in the issue queue matching the real releases
- even more consistency - people will find the source better from drupal.org
- and then more consistency - usage stats will work again!
We will be able to:
- host our module and theme within the install profile - the major reason why we migrated away back then
- use whatever naming convention we want for tags and branches - although only some will be used for releases for now, see this issue for followup
- keep our history - the git team was nice enough to clone directly from our repositories (although see below, we're taking the opportunity of the migration to clean up our history)
What it means for you
This does have an impact on your existing repositories. When we migrate to git.drupal.org (which should happen by the end of the week, according to my precious sources), the repositories will change location. The other thing is that the commit IDs have changed, and you will need to refresh your local checkouts before you can work with them effectively.
So in short:
- repos will move to git.drupal.org
- you need to rebase or clone again
- make sure your identity is set correctly
URL changes
So the repositories will move. At some point, Koumbit will close git.aegirproject.org and everyone will have to fetch the source code from git.drupal.org instead. Concretely, Koumbit will keep the mirror running for a while just to be on the safe side, but as soon as the Drupal git migration is officially done, you shouldn't expect git.aegirproject.org to work and should switch URLs, as detailed below.
This documentation is taken from the Drupal.org git workflows documentation.
Readonly repositories URL change
The readonly access will change from:
git://git.aegirproject.org/provision.git
git://git.aegirproject.org/hostmaster.git
to:
http://git.drupal.org/project/provision.git
http://git.drupal.org/project/hostmaster.git
See also the git patch maitnainer guide.
Core committers repositories URL change
The read/write access will change from:
ssh://gitosis@git.aegirproject.org/provision.git
ssh://gitosis@git.aegirproject.org/hostmaster.git
to:
ssh://[username]@git.drupal.org:project/provision.git
ssh://[username]@git.drupal.org:project/hostmaster.git
... notice how you will need to set your Drupal.org username in the URL. Drupal.org has good documentation on how authentication works on git.drupal.org. See also the project maintainer git guide.
Sandboxes
The migration to Drupal.org also introduces sandboxes, something we didn't have before, and which allows you to host your own branches of our modules (even if you're not a core committer) or contrib/test modules, directly on Drupal.org. See the sandbox collaboration guide for more information on that.
What if I pull/push from/to the wrong one?
It could happen before/during/after the migration that you pull or push from or to the wrong repository: don't worry. Because git handles those things very well, you will always be able to push back to the right one.
To be able to push to the new repositories, however, you will need to follow instructions below to cleanup your repositories.
Rewriting history
I have taken the liberty of rewriting our collective history. There were a lot of commits with inconsistent authors: some had invalid emails, some had no real names, some had multiple different email addresses for the same person. I narrowed the list down, through extensive detective work, to a complete list of 14 distinct authors.
I have rewritten all commit authors that didn't seem right to follow a consistent pattern:
Firstname Lastname <email>
The firstname/lastname is the combination you have used in certain commits. It's not the most popular way (ie. most commits had the nickname instead of first/last), but it seems the most logical one.
The email was your most often used email or the one you have registered as such at drupal.org, if you didn't have an email address specified (in case you committed your code back in the days of CVS). (There are 6 commits from someone with a box at ovh.net that I could track, and that I can only guess are Miguel's, but I'm not sure so I have let those one fly. :)
You should make sure you have your identity set straight in your git configuration, especially your email address, if you want your contributions to be properly credited in the future on Drupal.org. In short, this means:
git config --global user.name "User Name"
git config --global user.email user@example.com
Again, see the manual for more information, especially how this maps to your Drupal.org account.
Rebase (or fresh clone) is necessary!!!!
Note that, because we rewrote history, you will need to update your repository to the new remote. That should work fine, without duplicate commits or similar nightmares - normally (I tested this!) git is smart enough to see that only the metadata has changed and will ditch the old metadata.
But you do need to fixup your repository to be able to push to git.drupal.org. This can be done by cloning from scratch or by rebasing your branch to the upstream.
Cloning again
Recloning the new repos is the easiest and most obvious way, but may be annoying if you work on a bunch of branches or have local branches you want to keep. To do this, just trash or move away the old repository and clone from the above URLs.
Rebasing
To rebase, it's a bit more complicated, but it will allow you to keep your local changes transparently. I have successfully done the following steps:
git clone git://git.aegirproject.org/provision
cd provision/
git remote rm origin
git remote add origin ssh://anarcat@git.drupal.org:project/provision.git
git rebase origin/master
(well, I'm cheating here: I used a test URL with the repo as it will be after the migration)
This looks something like this:
anarcat@angela:test$ git clone git://git.aegirproject.org/provision
Cloning into provision...
remote: Counting objects: 8230, done.
remote: Compressing objects: 100% (5492/5492), done.
remote: Total 8230 (delta 5966), reused 3629 (delta 2660)
Receiving objects: 100% (8230/8230), 1.26 MiB | 628 KiB/s, done.
Resolving deltas: 100% (5966/5966), done.
anarcat@angela:test$ cd provision/
anarcat@angela:provision$ git remote rm origin
anarcat@angela:provision$ git remote add origin gitosis@git.aegirproject.org:/export/provision
anarcat@angela:provision$ git fetch origin
warning: no common commits
remote: Counting objects: 7842, done.
remote: Compressing objects: 100% (2985/2985), done.
remote: Total 7842 (delta 4886), reused 7790 (delta 4834)
Receiving objects: 100% (7842/7842), 6.37 MiB | 625 KiB/s, done.
Resolving deltas: 100% (4886/4886), done.
From git.aegirproject.org:/export/origin
* [new branch] debian -> origin/debian
[...]
* [new branch] provision-0.4 -> origin/provision-0.4
* [new branch] ssl -> origin/ssl
anarcat@angela:provision$ git rebase origin/master
First, rewinding head to replay your work on top of it...
Nothing to do.
anarcat@angela:provision$
Other documentation
People unfamiliar with git or specifically git on Drupal.org should read the growing git handbook on Drupal.org.
People that are familiar with git should contribute to that manual. :)
Under the hood
The history rewrite was performed through git-fast-export and git-fast-import magic, with a fairly simple perl script of my own. I have cloned the current repo, filtered it and published a new version for git.drupal.org to pull. You can see the results here:
http://git.aegirproject.org/?p=export/provision.git
http://git.aegirproject.org/?p=export/hostmaster.git
The exact calling sequence is this:
mkdir export orig cd export git init --bare hostmaster.git cd ../orig git clone --bare git://git.aegirproject.org/hostmaster cd hostmaster.git # look at the authors in the original repo git fast-export --all | grep -a '^author [^<]* <[^>]*> [0-9][0-9]* [+-][0-9][0-9][0-9][0-9]$' | sed 's/[0-9][0-9]* [+-][0-9][0-9][0-9][0-9]$//' | sort | uniq -c git fast-export --all | perl /home/anarcat/bin/rewrite_authors.pl | ( cd ../../export/hostmaster.git ; git fast-import ) cd ../../export/hostmaster.git # look at the authors in the new repo git fast-export --all | grep -a '^author [^<]* <[^>]*> [0-9][0-9]* [+-][0-9][0-9][0-9][0-9]$' | sed 's/[0-9][0-9]* [+-][0-9][0-9][0-9][0-9]$//' | sort | uniq -c # rinse, repeat until the mapping is right # push repositories online git remote add origin ssh://gitosis@git.aegirproject.org/export/hostmaster.git git push --all git push --tags # repeat with provision
This is the script source, without all the mappings:
#! /usr/bin/perl -w %authors = ( 'anarcat ' => 'Antoine Beaupré ', # ... ); sub remap { my $a = shift; if ($authors{$a}) { return $authors{$a}; } else { return $a; } } while (<>) { s#^author ([^<]+ <[^>]+>) ([0-9]+ [+-][0-9][0-9][0-9][0-9])$#'author ' . remap($1) . ' ' . $2#e; print; }
Migration checklist
- serve exported repositories to migration team done!
- update the repositories with change authors done!
- the great git migration in progress (scheduled downtime)
- update documentation in handbook
- update makefiles and files in docs/* in provision to fetch from git.drupal.org
- make at least one other release
- turn off git.aegirproject.org