An introduction to the Aegir Hosting system

This article first appeared on mig5's blog on 13th October, 2009

Tonight at 18:30pm GMT+11 I'll be giving a talk/presentation on 'An Introduction to the Aegir Hosting system' at the regular monthly Melbourne Drupal meet @ the Emspace offices.

My notes during the last week kind of took on a life of their own, and so in anticipation of the talk, I thought I'd publish them here, despite the somewhat casual tone, for those who can't make it and want a somewhat (perhaps overly, for an Introduction!) detailed overview of the Aegir system and what it does.

I believe the session will also be recorded, so we should see the video pop up on blip.tv or somewhere in days to come. Update 20/10/09: The video is available here http://blip.tv/file/2741212

A variation of these notes may end up being part of the documentation on g.d.o too.

Update: Here's a PDF of these notes http://ln-s.net/4Nbk

---------------------------------------------------------------------------

An introduction to the Aegir Hosting system

Contents

  1. What is Ægir?
  2. Brief history of Ægir
  3. Dissecting gods: the anatomy of Ægir
  4. Features and terminology
  5. What Aegir doesn't do
  6. Roadmap - where are we going
  7. Questions
  8. Links

What is Ægir?

Ægir is a set of Drupal components that when used together, help you manage other Drupal sites.
It does this by providing you with a simple Drupal based hosting frontend or control panel that allows you to perform actions against your network of Drupal sites. Some of these actions or 'tasks' include Installing new sites, Verifying the integrity of sites, backing up and restore, management of 'Platforms' (which are copies of Core or distributions such as OpenAtrium), and migration of your sites between these platforms. We'll cover the relationship between sites and platforms in a moment.

To Ægir, a site is just a node like any other node in a Drupal site. Depending on the node type, be it Site or Platform etc, determines what tasks like these can be performed.

Taken from the page on groups.drupal.org: In Norse mythology, Ægir was the god of the oceans and if Drupal is a drop of water, Ægir is the deity of large bodies of water.

Brief history of Ægir

Ægir used to be called Hostmaster, or Hostmaster2, and was authored by Adrian Rossouw who remains the lead developer of the project, back when he worked at Bryght. I believe Bryght pioneered the idea of a system like this to manage large deployments of sites for their clients, and Hostmaster was the result of that work and was subject to much growth and enhancements over a period of 4 years before it ever became 'Aegir' as it is now. Adrian's also known for other Drupal accomplishments such as the Forms API, the PHPTemplate theme engine, and the install profile system.

Since then, Bryght got acquired by Raincity Studios, and Adrian's now working for Development Seed.. thus a large influence on the development/direction of Ægir is naturally coming from that camp as they provide Adrian with the time and the means to work on improving the product.

Joining Ægir as part of the core development team is Antoine Beaupré, aka 'anarcat', who's from the Koumbit worker's collective in Montreal and been working on this for a long time too, and myself.

For this introductory talk, I'm going to focus on the anatomy of Ægir and how the components fit together and with Drupal. Then I'll provide a brief overview of the main features of Ægir and what you'd use it for.

After the talk, to avoid taking up too much time, I'll be happy to run through some more in depth, under-the-hood areas of Ægir such as the Provision backend, the database schema etc, and answer any of the more technical questions you might have.

Dissecting gods: the anatomy of Ægir

Ægir is not a 'module' nor yet a 'distribution' for Drupal, but currently a collection of 'components' as I like to call them, that work together to do the whole job.

In order to best understand how it all works, we can define Ægir as made up of two parts: the frontend, and the backend.

Frontend

The frontend is actually a Drupal site that makes use of a module called 'Hosting', which does all the job of the frontend: basically it provides the mechanisms to add / remove nodes like any other Drupal site, allowing you to point and click on tasks like Install, Backup and so forth. It also provides a nice UI to manage your sites in general, and to review tasks that have completed, or in progress, or failed for whatever reason. Most screenshots you've probably seen of Ægir, are using the Eldir theme developed by Young Hahn at Development Seed, which is made specifically for Ægir, though it is optional and Ægir will function without it.

Backend

The backend is made up of a 'drush module' or extension called Provision, which does pretty much its namesake: it works with Drush very heavily to handle all of the tasks such as installing new drupal sites, importing existing ones, installing platforms, verifying everything, instigating backups and restores, and so on. So it's actually doing the heavy lifting here, actually executing the tasks you assign to the system.

Provision is invoked through the use of drush commands, and thus these commands can also generally be executed from the command line to perform the functions that make Ægir so important. In fact it's very much by design that all the magic happens in the backend, because it allows us to have one Ægir frontend that can manage multiple backends, or in otherwords, multiple webservers and database servers. Though these are experimental features at the moment.

How Hosting and Provision work together

The glue that binds what Hosting lets you kick off tasks for, and how Provision learns of those tasks and executes them, is by way of adding the tasks to a Task queue in the database, which is viewable in the frontend.

On your server, we then have a dedicated 'aegir' user, whose crontab has an entry to regularly look and 'dispatch' any queues containing tasks that are pending, which are then forked off into drush provision commands in the background.

So the idea is that all this happens behind the scenes for you as the manager of the aegir system: the frontend exists for you to queue up all these sorts of actions, and let Ægir worry about actually doing them in the backend.

Along with Provision, Hosting, and Eldir, there is also another component which is an Install Profile called 'Hostmaster'. The Hostmaster Install profile is, like all install profiles, really only executed once, when you are setting up Ægir for the first time. And it does the work of getting information from you, or telling you to set things up on a system level (like directories, permissions etc), asking what Features you'd like to enable, and kicking off the initial verification of the system, setting up the crontab entry etc.

The home directory of Aegir

The backend Provision module

The Hostmaster profile and the Hosting frontend module contained therein

The Features and terminology of Ægir

So before I start running a few examples of what Ægir can do, I thought I'd bore you some more with the terminology and features that Ægir has, as there's often some confusion over what some of the terms mean.

Some features in Ægir are enabled by default, some cannot be turned off, some are optional and can be enabled at any time, and some are experimental and may have unexpected results in production environments.

Everything is a node

In keeping with strong Drupal tradition, and out of pure sense, everything is a node in Ægir. This includes Ægir's information on your web server, database server, platforms, sites, right down to tasks, and information about modules, themes, and install profiles.

Platforms and Sites

First of all, Ægir's very much designed to work within Drupal's multisite structure. Often one of the biggest hurdles users experience when trying to learn Ægir, can be traced back to the fact that few of them are actually making use of Drupal's multisite design, and instead tend to use one core per site. While Ægir will work that way just fine, its designed to make use of the proper multisite logic introduced way back in I think Drupal 4.5 or maybe d5 - in fact, it's what makes it so powerful and useful - it handles multisite for you, and it handles it properly.

For whatever reason, multisite is often not used, and so users are often confused about what constitutes a 'Platform' in Ægir, so we'll cover that now.

Basically the first thing you do after installing Ægir, is go and set up what we call a 'Platform', which is the code base within which you'll create a Site to sit on.

'Platform' translates directly to a copy of Drupal core, like Drupal 6.14 or 5.20, but it isn't limited to Drupal core. It could very well be a 'distribution' or custom Drupal core like OpenAtrium, Pressflow, OpenPublish, and so on. So it's important to recognise that a Platform is actually very simple to understand conceptually once you learn to exchange the term for a 'Drupal core' or thereabouts, and I suspect it's often expected to be more complicated than it really is.

So just per a standard multisite system in Drupal, sites are installed into the sites/ subdirectory of that Platform.

So a Platform and a Site are two examples of what are simply node types in the database, just like any regular Drupal site. This is what makes Ægir so easy, because to create a new Site, you can literally node/add/site or Create Content > Site and fill out the required fields, submit the form, and Ægir does the rest.

So you start to form a picture in your mind of a pyramid structure: In Drupal multisite, a Core or a Platform manages many sites. Ægir is a level above that, in that it can manage many Platforms. So Ægir becomes very powerful, because through inheritance of managing many Cores or Platforms, it exponentially can manage many sites, be they Drupal 5, Drupal 6, Pressflow, or Openatrium sites. And so on.

The task queue

When you add a new Platform or site, you can see that these items are immediately dropped into the task queue which you can see in this block in the sidebar. As the cron comes around once per minute, it'll pick these tasks up and dispatch them to the backend to be executed.

Depending on the return from Drush and Provision as they execute these tasks, the status of this task in the queue will change from Queued to either Failed or Success. In both cases, the full drush log output is given in the task node.

On a failed task, you can review the problems and if you feel you've fixed the issue, re-schedule the task to re-enter the queue and be executed again.

Drush and rolling back changes

Because Drush has a rich API that provides validate_, pre_, post_ and rollback hooks, we can safely revert any changes that were made if a task fails, through a rollback, or perform other functionality before or after the fact. This means that although Ægir is taking a lot of the heavy lifting out of your hands with this stuff, it's also very careful with your data. It doesn't want to destroy your sites, it wants to nurture them, and it'll undo or refuse to perform any actions it thinks might be unsafe.

Installing Sites

In the case of installing a site, support is written for Drupal 5 and Drupal 6 platforms. There is Drupal 7 support in there that once worked, but the nature of Drupal 7's shifting meant that support for d7 fell out of Drush, and thus it currently doesn't work in Ægir, though I think a Drupal 7 platform *will* verify ok.

Installing sites works because Provision basically knows how to create a mySQL database and credentials automatically for the site and apply these settings, along with locale, timezone etc, and any other settings you normally enter manually during a Drupal install, automatically into the installer. It doesn't matter what installer it is: it can be either the standard Drupal default installer, or a custom install profile you've written, or other install profiles such as the atrium_installer profile that makes OpenAtrium what it is.

Install also goes through and generates the Apache vhost configuration file for the site and restarts Apache, so you don't need to do any of this.

On successful installation of a site, the onetime login reset link will be e-mailed to the 'Client' (we'll go through Clients later), and the login url will also be logged to the task output for quick reference. It's normally the task output that I'll go to to retrieve the login url. So immediately you can login to your brand new drupal site, and you only clicked a couple of buttons.

Site Aliases

Ægir supports site 'aliases', which translate directly to the ServerAlias parameter in Apache vhost files. One can add URL aliases that will serve a site called by another name. A classic case is installing a site with a URL of 'www.example.com' with an alias of 'example.com' so that example.com also serves the site (providing a RR is set in DNS of course).

Redirection is also supported as an optional feature to Aliases, allowing permanent redirects of the aliases to the main site URL by way of mod_rewrite. All this is configurable in the site node form, though Site Aliasing is an optional feature that must be turned on first.

Importing sites

It is possible to import existing sites that you're hosting on a server, into your new Ægir system.

Since most people seem to be using one core per site and not using multisite, it's just a matter of moving the copy of core into /var/aegir/ , renaming sites/greenbeedigital.com.au to sites/$yoursite, and then adding a Platform in the Ægir frontend. Upon a successful Verify of the new Platform, any detected sites living in the sites/ folder will have an Import task spawned for them, and Ægir will hence 'learn' of these sites and their packages.

Obviously it is not ideal to have one platform per site, and the whole point of Ægir is to support multiple sites on one Platform. So once this site is imported, you will likely want to Migrate the site to another platform, which we'll cover in a moment.

Verify, and Packages
Along with Install, other tasks exist in Ægir. One that is often run is Verify, which just checks that the filesystem of the site or platform looks sane, permissions all correctly set, confirms the aegir superuser can create new databases, and so on. it also refreshes the database with knowledge of what modules, themes or profiles (altogether referred to as Packages in Ægir) exist on a platform of site. It keeps a running state of what packages are installed, so that if you try to Migrate a site, it knows whether the target platform can support these modules or not.

If you edit a site or platform's node - for instance if you are adding a URL alias for an existing site - on node_save, the item will be re-added to the task queue as a Verify task. Such functions serve to handle such cases like an alias addition, where the site's vhost needs to be edited and Apache restarted for the change to take effect.

Site Disable / Enable

If a site is installed and enabled, one can Disable the site, which effectively puts a redirect on the apache vhost for that site to redirect to a page saying the site's been disabled. It also goes and backs up the site.

If a site is Disabled, it can then also be re-enabled, which essentially unsets the redirection in the vhost so that the site starts working again.

Site Delete

After a site is in a Disabled state, you can Delete a site, which will go through and (again) take a backup first, then goes and removes the actual site data from the server, removes the knowledge of the site from the aegir database, drops the site's own database, and removes the vhost config. As mentioned earlier, if any of these steps go downhill, the procedure will be reversed as best as it can.

Site Migrate

I mentioned Migrate earlier, and this is really in my opinion, the biggest feature of Ægir and what makes it so worth using. Migrate essentially is another task that tells Provision to *move* the site from one platform to another. The most obvious case of where you'd want to do this is when a new Drupal core release comes out, like recently with 6.14.

All I had to do is download Drupal 6.14 to the server, add it as a platform in Ægir, and then use Migrate to move my Drupal 6.13 sites to Drupal 6.14.

In the process of this, it runs the Drupal updates and hence performs the hook_updates, schema changes etc that one would normally do by hand in an upgrade.

What makes quite significant is that it's by design that you can take a Drupal 5 site, and use Migrate to upgrade it to Drupal 6. It's built right into Ægir to handle that upgrade automatically for you.

Migrate basically takes a backup of the site, and then it uses a special 'deploy' command to 'deploy' this instance of the site to the target platform. It then removes the site from the existing platform.

I mentioned earlier the concept of a 'Package' in Ægir, which is any module or theme or install profile that Ægir finds on a platform or a site. With its knowledge of the package, its schema version and what 'instance' in which this package is being used, it can analyse a site that you want to migrate, compare the packages in use with what can be used on the target platform, and present a report on whether the site can be migrated or not.

In other words, it will intelligently calculate that you cannot migrate a Drupal 6 site to a Drupal 5 site, because the schema versions of Drupal 5 packages are incompatible with those installed on the Drupal 6 site. However obviously, the reverse ought to work.

And yes, it backs up and will rollback if unsuccessful. :)

Site Backup

So I've mentioned Backup a few times, so I should cover it: essentially the Backup task can be executed against a Site. It does a mysqldump of the site into a file in the site's directory, and then it tars up the whole site's directory and drops the tarball into a backups directory that lives outside the document root of any platform with a timestamp.

Site Restore

Using the Restore task, you can take any of these backups, and with a single click, restore your entire site, codebase and database all together, to a previous state from one of the tarballs.

Deploy

The 'Deploy' task I mentioned earlier isn't shown in the frontend. It's a backend command that is used in Migrate and Clone tasks. Clone is a new feature in our early 0.4 alpha releases that does what you probably are already thinking it does: it generates a snapshot of a site (actually using the Backup task above, we reuse as much code as possible) and uses Deploy to deploy that copy of a site with a new URL to any platform, be it the current platform or a separate platform.

Site Clone

So you can use Clone to clone entire copies of your site to a new URL, which could be useful for testing, or if you have a sort of 'template' site with common modules and themes, structures or common nodes etc. You could consider it a poor man's install profile.

Batch Migrate

To date, you could only Migrate a site on a per-site basis. We've recently introduced Batch migrate, which means you can migrate *every* site that currently exists on a platform, and move them all to a new platform in one hit. Again, especially useful when you have a lot of sites on a Drupal core that has suddenly become vulnerable with a new security release, and you need to move them all across to the new core as soon as you can.

We know of Ægir being used to deploy and manage more than 2100 sites on one system, actually by another Australian, Dave Hall, who's up in Bendigo, and goes by the alias skwashd on Drupal.org and IRC. So features like Batch Migrate are especially significant.

Cron Queue

Since Ægir already has a task queue scheduler, it makes sense that it should be able to queue up enabled sites and run cron regularly on the sites themselves. So a Feature exists to do exactly that. Both the regularity of the Cron and the Task queues are configurable.

Clients, roles, permissions

Another Feature and node type is that of the Client. A Client in Ægir is not a specific user in the typical Drupal sense, but more so a conceptual wrapper node that also translates directly to a role type. One then can create users and assign them to a 'Client' just like a role. Such users have the ability to login to the Ægir site and manage their own sites. In this way you can expose Aegir functionality to your clients and let them create their own sites or manage them (to a point).

Another role that Ægir sets up is the Ægir Account Manager: this is design as a non-technical role, likely a member of your own staff, who has the ability to set up and manage Clients in Ægir, but cannot perform any technical tasks or install sites etc.

Ports and SSL

A lot of work has been done in recent releases to implement multiple webserver Port support, along with SSL support which is not quite complete.

Currently one can create only one site URL on one port, but cannot create the same site URL on another port. There are cases where this will be required (such as port 80 and 443 for sites that need SSL as well), which is something I'm working at the moment in my development.

What Aegir doesn't do

  • Aegir knows about your modules and their release schema etc, in order to handle migrations and that sort of thing. But it isn't a plugin manager, and you can't disable/enable modules of sites within Ægir.
  • To the dismay of at least 50% of users who install it, Ægir is not in fact a build management tool, it's a site management tool. But one can build Platforms using drush_make, which can pull from drupal.org repos as well as your own git/svn/cvs repository, and migrate your sites onto new 'release' builds. I'll spend some time on this after the talk for those who are interested.
  • It doesn't provision non-Drupal sites. But it may support static sites, Joomla? in the future..

Roadmap - where are we going?

  • Lots of UI refactoring will be done before 1.0 - modal dialog, more UI feedback to the user
  • Refactoring 'Servers' from being separate node types i.e webserver, dbserver etc, to a single 'Server' type that acts as a container for pluggable Services (http, mysql, postgresql, dns)
  • Develop a File service, one reason is for dealing with moving backups around.. 'spoke' model or 'mesh' model (see roadmap)
  • Just in this week gone, in the alpha2 release, we leverage 'Drush make', allowing Hostmaster to provision itself and migrate itself to new platforms (0.4 alpha2 and onwards likely)
  • Third-party application hook-ins, i.e DNS, LDAP, Mail, Jabber.. with the idea of a control panel done right
  • Stronger quota management and the potential for e-commerce hook-ins, selling sites as a product
  • Move to PDO to support PostgreSQL, SQLite etc
  • The 1.0 release will see a 'frozen' Hostmaster API

What else?

Questions?

Links

PreviewAttachmentSize
aegir.png
aegir.png8.9 KB
backup.png
backup.png33.57 KB
batch_migrate_0.png
batch_migrate_0.png22.25 KB
batch_migrate_progress.png
batch_migrate_progress.png16.87 KB
clone.png
clone.png51.99 KB
create_content.png
create_content.png11.24 KB
create_platform.png
create_platform.png56.16 KB
create_site.png
create_site.png58.9 KB
crontab.png
crontab.png16.02 KB
features.png
features.png42.51 KB
front.png
front.png77.44 KB
fs1.png
fs1.png24.32 KB
fs3.png
fs3.png110.06 KB
fs4.png
fs4.png31.6 KB
import.png
import.png8.17 KB
migrate1.png
migrate1.png51.76 KB
migrate2.png
migrate2.png51.3 KB
packages.png
packages.png52.02 KB
queue_block.png
queue_block.png9.01 KB
restore.png
restore.png43.78 KB
roles.png
roles.png11.76 KB