Migrating at scale? These are the tools you need to use
Migrating a website from one platform to another can be daunting. But what if you’re migrating at scale? We recently completed a 50-strong migration project and in this post we outline some of the tools that we used to help things run smoothly.
Our biggest migration project to date is one we undertook for the University of Limerick (UL). The project will ultimately result in the migration of more than 200 sites to the Drupal 9 platform.
Phase one of the massive migration was completed last year, and involved 50 initial sites. These sites were on the Drupal 7 platform, which is nearing end-of-life. Each site had different log-ins, content types, configuration, modules and plugins.
Migrating to one, single D9 platform would allow the University of Limerick to improve governance, ensure regulatory compliance and brand consistency, and reduce technical debt. Plus there would be only one platform to maintain.
Rebuild vs upgrade
We knew that Drupal 9 was the websites’ destination. But before the project could commence we needed to map out the best way to get there. This is an important step in any project – do you rebuild the old site or would it be better to keep the site and upgrade it?
What’s the difference?
An upgrade is when you update your website to the latest version of your CMS and its related modules. It means keeping the content in place and simply updating the code around it. Upgrades are generally more technical in nature and don’t involve redesigning.
A rebuild means that you essentially start from scratch. The reasons for rebuilding are usually technical – so you might be changing from one CMS to another, or the technology behind your website is obsolete. Rebuilding may involve migrating information from the old version of the CMS to the new one.
It will probably also include a redesign – websites need refreshing every couple of years because they need to stay on-trend, and also there may be redundant sections of the website that need to be replaced with new ones. This will be the case if the business has grown in certain areas, or added new products and services. The information architecture may also need tweaking as user journeys evolve and their needs change.
A rebuild was decided on as the best option for the UL project. There were many reasons for this. Drupal 7 was released more than 11 years ago and technology had moved on quite substantially. Many of the decisions made in the original builds were now outdated, and although the different sites had grown organically over time, they were no longer fit for purpose.
Sometimes it’s better to rebuild rather than carrying a heavy legacy over to a new website. Plus there were many abandoned modules, and, probably the biggest reason to rebuild: there was no clear upgrade path.
Preparing for migrations
Before ever starting a migration, we recommend you undertake the following process:
- Get to know your data. There’s no such thing as too much planning. Look at legacy data structures, every single field and entity – you need to understand your data. What you’re going to keep and what you’re going to ditch? Maybe some content is completely out of date…
- Map your data structures. Create a spreadsheet for every single entity. Map it from where it is to where it’s going to live on the new site. This step can take weeks or months, depending on the size of your migration and how many sites you’re doing. We created hundreds of mapping spreadsheets for the UL project.
- Define your control data. This is the data that you know covers the different aspects of the site, the different combinations of fields. This is the data that you’re going to be testing with constantly. Run a test migration. If it doesn’t work then you’ll need to tweak it, roll it back and run it again. When the tests pass with your control data then you can move on to running it on the full set of data.
Tools to help in the migration
Once a decision has been made on the platform you’re going to be moving to (in this case it was Drupal 9), and how to get there (rebuild or upgrade), there are many tools available to help you on the migration journey.
Here are some of the tools that were utilised in the University of Limerick migration:
1. Spreadsheets
These documents are extremely useful for collaborating and were done by hand for this project. It was laborious. To complicate matters, some of the fields were only on some of the sites. Or some fields were on all of the sites but the configuration was different.
2. Drupal 7 field analysis
Spreadsheet hell was the catalyst for Annertech senior developer Erik Erskine to come up with a nifty module for the next migration. Erik looked at ways of automating the process of creating spreadsheets, and wrote a command line tool that would create a Google sheet for a single entity type.
The online tool extracts all the fields in paragraph types, where each field has been used and in what context, and links them up to the legacy site. His module allows a comprehensive view of what’s in each entity, and the module will save hours of work in future.
You can find the Drupal 7 field analysis module in the drupal.org sandbox.
This short video demonstrates how it works:
3. Derivatives
Scaling your configurations can be made simple by using derivatives.
When you have a large number of migrations, with varying variations and discrepancies between sites that have slowly diverged over time, what you may need to do is define a base configuration to cover the commonalities, and then create a separate migration file for each variation which extends it. This allows you to run each of the individual migrations covering all the variations, but as the project scales and the number of variations increases, so does the number of YAML files – it can all get quite tricky to manage! This is where derivatives are extremely useful.
Derivatives provide a simple way to expand a single plugin so that it can represent itself as multiple plugins, and are used in other parts of Drupal, not just in migrations. To implement derivatives, you just need to add in an extra line in your YAML file to define the path to the deriver plugin you want to use. This causes your YAML file to become a definition of a base plugin and then it’s up to the deriver to take this base definition and generate lots of copies.
The end result is that we end up with lots of migrations that operate and behave the same as normal non-derivative migrations, but rather than multiple YAML files to maintain covering all of these migrations, there is just one base YAML file.
This is a huge time saver when it comes to creating, or later tweaking the migrations. You can just edit the variations in one central location, and avoid having to edit multiple YAML files when you need to change something later.
The derived migration machine names are a bit different from what you may be used to. They take the format of the base identifier, colon and derivative name (base_identifier:derivative_name). For example, if you have a migration for the “page” content type and create a derivative for each site in a multisite, then the derived migrations could have names like “node_page:site1”, “node_page:site2” and so on.
Similarly a set of derived taxonomy migrations could be created as “taxonomy_term:tags”, “taxonomy_term:product_types”, etc. Other than this slightly different naming structure they are exactly the same. This article has the essence of how to actually use derivers: “Migrations can now be derived and executed after a specific other migration.”
Good to go
Don’t be intimidated by migrations. Once you’ve done the preparation, and used some of the tips and tools mentioned above, it should all hopefully go according to plan.
Of course things can change and go wrong when you least expect it, but following the guidelines in this blog should set you up for success.
This blog was written following a presentation by Stella Power and Erik Erskine at DrupalCon Prague, called “Migration at scale. How to not… fail”. You can watch the presentation here:
If you're looking for more advice on migrations, check out this blog
Senior Backend Developer Erik Erskine has done so many migrations he's lost count. And, fortunately for developers who are battling with tricky projects, he's jotted down some of the tricks he's discovered along the way.
Dreading your next migration?
Known for our ability to solve challenging problems, the Annertech team has carried out some tricky migrations. Let us do the heavy lifting for you.
Stella Power Managing Director
As well as being the founder and managing director of Annertech, Stella is one of the best known Drupal contributors in the world.