Client: NIH NICHD CSSC
Languages / Platforms: Python Flask HTML/CSS/JS USWDS Drupal
Role: Technical Lead Software Developer

The Confluence-to-Drupal migrator is a comprehensive set of tools I developed for, as the name indicates, migrating an entire Confluence website into Drupal.

Confluence as a CMS

Several ICs at NIH were using Confluence as a CMS. An unusual choice, but the greatest benefit was that you can edit sites using an easy-to-understand block editor, and you can even build your own custom blocks for maximum customizability.

Unfortunately, though this part of the site was customizable, many other aspects of running the site were quite restricted. Editing CSS, for instance, required users to make all changes in a single file in a tiny editor window.

Suddenly, licensing changes!

The original impetus for migrating away from Confluence CMS came when they changed their licensing agreement. Support for Confluence Server reached the beginning of the end in February 2021 when they stopped selling new licenses and started directing everyone to Cloud, a less-capable and less-customizable substitute.

Costs go up, flexibility goes down.

Enter: Drupal

After some research, we decided that Drupal was a pretty clear winner as a replacement. It's free and open source, has an enormous user base and developer community, and is used by many, many government sites.

Additionally, unlike other otherwise-viable alternatives like WordPress, Drupal has robust support for the US Web Design System. The USWDS was designed by and for the US federal government as a way to provide a familiar experience across all .gov websites. The Confluence-based site used USWDS extensively and we had no intention of abandoning that, so Drupal it was.

Unfortunately but unsurprisingly, there existed no tool to convert Confluence-as-CMS to Drupal. And even if there had been, it would've been extremely difficult to also carry over all the structured macros that had been custom-built and put all of them where they're supposed to go on the new site.

It was decided that the best solution was to create a bespoke solution that would handle all of these tasks. That's where I came in!

Let's go over the individual problems that had to be overcome.

Problem 1: How to migrate 1400+ pages?

Luckily for us, Confluence has a feature allowing you to export an XML file containing the entire (or partial) contents of the site. It is a very complex format that really cuts to the bone of how Confluence works under the hood; one thing references another thing somewhere else in the file, which references another thing, etc.

I put in the time to untangle this complexity, at least insofar as was needed to extract the contents of all published articles on the site.

Once everything's processed, it imports each article into Drupal automatically using the REST API feature.

Problem 2: How to keep the block-based editor?

Unlike Confluence, Drupal doesn't natively support block-based editing. It does, however, have a plugin available that allows you to use Gutenberg, the block editor that comes standard with modern WordPress.

My migrator tool provided one-to-one replacement of structured macros to Gutenberg blocks with no human intervention needed.

Problem 3: How to keep special block functionality (and migrate it to boot)?

Of course, it wasn't that simple. Much time had been spent fleshing out the suite of structured macros that made the Confluence-based sites as customizable and easily-editable as they were. Unsurprisingly, the format used is not one that can easily be replicated automatically in another CMS.

As such, part of the project was to painstakingly recreate most of the structured macros in Gutenberg by way of building custom blocks.

And yes, that includes migrating all settings from structured macros into their new Gutenberg block analogues!

Security alert

The migrator project was originally planned as a long-term project which wouldn't strictly be needed until the existing licenses expired. It was worked on in the background over approximately two years.

But then one day, the Confluence site was taken offline as the result of a security exploit.

Suddenly, the migrator's day had come!

Working with Woodbourne

The decision was made to get the new Drupal-based site working as quickly as possible, since Confluence was at this point no longer a viable solution.

After originally considering the possibility of using Drupal as the basis of a static website, eventually we enlisted the expertise of Woodbourne Solutions, who was already responsible for building and administering many of the other Drupal-based sites used by NICHD.

For this project, I was given the role of Technical Lead, as I was the one most familiar with the migrator and associated workflow.

We had just under two months to completely migrate the site and a good chunk of its articles to Drupal. (The plan is to go through the earlier articles over time.) This involved working out the remaining glitches and oddities of the migrator, rapidly adapting to an unfamiliar workflow, and keeping all the pages and their progress organized through a custom-built issue tracker app.

What it did NOT involve, however, was painstakingly reproducing every single article by hand.

Success

The site finally went online in November of 2024.

Having access to the entire inner workings of the site allows for far greater customizability than the Confluence version ever allowed. We run it on our own servers (well, WBS's servers) and can change pretty much anything we want.

And I'm proud to say it was all made possible (at long last) through the tools and workflow I'd built!

This was a pretty complex project and I've skipped many of the finer details that would have slowed this project profile to a crawl. I'm happy to provide all those details one-on-one :)