Back at the start of this series, I listed four problems within the scope of the WordPress Importers that we needed to address. Three of them are largely technical problems, which I covered in previous posts. In wrapping up this series, I want to focus exclusively on the fourth problem, which has a philosophical side as well as a technical one — but that does not mean we cannot tackle it!
Problem Number 4
Some services work against their customers, and actively prevent site owners from controlling their own content.
Some services are merely inconvenient: they provide exports, but it often involves downloading a bunch of different files. Your CMS content is in one export, your store products are in another, your orders are in another, and your mailing list is in yet another. It’s not ideal, but they at least let you get a copy of your data.
However, there’s another class of services that actively work against their customers. It’s these services I want to focus on: the services that don’t provide any ability to export your content — effectively locking people in to using their platform. We could offer these folks an escape! The aim isn’t to necessarily make them use WordPress, it’s to give them a way out, if they want it. Whether they choose to use WordPress or not after that is immaterial (though I certainly hope they would, of course). The important part is freedom of choice.
It’s worth acknowledging that this is a different approach to how WordPress has historically operated in relation to other CMSes. We provide importers for many CMSes, but we previously haven’t written exporters. However, I don’t think this is a particularly large step: for CMSes that already provide exports, we’d continue to use those export files. This is focussed on the few services that try to lock their customers in.
Why Should WordPress Take This On?
There are several aspects to why we should focus on this.
First of all, it’s the the WordPress mission. Underpinning every part of WordPress is the simplest of statements:
The freedom to build. The freedom to change. The freedom to share.
These freedoms are the pillars of a Free and Open Web, but they’re not invulnerable: at times, they need to be defended, and that needs people with the time and resources to offer a defence.
Which brings me to my second point: WordPress has the people who can offer that defence! The WordPress project has so many individuals working on it, from such a wide variety of backgrounds, we’re able to take on a vast array of projects that a smaller CMS just wouldn’t have the bandwidth for. That’s not to say that we can do everything, but when there’s a need to defend the entire ecosystem, we’re able to devote people to the cause.
Finally, it’s important to remember that WordPress doesn’t exist in a vacuum, we’re part of a broad ecosystem which can only exist through the web remaining open and free. By encouraging all CMSes to provide proper exports, and implementing them for those that don’t, we help keep our ecosystem healthy.
We have the ability to take on these challenges, but we have a responsibility that goes alongside. We can’t do it solely to benefit WordPress, we need to make that benefit available to the entire ecosystem. This is why it’s important to define a WordPress export schema, so that any CMS can make use of the export we produce, not just WordPress. If you’ll excuse the imagery for a moment, we can be the knight in shining armour that frees people — then gives them the choice of what they do with that freedom, without obligation.
How Can We Do It?
Moving on to the technical side of this problem, I can give you some good news: the answer is definitely not screen scraping. 😄 Scraping a site is fragile, impossible to transform into the full content, and provides an incomplete export of the site: anything that’s only available in the site dashboard can’t be obtained through scraping.
I’ve recently been experimenting with an alternative approach to solving this problem. Rather than trying to create something resembling a traditional exporter, it turns out that modern CMSes provide the tools we need, in the form of REST APIs. All we need to do is call the appropriate APIs, and collate the results. The fun part is that we can authenticate with these APIs as the site owner, by calling them from a browser extension! So, that’s what I’ve been experimenting with, and it’s showing a lot of promise.
If you’re interested in playing around with it, the experimental code is living in this repository. It’s a simple proof of concept, capable of exporting the text content of a blog on a Wix site, showing that we can make a smooth, comprehensive, easy-to-use exporter for any Wix site owner.
Clicking the export button starts a background script, which calls Wix’s REST APIs as the site owner, to get the original copy of the content. It then packages it up, and presents it as a WXR file to download.
I’m really excited about how promising this experiment is. It can ultimately provide a full export of any Wix site, and we can add support for other CMS services that choose to artificially lock their customers in.
Where Can I Help?
If you’re a designer or developer who’s excited about working on something new, head on over to the repository and check out the open issues: if there’s something that isn’t already covered, feel free to open a new issue.
Since this is new ground for a WordPress project, both technically and philosophically, I’d love to hear more points of view. It’s being discussed in the WordPress Core Dev Chat this week, and you can also let me know what you think in the comments!