CMS Architecture: Before you begin

It’s important to plan any large-scale software development. Fail to plan, plan to fail as the old saying goes. So before a single line of code is written, before even any architecture or technology decisions have been made, I’m going to define what I want my CMS to be.

Firstly, and most importantly, the CMS needs to be extensible. In the same way that CRM (customer relationship management) systems became XRM (entity relationship) systems this CMS needs to be an XMS.. This will. mean that the system can be used as much more than just a system for managing pages, and instead can be used to manage any kind of data.

Content is generally thought of as pages and files. but just below the surface the vast array of types of content is easy to find:

– Property website have pages which represent houses
– Band websites have albums, which have songs
– E-commerce systems have products, which have alternative versions (like sizes) and related products
– Magazine sites have articles, which may be several on one page or longer articles spread across multiple pages

Forcing users to manage and – more crucially – conceptualise their content as just “pages” will lead to confusion and badly maintained data.

WordPress handles these different types of data through Custom Post Types, and does it very well. All post types share the same basic properties, and can be extended by adding further metadata. Relationships can be defined between posts using the taxonomy system.

This capability is crucial to the power afforded by a CMS and needs to be baked in from the very beginning.

Secondly the CMS needs to be easy to work with by developers (ease of use from a front-end user perspective goes without saying). There are three traits of well-designed systems that make them easy to use by developers.

1) Consistent

If the system is not consistent in its approach every line of code has the potential to trip the developer up. Consistency is not just about ensuring class and method names follow a pattern, but about the ethos behind the architecture of the system. If the user has to instantiate a helper class in module X, but in module Y there is a static manager class then consistency is broken.

Also having two – or even more – ways to achieve the same thing without clear direction on which one to choose puts. doubt in the developer’s mind.

2) Discoverable

Following on from the system being consistent is making it discoverable. It may not be possible to create a system where developers never need to check the documentation, but aiming to get as close to that as possible is a worthy goal.

This can be achieved by applying standard patterns that developers will recognise, and by carefully thinking through each architecture decision to ensure it is as intuitive as possible to use the system.

3) Sensible

If just one of these traits is followed, make it this one; make the system sensible. Have sensible defaults, sensible naming conventions, sensible use of existing patterns, sensible architecture decisions. The moment the system forces developers to do something non-intuitive, a little bit crazy, then a battle is lost.

Better to spend a little time now ensuring the most sensible decision is taken than spend a lot of time later reversing a crazy decision.

Thirdly the system must make as few assumptions as possible. Setting hard limits, such as the number of properties an entity may have, is a recipe for disaster – even if you think those limits are really high. Assuming that users will always need to be authenticated by a username and password will fail the moment an organisation needs users to enter their membership number as well.

Assuming that particular HTML output will always be used is also a sure-fire way to ensure the system will get outdated quickly. And assuming that your users will never want – or need – to use a different database system will also limit future development. Designing a system this flexible is hard work and requires a lot of thought, but when the alternative is painting yourself into a corner the choice is clear.

Many of these system attributes can only be properly verified as the system is being used. And there’s no better way to verify a system can be used than actually build something. So it makes sense to build the entire front-end system, including a default theme, using the API. This has proved to be a successful approach for several companies, and ensures a level of internal testing beyond the usual unit and functional tests.

Bearing in mind these principles we’re ready to start making actual technical decisions. That will be the subject of the next post in this series.

CMS architecture: Part 1

I’ve been doing a lot of thinking about the architecture of content management systems (CMS) recently. Little wonder, that’s my full-time focus at the moment. By “architecture” I mean pretty much everything to do with the planning and development of a CMS. This blog post is the first in a series that explores some of the elements to think about if you’re going to create a CMS from scratch.

This is unashamedly going to be at an advanced level – I’m not talking about a simple system just to keep a few pages updated. I’ll try to keep as technology-agnostic as possible, but I will be coding at least part of this system to ensure what I say is technically feasible.

The areas I’m going to tackle, in no particular order, and almost certainly incomplete, are:

  • Data storage

    Any serious CMS need a database, but is a relational database (MySQL, Postgresql, SQL Server) a better choice than a NoSQL database? What about extensibility, making complex queries possible for reporting purposes, performance, versioning? How about scalability and data security?

  • System security

    Unless you want everyone to be able to do everything you need to be able to secure aspects of the system. So you need user accounts with authentication mechanisms. Securing individual parts of the systems (particular modules, or specific related data) needs to be possible, and what about SSL? There’s also the question of authenticating 3rd party systems, for example users of APIs.

  • Extensibility

    WordPress, which I love, has a fantastic API which enables developers to write plugins for almost every conceivable use. Plugins are cool, and the hook and filters that power them are a must. But what about cutting a little deeper than that; allowing entire subsystems and modules to be swapped out? What about an API?

  • Output

    Obviously a CMS will have some form of HTML output. But how do you architect the system so the sweet spot between allowing front-end developers a large degree of control over the HTML and the system producing what it needs to run? How about themes and templates? Repurposing content is going to be come increasingly important, and so how do you handle microformats and data schemas? What about alternative outputs: PDF, XML, JSON etc? Then there’s the tiny matter of internationalisation.

  • Assets

    Assets are a big part of any CMS. Storing files securely is just one aspect of this, but how do you handle versioning and repurposing of assets (PDFs also available in Word and ODF, for example). And with images getting more complex with high-DPI displays, how do you handle resizing imagery?

  • Performance and scalability

    Caching is key, but what do you do when you grow from 1 server to 10, to 100, to 1000?

I don’t pretend to have all the answers to this stuff, it’s just an area that interests me and I want to explore. If I end up with an experimental CMS at the end of this that handles a few of these thorny issues then I’ll be a slightly better developer than I am now. And even if I don’t I’ll still have done some serious thinking about these issues.

Blog highlights

I enjoyed myself with this trip through my blogging history, but I guess something you’d like to see is some highlights of what I’ve written about. Here’s the greatest hits of (in my opinion, of course).

*Sniff*. Good times.

Protecting your bits

My car is poorly. Yesterday there was a “big metallic bang”, according to my wife, and then it started “clanking”. Gotta love these technical people! The guys at Kwik Fit soon diagnosed the problem: the front passenger side coil spring had snapped. Great, more expense. And all due to the state of the roads. Thanks, local council.

But one thing the Kwik Fit bloke said interested me. Looking at the broken spring he commented how it was good the car manufacturer had started putting a plate at the bottom of the spring, as in days gone past springs would snap and go through the tyre. That would, obviously, have caused a serious accident. But the spring fortunately broke in a safe way, and I’ve got a reasonably drivable car.

When writing software we have to do the same thing. We code for the possibility that bits will break, to protect other bits and the application as a whole. There are a number of ways we do this, here’s a quick list of the ones I can think of.

  • Ensure that if we’re going to use a variable, it is set
  • Check the type of variables: if a variable must be an int then make sure it is an int
  • Checking whether we need to do an operation at all, for example not looping a collection if there’s nothing to loop
  • Checking a collections length before trying to get an item with a non-existent index
  • Catching exceptions
  • Providing meaningful error messages
  • Persisting form information so users can try again if their submission doesn’t work
  • Checking variables are within the required range, for example validating a birth date

And there are probably loads more, including ensuring that the UI looks and functions reasonably, even if the user doesn’t have the latest, greatest browser.

With all of these things we’re aiming to ensure that if something breaks – and it will, let me assure you – it doesn’t cause an accident. Car manufacturers have figured this out, and rightly so as they have a responsibility towards the safety of road users. I don’t want to think of how many tires were blown before they added those safety plates.

Here’s an old, but true, saying; an ounce of prevention is better than a pound of cure.

I’ve been certified

It’s been a good week, for several reasons. One of them is that I passed a Microsoft exam: Developing web apps in .Net 4. That means I am now (drum roll) a Microsoft Certified Technology Specialist. I’ve been building websites for 14 years, give or take, so it’s about time I had some kind of qualification in it.

This is, to put it bluntly, an unexpected turn of events. For much of my career I’ve treated the Microsoft-based things I do in my day job as “just” work, and my real passion has been the Open Source web development I’ve done in my own time. When those two worlds collide, which they do occassionally, it always feels a bit strange. The most obvious example of my day job and personal work coming together is my Performer JavaScript library, which I use on every project I can.

So why the dismissal of Microsoft web technologies for so long? I guess it’s because, for much of the lifetime of ASP.NET, they just haven’t been very good. ASP.NET fundamentally ignored the way the web was designed to work for a long time (I’m talking ViewState and WebForms) but I have to admit it also introduced many fantastic features as well. Templated controls and MasterPages are just brilliant. For this and more of the good stuff refer to my articles on Nettuts.

But now, well, things have changed. ASP.NET MVC is a cracking bit of kit, and Microsoft’s embrace of Open Source – particularly jQuery – has changed many opinions about the Redmond behemoth. Plus the fact that finally we’re beginning to see versions of Internet Explorer that don’t suck. Although much of the industry press is still about open technologies, like WordPress, JavaScript and RoR, it no longer feels faintly embarrasing to me to be a Microsoft developer. Yes, thing’s have certainly changed.

So here’s to the future. A future where all companies, technologies and platforms can work towards making the web more open, accessible and useful in people’s lives.