Tagging, the right way…

It’s late, and I’m engrossed in the epic 2006 World Snooker Championship final just down the motorway from me in Sheffield. However this article from Uncommon Sense for Software caught my eye, as tagging is something I’ve used in the new Wiblog system.

I realised that people make typos, and so have implemented a function that does the following things to clean up and check tag duplicates when saving tags for posts. Multiple tags can be entered in a comma-separated list.

1. Split the tags up by commas
2. Trim the whitespace (spaces, line breaks, tabs etc) from both ends of each tag
3. Remove multiple adjacent spaces from tags with multiple words – so my[space][space]tag will become my[space]tag
4. Remove bad characters such as ' and “
5. Check that there are no duplicates of tags (so entering a single tag twice will only link it to a blog entry once)
6. Save each tag into the database

To make things a little easier at the time of saving I also save not just the proper tag words, but a link-safe version as well. So for the tag “All about me” I would save the link tag “all-about-me”. That allows me to use the tag words in links in the blog system without worrying about letting bad characters in.

Anyway, hopefully that will help someone. I’m off to sit on the edge of my seat and chew my fingernails wondering who will win the snooker.