Web development – Page 4 – Still Breathing

Namespaced SQL Columns in Queries

Posted on November 23, 2013 by chris

One of the things I really like about the C# language is the wide range of database ORM and data mapper packages available. In a nutshell an ORM can help remove the need to write SQL queries and stored procedures in your database, instead you tell it what the “model” of your data is (i.e. the schema of the tables and how they relate to each other) and the ORM constructs SQL queries to fetch, update and add data. Data mappers do much the same thing.

A simple example would probably be helpful. This is psuedo-code, it’s not from a real ORM:

// create the connection to the database which we have modelled MyDatabaseModel myDb = new MyDatabaseModel("localhost", "username", "password");


// get all records from the MyTableType table and cast them to MyTableType objects

IEnumerable allTheThings = myDb.MyTableType.Get();
// create a new MyTableType object

MyTableType newThing = new MyTableType {

	Column1 = "Hello";

	Column2 = "World";

};

// insert it into the database myDb.MyTableType.Insert(newThing);

You can imagine this saves a LOT of time and effort, particularly for applications with a simple database schema. I’ve used several ORMs including Entity Framework (big, powerful, complex) and Linq2SQL (small, quick, easy, only works with SQL Server). I do want to try some others, in particular Rob Conery’s massive.

In PHP there are ORM packages available, too, although I admit I’ve not used any of them. Sometimes, however, you need to inject a bit of cleverness into your database access code to help speed up development, but don’t have time (or inclination) to rewrite the system around an ORM.

Repetitive SQL

This was the situation I found myself in the other night. I had a medium-sized application with lots of queries, and some tables that were being accessed a lot. I found I had very repetitive SQL strings like this:

select t1.col1, t1.col2, t2.col1, t2.col2 from table1 t1 inner join table2 t2 on t2.parentid = t1.id

Fortunately I’d put all the queries into two sets of files:

The classes for my types, each one of which had its own Get(), Save() and Delete() methods
Manager classes which handle loading of lists of object, or loading sets of related objects

The problem was I had used the same column names across many tables. This code:

select t1.col1, t1.col2, t2.col1, t2.col2

Gives a set of results that looks like this:

col1	col2	col1	col2
Table 1, Row 1, Col 1	Table 1, Row 2, Col 2	Table 2, Row 1, Col 1	Table 2, Row 2, Col 2

Which “col1” column belongs to table1, and which to table2? I was in a pickle, and no mistake. What I needed was a way to namespace the columns I returned.

Namespacing

Namespacing is simply a way to group related functionality together in a system. You can imagine a folder structure on a hard drive being like many namespaces:

/My Documents/
/My Documents/Photos/
/My Documents/Photos/Holidays/
/My Documents/Music/
/My Documents/Music/County-AND-Western/

You know where all the music of both types (country AND western) are because the folder structure clearly shows you the way.

In C# the different levels in a namespace are separated by a dot, for example:

MyApp.UI.Helpers.DateHelpers

I thought I could use this same naming convention for my queries. Also the SQL language itself uses dots to denote the boundary between a table and its columns:

select table1.* from table1 where table1.id = 123

After a bit of experimentation I found that as long as the column names were put in quotes I could return pretty much anything I wanted. For example:

select t1.id as 'this.is.my.test' from table1 t1

So I realised I could namespace all the columns by the table they belong to, like this:

select t1.col1 as 'table1.col1', t1.col2 as 'table1.col2', t2.col1 as 'table2.col1', t2.col2 as 'table2.col2' from table1 t1 inner join table2 t2 on t2.parentid = t1.id

Now the result set I get back looks like this:

table1.col1	table1.col2	table2.col1	table2.col2
Table 1, Row 1, Col 1	Table 1, Row 2, Col 2	Table 2, Row 1, Col 1	Table 2, Row 2, Col 2

Much better! I can query the columns for each table safely.

Making namespaced column queries easy to use

Now I’d proved the concept I did some work to make it easier to use these namespaced columns. Firstly I added a new method to each of my classes which returned a string which I could use in queries.

As this application is a plugin for WordPress I am using the global $wpdb object to do my database access. Here is PHP for a small, simple class:

class MyType {


    // the properties of this class

    var $id;

    var $name;
    // returns a string for all the columns in this class

    function ColumnQuery( $prefix ) {

        $s = "%1.id as '%2.id', %1.name as '%2.name'";

        $s = str_replace( "%1", $prefix, $s );

	$s = str_replace( "%2", "MyType", $s );

        return $s;

    }
    // loads the row with the given ID

    function Load( $id ) {

        global $wpdb;

        $sql = $wpdb->prepare( "select

            " . MyType::ColumnQuery( "t1" ) . "

            from {$wpdb->prefix}MyType t1

            where b.id = %d;",

            $id );

        $row = $wpdb->get_row( $sql, ARRAY_A );

        $this->MapRow( $row );

        return $this;

    }
    // maps a database row to the current object

    function MapRow( $row ) {

        $this->id = $row["MyType.id"];

        $this->name = $row["MyType.name"];

    }

}
// create a new instance of a MyType object

$myType = new MyType();

// populate the details of the $myType variable with the details from row 123 $myType.Load(123);

One really important think to note is that I use the ARRAY_A parameter in the $wpdb->get_row() method. This means that the columns are returned as an associative array, which is the only way I could access the namespaced columns because of the dots in them.

Something has just been dropped in my brain…

Actually, and this has only just struck me, I could probably use a different character than a dot and use the default column methods. Here’s quick psuedo-code using underscores instead of dots:

// returns a string for all the columns in this class function ColumnQuery( $prefix ) { $s = "%1.id as '%2_id', %1.name as '%2_name'"; $s = str_replace( "%1", $prefix, $s ); $s = str_replace( "%2", "MyType", $s ); return $s; }

// maps a database row to the current object function MapRow( $row ) { $this->id = $row->MyType_id; $this->name = $row->MyType_name; }

Hmm, I might have to go back and do that as it’s a bit cleaner…

Table variables

The $prefix parameter in the ColumnQuery method means I can set what the variable for the table is, for example:

$sql = "select " . MyType::ColumnQuery( "t1" ) . " from {$wpdb->prefix}MyType t1";

In this case the table variable is “t1”. The resulting SQL string will be:

select t1.id as 'MyType.id', t1.name as 'MyType.name' from MyType t1

Related tables

That’s great for single tables, but the real power comes when you use this structure for multiple related tables.

Imagine we have two related tables like this:

Child
Parent

These tables are related using a foreign key; a parent_id column in the Child table which stored the id of the Parent.

Whenever we load a Child object we also want to know the details of the parent. This is the kind of thing that an ORM system eats for breakfast. Using our new namespaced column query it’s also pretty easy. Here are the classes for the Child and Parent type:

class Child {


    // the properties of this class

    var $id;

    var $name;

    var $parent_id;

    var $parent;
    // returns a string for all the columns in this class

    function ColumnQuery( $prefix ) {

        $s = "%1.id as '%2.id', %1.name as '%2.name', %1.parent_id as '%2.parent_id'";

        $s = str_replace( "%1", $prefix, $s );

        $s = str_replace( "%2", "Child", $s );

        return $s;

    }
    // maps a database row to the current object

    function MapRow( $row ) {

        $this->id = $row["Child.id"];

        $this->name = $row["Child.name"];

        $this->parent_id = $row["Child.parent_id"];
        // if we have columns from the parent table then populate the parent object

        if ( $row["Parent.id"] != "" ) {

            $this->Parent = new Parent();

            $this->Parent->MapRow( $row );

        }

    }

}
class Child {
    // the properties of this class

    var $id;

    var $name;
    // returns a string for all the columns in this class

    function ColumnQuery( $prefix ) {

        $s = "%1.id as '%2.id', %1.name as '%2.name'";

        $s = str_replace( "%1", $prefix, $s );

        $s = str_replace( "%2", "Parent", $s );

        return $s;

    }

// maps a database row to the current object function MapRow( $row ) { $this->id = $row["Parent.id"]; $this->name = $row["Parent.name"]; } }

This is the magic bit:

// if we have columns from the parent table then populate the parent object if ( $row["Parent.id"] != "" ) { $this->Parent = new Parent(); $this->Parent->MapRow( $row ); }

If the system detects that we have columns from the parent table then we can fully populate the parent property. Here’s the Load() method for the Child class that would load and populate a Child instance with the details of its Parent:

class Child { function Load( $id ) { global $wpdb; $sql = $wpdb->prepare( "select " . Child::ColumnQuery( "c" ) . ", " . Parent::ColumnQuery( "p" ) . " from {$wpdb->prefix}Child c inner join {$wpdb->prefix}Parent p on p.id = c.parent_id where c.id = %d;", $id ); $row = $wpdb->get_row( $sql, ARRAY_A ); $this->MapRow( $row ); return $this; }

So we would load a child object like this:

$child = new Child(); $child->Load( 123 );

And we can access details of the child and it’s parent like this:

print "ID: " . $child->id . " "; print "Name: " . $child->name . " "; print "Parent Name: " . $child->parent->name;

The same approach can be used when loading multiple rows. For example here’s some SQL to load all children:

$sql = $wpdb->prepare( "select " . Child::ColumnQuery( "c" ) . ", " . Parent::ColumnQuery( "p" ) . " from {$wpdb->prefix}Child c inner join {$wpdb->prefix}Parent p on p.id = c.parent_id;";

Wrapping up

This technique helped me out of a little hole this week, hopefully it can help you as well.

Responsified and updateified

Posted on October 25, 2013 by chris

That old proverb about the cobblers children going barefoot is true. For too long this site has been neglected and unloved. Well, no more! I’ve realigned (not redesigned) this site to take advantage of modern web development techniques. Of course, I’m talking about responsive design.

It’s not a radical departure from what was there before, and talking advantage of the excellent Bones starter theme meant the whole thing was done in a few hours. And even though I haven’t paid too much attention to performance I’ve still ended up with a respectable payload for the homepage:

And even YSlow gives me a B grade:

I’ve also updated to WordPress 3.7, named in honour of the great Count Basie. The update was, it has to be said, not as slick as usual. I had “out of memory” problems which weren’t that hard to fix, but did cause me some stress.

WordPress 3.7 is the first version with auto-updating built in, so hopefully my installation will keep itself up to date and in tip-top condition while I’m asleep.

Bad, good and great programmers

Posted on August 19, 2013August 19, 2013 by chris

A recent tweet by Cory House started my brain a-twitching:

One of my new favourite interview questions: What is the difference between a bad, good and great programmer? Which are you and why?

I’ve worked with all three kinds, and thinking about the question of what the differences are I realised (not for the first time) that pure coding skill is nowhere near all the story. This blog post explores what else may put you in one category or another. First some assumptions:

As this is an interview question I took “programmer” to mean “one who works in a software development team”
The basics of being a decent employee – turning up on time, dressed appropriately and at least partially sober – are a given
The programmer can actually code to a decent level, “decent” being whatever is appropriate to fulfil the job requirements
The employer doesn’t just want a code monkey. I hope and trust that there are very few companies who want programmers to just “shut up and write the code”, and if you find yourself working for one of those then get out as soon as you can.
I’m talking to traditional programmers (C#, PHP, Python, Ruby, JavaScript etc) rather than front-end (HTML, CSS) but many of the points are applicable for any type of developer

Bad programmers

Generally, bad programmers are easy to spot. They write code that generally works (if they didn’t they’d be out of a job), but suffer from many or all of these issues:

They have a bad attitude towards non-geeks, whether non-technical colleagues or users
They have a bad attitude towards other programmers whose views don’t closely match their own
They know it all, or at least everything that matters
They are opinionated about tech; either wanting to use cutting edge stuff because it’s cool (whether or not it gets the job done) or they only trust very established things and refuse to look at new tech
Their communication skills are poor, and they have little desire to improve
They may fulfil specification requirements but don’t go any further than coding for exactly what is written
Their sense of humour is non-existent, or is harsh and rude
They don’t ask questions, and don’t make suggestions

It’s important to take all these points on balance. Some people may be shy, which could mean they don’t ask questions or make suggestions. But a good manager or lead developer will help staff to fulfil their potential, for example by encouraging the quieter members of the team to contribute.

The code from a bad programmer may also have these hallmarks:

Lack of comments, or comments that state the obvious and don’t tell you *why* they coded something a particular way
Obtuse, gnarly, over-abstracted, over-complicated architecture (at any level: method, class, subsystem or app)
Inconsistent or otherwise poorly-considered naming of variables, methods and classes
Lack of attention to detail, such as proper sanitisation of inputs for security
Copy-and-paste-itis. Bonus points if it’s clear they didn’t understand how the copied code works.

You get the picture. We’ve all written code like this at some point and we should be careful not to think of a programmer with less experience, or one who is struggling with a complex task, as a bad programmer. Bad programmers have a bad attitude, and even if they can mask that attitude for a while, eventually the mask will slip.

Good programmers

Good programmers, like bad, are known by their attitude as well as their technical skill. Note that a good programmer may not have as powerful code-fu as a bad programmer, but their attitude and “soft” skills more than make up for any technical shortcoming. (This is not to say that someone who doesn’t know XML from XTC can be classed as a good programmer; the basics mentioned above still apply.)

You’ll know a good programmer because:

They have a good, helpful and professional attitude towards colleagues and customers/users
They have respect for other programmers and are willing to listen – and when necessary respectfully disagree – to other points of view
They have a good sense of humour and contribute to a good atmosphere in the team
They exhibit attention to detail, and can spot the holes and inconsistencies in a specification
They are respected and liked by other members of the team
They don’t make rash decisions and have learned to weigh the pros and cons of different technologies
They can put themselves into the users’ shoes and make good decisions about how their code affects the user interface and experience
They have good communication skills, and avoid jargon wherever possible when communicating non-technical people

These kind of people make a team better, and so help an organisation to be better. But it’s not just soft skills, they also have a good grasp of technology:

Their code is readable and well-formatted, adhering to any coding standards adopted by the team
They choose sensible, descriptive and unambiguous names for classes, methods, properties etc
Their comments are actually useful: not just telling you what a method does, but why it is used – and any non-obvious things you may need to watch out for
Their architecture choices (at method, class, subsystem or any other app level) are clear, well-abstracted and only as complicated as it needs to be. My rule of thumb for “good” architecture: either you would have written it the same way, or you see why they did it their way and don’t need to change it much.

No doubt anyone with a bit of experience working in a development team can think of several more hallmarks of good and bad programmers. But what about great programmers?

Great programmers

While many people would roughly agree what makes a programmer good or bad, deciding what criteria to use to distinguish the greats ones is much more subjective. Here are a few points I can think of that, in addition to everything I noted for a good programmer, would class a programmer as great:

They are aware not just of the users’ needs and the specification, but the organisational environment. They can guide the development of an application so that it serves wider organisational purposes: in the case of a business to serve users and be commercially successful.
Their communication and “people” skills are highly developed; for example they can explain complex technical concepts to non-geeks in an understandable, even entertaining, way
They are known in the team as the person who will help you out with a problem: partly because they have the experience and knowledge, but also because they make themselves available to help others
They keep themselves up to date with technology – not just their own area of expertise but the wider world of development, UI, UX, project management and business
Their thinking in terms of abstracting code takes into consideration not just the next 3 months but the next 3 years (that *almost* means they can tell the future)

In short a great programmer has an expansive view of technology, of people, of the future of the organisation and applications they work on. They are part psychologist, part entrepreneur, part designer, part storyteller.

A couple of years or so experience and taking pride in what you do will take you from being a bad programmer to being a good one. But to be a great programmer takes a special kind of person. The kind that, eventually, will do something really special. Like write a great book, start a great company, write an amazing Open Source project.

There’s lots I haven’t mentioned here (such as writing testable code).For more on this subject check out these links: some are articles and essays, some are great programmers.

CMS Architecture: Before you begin

Posted on May 30, 2013May 30, 2013 by chris

It’s important to plan any large-scale software development. Fail to plan, plan to fail as the old saying goes. So before a single line of code is written, before even any architecture or technology decisions have been made, I’m going to define what I want my CMS to be.

Firstly, and most importantly, the CMS needs to be extensible. In the same way that CRM (customer relationship management) systems became XRM (entity relationship) systems this CMS needs to be an XMS.. This will. mean that the system can be used as much more than just a system for managing pages, and instead can be used to manage any kind of data.

Content is generally thought of as pages and files. but just below the surface the vast array of types of content is easy to find:

– Property website have pages which represent houses
– Band websites have albums, which have songs
– E-commerce systems have products, which have alternative versions (like sizes) and related products
– Magazine sites have articles, which may be several on one page or longer articles spread across multiple pages

Forcing users to manage and – more crucially – conceptualise their content as just “pages” will lead to confusion and badly maintained data.

WordPress handles these different types of data through Custom Post Types, and does it very well. All post types share the same basic properties, and can be extended by adding further metadata. Relationships can be defined between posts using the taxonomy system.

This capability is crucial to the power afforded by a CMS and needs to be baked in from the very beginning.

Secondly the CMS needs to be easy to work with by developers (ease of use from a front-end user perspective goes without saying). There are three traits of well-designed systems that make them easy to use by developers.

1) Consistent

If the system is not consistent in its approach every line of code has the potential to trip the developer up. Consistency is not just about ensuring class and method names follow a pattern, but about the ethos behind the architecture of the system. If the user has to instantiate a helper class in module X, but in module Y there is a static manager class then consistency is broken.

Also having two – or even more – ways to achieve the same thing without clear direction on which one to choose puts. doubt in the developer’s mind.

2) Discoverable

Following on from the system being consistent is making it discoverable. It may not be possible to create a system where developers never need to check the documentation, but aiming to get as close to that as possible is a worthy goal.

This can be achieved by applying standard patterns that developers will recognise, and by carefully thinking through each architecture decision to ensure it is as intuitive as possible to use the system.

3) Sensible

If just one of these traits is followed, make it this one; make the system sensible. Have sensible defaults, sensible naming conventions, sensible use of existing patterns, sensible architecture decisions. The moment the system forces developers to do something non-intuitive, a little bit crazy, then a battle is lost.

Better to spend a little time now ensuring the most sensible decision is taken than spend a lot of time later reversing a crazy decision.

Thirdly the system must make as few assumptions as possible. Setting hard limits, such as the number of properties an entity may have, is a recipe for disaster – even if you think those limits are really high. Assuming that users will always need to be authenticated by a username and password will fail the moment an organisation needs users to enter their membership number as well.

Assuming that particular HTML output will always be used is also a sure-fire way to ensure the system will get outdated quickly. And assuming that your users will never want – or need – to use a different database system will also limit future development. Designing a system this flexible is hard work and requires a lot of thought, but when the alternative is painting yourself into a corner the choice is clear.

Many of these system attributes can only be properly verified as the system is being used. And there’s no better way to verify a system can be used than actually build something. So it makes sense to build the entire front-end system, including a default theme, using the API. This has proved to be a successful approach for several companies, and ensures a level of internal testing beyond the usual unit and functional tests.

Bearing in mind these principles we’re ready to start making actual technical decisions. That will be the subject of the next post in this series.

CMS architecture: Part 1

Posted on May 9, 2013 by chris

I’ve been doing a lot of thinking about the architecture of content management systems (CMS) recently. Little wonder, that’s my full-time focus at the moment. By “architecture” I mean pretty much everything to do with the planning and development of a CMS. This blog post is the first in a series that explores some of the elements to think about if you’re going to create a CMS from scratch.

This is unashamedly going to be at an advanced level – I’m not talking about a simple system just to keep a few pages updated. I’ll try to keep as technology-agnostic as possible, but I will be coding at least part of this system to ensure what I say is technically feasible.

The areas I’m going to tackle, in no particular order, and almost certainly incomplete, are:

Data storage

Any serious CMS need a database, but is a relational database (MySQL, Postgresql, SQL Server) a better choice than a NoSQL database? What about extensibility, making complex queries possible for reporting purposes, performance, versioning? How about scalability and data security?
System security

Unless you want everyone to be able to do everything you need to be able to secure aspects of the system. So you need user accounts with authentication mechanisms. Securing individual parts of the systems (particular modules, or specific related data) needs to be possible, and what about SSL? There’s also the question of authenticating 3rd party systems, for example users of APIs.
Extensibility

WordPress, which I love, has a fantastic API which enables developers to write plugins for almost every conceivable use. Plugins are cool, and the hook and filters that power them are a must. But what about cutting a little deeper than that; allowing entire subsystems and modules to be swapped out? What about an API?
Output

Obviously a CMS will have some form of HTML output. But how do you architect the system so the sweet spot between allowing front-end developers a large degree of control over the HTML and the system producing what it needs to run? How about themes and templates? Repurposing content is going to be come increasingly important, and so how do you handle microformats and data schemas? What about alternative outputs: PDF, XML, JSON etc? Then there’s the tiny matter of internationalisation.
Assets

Assets are a big part of any CMS. Storing files securely is just one aspect of this, but how do you handle versioning and repurposing of assets (PDFs also available in Word and ODF, for example). And with images getting more complex with high-DPI displays, how do you handle resizing imagery?
Performance and scalability

Caching is key, but what do you do when you grow from 1 server to 10, to 100, to 1000?

I don’t pretend to have all the answers to this stuff, it’s just an area that interests me and I want to explore. If I end up with an experimental CMS at the end of this that handles a few of these thorny issues then I’ll be a slightly better developer than I am now. And even if I don’t I’ll still have done some serious thinking about these issues.

Repetitive SQL

Namespacing

Making namespaced column queries easy to use

Something has just been dropped in my brain…

Table variables

Related tables

Wrapping up

Bad programmers

Good programmers

Great programmers

Data storage

System security

Extensibility

Output

Assets

Performance and scalability