How do you decide when a product is ready? Accelerated-ageing testing using massive amount of random data helps uncover hidden bugs.

Before I started ICanLocalize and WPML, I was a chip-designer. Chips are the “brains” in electronic devices, such as computers, phones, TVs, and anything else you can think of.

To produce chips dozens of engineers work for months or years, designing, coding and testing. Producing a batch of chips costs hundreds of thousands of dollars, so before you do the first round, you run exhaustive testing to make sure it’s clean of any bugs.

In the last 6 weeks, we did the same for WPML. We reviewed every line of code, tested and fixed. Although WPML 2.0.4 works pretty good, we still uncovered hundreds of bugs, ranging from cosmetic PHP notices to critical functionality.

The last remaining test is probably the most important one – an accelerated ageing test, which should flush out the remaining “data dependent” problems. You know, the really nasty bugs which take weeks to surface and are the trickiest to nail.

Accelerated Ageing Testing for WordPress Plugins

The idea is, you can’t fix what you don’t know is broken. When you run hundreds of tests from a checklist, you do the same thing over and over. If there’s a bug you didn’t find in release 1.5, you’re not going to find it in release 1.6 or in release 10.1.

Random-data tests are different. In these tests, you bombard your design with huge amount of random data, sit back and see what happens.

Here is how it works:

The stimulus runs the show. It generates random data and drives it into the plugin. We use the WP API functions to create new content. WPML already hooks to these functions so it adds the language information as data is being created.

The test creates new categories, tags, posts, pages. The process is programmed, but the data is random. We use all sorts of characters, including regular English, Russian, Asian and encoded chars. Data length is also random, ranging from a single character to the maximal allowed for every field.

Every bit of data is written twice. Once to the plugin and the other to the “expected data”.

As we add data, we also translate it.

It’s like putting an angry monkey in front of the keyboard and letting him pound at it. The monkey writes random content and translates it. Just that since a computer does the job, it happens pretty fast and we can pump in as much data as we want in seconds.

As we create and translate content, we also log everything. This allows us to later analyze the test and trace back exactly what happened.

When it’s all over, we compare between the data in the WordPress DB and the “expected data”. It should match exactly. If anything doesn’t agree it’s one of two options:

  • There’s a bug in the test logic
  • There’s a bug in the plugin

This is where the test log comes into play. We run it step-by-step and compare between the databases. When we find the first difference, we break. This process takes longer to run, but it tells us exactly where the problem begins. We fix it and rerun.

We’re happy when we see a random test with tens of thousands of posts, translated to dozens of languages without any error.

So, No More Bugs? Ever?

This kind of random test reveals a certain type of bugs. We consider these bugs as most critical, because they cause accumulating database errors. By the time you realize that these errors occur, they already cause a great deal of trouble. This is why we spend so much time on this single test.

Other bugs are related to WordPress display hooks, AJAX operations and interface-related issues. The good news is that display-related bugs are easier to notice, report and fix. They also (usually) have no long-term impact on the site.

Accelerated ageing testing is the last item in our release list. As soon as it’s done, we’re ready with the first commercial version of WPML.

8 Responses to “One Last Test and We’re Ready”

  1. Wow, that’s interesting. Is this kind of test hand-crafted for every individual plugin? It just occurred to me that it would be really cool if the WordPress plugin repository could have a new section alongside the ratings and all, which listed what type of standardized tests the plugin had gone through.

  2. You can’t automate the creation of these tests. Each test simulates what a user would do, and since each plugin does a different thing, the test is also unique. Much like in electrical engineering, building the tests is about the same amount of work as building the product itself.

    I agree that it would help to know what kind of testing plugins went through. The ‘works for me’ indication on the plugin repository gives somewhat of an indication, but not too comprehensive.

    What can be standardized is the way plugins operate generally. Plugins can be written for testing. This means that all logic is in class outside of output rendering and input processing and you can call it without simulating HTTP calls. MVC (model, view, controller) design makes this possible. WordPress itself isn’t exactly MVC, but it’s pretty close. Same for WPML. This is why we can create these tests in the first place.

  3. Has anyone else been receiving this piece of news several times in their feed reader? I’m using Google Reader and I think this is the 3rd time I see this popping in as a supposedly new and unread item.

  4. We’re to blame here. It’s due to the site update that we did. We tried to keep all RSS items the same, but somehow Google identified our DB switch and resent it all.

    That’s the end of it. Sorry about the confusion.