Guiding principles for choosing a Test framework

tools, testing, and personal experience
a movie still from Dune, 2021
Dune, 2021

“No one understands my passion for this vain test library, but I shall make it default for everyone”

– A vernacular Software Test Engineer

The problem

Test tool (or framework) comparisons, and the actual choice of a tool, can be a “painful” and overblown exercise in modern tech organizations:

This is oftentimes true regardless of the area of focus of the test tools, be it UI-facing, API-facing, Load/Performance-dedicated, …

Proposed solution(s)

I’d like to share and suggest a handful of personal guiding principles that have helped me mitigate the non-useful noise and disruption that comes with a tool-choice exercise:

The underlying principle is using accessibility and maintainability as a compass (and a shortcut) to help create some sort of safety-distance / critical perspective while comparing test-frameworks.

Let’s look at each of the above points in detail.

Anyone can run

One of the most overlooked aspects of any test tool, and specifically tooling that is adapted and then developed on top of inhouse tools is that it’s not always straightforward for folks to make use of the actual tools.

If you have been working on the tech industry for a while, you probably have seen some of the worse symptoms of what I’m describing:

The basis of the problem is always the same: only the test engineers developing the checks know how to run them and/or how to interpret the results, and there is a massive disconnect in value between the automation and everyone else in the team.

In my personal experience, to fight this problem we can resource to one principle:

Whatever complexity we might code into our automated checks, we need to provide multiple accessible interfaces for folks to run the automated checks, abstracting any tool-specific complexity (and setup) at all its levels.

What this means always depends on the context where we’re working, but here’s a few good signals that we should be tuning for:

In the field the above can look like this: suppose we’re working on a test tool called “testthis” where folks can mimic user flows pointed at the API level.

Technical folks can run the tool via CLI, which could look like:

testthis run --flow release_goat_for_trex --environment staging --project dinopark

And non-technical folks can run the same tool from points where they are used to work in, like in Slack or other chat-based software, e.g. using a slash command:

/testthis run --flow release_goat_for_trex --environment staging --project dinopark

The trick is abstracting and reducing configuration, reducing the mental load of someone using the tool. There’s more I could write about this principle alone, because it’s tricky to design an interface that is simple but not over-simplifying. I’ll leave it to another post for the time being.

Anyone can run from anywhere

One of the coolest things that I’ve come to appreciate over this past year was having direct access to a friend who also happens to be a Docker Captain (yes, that’s right Tom, I’m name dropping you on one of my posts).

The main teaching that we adopt when we’re interacting everyday with someone that is a Docker (and containers) ace pilot is:

If we can break apart and contain a solution to a problem, e.g. a piece of software, the solution might not be perfect, but we’re solving in one go a lot of other tiny issues for ourselves and others.

We go from the typical “Works on my machine” to being able to share and distribute the thing, contain it and pin its dependencies to a working state, and reuse it without worrying about internals or obscure steps of setup or host machine specifics. It now “works on my container”, which is a slightly better predicament than having something just work on the programmer’s host machine.

Sadly, what we’ll find inside most organizations in the industry is that, aside from using existing containerized tools, very few test engineers tap into that power and start thinking about containerizing their own automated checks and their own in-house test tooling, and fail to extend tools that already allow for containerization. I believe this also aggravates the pickles that a lot of Test Engineers endure:

Nobody cares because the test engineer is not distributing the thing properly.

Distribution is not just sharing a link to a repository or a CI job. In order to fix these “dormant test tooling dilemmas, we need to keep in mind:

And probably the key side-pieces that come as a byproduct of the above principles:

When we do the exercise of putting our test tool or our handful of automated checks working in a way that they can be run “anywhere”, more often than not it also forces us to think about the next problem: to “run anything” pointed everywhere.

Anyone can run anything

It’s usually the case our test tools all try do the same: they try to follow a scripted path of the interaction of a user through a certain narrow perspective of a product.

If we wanted the test tool user to do this same interaction at a larger scale, like in a load test, it would be a good practice that they could easily just do that: indicate that they want to run the same thing they are running mimicking one user - but for hundreds, thousands, etc… of users.

Here’s the part where most Test Engineers will spot a gotcha: folks dedicate too much time either:

But they almost never dedicate balanced time for both. This cannot be the case.

My proposed principle to try and do things right in this case is that we need to push for ways that we can dedicate enough meaningful time for both implementations. And this is only possible if from the start we try to provide those through the “same” interface:

testthis run --flow some_flow (... other arguments)

testthis run-load --flow some_flow  (... other arguments) (load specific arguments, like number of virtual users, iterations)

testthis run-distributed-load --flow some_flow  (... other arguments) (load specific arguments) (distributed arguments)

The end user should be able to run anything easily, they just need to focus on choosing the right “attack” type, and the “parent” test tool abstracts the underlying complexity, and acts as an alias of any other underlying tools.

Anyone can understand what failed

This goes back to something I had mentioned in a previous post.

As test engineers we tend make it so when a given automated check suite fails, we get notified with some bland message and a link to a CI job. Problems with this approach:

This is not the way. Notifications for test tool failures should be as “delicious”, enticing and meaningful as a typical predatory notification for a new social media post… without the shallowness and ad-revenue hungry demonic spirits that come with default social media notifications and clickbait.

What would this look like in theory? Well, it means we prioritize:

And what does this look like in practice? Taking the example from my anti-patterns post, it’s all about trying to reach a message that tells a story, like this:

SomeAutoChatbot says: The endpoint ABC in development environment 029 is failing with 502 Bad Gateway for the buy-an-action-man test scenario. Error trace-id is 053de188-7438-42b1. Link to the logs some kibana/cloudwatch link. Possible solution: restart the orders service here or contact @oncall-support-dev-env-team.

versus saying something bland, like this:

something is not working, please check my failed jenkins job and the ticket 1234 of the test case on JIRA

The point is: whatever you do, you optimize for the message itself, by being sure that anyone in your surrounding context, including yourself, can have a quick grasp and clear signals of why something failed, and you leave breadcrumbs for folks to investigate deeper if they are up for it.

Anyone can tinker with failure

How many times has a developer reached out to a test engineer and asked - how could I do this specific automated check or debug a failing check, only to be met with several flavours of the same response:

You could, but you can’t

We tend to make our lives in a project harder by not looking at probably the most useful problem to look at after the problem of containerization:

How do we make it easy for anyone to debug a failure state of our test tool?

This principle depends heavily on the programming language, libraries and tools from which each of us build our own automated checks and test tooling, but its importance is what can make or break a test tool and even a test engineer.

Some folks are quick to write this off and will say: “Ah, if folks do this and that on the specific dev environment that I use, they can somewhat debug the test tool… problem solved

Those folks fail to realize they are a part of the problem. This shouldn’t be the case. There are a few steps I can suggest in this case:

Wrap-up: A caution word about tools that try to be human

Here’s what most folks might not talk about when it comes to test tool choice: the “evil” of tools that try to be and do everything a human does.

By this I mean:

They suggest impose a human-based domain specific language (DSL)

There’s two crucial points to keep in mind regarding this:

Just these points breed the equivalent of the flowers blossoming problem: weeds blossom due to (re)implementation freedom. Oftentimes you end up with an extra million ways of using the “do-it-all” library to solve a certain path within the same org, plus the added clutter of having libraries that do more than what you are trying to do in the context of a scripted test.

So, I’ll wrap up this post with a word of caution:

There is such a thing as test tools/frameworks that try to do everything a human does so much so they become vain tools.

I recommend you give a read of Michael Bolton’s experience reports of Katalon and mabl to get a feel of what this usually means from the lens of a hardcore software tester.


If you read this far, thank you. Feel free to reach out to me with comments, ideas, grammar errors, and suggestions via any of my social media. Until next time, stay safe, take care! If you are up for it, you can also buy me a coffee ☕