Saturday, March 05, 2005

Acceptance tests for Web apps: GUI vs.business logic

I've been looking at Selenium lately (see previous posts) as a tool for acceptance/functional testing of Web applications. In its standard "TestRunner" mode, Selenium allows you to write tests as HTML tables that contain actions related to the HTML elements of the Web pages you want to test. Actions can be commands such as "open" a page, "click" on a link, "type" in a text field, "select" a value from a drop-down box. Actions can also be checks that compare the values of HTML elements against expected values: "verifyText", "verifyValue", "verifyTitle", etc.

A typical Selenium test table looks like this:

Google Test Search
open http://www.google.com
verifyTitle Google
type q Selenium ThoughtWorks
verifyValue q Selenium ThoughtWorks
click btnG
verifyTextPresent selenium.thoughtworks.com
verifyTitle Google Search: Selenium ThoughtWorks

This type of test exercises the AUT (application under test) at the GUI level. It is an important part of your testing strategy, since you want to make sure that your users can actually access all the functionality offered by your Web application. However, testing at the GUI level is notoriously brittle. Any time you make a change in the structure of the HTML pages under test, you run the risk of breaking the acceptance/functional tests that act on and verify elements of those HTML pages. Selenium tries to alleviate this issue by offering different ways of referring to HTML elements in your tests: by elementID, by specifying a DOM node, and by indicating an XPath expression. If you start developing a Web application from scratch, then you can collaborate with the HTML designers on the team and have them use identifiers for those HTML elements that you know will be exercised more by your tests. Or you can come up with a naming scheme for the HTML elements that will change as little as possible as development progresses.

There is another problem with testing at the GUI level, even assuming the HTML pages are stable and your tests will not break. GUI-level tests exercise the business logic of your application only indirectly. For example, say you're in the business of selling widgets. You put together an online store application, with a shopping cart, credit card processing, etc. If you only test the application at the GUI level, how do you know that orders for your widgets really go through, that the inventory is correspondingly modified, that the credit card was really valid? All you can verify via a GUI-level test is that users can succesfully navigate through your site and see the pages that you expect them to see. This is only an indirect validation of your business logic rules. For a direct validation, you need tests that exercise the database backend of your application. These tests can also be written as HTML tables and ran through a framework such as FitNesse. Here's a typical FitNesse test that specifies an acceptance test scenario for a payroll application (I copied it directly from this page):

First we add a few employees.

Employees
id name address salary
1 Jeff Languid 10 Adamant St; Laurel, MD 20707 1005.00
2 Kelp Holland 12B Baker St; Cottonmouth, IL 60066 2000.00

Next we pay them.

Pay day.
pay date check number
1/31/2001 1000

We make sure their paychecks are correct. The blank cells will be filled in by the PaycheckInspector fixture. The cells with data in them already will be checked.

Paycheck Inspector.
id amount number name date
1 1005


2 2000



Finally we make sure that the output contained two, and only two paychecks, and that they had the right check numbers.

Paycheck inspector.
number
1000
1001


There is a very different look and feel for these tests, compared to the Selenium tests. Each table has a name such as Employees, which corresponds to a "fixture", a piece of code that takes the data specified in the table rows as arguments and passes them along to the AUT, then retrieves values that can be checked against expected values in the tables. Acceptance tests written in FitNesse often exhibit the "Build, Operate, Check" pattern: build the test data, operate on it, then check it against expected values. It can truly be said that a FitNesse page containing test tables is another thin GUI layer into your application, but a GUI layer that exercises your business rules directly. For this to work, the developers need to provide clean interfaces into the business logic code. The application needs to have a design that separates the GUI logic from the business logic. In an agile environment, customers are supposed to pitch in and start writing acceptance tests in FitNesse as soon as the team gets started on a new iteration. Developers can then see what kind of hooks they need to provide as an interface for the fixture code. These hooks will evolve into a testing interface for the application. The resulting design will clearly separate the GUI logic from the business logic, and in fact it will be much easier to change GUIs altogether, or to provide different "views" into the business logic which do not even have to be GUI-based -- I'm thinking primarily of Web services.

The testing interfaces I mentioned can also evolve from the so-called "admin modules" that many Web applications have. These are alternate interfaces into the application, used by the business people for checking and updating inventory, running reports against the database, etc. All these functions directly exercise business rules by interfacing with the database backend. While writing them, the developers might as well think of them as testing interfaces that can be used in acceptance tests.

In conclusion, I think that acceptance tests for a Web application (or any application that has a GUI for that matter) need to be run at both levels: GUI and business logic. The GUI tests can be used as a "smoke test" strategy, as a sanity check that navigation through the site works and that users are not faced with ugly 404 errors. For this type of testing, a tool such as Selenium, which drives a real browser, is invaluable. But the bulk of the acceptance testing should be done at the business logic level. Being able to run FitNesse-type acceptance tests not only enhances the testability of the application, but most importantly forces a clean design that separates the GUI layer from the business logic layer and allows the application to easily adapt to GUI changes. Another benefit is that it becomes easy for the application to offer several interfaces into its business logic, for example a Web services interface in addition to the standard HTML-based interface.

One more note: one thing that Selenium-style tests and FitNesse-style tests have in common is that tests can be specified via HTML tables. This provides a nice visual feedback during the test run: the rows of these tables get colored green or red, depending on the test outcome. The HTML table format however is not the most friendly one for business customers who are supposed to write these tests side-by-side with the testers. FitNesse does offer ways of importing tables from spreadsheets, and Selenium is being extended at the moment so that it can deal with CSV formats (see Ian Bicking's post on some of the work he's doing on this, as well as on automatically generating Selenium scripts via TCPWatch). Selenium can also be run in "driven mode", where scripts written in Python, Ruby or Perl can drive the browser via an API.

7 comments:

Anonymous said...

I just noticed your presentation for PyCon includes looking at web unit/regression/acceptance testing kits. You're not including webunit though. I'd like to see it included :)

http://www.python.org/pypi?:action=display&name=webunit

Grig Gheorghiu said...

Actually I have a set of slides ready for PyCon and I do mention webunit, mechanize, PBP and twill along with MaxQ and Selenium in them. I'll update the PyCon Wiki page for the abstract of my presentation and I'll add those tools in there too.

Anonymous said...

Some interesting items in this and other articles. However, the following got me thinking:

"If you only test the application at the GUI level, how do you know that orders for your widgets really go through, that the inventory is correspondingly modified, that the credit card was really valid?"

Isn't this more a matter of test design than Business vs Presentation layer testing? In any test of a business process with DB access, the applicable DB contents should be validated using something other than the AUT. (Trust but verify, as they say...)

I do agree that testing at the business layer can be more efficient, although not sure if one always has time to test at every level, and if one can only create acceptance tests at one level, I'd vote for the Presentation layer to ensure end-to-end testing is being done (GUI to DB.)

Now if test cases could be defined in one HTML table for use through either the Presentation or Business layer, that might be interesting.... ;-)

Anonymous said...

Heh. The above posters comment is so completely over-used.

"not sure if one always has time to test at every level, and if one can only create acceptance tests at one level, I'd vote for the Presentation layer to ensure end-to-end testing is being done"

Not having time to test during the development cycle thoroughly is just another way of saying... I would rather debug production probems or have my project QA'd for twice the amount of time that it should be.

Lame.

Unit testing your business logic ruthlessly while still maintaining a small percentage of high level UI/Presentation layer tests is the only way to fly.

Adib said...

Greetings,

Good post on Fitnesse and Selenium. In fact, at Valtech, where i work, we have written a Bridge between Fitnesse and Selenium. This enables us to write Selenium tests in Fitnesse wiki style. Both UI and Service level tests can exist in the same framework and allows for common reporting.

Stuart Taylor said...

i stumbled in here after a Google for FitNesse and Apache2 (but thats another blog post).

Am i missing something here? you are comparing selenium and FitNesse?

Thats not like comparing apples and pears, that's like comparing apples and Diana Ross. Fitnesse is great where your stakeholder requirements are a little thin. The fact its a wiki lends itself easily to allowing business users to read the business use case, and see the related tests right on the page, written in plain English. Giving them a button to test that use case right on the page empowers them.

You couldn't have the stakeholder look through a bunch of selenium tests and understand them easily.

We too are building commonality between selenium and FitNesse but thats only so our test team can write the tests in a common syntax that QA and DEV can both use.
the rationale being this, any automated testing comes down to a few simple actions:
set [something]
get [something]
validate [something]
waitfor [something]

being able to write in English in Fitnesse only serves to increase the amount of code needed for the testware.

the user inputs
the user types
the user enters

all boils down to
set [something]

like i said apples and Diana Ross ;-)

Adib said...

I am not comparing fitnesse and selenium. I am saying they can complement each other

Modifying EC2 security groups via AWS Lambda functions

One task that comes up again and again is adding, removing or updating source CIDR blocks in various security groups in an EC2 infrastructur...