February 26, 2007

Expecting the unexpected. Part 1

Posted by Ben Simo

One of the expectations for GUI test automation is unattended running of tests. However, this is often difficult to accomplish. Unexpected application behavior can stop an automated test in its tracks. Manual intervention is then required to run the script. Some automation tools offer run-time options to help the user prod the test along. Other tools require that the script or system under test be fixed before test execution can continue. The process of running a partial test, fixing the script (or waiting for an application fix), and then running another partial test to only find another script-stopping change can be time consuming. This process often takes longer than manual testing.

The problem is that scripted automation cannot adjust to application issues like a thinking manual tester. The automation script can only do what the scripter told it to expect. Some automation tools offer complex exception handling features that allow users to define expected unexpected behavior. There lies the problem: someone has to expect and code for the unexpected. There will always be unexpected unexpected behavior.

How can we create automation that can deal with the unexpected?


February 13, 2007

Top ten web application security flaws

Posted by Ben Simo

The topic of tonight's SQuAD meeting was software security. Mike Walters presentation, "Integration of Security into the SDLC", highlighted the need to implement and validate security throughout the software development lifecycle. Mike mentioned that security is moving from the realm of "non-functional" testing to functional testing. Security has become an important functional requirement. Mike stressed the need to define security requirements at the start; get developer buy-in; and provide developers with the tools and training to build secure software. The risks of poor security are often too great to ignore.

Mike recommended the OWASP Top Ten Project as a great starter list of web application security threats to consider.

    • A1 Unvalidated Input. Information from web requests is not validated before being used by a web application. Attackers can use these flaws to attack backend components through a web application.
    • A2 Broken Access Control. Restrictions on what authenticated users are allowed to do are not properly enforced. Attackers can exploit these flaws to access other users' accounts, view sensitive files, or use unauthorized functions.
    • A3 Broken Authentication and Session Management. Account credentials and session
      tokens are not properly protected. Attackers that can compromise passwords, keys, session cookies, or other tokens can defeat authentication restrictions and assume other users' identities.
    • A4 Cross Site Scripting. The web application can be used as a mechanism to transport an attack to an end user's browser. A successful attack can disclose the end user's session token, attack the local machine, or spoof content to fool the user.
    • A5 Buffer Overflow. Web application components in some languages that do not properly validate input can be crashed and, in some cases, used to take control of a process. These components can include CGI, libraries, drivers, and web application server components.
    • A6 Injection Flaws. Web applications pass parameters when they access external systems or the local operating system. If an attacker can embed malicious commands in these parameters, the external system may execute those commands on behalf of the web application.
    • A7 Improper Error Handling. Error conditions that occur during normal operation are not handled properly. If an attacker can cause errors to occur that the web application does not handle, they can gain detailed system information, deny service, cause security mechanisms to fail, or crash the server.
    • A8 Insecure Storage. Web applications frequently use cryptographic functions to protect information and credentials. These functions and the code to integrate them have proven difficult to code properly, frequently resulting in weak protection.
    • A9 Application Denial of Service. Attackers can consume web application resources to a point where other legitimate users can no longer access or use the application. Attackers can also lock users out of their accounts or even cause the entire application to fail.
    • A10 Insecure Configuration Management. Having a strong server configuration standard is critical to a secure web application. These servers have many configuration options that affect security and are not secure out of the box.

The next time you are involved in designing, coding, or testing a web application: consider these things.


February 12, 2007

Best Practices Aren’t

Posted by Ben Simo

The first two of the Seven Basic Principles of the Context-Driven School of software testing are:

1. The value of any practice depends on its context.
2. There are good practices in context, but there are no best practices.

As a former quality school gatekeeper, I understand the value of standards – in both products and processes. However, I am concerned by the current “best practices” trends in software development and testing. The rigidity that we demand in product standards can hurt in process standards. Even the CMM (which is often viewed as a rigid process) has “Optimizing” as the highest level of maturity. A mature process includes continuous learning and adjustment of the process. No process should lock us into one way of doing anything.

Nearly 100 years ago, the industrial efficiency pioneer Frederick Taylor wrote “among the various methods and implements used in each element of each trade there is always one method and one implement which is quicker and better than any of the rest”.

I do not disagree that there may be a best practice for a specific task in a specific context. Taylor broke down existing methods and implements (tools) into small pieces and scientifically evaluated them in search of areas for improvement. The problem is that today's best practices are often applied as one-size-fits-all processes. The best practice for one situation is not necessarily the best for all other contexts. And a "best practice" today may no longer be the best practice tomorrow. This is actually the opposite of what Taylor did. Consultants and tool vendors have discovered that there is money to be made taking "best practices" out of one context and applying to all other contexts. It is harder, and likely less profitable, for the "experts" to seek out the best practices for a specific context. Taylor sought out and applied best practices to small elements. Many of today's "best practices" are applied at a universal level.

I am amazed by what appears to be widespread acceptance of “best practices” by software testers. As testers, it is our job to question. We make a living questioning software. We need to continually do the same for practices. Test your practices.

When presented with a best practice, consider contexts in which the practice is not the best. The broader the scope of the best practice, the more situations it is unlikely to fit. Don’t limit your toolbox to a single practice or set of practices. Be flexible enough to adjust your processes as the context demands. Treat process as an example and apply it where it fits and be willing to deviate from the process -- or even apply an entirely different process -- if it does not fit the context.

No process should replace human intelligence. Let process guide you when it applies. Don’t let a process make decisions for you.

Seek out continuous improvement. Don't let process become a rut.

Process is best used as a map; not as an auto-pilot.


February 8, 2007

When is a bug a feature?

Posted by Ben Simo

Ever find a bug that "can't" be fixed?

I am not referring to fixes prevented by technical or schedule limitations. I am referring to bugs that have become expected features.

Here are two...

Experience #1
I once found a bug in communications software that would cause most of the systems likely to be on the other end of a connection to crash. I don't mean the failure of a single process. This was a memory allocation issue that could take down entire systems that were essential to their owners. An argument could be made that the bug was really in all those other systems and not in the system that I was testing. However, most of the large number of systems that made up the pre-existing installed base could not create the condition that would make another system fail. Due to the large number of systems (from a variety of vendors) it was determined that the condition that caused the failure should be prevented by new systems (and future releases of existing systems) instead of immediately fixing all the old systems. When I discovered the problem on the new system, the developers were willing and able to make a quick fix but the business said no. The reason? The manuals had already been printed and the application could not be changed in any way that changed the documented user experience.

Experience #2
After being newly assigned to test a product that had been in "maintenance" mode for many years, I discovered and reported numerous bugs. There were also new developers assigned to this product. The developers and I were allowed to work on these long-standing defects because the new developers needed a chance to familiarize themselves with the code before working on upcoming enhancements. One of the bugs we fixed was a yes/no prompt that required that users select "no" when they meant "yes", and "yes" when they meant "no". To both me and the new development team, this was a major problem. However, after shipping the "new and improved" release, we received requests from a customer that the yes/no prompt be put back the way it was. The reason? The customer had created their own documentation and training for their users. The customer was teaching their users that "yes" means "no" and "no" means "yes". We had to back out this change to keep the customer happy.

Some lessons I learned from these experiences are:
1) There are often bigger issues involved in software development than removing all the bugs.
2) Users learn to work around bugs and treat them as features. This can create a situation in which the fix to a bug becomes a new bug. Consult users before making changes that impact their workflow -- especially users that write big checks.
3) Delaying a fix for a bug that impacts how users behave may prevent the bug from ever being fixed. Had the yes/no issue been fixed soon after it was first introduced, the customer would have been happy with the fix. You may end up needing to manage two sets of code: one for customers that want the bug fixed and one for customers that want the bad behavior to stay.
4) Respectfully ask questions when direction doesn't make sense. Work with stakeholders to come up with creative solutions. In the case of the pre-printed documentation, development was able to come up with a creative solution that did not impact the user interface or documentation.

What "non-fixable" bugs have you encountered? Was a solution found?


February 7, 2007

People, Monkeys, and Models

Posted by Ben Simo

Methods I have used for automating “black box” software testing…

I have approached test automation in a number of different ways over the past 15 years. Some have worked well and others have not. Most have worked when applied in the appropriate context. Many would be inappropriate for contexts other than that in which they were successful.

Below is a list of methods I’ve tried in the general order that I first implemented them.

Notice that I did not start out with the record-playback test automation that is demonstrated by tool vendors. The first test automation tool I used professionally was the DOS version of Word Perfect. (Yes, a Word Processor as a test tool. Right now, Excel is probably the the one tool I find most useful.) Word Perfect had a great macro language that could be used for all kinds of automated data manipulation. I then moved to Pascal and C compilers. I even used a pre-HTML hyper-link system called First Class to create front ends for integrated computer-assisted testing systems.

I had been automating tests for many years before I saw my first commercial GUI test automation tool. My first reaction to such tools was something like: "Cool. A scripting language that can easily interact with the user interfaces of other programs."

I have approached test automation as software development since the beginning. I've seen (and helped recover from) a number of failed test automation efforts that were implemented using the guidelines (dare I say "Best Practices"?) of the tools' vendors. I had successfully implemented model-based testing solutions before I knew of keyword-driven testing (as a package by that name). I am currently using model-based test automation for most GUI test automation: including release acceptance and regression testing. I also use computer-assisted testing tools help generate test data and model applications for MBT.

I've rambled on long enough. Here's my list of methods I've applied in automating "black box" software testing. What methods have worked for you?

Computer-assisted Testing
· How It Works
: Manual testers use software tools to assist them with testing testing. Specific tasks in the manual testing process are automated to improve consistency or speed.
· Pros: Tedious or difficult tasks can be given to the computer while a thinking human being is engaged throughout most of the process. A little coding effort greatly benefits testers. A thinking human being is involved throughout most of the testing process.
· Cons: A human being is involved throughout most of the testing process.

Static Scripted Testing
· How It Works: The test system steps through an application in a pre-defined order, validating a small number of pre-defined requirements. Every time a static test is repeated, it performs the same actions in the same order. This is the type of test created using the record and playback features in most test automation tools.
· Pros: Tests are easy to create for specific features and to retest known problems. Non-programmers can usually record and replay manual testing steps.
· Cons: Specific test cases need to be developed, automated, and maintained. Regular maintenance is required because most automated test tools are not able to adjust for minor application changes that may not even be noticed by a human tester. Test scripts can quickly become complex and may even require a complete redesigned each time an application changes. Tests only retrace steps that have already been performed manually. Tests may miss problems that are only evident when actions are taken (or not taken) in a specific order. Recovery from failure can be difficult: a single failure can easily prevent testing of other parts of the application under test.

Wild (or Unmanaged) Monkey Testing
· How It Works:
The automated test system simulates a monkey banging on the keyboard by randomly generating input (key-presses; and mouse moves, clicks, drags, and drops) without knowledge of available input options. Activity is logged, and major malfunctions such as program crashes, system crashes, and server/page not found errors are detected and reported.
· Pros: Tests are easy to create, require little maintenance, and given time, can stumble into major defects that may be missed following pre-defined test procedures.
· Cons: The monkey is not able to detect whether or not the software is functioning properly. It can only detect major malfunctions. Reviewing logs to determine just what the monkey did to stumble into a defect can be time consuming.

Trained (or Managed) Monkey Testing
· How It Works: The automated test system detects available options displayed to the user and randomly enters data and presses buttons that apply to the detected state of the application. · Pros: Tests are relatively easy to create, require little maintenance, and easily find catastrophic software problems. May find errors more quickly than an unsupervised monkey test.
· Cons: Although a trained monkey is somewhat selective in performing actions, it also knows nothing (or very little) about the expected behavior of the application and can only detect defects that result in major application failures.

Tandem Monkey Testing
· How It Works:
The automated test system performs trained monkey tests, in tandem, in two versions of an application: one performing an action after the other. The test tool compares the results of each action and reports differences.
· Pros: Specific test cases are not required. Tests are relatively easy to create, require little maintenance, and easily identify differences between two versions of an application.
· Cons: Manual review of differences can be time consuming. Due to the requirement of running two versions of a application at the same time, this type of testing is usually only suited for testing through web browsers and terminal emulators. Both versions of the application under test must be using the same data – unless the data is the subject of the test.

Data-Reading Scripted Testing
· How It Works: The test system steps through an application using pre-defined procedures with a variety of pre-defined input data. Each time an action is executed, the same procedures are followed; however, the input data changes.
· Pros: Tests are easy to create for specific features and to retest known problems. Recorded manual tests can be parameterized to create data-reading static tests. Performing the same test with a variety of input data can identify data-related defects that may be missed by tests that always use the same data.
· Cons: All the development and maintenance problems associated with pure static scripted tests still exist with most data-reading tests.

Model-Based Testing
· How It Works:
Model-based testing is an approach in which the behavior of an application is described in terms of actions that change the state of the system. The test system can then dynamically create test cases by traversing the model and comparing results of each action to the action’s expected result state.
· Pros: Relatively easy to create and maintain. Models can be as simple or complex as desired. Models can be easily expanded to test additional functionality. There is no need to create specific test cases because the test system can generate endless tests from what is described in the model. Maintaining a model is usually easier than managing test cases (especially when an application changes often). Machine-generated “exploratory” testing is likely to find software defects that will be missed by traditional automation that simply repeats steps that have already been performed manually. Human testers can focus on bigger issues that require an intelligent thinker during execution. Model-based automation can also provide information to human testers to help direct manual testing.
· Cons: It requires a change in thinking. This is not how we used to creating tests. Model-based test automation tools are not readily available.

Keyword-Driven Testing
· How It Works:
Test design and implementation are separated. Use case components are assigned keywords. Keywords are linked to create tests procedures. Small components are automated for each keyword process.
· Pros: Automation maintenance is simplified. Coding skills are not required to create tests from existing components. Small reusable components are easier to manage than long recorded scripts.
· Cons: Test cases still need to be defined. Managing the process can become as time consuming as automating with static scripts. Tools to manage the process are expensive. Cascading bugs can stop automation in its tracks. The same steps are repeated each time a test is executed. (Repeatability is not all its cracked up to be.)


February 6, 2007

Slogans are models.

Posted by Ben Simo

Harry Robinson posted an answer to inquiries about the Google Testing Blog's slogan: "Life is too short for manual testing." Some were concerned that the slogan implied that Google does not value manual and exploratory testing. I too had such concerns.

Harry pointed out that the slogan is just a slogan and that life really is too short to try all the combinations that might expose important bugs.

This got me to thinking about slogans as models. A slogan is really a model of an idea. It is not complete. It is simpler than the thing it describes.

Consider the following advertising slogans:

  • "The ultimate driving machine"
  • "When it absolutely, positively has to be there overnight."
  • "Finger lickin' good."
  • "Let your fingers do the walking."
  • "Reach out and touch someone."
  • "The quicker picker-upper."
  • "Have if your way."
  • "It's everywhere you want to be."
  • "Betcha can't eat just one."
These slogans bring to mind attributes about the companies and their products that are not an explicit part of the slogan. I don't even have to mention the companies or their products. This is your mind accessing your mental model of the company and products that the model represents.

In addition to the more detailed model invoked in your mind, it should not be difficult to find faults with these slogans. The slogans are incomplete; yet they are not useless.

Slogans demonstrate both the usefulness and potential complexity of models. A model does not need to be complete to be useful.

So, how does this apply to software testing ... and test automation?

When we develop test cases or perform exploratory testing we are implementing our mental models. When we execute tests, we (hopefully) learn more about the system under test and update our mental models.

In the same way, explicit models used for model-based test automation can be refined after each test execution. There is no need to model all the possible details before the first test run. Running tests based on incomplete models can provide valuable information about your test subject. It can validate or disprove your assumptions. Results from an incomplete model can help lead you to other useful tests -- both manual and automated.

Investigate using Hierarchical State Machines to simplify model definition and maintenance.

Build your models one step at a time.