Beyondrelational

Tuesday, March 17, 2009

Fundamentals of Web Application Performance Testing

Fundamentals of Web Application Performance Testing

Core Activities of Performance Testing

Performance testing is typically done to help identify bottlenecks in a system, establish a baseline for future testing, support a performance tuning effort, determine compliance with performance goals and requirements, and/or collect other performance-related data to help stakeholders make informed decisions related to the overall quality of the application being tested. In addition, the results from performance testing and analysis can help you to estimate the hardware configuration required to support the application(s) when you “go live” to production operation.



The performance testing approach used in this guide consists of the following activities:

  1. Activity 1. Identify the Test Environment. Identify the physical test environment and the production environment as well as the tools and resources available to the test team. The physical environment includes hardware, software, and network configurations. Having a thorough understanding of the entire test environment at the outset enables more efficient test design and planning and helps you identify testing challenges early in the project. In some situations, this process must be revisited periodically throughout the project’s life cycle.
  2. Activity 2. Identify Performance Acceptance Criteria. Identify the response time, throughput, and resource utilization goals and constraints. In general, response time is a user concern, throughput is a business concern, and resource utilization is a system concern. Additionally, identify project success criteria that may not be captured by those goals and constraints; for example, using performance tests to evaluate what combination of configuration settings will result in the most desirable performance characteristics.
  3. Activity 3. Plan and Design Tests. Identify key scenarios, determine variability among representative users and how to simulate that variability, define test data, and establish metrics to be collected. Consolidate this information into one or more models of system usage to be implemented, executed, and analyzed.
  4. Activity 4. Configure the Test Environment. Prepare the test environment, tools, and resources necessary to execute each strategy as features and components become available for test. Ensure that the test environment is instrumented for resource monitoring as necessary.
  5. Activity 5. Implement the Test Design. Develop the performance tests in accordance with the test design.
  6. Activity 6. Execute the Test. Run and monitor your tests. Validate the tests, test data, and results collection. Execute validated tests for analysis while monitoring the test and the test environment.
  7. Activity 7. Analyze Results, Report, and Retest. Consolidate and share results data. Analyze the data both individually and as a cross-functional team. Reprioritize the remaining tests and re-execute them as needed. When all of the metric values are within accepted limits, none of the set thresholds have been violated, and all of the desired information has been collected, you have finished testing that particular scenario on that particular configuration.

Why Do Performance Testing?

At the highest level, performance testing is almost always conducted to address one or more risks related to expense, opportunity costs, continuity, and/or corporate reputation. Some more specific reasons for conducting performance testing include:

Assessing release readiness by:

Enabling you to predict or estimate the performance characteristics of an application in production and evaluate whether or not to address performance concerns based on those predictions. These predictions are also valuable to the stakeholders who make decisions about whether an application is ready for release or capable of handling future growth, or whether it requires a performance improvement/hardware upgrade prior to release.

Providing data indicating the likelihood of user dissatisfaction with the performance characteristics of the system.

Providing data to aid in the prediction of revenue losses or damaged brand credibility due to scalability or stability issues, or due to users being dissatisfied with application response time.

Assessing infrastructure adequacy by:

Evaluating the adequacy of current capacity.

Determining the acceptability of stability.

Determining the capacity of the application’s infrastructure, as well as determining the future resources required to deliver acceptable application performance.

Comparing different system configurations to determine which works best for both the application and the business.

Verifying that the application exhibits the desired performance characteristics, within budgeted resource utilization constraints.

Assessing adequacy of developed software performance by:

Determining the application’s desired performance characteristics before and after changes to the software.

Providing comparisons between the application’s current and desired performance characteristics.

Improving the efficiency of performance tuning by:

Analyzing the behavior of the application at various load levels.

Identifying bottlenecks in the application.

Providing information related to the speed, scalability, and stability of a product prior to production release, thus enabling you to make informed decisions about whether and when to tune the system.

Project Context

For a performance testing project to be successful, both the approach to testing performance and the testing itself must be relevant to the context of the project. Without an understanding of the project context, performance testing is bound to focus on only those items that the performance tester or test team assumes to be important, as opposed to those that truly are important, frequently leading to wasted time, frustration, and conflicts.

The project context is nothing more than those things that are, or may become, relevant to achieving project success. This may include, but is not limited to:

  • The overall vision or intent of the project
  • Performance testing objectives
  • Performance success criteria
  • The development life cycle
  • The project schedule
  • The project budget
  • Available tools and environments
  • The skill set of the performance tester and the team
  • The priority of detected performance concerns
  • The business impact of deploying an application that performs poorly

Some examples of items that may be relevant to the performance-testing effort in your project context include:

  • Project vision. Before beginning performance testing, ensure that you understand the current project vision. The project vision is the foundation for determining what performance testing is necessary and valuable. Revisit the vision regularly, as it has the potential to change as well.
  • Purpose of the system. Understand the purpose of the application or system you are testing. This will help you identify the highest-priority performance characteristics on which you should focus your testing. You will need to know the system’s intent, the actual hardware and software architecture deployed, and the characteristics of the typical end user.
  • Customer or user expectations. Keep customer or user expectations in mind when planning performance testing. Remember that customer or user satisfaction is based on expectations, not simply compliance with explicitly stated requirements.
  • Business drivers. Understand the business drivers – such as business needs or opportunities – that are constrained to some degree by budget, schedule, and/or resources. It is important to meet your business requirements on time and within the available budget.
  • Reasons for testing performance. Understand the reasons for conducting performance testing very early in the project. Failing to do so might lead to ineffective performance testing. These reasons often go beyond a list of performance acceptance criteria and are bound to change or shift priority as the project progresses, so revisit them regularly as you and your team learn more about the application, its performance, and the customer or user.
  • Value that performance testing brings to the project. Understand the value that performance testing is expected to bring to the project by translating the project- and business-level objectives into specific, identifiable, and manageable performance testing activities. Coordinate and prioritize these activities to determine which performance testing activities are likely to add value.
  • Project management and staffing. Understand the team’s organization, operation, and communication techniques in order to conduct performance testing effectively.
  • Process. Understand your team’s process and interpret how that process applies to performance testing. If the team’s process documentation does not address performance testing directly, extrapolate the document to include performance testing to the best of your ability, and then get the revised document approved by the project manager and/or process engineer.
  • Compliance criteria. Understand the regulatory requirements related to your project. Obtain compliance documents to ensure that you have the specific language and context of any statement related to testing, as this information is critical to determining compliance tests and ensuring a compliant product. Also understand that the nature of performance testing makes it virtually impossible to follow the same processes that have been developed for functional testing.
  • Project schedule. Be aware of the project start and end dates, the hardware and environment availability dates, the flow of builds and releases, and any checkpoints and milestones in the project schedule.

The Relationship between Performance Testing and Tuning

When end-to-end performance testing reveals system or application characteristics that are deemed unacceptable, many teams shift their focus from performance testing to performance tuning, to discover what is necessary to make the application perform acceptably. A team may also shift its focus to tuning when performance criteria have been met but the team wants to reduce the amount of resources being used in order to increase platform headroom, decrease the volume of hardware needed, and/or further improve system performance.

Cooperative Effort

Although tuning is not the direct responsibility of most performance testers, the tuning process is most effective when it is a cooperative effort between all of those concerned with the application or system under test, including:

  • Product vendors
  • Architects
  • Developers
  • Testers
  • Database administrators
  • System administrators
  • Network administrators

Without the cooperation of a cross-functional team, it is almost impossible to gain the system-wide perspective necessary to resolve performance issues effectively or efficiently.

The performance tester, or performance testing team, is a critical component of this cooperative team as tuning typically requires additional monitoring of components, resources, and response times under a variety of load conditions and configurations. Generally speaking, it is the performance tester who has the tools and expertise to provide this information in an efficient manner, making the performance tester the enabler for tuning.

Tuning Process Overview

Tuning follows an iterative process that is usually separate from, but not independent of, the performance testing approach a project is following. The following is a brief overview of a typical tuning process:

  • Tests are conducted with the system or application deployed in a well-defined, controlled test environment in order to ensure that the configuration and test results at the start of the testing process are known and reproducible.
  • When the tests reveal performance characteristics deemed to be unacceptable, the performance testing and tuning team enters a diagnosis and remediation stage (tuning) that will require changes to be applied to the test environment and/or the application. It is not uncommon to make temporary changes that are deliberately designed to magnify an issue for diagnostic purposes, or to change the test environment to see if such changes lead to better performance.
  • The cooperative testing and tuning team is generally given full and exclusive control over the test environment in order to maximize the effectiveness of the tuning phase.
  • Performance tests are executed, or re-executed after each change to the test environment, in order to measure the impact of a remedial change.
  • The tuning process typically involves a rapid sequence of changes and tests. This process can take exponentially more time if a cooperative testing and tuning team is not fully available and dedicated to this effort while in a tuning phase.
  • When a tuning phase is complete, the test environment is generally reset to its initial state, the successful remedial changes are applied again, and any unsuccessful remedial changes (together with temporary instrumentation and diagnostic changes) are discarded. The performance test should then be repeated to prove that the correct changes have been identified. It might also be the case that the test environment itself is changed to reflect new expectations as to the minimal required production environment. This is unusual, but a potential outcome of the tuning effort.

Performance, Load, and Stress Testing

Performance tests are usually described as belonging to one of the following three categories:

  • Performance testing. This type of testing determines or validates the speed, scalability, and/or stability characteristics of the system or application under test. Performance is concerned with achieving response times, throughput, and resource-utilization levels that meet the performance objectives for the project or product. In this guide, performance testing represents the superset of all of the other subcategories of performance-related testing.
  • Load testing. This subcategory of performance testing is focused on determining or validating performance characteristics of the system or application under test when subjected to workloads and load volumes anticipated during production operations.
  • Stress testing. This subcategory of performance testing is focused on determining or validating performance characteristics of the system or application under test when subjected to conditions beyond those anticipated during production operations. Stress tests may also include tests focused on determining or validating performance characteristics of the system or application under test when subjected to other stressful conditions, such as limited memory, insufficient disk space, or server failure. These tests are designed to determine under what conditions an application will fail, how it will fail, and what indicators can be monitored to warn of an impending failure.

Baselines

Creating a baseline is the process of running a set of tests to capture performance metric data for the purpose of evaluating the effectiveness of subsequent performance-improving changes to the system or application. A critical aspect of a baseline is that all characteristics and configuration options except those specifically being varied for comparison must remain invariant. Once a part of the system that is not intentionally being varied for comparison to the baseline is changed, the baseline measurement is no longer a valid basis for comparison.

With respect to Web applications, you can use a baseline to determine whether performance is improving or declining and to find deviations across different builds and versions. For example, you could measure load time, the number of transactions processed per unit of time, the number of Web pages served per unit of time, and resource utilization such as memory usage and processor usage. Some considerations about using baselines include:

  • A baseline can be created for a system, component, or application. A baseline can also be created for different layers of the application, including a database, Web services, and so on.
  • A baseline can set the standard for comparison, to track future optimizations or regressions. It is important to validate that the baseline results are repeatable, because considerable fluctuations may occur across test results due to environment and workload characteristics.
  • Baselines can help identify changes in performance. Baselines can help product teams identify changes in performance that reflect degradation or optimization over the course of the development life cycle. Identifying these changes in comparison to a well-known state or configuration often makes resolving performance issues simpler.
  • Baselines assets should be reusable. Baselines are most valuable if they are created by using a set of reusable test assets. It is important that such tests accurately simulate repeatable and actionable workload characteristics.
  • Baselines are metrics. Baseline results can be articulated by using a broad set of key performance indicators, including response time, processor capacity, memory usage, disk capacity, and network bandwidth.
  • Baselines act as a shared frame of reference. Sharing baseline results allows your team to build a common store of acquired knowledge about the performance characteristics of an application or component.
  • Avoid over-generalizing your baselines. If your project entails a major reengineering of the application, you need to reestablish the baseline for testing that application. A baseline is application-specific and is most useful for comparing performance across different versions. Sometimes, subsequent versions of an application are so different that previous baselines are no longer valid for comparisons.
  • Know your application’s behavior. It is a good idea to ensure that you completely understand the behavior of the application at the time a baseline is created. Failure to do so before making changes to the system with a focus on optimization objectives is frequently counterproductive.
  • Baselines evolve. At times you will have to redefine your baseline because of changes that have been made to the system since the time the baseline was initially captured.

Benchmarking

Benchmarking is the process of comparing your system’s performance against a baseline that you have created internally or against an industry standard endorsed by some other organization.

In the case of a Web application, you would run a set of tests that comply with the specifications of an industry benchmark in order to capture the performance metrics necessary to determine your application’s benchmark score. You can then compare your application against other systems or applications that also calculated their score for the same benchmark. You may choose to tune your application performance to achieve or surpass a certain benchmark score. Some considerations about benchmarking include:

  • You need to play by the rules. A benchmark is achieved by working with industry specifications or by porting an existing implementation to meet such standards. Benchmarking entails identifying all of the necessary components that will run together, the market where the product exists, and the specific metrics to be measured.
  • Because you play by the rules, you can be transparent. Benchmarking results can be published to the outside world. Since comparisons may be produced by your competitors, you will want to employ a strict set of standard approaches for testing and data to ensure reliable results.
  • You divulge results across various metrics. Performance metrics may involve load time, number of transactions processed per unit of time, Web pages accessed per unit of time, processor usage, memory usage, search times, and so on.

Summary

Performance testing helps to identify bottlenecks in a system, establish a baseline for future testing, support a performance tuning effort, and determine compliance with performance goals and requirements. Including performance testing very early in your development life cycle tends to add significant value to the project.

For a performance testing project to be successful, the testing must be relevant to the context of the project, which helps you to focus on the items that that are truly important.

If the performance characteristics are unacceptable, you will typically want to shift the focus from performance testing to performance tuning in order to make the application perform acceptably. You will likely also focus on tuning if you want to reduce the amount of resources being used and/or further improve system performance.

Performance, load, and stress tests are subcategories of performance testing, each intended for a different purpose.

Creating a baseline against which to evaluate the effectiveness of subsequent performance-improving changes to the system or application will generally increase project efficiency.

Need For Load Testing

Any multi-user application needs to face the concurrent access some day or the other. Before deploying the application and then exposing the application for multiple users it is better we test it and then do the deployment. This process is load testing.

Minimal Infrastructure - We cannot gather hundreds or thousands of people to carry out concurrent user tests and this will not be possible for large number of users for longer time

Reliable - Tests perform precisely the same operations each time they are run, thereby eliminating human error.

Repeatable - We can test how the application reacts after repeated execution of the same operations, for longer durations for many days

Programmable - We can program sophisticated tests that bring out hidden information.

Comprehensive - We can build a suite of tests that covers every feature in our application.

Reusable - We can reuse tests on different versions of an application, even if the user interface changes.

Open STA Features

The following are the key features of Open STA

· Record Single user scripts and debug scripts (Record Script)

· Configure and Run Performance scenarios (Performance Tests and schedule )

  • Run tests in a distributed manner (Names Server)
  • Analyze graphs (Performance Report)

Load Test Process Steps

  • Plan
  • Create scripts
  • Create scenarios
  • Run & monitor scenarios
  • Analyze results

Load Test Planning

Identify most frequently used transactions

Identify potential number of users

Identify potential number of concurrent users

Apply 10:1 or 5:1 ratio for logged-in Vs concurrent users

Identify the production platform size and configuration

Identify the data to be used for testing

Identify the different real-time usage combinations of test scenarios

Identify the load test run duration

Identify what kind of information is transmitted between server and client

Plan load testing only after functional stability of the product is achieved

Discuss with other stakeholders like network admin, database admin, server admin and others on what information is required for them

Chalk out the software configurations/settings for web server, app server and database server

Functional Testing Vs Load Testing

If preconditions are met and steps are followed, function test results are defined. Load test results are always unpredictable

Functional test results do not change more than 5% when moved from one configuration to the other. Load Test results may even nose-dive!

Functional test happens on a daily basis; but load test is not that frequent

Load test results depend on database volume as well and they change when number of users change

Load Testing Checklist

Do we have the near-production hardware configuration? If not what is the delta between test hardware and production hardware?

Is the tool capable of recording the requests based on the protocols used by the application (e.g. HTTPS) and able to replay the same?

Is the product functionally cleared before load testing?

Can we get numbers on the user counts from customer, based on past records?

Is the data pool containing unique data?

Is the trace log enabled for database and web servers?

Are the requests distributed equally to different boxes? Is there a load balancer?

Is there a facility in the tool to mimic different line speeds?

Is there a facility in the tool to mimic different browser versions?

Is there a facility in the tool to selectively log messages?

Is there a facility in the tool to export the data in xls format?

Is there a facility in the tool to auto-synchronize concurrent requests?

If the application uses queues, the queue size must be monitored during test runs.

Do the tests need runs with and without proxy servers?

Do the tests need runs with and without firewalls?

Load Test Guidelines

Number of users Vs response time must not be linear

Stress test needs to be done for shorter durations and not for longer durations

To the extent possible, let the data pool contain more unique data than what is needed

The load generating client machines must not be operated at capacities beyond 80% for CPU and memory

Avoid enabling detailed log information in the tool which will take more disk IO in the client machines

Script must be parameterized for accessing the same application with different configurable URLs. So if the application is moved form one box to the other, the script can be reused

Wherever needed, use rendezvous points to synchronize the requests before any form submission actions in the script. This ensures the simultaneous hits at the time of form submission

If there is a possibility, disable downloading image files as image files are not downloaded every time in real time usage.

Check the consistency of response time over a period of elapsed time and compare it with different test runs

All successful requests must have been submitted and the log files must match. If the requests trigger data base operations, the same must have been recorded in database.

The queues size must be minimal at any given point of time.

Most of the time the database and the business logic layer need to be doubted first before the web server is doubted.

No comments:

Post a Comment