MCE Stress Test HOWTO
====================

Oct 10th, 2009

Haicheng Li


Abstract
--------

This document explains the design and structure of MCE stress test suite,
the kernel configurations and user space tools required for automated
stress testing, as well as usage guide and etc.


0. Quick Shortcut
-----------------

- Install the Linux kernel (2.6.32 or newer) with full MCA recovery support.
  Make sure following configuration options are enabled:

  CONFIG_X86_MCE=y
  CONFIG_MEMORY_FAILURE=y

  With these two options enabled, you can do stress testing thru madvise
  syscall (sec 4.1).

- Install page-types tool (sec 3.3), which is accompanied with Linux kernel
  source (2.6.32 or newer).

  # cd $KERNEL_SRC/Documentation/vm/
  # gcc -o page-types page-types.c 
  # cp page-types /usr/bin/

- Get latest LTP (Linux Test Project) image from http://ltp.sf.net. Refer
  to INSTALL of LTP to install LTP on your machine.

- Build and run stress testing

  # make
  # cd stress
  # ./hwpoison.sh -d $YOUR_PARTITION -M -o $YOUR_LTP_DIR -N

  Note here, '-d $YOUR_PARTITION' is a mandatory option. Test will create 
  all temporary files on $YOUR_PARTITION, and error injection will just
  affect the pages associated with $$YOUR_PARTITION. So you must provide a
  free disk partition to stress test driver!

  This will do the stress testing thru madvise syscall (sec 4.1). However,
  there are more advanced test methods provided (sec 4.2, 4.3).

Note, for all examples in the rest of this doc, it is supposed that $PWD is
the stress subdir.  

1. Overview
-----------

The MCE stress test suite is a collection of tools and test scripts, which
intends to achieve stress testing on Linux kernel MCA high level handlers
that include HWPosion page recovery, soft page offline, and so on.

In general, this test suite is designed to do stress testing thru various
test interfaces, i.e. madvise syscall, HWPoison page injector, and APEI
injector (see ACPI4.0 spec). And it's able to support most of popular
Linux File Systems (FS), that is, there is an option for user to specify which
FS type they want the test to be running on.

If you just want to start testing as quickly as possible, you can skip
section 2 & 3, just go to section 4 directly.


2. Design Details
-----------------

The MCE stress test suite consists of four parts: test driver, workload
controller, customized workloads, and background workloads.

The main test idea is described as below:
- Test driver launchs various customized workloads to continuously generate
  lots of pages with expected page states, Note, all of these workloads know
  about their expected results that should not be affected by Linux MCE high
  level handlers.
- Then test driver injects MCE errors to these pages thru either madvise 
  syscall or HWPoison injector or APEI injector. While Linux Kernel handling
  these MCE errors, all the workloads continue running normally,
- After long time running, test driver will collect test result of each
  workload to see if any unexpected failures happened. In such a way, it can 
  decide if any bug is found.
- If any system panics or FS corruption happens, that means there must be a
  bug. It's the bottom line to decide if test gets pass.

2.1 Test Driver

Test driver (a.k.a hwpoison.sh) drives the whole test procedure. It's
responsible for managing test environment, setting up error injection 
interface, controlling test progress, launching workloads, injecting page 
errors, as well as recording test logs and reporting test result.

For detailed usage of hwpoison.sh test driver, please refer to:
# ./hwpoison.sh -h

2.2 Workload Controller

Workload controller needs to have various test workloads running parallelly
and continuously within a required duration time. We select ltp-pan
program of Linux Test Project (LTP) as the workload controller of this
stress test suite.

Test driver (hwpoison.sh) interacts with ltp-pan in following ways:
- hwpoison.sh generates a test config file that lists the workload type
  to be launched by ltp-pan.
- hwpoison also passes test duration time and other workload specific
  parameters to ltp-pan via test config file.
- ltp-pan makes each workload run and get finished in time, then test driver
  can get the result of each workload via corresponding result files.
- finally, hwpoison.sh will decide the overall test result based on each
  workload result, and report final result out.

2.3 Customized Workloads

There are three types of customized workloads, which are intended to generate 
pages with various page state.

* Type0: page-poisoning workload, meant to cover:
  - anonymous pages operations.
  - file data operations.

* Type1: fs-metadata workload, meant to cover:
  - inode operations.

* Type2: fs_type specific workload, meant to cover:
  - extended functions of some special FS.

2.4 Background Workloads

LTP is selected as the background workload to simulate normal system
operations in background while stress testing is running.

Besides LTP, there are also some alternatives, like AIM. We might extend more
background workloads in future.

2.5 Test Result

How to determine that stress testing gets pass?
- at least no kernel panics happens during stress testing.
- fsck on the target disk at the end of stress testing should get pass.
- there is no failure found by customized workloads, especially for
  page-poisoning workload.

Where to get detailed test result?
- When stress testing is done, the general test result is recorded in
  result/hwpoison.result, and the general test log is in result/hwpoison.log.
  However, you can specify them in following way:
  # hwpoison.sh -r $YOUR_RESULT -l $YOUR_LOG
- The test result and test log of each workload are recorded as
  log/$workload/$workload.result and log/$workload/$workload.log.
  For example, for page-poisoning workload, its test result and test logs are
  log/page-poisoning/page-poisoning.result and 
  log/page-poisoning/page-poisoning.log.
- Besides, under each workload result dir, you can find other extra logs
  like pan_log, pan_output and etc. These logs are generated by ltp-pan 
  workload controller. Usually they can help you understand what has been 
  going on with ltp-pan while workload is running. Pls. refer to ltp-pan doc
  for details.


3. Tools
--------

3.1 page-poisoning

It is the page-poisoning workload. page-poisoning workload is an extension of
tinjpage test program with a multi-process model. It spawns thousands of
processes that inject HWPosion error to various pages simultaneously thru
madvise syscall. Then it checks if these errors get handled correctly, 
i.e. whether each test process receives or doesn't receive SIGBUS signal as 
expected.

For more info about page-poisoning workload, pls. read through README file
under stress/tools/page-poisoning/.

3.2 fs-metadata

It is the fs-metadata workload. fs-metadata is designed to test i-node
operations with heavy workload and make sure every i-node operation gets
the expected result. In details, it firstly generates a huge directory
hierarchy on the target disk, then it performs unlink operations on this
directory hierarchy and duplicate a copy of the directory, finally it
checks if these two directories are same as expected.

For more info about fs-metadata workload, pls. read through README file
under stress/tools/fs-metadata/.

3.3 page-types

page-types is a tool to query the page type of every memory page in the
system. We use it to filter out pages with required page types. Test will
inject error to these pages via error injector, although the page filter
of HWPosion handler in Linux Kernel will filter them out for a second
time. Note, the reason we need to use page-types to do first time filtering
is just about performance.

To install page-types on your test machine:

  # cd $KERNEL_SRC/Documentation/vm/
  # gcc -o page-types page-types.c 
  # cp page-types /usr/bin/

3.4 ltp-pan

It's the workload controller of this stress test suite. In fact, ltp-pan
is the test harness of LTP (Linux Test Project), and is included in
LTP package. For more information, please refer to ltp-pan document of LTP.


4. Usage Guide
--------------

This section is trying to show you how to conduct the stress testing thru
various test interfaces. 

As an example, we choose to run stress testing based on partition /dev/sda1 
for 1 hour. Note, we've installed LTP to /ltp.  

4.1 Stress Test thru Madvise Syscall.

To run this stress testing, you need to strictly follow below test
instructions. 

* Test instructions:

- make sure following kernel options are enabled:
  CONFIG_X86_MCE=y
  CONFIG_MEMORY_FAILURE=y

- build and run stress testing
  # make
  # ./hwpoison.sh -d $YOUR_PARTITION -M -o $YOUR_LTP_DIR

* Example: 

- launch testing
  # ./hwpoison.sh -d /dev/sda1 -M -t 3600

- general test results
  result: result/hwpoison.result
  logs: result/hwpoison.log

- detailed workload results
  result: log/page-poisoning/page-poisoning.result
  log: log/page-poisoning/page-poisoning.log

4.2 Stress Test thru HWPosion Page Injector

This is the default test method of this stress test suite.

To run this stress testing, you need to strictly follow below test
instructions. 

* Test instructions:

- make sure following kernel options are enabled:
  CONFIG_X86_MCE=y
  CONFIG_MEMORY_FAILURE=y
  CONFIG_DEBUG_KERNEL=y
  CONFIG_CGROUP_MEM_RES_CTLR_SWAP=y
  CONFIG_HWPOISON_INJECT=y

- build and run stress testing
  # make
  # ./hwpoison.sh -d $YOUR_PARTITION -o $YOUR_LTP_DIR -L 

* Example: 

- launch testing
  # ./hwpoison.sh -d /dev/sda1 -t 3600 -L

- general test results
  result: result/hwpoison.result
  logs: result/hwpoison.log

- detailed workload results
  fs-metadata result: log/fs-metadata/fs-metadata.result
  fs-metadata log: log/fs-metadata/fs-metadata.log
  ltp result: log/ltp/ltp.result
  ltp log: log/ltp/ltp.log
  fs-specific result: log/fs-specific/fs-specific.result
  fs-specific log: log/fs-specific/fs-specific.log

4.3 Stress Test thru APEI Injector

To run this stress testing, you need to follow below test instructions. 

* Test instructions:

- make sure following kernel options are enabled:
  CONFIG_X86_MCE=y
  CONFIG_X86_MCE_INTEL=y
  CONFIG_MEMORY_FAILURE=y
  CONFIG_ACPI_APEI=y
  CONFIG_ACPI_APEI_EINJ=y

- build and run stress testing
  # make
  # ./hwpoison.sh -d $YOUR_PARTITION -o $YOUR_LTP_DIR -L -A

* Example: 

- launch testing
  # ./hwpoison.sh -d /dev/sda1 -t 3600 -L -A

- general test results
  result: result/hwpoison.result
  logs: result/hwpoison.log

- detailed workload results
  fs-metadata result: log/fs-metadata/fs-metadata.result
  fs-metadata log: log/fs-metadata/fs-metadata.log
  ltp result: log/ltp/ltp.result
  ltp log: log/ltp/ltp.log
  fs-specific result: log/fs-specific/fs-specific.result
  fs-specific log: log/fs-specific/fs-specific.log


5. FAQs
-------

Here is a collection of frequently asked questions:

Q: How to tell test driver not to format my disk partition?
A: Use the option '-N'.

Q: Can three types of tests run on same sytem simultaneously?
A: No. There are limitations in Linux Kernel HWPoison page filtering.  

Q: Can I run this stress testing on multiple disks parallely?
A: Yes. But it requires updated Kernel patches for HWPosion page filtering.
   Now, it just supports one same test with same pagetype flags specified. 

