Tutorial 1: Getting Started with ReFrame

New in version 3.1.

This tutorial will give you a first overview of ReFrame and will acquaint you with its basic concepts. We will start with a simple “Hello, World!” test running with the default configuration and we will expand the example along the way. We will also explore performance tests and port our tests to an HPC cluster. The examples of this tutorial can be found under tutorials/basics/.

Getting Ready

All you need to start off with this tutorial is to have installed ReFrame. If you haven’t done so yet, all you need is Python 3.6 and above and to follow the steps below:

git clone https://github.com/reframe-hpc/reframe.git
cd reframe
./bootstrap.sh
./bin/reframe -V

We’re now good to go!

The “Hello, World!” test

As simple as it may sound, a series of “naive” “Hello, World!” tests can reveal lots of regressions in the programming environment of HPC clusters, but the bare minimum of those also serves perfectly the purpose of starting this tutorial. Here is its C version:

cat tutorials/basics/hello/src/hello.c
#include <stdio.h>


int main()
{
    printf("Hello, World!\n");
    return 0;
}

And here is the ReFrame version of it:

cat tutorials/basics/hello/hello1.py
import reframe as rfm
import reframe.utility.sanity as sn


@rfm.simple_test
class HelloTest(rfm.RegressionTest):
    valid_systems = ['*']
    valid_prog_environs = ['*']
    sourcepath = 'hello.c'

    @sanity_function
    def assert_hello(self):
        return sn.assert_found(r'Hello, World\!', self.stdout)

Regression tests in ReFrame are specially decorated classes that ultimately derive from RegressionTest. The @simple_test decorator registers a test class with ReFrame and makes it available to the framework. The test variables are essentially attributes of the test class and can be defined directly in the class body. Each test must always set the valid_systems and valid_prog_environs attributes. These define the systems and/or system partitions that this test is allowed to run on, as well as the programming environments that it is valid for. A programming environment is essentially a compiler toolchain. We will see later on in the tutorial how a programming environment can be defined. The generic configuration of ReFrame assumes a single programming environment named builtin which comprises a C compiler that can be invoked with cc. In this particular test we set both these attributes to ['*'], essentially allowing this test to run everywhere.

A ReFrame test must either define an executable to execute or a source file (or source code) to be compiled. In this example, it is enough to define the source file of our hello program. ReFrame knows the executable that was produced and will use that to run the test.

Finally, every regression test must always decorate a member function as the test’s @sanity_function. This decorated function is converted into a lazily evaluated expression that asserts the sanity of the test. In this particular case, the specified sanity function checks that the executable has produced the desired phrase into the test’s standard output stdout. Note that ReFrame does not determine the success of a test by its exit code. Instead, the assessment of success is responsibility of the test itself.

Before running the test let’s inspect the directory structure surrounding it:

tutorials/basics/hello
├── hello1.py
└── src
    └── hello.c

Our test is hello1.py and its resources, i.e., the hello.c source file, are located inside the src/ subdirectory. If not specified otherwise, the sourcepath attribute is always resolved relative to src/. There is full flexibility in organizing the tests. Multiple tests may be defined in a single file or they may be split in multiple files. Similarly, several tests may share the same resources directory or they can simply have their own.

Now it’s time to run our first test:

./bin/reframe -c tutorials/basics/hello/hello1.py -r
[ReFrame Setup]
  version:           4.0.0-dev.2+5ea6b7a6
  command:           './bin/reframe -c tutorials/basics/hello/hello1.py -r'
  launched by:       user@host
  working directory: '/home/user/Repositories/reframe'
  settings files:    '<builtin>'
  check search path: '/home/user/Repositories/reframe/tutorials/basics/hello/hello1.py'
  stage directory:   '/home/user/Repositories/reframe/stage'
  output directory:  '/home/user/Repositories/reframe/output'
  log files:         '/var/folders/h7/k7cgrdl13r996m4dmsvjq7v80000gp/T/rfm-tgqpdq_b.log'

[==========] Running 1 check(s)
[==========] Started on Sat Nov 12 19:00:44 2022 

[----------] start processing checks
[ RUN      ] HelloTest /2b3e4546 @generic:default+builtin
[       OK ] (1/1) HelloTest /2b3e4546 @generic:default+builtin
[----------] all spawned checks have finished

[  PASSED  ] Ran 1/1 test case(s) from 1 check(s) (0 failure(s), 0 skipped)
[==========] Finished on Sat Nov 12 19:00:45 2022 
Run report saved in '/home/user/.reframe/reports/run-report-319.json'
Log file(s) saved in '/var/folders/h7/k7cgrdl13r996m4dmsvjq7v80000gp/T/rfm-tgqpdq_b.log'

Perfect! We have verified that we have a functioning C compiler in our system.

When ReFrame runs a test, it copies all its resources to a stage directory and performs all test-related operations (compilation, run, sanity checking etc.) from that directory. On successful outcome of the test, the stage directory is removed by default, but interesting files are copied to an output directory for archiving and later inspection. The prefixes of these directories are printed in the first section of the output. Let’s inspect what files ReFrame produced for this test:

ls output/generic/default/builtin/HelloTest/
rfm_HelloTest_build.err rfm_HelloTest_build.sh  rfm_HelloTest_job.out
rfm_HelloTest_build.out rfm_HelloTest_job.err   rfm_HelloTest_job.sh

ReFrame stores in the output directory of the test the build and run scripts it generated for building and running the code along with their standard output and error. All these files are prefixed with rfm_.

ReFrame also generates a detailed JSON report for the whole regression testing session. By default, this is stored inside the ${HOME}/.reframe/reports directory and a new report file is generated every time ReFrame is run, but you can control this through the --report-file command-line option.

Here are the contents of the report file for our first ReFrame run:

cat ~/.reframe/reports/run-report.json
{
  "session_info": {
    "cmdline": "./bin/reframe -c tutorials/basics/hello/hello1.py -r",
    "config_file": "<builtin>",
    "data_version": "2.0",
    "hostname": "host",
    "prefix_output": "/path/to/reframe/output",
    "prefix_stage": "/path/to/reframe/stage",
    "user": "user",
    "version": "3.10.0-dev.3+c22440c1",
    "workdir": "/path/to/reframe",
    "time_start": "2022-01-22T13:21:50+0100",
    "time_end": "2022-01-22T13:21:51+0100",
    "time_elapsed": 0.8124568462371826,
    "num_cases": 1,
    "num_failures": 0
  },
  "runs": [
    {
      "num_cases": 1,
      "num_failures": 0,
      "num_aborted": 0,
      "num_skipped": 0,
      "runid": 0,
      "testcases": [
        {
          "build_stderr": "rfm_HelloTest_build.err",
          "build_stdout": "rfm_HelloTest_build.out",
          "dependencies_actual": [],
          "dependencies_conceptual": [],
          "description": "HelloTest",
          "display_name": "HelloTest",
          "filename": "/path/to/reframe/tutorials/basics/hello/hello1.py",
          "environment": "builtin",
          "fail_phase": null,
          "fail_reason": null,
          "jobid": "43152",
          "job_stderr": "rfm_HelloTest_job.err",
          "job_stdout": "rfm_HelloTest_job.out",
          "maintainers": [],
          "name": "HelloTest",
          "nodelist": [
            "tresa.local"
          ],
          "outputdir": "/path/to/reframe/output/generic/default/builtin/HelloTest",
          "perfvars": null,
          "prefix": "/path/to/reframe/tutorials/basics/hello",
          "result": "success",
          "stagedir": "/path/to/reframe/stage/generic/default/builtin/HelloTest",
          "scheduler": "local",
          "system": "generic:default",
          "tags": [],
          "time_compile": 0.27164483070373535,
          "time_performance": 0.00010180473327636719,
          "time_run": 0.3764667510986328,
          "time_sanity": 0.0006909370422363281,
          "time_setup": 0.007919073104858398,
          "time_total": 0.8006880283355713,
          "unique_name": "HelloTest"
        }
      ]
    }
  ],
  "restored_cases": []
}

More of “Hello, World!”

We want to extend our test and run a C++ “Hello, World!” as well. We could simply copy paste the hello1.py and change the source file extension to refer to the C++ source code. But this duplication is something that we generally want to avoid. ReFrame allows you to avoid this in several ways but the most compact is to define the new test as follows:

cat tutorials/basics/hello/hello2.py
import reframe as rfm
import reframe.utility.sanity as sn


@rfm.simple_test
class HelloMultiLangTest(rfm.RegressionTest):
    lang = parameter(['c', 'cpp'])

    valid_systems = ['*']
    valid_prog_environs = ['*']

    @run_before('compile')
    def set_sourcepath(self):
        self.sourcepath = f'hello.{self.lang}'

    @sanity_function
    def assert_hello(self):
        return sn.assert_found(r'Hello, World\!', self.stdout)

This test extends the hello1.py test by defining the lang parameter with the parameter() built-in. This parameter will cause as many instantiations as parameter values available, each one setting the lang attribute to one single value. Hence, this example will create two test instances, one with lang='c' and another with lang='cpp'. The parameter is available as an attribute of the test instance and, in this example, we use it to set the extension of the source file. However, at the class level, a test parameter holds all the possible values for itself, and this is only assigned a single value after the class is instantiated. Therefore, the variable sourcepath, which depends on this parameter, also needs to be set after the class instantiation. The simplest way to do this would be to move the sourcepath assignment into the __init__() method as shown in the code snippet below, but this has some disadvantages when writing larger tests.

def __init__(self):
    self.sourcepath = f'hello.{self.lang}'

For example, when writing a base class for a test with a large amount of code into the __init__() method, the derived class may want to do a partial override of the code in this function. This would force us to understand the full implementation of the base class’ __init__() despite that we may just be interested in overriding a small part of it. Doable, but not ideal. Instead, through pipeline hooks, ReFrame provides a mechanism to attach independent functions to execute at a given time before the data they set is required by the test. This is exactly what we want to do here, and we know that the test sources are needed to compile the code. Hence, we move the sourcepath assignment into a pre-compile hook.

    @run_before('compile')
    def set_sourcepath(self):
        self.sourcepath = f'hello.{self.lang}'

The use of hooks is covered in more detail later on, but for now, let’s just think of them as a way to defer the execution of a function to a given stage of the test’s pipeline. By using hooks, any user could now derive from this class and attach other hooks (for example, adding some compiler flags) without having to worry about overriding the base method that sets the sourcepath variable.

Let’s run the test now:

./bin/reframe -c tutorials/basics/hello/hello2.py -r
[ReFrame Setup]
  version:           4.0.0-dev.2+5ea6b7a6
  command:           './bin/reframe -c tutorials/basics/hello/hello2.py -r'
  launched by:       user@host
  working directory: '/home/user/Repositories/reframe'
  settings files:    '<builtin>'
  check search path: '/home/user/Repositories/reframe/tutorials/basics/hello/hello2.py'
  stage directory:   '/home/user/Repositories/reframe/stage'
  output directory:  '/home/user/Repositories/reframe/output'
  log files:         '/var/folders/h7/k7cgrdl13r996m4dmsvjq7v80000gp/T/rfm-krmo7oc3.log'

[==========] Running 2 check(s)
[==========] Started on Sat Nov 12 19:00:45 2022 

[----------] start processing checks
[ RUN      ] HelloMultiLangTest %lang=cpp /71bf65a3 @generic:default+builtin
[ RUN      ] HelloMultiLangTest %lang=c /7cfa870e @generic:default+builtin
[     FAIL ] (1/2) HelloMultiLangTest %lang=cpp /71bf65a3 @generic:default+builtin
==> test failed during 'compile': test staged in '/home/user/Repositories/reframe/stage/generic/default/builtin/HelloMultiLangTest_71bf65a3'
rfm_job.out
[       OK ] (2/2) HelloMultiLangTest %lang=c /7cfa870e @generic:default+builtin
[----------] all spawned checks have finished

[  FAILED  ] Ran 2/2 test case(s) from 2 check(s) (1 failure(s), 0 skipped)
[==========] Finished on Sat Nov 12 19:00:46 2022 

================================================================================
SUMMARY OF FAILURES
--------------------------------------------------------------------------------
FAILURE INFO for HelloMultiLangTest_1 
  * Expanded name: HelloMultiLangTest %lang=cpp
  * Description: 
  * System partition: generic:default
  * Environment: builtin
  * Stage directory: /home/user/Repositories/reframe/stage/generic/default/builtin/HelloMultiLangTest_71bf65a3
  * Node list: 
  * Job type: local (id=None)
  * Dependencies (conceptual): []
  * Dependencies (actual): []
  * Maintainers: []
  * Failing phase: compile
  * Rerun with '-n /71bf65a3 -p builtin --system generic:default -r'
  * Reason: build system error: I do not know how to compile a C++ program
--------------------------------------------------------------------------------
Run report saved in '/home/user/.reframe/reports/run-report-320.json'
Log file(s) saved in '/var/folders/h7/k7cgrdl13r996m4dmsvjq7v80000gp/T/rfm-krmo7oc3.log'

Oops! The C++ test has failed. ReFrame complains that it does not know how to compile a C++ program. Remember our discussion above that the default configuration of ReFrame defines a minimal programming environment named builtin which only knows of a cc compiler. We will fix that in a moment, but before doing that it’s worth looking into the failure information provided for the test. For each failed test, ReFrame will print a short summary with information about the system partition and the programming environment that the test failed for, its job or process id (if any), the nodes it was running on, its stage directory, the phase that failed etc.

When a test fails its stage directory is kept intact, so that users can inspect the failure and try to reproduce it manually. In this case, the stage directory contains only the “Hello, World” source files, since ReFrame could not produce a build script for the C++ test, as it doesn’t know to compile a C++ program for the moment.

ls stage/generic/default/builtin/HelloMultiLangTest_cpp
hello.c   hello.cpp

Let’s go on and fix this failure by defining a new system and programming environments for the machine we are running on. For this we need to create our own configuration file.

vi tutorials/config/tresa.py

Here is what we need to type:

# Copyright 2016-2024 Swiss National Supercomputing Centre (CSCS/ETH Zurich)
# and other ReFrame Project Developers. See the top-level LICENSE file for
# details.
#
# SPDX-License-Identifier: BSD-3-Clause


site_configuration = {
    'systems': [
        {
            'name': 'tresa',
            'descr': 'My Mac',
            'hostnames': ['tresa'],
            'modules_system': 'nomod',
            'partitions': [
                {
                    'name': 'default',
                    'scheduler': 'local',
                    'launcher': 'local',
                    'environs': ['gnu', 'clang'],
                }
            ]
        }
    ],
    'environments': [
        {
            'name': 'gnu',
            'cc': 'gcc-12',
            'cxx': 'g++-12',
            'ftn': 'gfortran-12',
            'target_systems': ['tresa']
        },
        {
            'name': 'clang',
            'cc': 'clang',
            'cxx': 'clang++',
            'ftn': '',
            'target_systems': ['tresa']
        },
    ]
}

We define a system named tresa that has one partition named default. This partition makes no use of any workload manager, but instead launches any jobs locally as OS processes. Two programming environments are relevant for that partition, namely gnu and clang, which are defined in the section environments of the configuration file. The gnu programming environment provides GCC 12, whereas the clang one provides the Clang compiler from the system. Notice, how you can define the actual commands for invoking the C, C++ and Fortran compilers in each programming environment. As soon as a programming environment defines the different compilers, ReFrame will automatically pick the right compiler based on the source file extension. In addition to C, C++ and Fortran programs, ReFrame will recognize the .cu extension as well and will try to invoke the nvcc compiler for CUDA programs. Note also that we set the target_systems for each environment definition. This restricts the definition of the environment being defined to the specified systems only. ReFrame will always pick the definition that is a closest match for the current system. Restricting the environment definitions is generally a good practice if you plan to define multiple systems in multiple configuration files, as ReFrame would otherwise complain that an environment is redefined. On the other hand, if you want to provide generic definitions of environments that are valid for multiple systems, you may skip that. This is what the builtin configuration of ReFrame does for its generic builtin environment.

Finally, the new system that we defined may be identified by the hostname tresa (see the hostnames systems configuration parameter) and it will not use any environment modules system (see the modules_system configuration parameter). The hostnames attribute will help ReFrame to automatically pick the right configuration when running on it. Notice, how the generic system matches any hostname, so that it acts as a fallback system.

Note

Multiple systems may defined in a configuration file, in which case they are tried in order and the first match is picked. This means that the systems whose hostnames patterns are more generic, they should go to the end of the list.

The Configuring ReFrame for Your Site page describes the configuration file in more detail and the Configuration Reference provides a complete reference guide of all the configuration options of ReFrame.

Let’s now rerun our “Hello, World!” tests:

./bin/reframe -C tutorials/config/tresa.py -c tutorials/basics/hello/hello2.py -r
[ReFrame Setup]
  version:           4.0.0-dev.2+5ea6b7a6
  command:           './bin/reframe -C tutorials/config/tresa.py -c tutorials/basics/hello/hello2.py -r'
  launched by:       user@host
  working directory: '/home/user/Repositories/reframe'
  settings files:    '<builtin>', 'tutorials/config/tresa.py'
  check search path: '/home/user/Repositories/reframe/tutorials/basics/hello/hello2.py'
  stage directory:   '/home/user/Repositories/reframe/stage'
  output directory:  '/home/user/Repositories/reframe/output'
  log files:         '/var/folders/h7/k7cgrdl13r996m4dmsvjq7v80000gp/T/rfm-e3dlf19_.log'

[==========] Running 2 check(s)
[==========] Started on Sat Nov 12 19:00:46 2022 

[----------] start processing checks
[ RUN      ] HelloMultiLangTest %lang=cpp /71bf65a3 @tresa:default+gnu
[ RUN      ] HelloMultiLangTest %lang=cpp /71bf65a3 @tresa:default+clang
[ RUN      ] HelloMultiLangTest %lang=c /7cfa870e @tresa:default+gnu
[ RUN      ] HelloMultiLangTest %lang=c /7cfa870e @tresa:default+clang
rfm_job.out
[       OK ] (1/4) HelloMultiLangTest %lang=c /7cfa870e @tresa:default+gnu
rfm_job.out
[       OK ] (2/4) HelloMultiLangTest %lang=c /7cfa870e @tresa:default+clang
rfm_job.out
[       OK ] (3/4) HelloMultiLangTest %lang=cpp /71bf65a3 @tresa:default+gnu
rfm_job.out
[       OK ] (4/4) HelloMultiLangTest %lang=cpp /71bf65a3 @tresa:default+clang
[----------] all spawned checks have finished

[  PASSED  ] Ran 4/4 test case(s) from 2 check(s) (0 failure(s), 0 skipped)
[==========] Finished on Sat Nov 12 19:00:48 2022 
Run report saved in '/home/user/.reframe/reports/run-report-321.json'
Log file(s) saved in '/var/folders/h7/k7cgrdl13r996m4dmsvjq7v80000gp/T/rfm-e3dlf19_.log'

Notice how the same tests are now tried with both the gnu and clang programming environments, without having to touch them at all! That’s one of the powerful features of ReFrame and we shall see later on, how easily we can port our tests to an HPC cluster with minimal changes. In order to instruct ReFrame to use our configuration file, we use the -C command line option. Since we don’t want to type it throughout the tutorial, we could set the RFM_CONFIG_FILES environment variable, which takes a colon-separated list of configuration files that ReFrame will load. We will take advantage of multiple configuration files later in the tutorial.

export RFM_CONFIG_FILES=$(pwd)/tutorials/config/tresa.py

Tip

If our configuration file was named settings.py and we did not intend to use multiple configuration files in the same directory, we could also set the RFM_CONFIG_PATH environment variable.

A Multithreaded “Hello, World!”

We extend our C++ “Hello, World!” example to print the greetings from multiple threads:

cat tutorials/basics/hellomp/src/hello_threads.cpp
#include <iomanip>
#include <iostream>
#include <mutex>
#include <thread>
#include <vector>


#ifdef SYNC_MESSAGES
std::mutex hello_mutex;
#endif


void greetings(int tid)
{
#ifdef SYNC_MESSAGES
    const std::lock_guard<std::mutex> lock(hello_mutex);
#endif
    std::cout << "[" << std::setw(2) << tid << "] " << "Hello, World!\n";
}


int main(int argc, char *argv[])
{
    int nr_threads = 1;
    if (argc > 1) {
        nr_threads = std::atoi(argv[1]);
    }

    if (nr_threads <= 0) {
        std::cerr << "thread count must a be positive integer\n";
        return 1;
    }

    std::vector<std::thread> threads;
    for (auto i = 0; i < nr_threads; ++i) {
        threads.push_back(std::thread(greetings, i));
    }

    for (auto &t : threads) {
        t.join();
    }

    return 0;
}

This program takes as argument the number of threads it will create and it uses std::thread, which is a C++11 addition, meaning that we will need to pass -std=c++11 to our compilers. Here is the corresponding ReFrame test, where the new concepts introduced are highlighted:

cat tutorials/basics/hellomp/hellomp1.py
import reframe as rfm
import reframe.utility.sanity as sn


@rfm.simple_test
class HelloThreadedTest(rfm.RegressionTest):
    valid_systems = ['*']
    valid_prog_environs = ['*']
    sourcepath = 'hello_threads.cpp'
    build_system = 'SingleSource'
    executable_opts = ['16']

    @run_before('compile')
    def set_compilation_flags(self):
        self.build_system.cxxflags = ['-std=c++11', '-Wall']
        environ = self.current_environ.name
        if environ in {'clang', 'gnu'}:
            self.build_system.cxxflags += ['-pthread']

    @sanity_function
    def assert_hello(self):
        return sn.assert_found(r'Hello, World\!', self.stdout)

ReFrame delegates the compilation of a test to a build_system, which is an abstraction of the steps needed to compile the test. Build systems take also care of interactions with the programming environment if necessary. Compilation flags are a property of the build system. If not explicitly specified, ReFrame will try to pick the correct build system (e.g., CMake, Autotools etc.) by inspecting the test resources, but in cases as the one presented here where we need to set the compilation flags, we need to specify a build system explicitly. In this example, we instruct ReFrame to compile a single source file using the -std=c++11 -pthread -Wall compilation flags. However, the flag -pthread is only needed to compile applications using std::thread with the GCC and Clang compilers. Hence, since this flag may not be valid for other compilers, we need to include it only in the tests that use either GCC or Clang. Similarly to the lang parameter in the previous example, the information regarding which compiler is being used is only available after the class is instantiated (after completion of the setup pipeline stage), so we also defer the addition of this optional compiler flag with a pipeline hook. In this case, we set the set_compile_flags() hook to run before the ReFrame pipeline stage compile.

Note

The pipeline hooks, as well as the regression test pipeline itself, are covered in more detail later on in the tutorial.

In this example, the generated executable takes a single argument which sets the number of threads to be used. The options passed to the test’s executable can be set through the executable_opts variable, which in this case is set to '16'.

Let’s run the test now:

./bin/reframe -c tutorials/basics/hellomp/hellomp1.py -r
[ReFrame Setup]
  version:           4.0.0-dev.2+5ea6b7a6
  command:           './bin/reframe -c tutorials/basics/hellomp/hellomp1.py -r'
  launched by:       user@host
  working directory: '/home/user/Repositories/reframe'
  settings files:    '<builtin>', '/home/user/Repositories/reframe/tutorials/config/tresa.py'
  check search path: '/home/user/Repositories/reframe/tutorials/basics/hellomp/hellomp1.py'
  stage directory:   '/home/user/Repositories/reframe/stage'
  output directory:  '/home/user/Repositories/reframe/output'
  log files:         '/var/folders/h7/k7cgrdl13r996m4dmsvjq7v80000gp/T/rfm-v56bz2uo.log'

[==========] Running 1 check(s)
[==========] Started on Sat Nov 12 19:00:48 2022 

[----------] start processing checks
[ RUN      ] HelloThreadedTest /a6fa300f @tresa:default+gnu
[ RUN      ] HelloThreadedTest /a6fa300f @tresa:default+clang
[       OK ] (1/2) HelloThreadedTest /a6fa300f @tresa:default+gnu
[       OK ] (2/2) HelloThreadedTest /a6fa300f @tresa:default+clang
[----------] all spawned checks have finished

[  PASSED  ] Ran 2/2 test case(s) from 1 check(s) (0 failure(s), 0 skipped)
[==========] Finished on Sat Nov 12 19:00:50 2022 
Run report saved in '/home/user/.reframe/reports/run-report-322.json'
Log file(s) saved in '/var/folders/h7/k7cgrdl13r996m4dmsvjq7v80000gp/T/rfm-v56bz2uo.log'

Everything looks fine, but let’s inspect the actual output of one of the tests:

cat output/catalina/default/clang/HelloThreadedTest/rfm_HelloThreadedTest_job.out
[[[[    8] Hello, World!
1] Hello, World!
5[[0[ 7] Hello, World!
] ] Hello, World!
[ Hello, World!
6[] Hello, World!
9] Hello, World!
 2 ] Hello, World!
4] [[10 3] Hello, World!
] Hello, World!
[Hello, World!
11] Hello, World!
[12] Hello, World!
[13] Hello, World!
[14] Hello, World!
[15] Hello, World!

Not exactly what we were looking for! In the following we write a more robust sanity check that can catch this havoc.

More advanced sanity checking

So far, we have seen only a grep-like search for a string in the test’s stdout, but ReFrame’s @sanity_function are much more capable than this. In fact, one could practically do almost any operation in the output and process it as you would like before assessing the test’s sanity. In the following, we extend the sanity checking of the above multithreaded “Hello, World!” to assert that all the threads produce a greetings line. See the highlighted lines below in the modified version of the @sanity_function.

cat tutorials/basics/hellomp/hellomp2.py
import reframe as rfm
import reframe.utility.sanity as sn


@rfm.simple_test
class HelloThreadedExtendedTest(rfm.RegressionTest):
    valid_systems = ['*']
    valid_prog_environs = ['*']
    sourcepath = 'hello_threads.cpp'
    build_system = 'SingleSource'
    executable_opts = ['16']

    @run_before('compile')
    def set_compilation_flags(self):
        self.build_system.cxxflags = ['-std=c++11', '-Wall']
        environ = self.current_environ.name
        if environ in {'clang', 'gnu'}:
            self.build_system.cxxflags += ['-pthread']

    @sanity_function
    def assert_num_messages(self):
        num_messages = sn.len(sn.findall(r'\[\s?\d+\] Hello, World\!',
                                         self.stdout))
        return sn.assert_eq(num_messages, 16)

This new @sanity_function counts all the pattern matches in the tests’s stdout and checks that this count matches the expected value. The execution of the function assert_num_messages() is deferred to the sanity stage in the test’s pipeline, after the executable has run and the stdout file has been populated. In this example, we have used the findall() utility function from the sanity module to conveniently extract the pattern matches. This module provides a broad range of utility functions that can be used to compose more complex sanity checks. However, note that the utility functions in this module are lazily evaluated expressions or deferred expressions which must be evaluated either implicitly or explicitly (see Deferrable Functions Reference).

Let’s run this version of the test now and see if it fails:

./bin/reframe -c tutorials/basics/hellomp/hellomp2.py -r
[ReFrame Setup]
  version:           4.0.0-dev.2+5ea6b7a6
  command:           './bin/reframe -c tutorials/basics/hellomp/hellomp2.py -r'
  launched by:       user@host
  working directory: '/home/user/Repositories/reframe'
  settings files:    '<builtin>', '/home/user/Repositories/reframe/tutorials/config/tresa.py'
  check search path: '/home/user/Repositories/reframe/tutorials/basics/hellomp/hellomp2.py'
  stage directory:   '/home/user/Repositories/reframe/stage'
  output directory:  '/home/user/Repositories/reframe/output'
  log files:         '/var/folders/h7/k7cgrdl13r996m4dmsvjq7v80000gp/T/rfm-a2tt4eqp.log'

[==========] Running 1 check(s)
[==========] Started on Sat Nov 12 19:00:50 2022 

[----------] start processing checks
[ RUN      ] HelloThreadedExtendedTest /4733a67d @tresa:default+gnu
[ RUN      ] HelloThreadedExtendedTest /4733a67d @tresa:default+clang
[     FAIL ] (1/2) HelloThreadedExtendedTest /4733a67d @tresa:default+gnu
==> test failed during 'sanity': test staged in '/home/user/Repositories/reframe/stage/tresa/default/gnu/HelloThreadedExtendedTest'
[     FAIL ] (2/2) HelloThreadedExtendedTest /4733a67d @tresa:default+clang
==> test failed during 'sanity': test staged in '/home/user/Repositories/reframe/stage/tresa/default/clang/HelloThreadedExtendedTest'
[----------] all spawned checks have finished

[  FAILED  ] Ran 2/2 test case(s) from 1 check(s) (2 failure(s), 0 skipped)
[==========] Finished on Sat Nov 12 19:00:52 2022 

================================================================================
SUMMARY OF FAILURES
--------------------------------------------------------------------------------
FAILURE INFO for HelloThreadedExtendedTest 
  * Expanded name: HelloThreadedExtendedTest
  * Description: 
  * System partition: tresa:default
  * Environment: gnu
  * Stage directory: /home/user/Repositories/reframe/stage/tresa/default/gnu/HelloThreadedExtendedTest
  * Node list: hostNone
  * Job type: local (id=59525)
  * Dependencies (conceptual): []
  * Dependencies (actual): []
  * Maintainers: []
  * Failing phase: sanity
  * Rerun with '-n /4733a67d -p gnu --system tresa:default -r'
  * Reason: sanity error: 13 != 16
--------------------------------------------------------------------------------
FAILURE INFO for HelloThreadedExtendedTest 
  * Expanded name: HelloThreadedExtendedTest
  * Description: 
  * System partition: tresa:default
  * Environment: clang
  * Stage directory: /home/user/Repositories/reframe/stage/tresa/default/clang/HelloThreadedExtendedTest
  * Node list: hostNone
  * Job type: local (id=59528)
  * Dependencies (conceptual): []
  * Dependencies (actual): []
  * Maintainers: []
  * Failing phase: sanity
  * Rerun with '-n /4733a67d -p clang --system tresa:default -r'
  * Reason: sanity error: 11 != 16
--------------------------------------------------------------------------------
Run report saved in '/home/user/.reframe/reports/run-report-323.json'
Log file(s) saved in '/var/folders/h7/k7cgrdl13r996m4dmsvjq7v80000gp/T/rfm-a2tt4eqp.log'

As expected, only some of lines are printed correctly which makes the test fail. To fix this test, we need to compile with -DSYNC_MESSAGES, which will synchronize the printing of messages.

cat tutorials/basics/hellomp/hellomp3.py
import reframe as rfm
import reframe.utility.sanity as sn


@rfm.simple_test
class HelloThreadedExtended2Test(rfm.RegressionTest):
    valid_systems = ['*']
    valid_prog_environs = ['*']
    sourcepath = 'hello_threads.cpp'
    build_system = 'SingleSource'
    executable_opts = ['16']

    @run_before('compile')
    def set_compilation_flags(self):
        self.build_system.cppflags = ['-DSYNC_MESSAGES']
        self.build_system.cxxflags = ['-std=c++11', '-Wall']
        environ = self.current_environ.name
        if environ in {'clang', 'gnu'}:
            self.build_system.cxxflags += ['-pthread']

    @sanity_function
    def assert_num_messages(self):
        num_messages = sn.len(sn.findall(r'\[\s?\d+\] Hello, World\!',
                                         self.stdout))
        return sn.assert_eq(num_messages, 16)

Writing A Performance Test

An important aspect of regression testing is checking for performance regressions. In this example, we write a test that downloads the STREAM benchmark, compiles it, runs it and records its performance. In the test below, we highlight the lines that introduce new concepts.

cat tutorials/basics/stream/stream1.py
import reframe as rfm
import reframe.utility.sanity as sn


@rfm.simple_test
class StreamTest(rfm.RegressionTest):
    valid_systems = ['*']
    valid_prog_environs = ['gnu']
    prebuild_cmds = [
        'wget https://raw.githubusercontent.com/jeffhammond/STREAM/master/stream.c'  # noqa: E501
    ]
    build_system = 'SingleSource'
    sourcepath = 'stream.c'
    env_vars = {
        'OMP_NUM_THREADS': '4',
        'OMP_PLACES': 'cores'
    }

    @run_before('compile')
    def set_compiler_flags(self):
        self.build_system.cppflags = ['-DSTREAM_ARRAY_SIZE=$((1 << 25))']
        self.build_system.cflags = ['-fopenmp', '-O3', '-Wall']

    @sanity_function
    def validate_solution(self):
        return sn.assert_found(r'Solution Validates', self.stdout)

    @performance_function('MB/s', perf_key='Copy')
    def extract_copy_perf(self):
        return sn.extractsingle(r'Copy:\s+(\S+)\s+.*', self.stdout, 1, float)

    @performance_function('MB/s', perf_key='Scale')
    def extract_scale_perf(self):
        return sn.extractsingle(r'Scale:\s+(\S+)\s+.*', self.stdout, 1, float)

    @performance_function('MB/s', perf_key='Add')
    def extract_add_perf(self):
        return sn.extractsingle(r'Add:\s+(\S+)\s+.*', self.stdout, 1, float)

    @performance_function('MB/s', perf_key='Triad')
    def extract_triad_perf(self):
        return sn.extractsingle(r'Triad:\s+(\S+)\s+.*', self.stdout, 1, float)

First of all, notice that we restrict the programming environments to gnu only, since this test requires OpenMP, which our installation of Clang does not have. The next thing to notice is the prebuild_cmds attribute, which provides a list of commands to be executed before the build step. These commands will be executed from the test’s stage directory. In this case, we just fetch the source code of the benchmark. For running the benchmark, we need to set the OpenMP number of threads and pin them to the right CPUs through the OMP_NUM_THREADS and OMP_PLACES environment variables. You can set environment variables in a ReFrame test through the env_vars dictionary.

What makes a ReFrame test a performance test is the definition of at least one performance function. Similarly to a test’s @sanity_function, a performance function is a member function decorated with the @performance_function decorator that merely extracts or computes a performance metric from the test’s output and associates it with a unit. By default, every performance function defined in the test is assigned to a performance variable with the function’s name. A performance variable is a named quantity representing a performance metric that ReFrame will report on, log and can also check against a reference value. The performance variables of a test are stored in the perf_variables dictionary. The keys are the names of the metrics, whereas the values are performance functions. The @performance_function decorator apart from turning an ordinary method into a “performance function”, it also creates an entry in the perf_variables dictionary. The optional perf_key argument can be used to assign a different name to the newly created performance variable.

In this example, we extract four performance variables, namely the memory bandwidth values for each of the “Copy”, “Scale”, “Add” and “Triad” sub-benchmarks of STREAM, where each of the performance functions use the extractsingle() utility function. For each of the sub-benchmarks we extract the “Best Rate MB/s” column of the output (see below) and we convert that to a float.

Function    Best Rate MB/s  Avg time     Min time     Max time
Copy:           24939.4     0.021905     0.021527     0.022382
Scale:          16956.3     0.031957     0.031662     0.032379
Add:            18648.2     0.044277     0.043184     0.046349
Triad:          19133.4     0.042935     0.042089     0.044283

Let’s run the test now:

./bin/reframe -c tutorials/basics/stream/stream1.py -r --performance-report

The --performance-report will generate a short report at the end of the run for each performance test that has run. Additionally, as soon as a performance test finishes, the obtained performance for each of the metrics is immediately reported. This is especially useful if you run long suites of performance exploration tests and you do not want to wait until the end of the run to have an overview of the obtained performance.

[ReFrame Setup]
  version:           4.0.0-dev.2+5ea6b7a6
  command:           './bin/reframe -c tutorials/basics/stream/stream1.py -r --performance-report'
  launched by:       user@host
  working directory: '/home/user/Repositories/reframe'
  settings files:    '<builtin>', '/home/user/Repositories/reframe/tutorials/config/tresa.py'
  check search path: '/home/user/Repositories/reframe/tutorials/basics/stream/stream1.py'
  stage directory:   '/home/user/Repositories/reframe/stage'
  output directory:  '/home/user/Repositories/reframe/output'
  log files:         '/var/folders/h7/k7cgrdl13r996m4dmsvjq7v80000gp/T/rfm-v0ig7jt4.log'

[==========] Running 1 check(s)
[==========] Started on Sat Nov 12 19:00:53 2022 

[----------] start processing checks
[ RUN      ] StreamTest /cdf4820d @tresa:default+gnu
[       OK ] (1/1) StreamTest /cdf4820d @tresa:default+gnu
P: Copy: 24031.8 MB/s (r:0, l:None, u:None)
P: Scale: 16297.9 MB/s (r:0, l:None, u:None)
P: Add: 17843.8 MB/s (r:0, l:None, u:None)
P: Triad: 18278.3 MB/s (r:0, l:None, u:None)
[----------] all spawned checks have finished

[  PASSED  ] Ran 1/1 test case(s) from 1 check(s) (0 failure(s), 0 skipped)
[==========] Finished on Sat Nov 12 19:00:56 2022 

================================================================================
PERFORMANCE REPORT
--------------------------------------------------------------------------------
[StreamTest /cdf4820d @tresa:default:gnu]
  num_tasks: 1
  num_gpus_per_node: 0
  performance:
    - Copy: 24031.8 MB/s (r: 0 MB/s l: -inf% u: +inf%)
    - Scale: 16297.9 MB/s (r: 0 MB/s l: -inf% u: +inf%)
    - Add: 17843.8 MB/s (r: 0 MB/s l: -inf% u: +inf%)
    - Triad: 18278.3 MB/s (r: 0 MB/s l: -inf% u: +inf%)
--------------------------------------------------------------------------------
Run report saved in '/home/user/.reframe/reports/run-report-324.json'
Log file(s) saved in '/var/folders/h7/k7cgrdl13r996m4dmsvjq7v80000gp/T/rfm-v0ig7jt4.log'

Setting explicitly the test’s performance variables

Users are allowed to manipulate the test’s perf_variables dictionary directly. This is useful to avoid code repetition or in cases that relying on decorated methods to populate the perf_variables is impractical, e.g., creating multiple performance variables in a loop.

You might have noticed that in our STREAM example above, all four performance functions are almost identical except for a small part of the regex pattern. In the following example, we define a single performance function, extract_bw(), that can extract any of the requested bandwidth metrics, and we populate the perf_variables ourselves in a pre-performance hook:

cat tutorials/basics/stream/stream2.py
import reframe as rfm
import reframe.utility.sanity as sn


@rfm.simple_test
class StreamAltTest(rfm.RegressionTest):
    valid_systems = ['*']
    valid_prog_environs = ['gnu']
    prebuild_cmds = [
        'wget https://raw.githubusercontent.com/jeffhammond/STREAM/master/stream.c'  # noqa: E501
    ]
    build_system = 'SingleSource'
    sourcepath = 'stream.c'
    env_vars = {
        'OMP_NUM_THREADS': '4',
        'OMP_PLACES': 'cores'
    }

    @run_before('compile')
    def set_compiler_flags(self):
        self.build_system.cppflags = ['-DSTREAM_ARRAY_SIZE=$((1 << 25))']
        self.build_system.cflags = ['-fopenmp', '-O3', '-Wall']

    @sanity_function
    def validate_solution(self):
        return sn.assert_found(r'Solution Validates', self.stdout)

    @performance_function('MB/s')
    def extract_bw(self, kind='Copy'):
        '''Generic performance extraction function.'''

        if kind not in ('Copy', 'Scale', 'Add', 'Triad'):
            raise ValueError(f'illegal value in argument kind ({kind!r})')

        return sn.extractsingle(rf'{kind}:\s+(\S+)\s+.*',
                                self.stdout, 1, float)

    @run_before('performance')
    def set_perf_variables(self):
        '''Build the dictionary with all the performance variables.'''

        self.perf_variables = {
            'Copy': self.extract_bw(),
            'Scale': self.extract_bw('Scale'),
            'Add': self.extract_bw('Add'),
            'Triad': self.extract_bw('Triad'),
        }

As mentioned in the previous section the @performance_function decorator performs two tasks:

  1. It converts a test method to performance function, i.e., a function that is suitable for extracting a performance metric.

  2. It updates the perf_variables dictionary with the newly created performance function.

In this example, we are only interested in the first functionality and that’s why we redefine completely the test’s perf_variables using the extract_bw() performance function. If you are inheriting from a base test and you don’t want to override completely its performance variables, you could call instead update() on perf_variables.

Finally, you can convert any arbitrary function or deferred expression into a performance function by calling the make_performance_function() utility as shown below:

@run_before('performance')
def set_perf_vars(self):
    self.perf_variables = {
        'Copy': sn.make_performance_function(
            sn.extractsingle(r'Copy:\s+(\S+)\s+.*',
                             self.stdout, 1, float),
            'MB/s'
         )
    }

Note that in this case, the newly created performance function is not assigned to a test’s performance variable and you will have to do this independently.

Adding reference values

On its current state, the above STREAM performance test will simply extract and report the performance variables regardless of the actual performance values. However, in some situations, it might be useful to check that the extracted performance values are within an expected range, and report a failure whenever a test performs below expectations. To this end, ReFrame tests include the reference variable, which enables setting references for each of the performance variables defined in a test and also set different references for different systems. In the following example, we set the reference values for all the STREAM sub-benchmarks for the system we are currently running on.

Note

Optimizing STREAM benchmark performance is outside the scope of this tutorial.

cat tutorials/basics/stream/stream3.py
import reframe as rfm
import reframe.utility.sanity as sn


@rfm.simple_test
class StreamWithRefTest(rfm.RegressionTest):
    valid_systems = ['*']
    valid_prog_environs = ['gnu']
    prebuild_cmds = [
        'wget https://raw.githubusercontent.com/jeffhammond/STREAM/master/stream.c'  # noqa: E501
    ]
    build_system = 'SingleSource'
    sourcepath = 'stream.c'
    env_vars = {
        'OMP_NUM_THREADS': '4',
        'OMP_PLACES': 'cores'
    }
    reference = {
        'catalina': {
            'Copy':  (25200, -0.05, 0.05, 'MB/s'),
            'Scale': (16800, -0.05, 0.05, 'MB/s'),
            'Add':   (18500, -0.05, 0.05, 'MB/s'),
            'Triad': (18800, -0.05, 0.05, 'MB/s')
        }
    }

    @run_before('compile')
    def set_compiler_flags(self):
        self.build_system.cppflags = ['-DSTREAM_ARRAY_SIZE=$((1 << 25))']
        self.build_system.cflags = ['-fopenmp', '-O3', '-Wall']

    @sanity_function
    def validate_solution(self):
        return sn.assert_found(r'Solution Validates', self.stdout)

    @performance_function('MB/s')
    def extract_bw(self, kind='Copy'):
        '''Generic performance extraction function.'''

        if kind not in ('Copy', 'Scale', 'Add', 'Triad'):
            raise ValueError(f'illegal value in argument kind ({kind!r})')

        return sn.extractsingle(rf'{kind}:\s+(\S+)\s+.*',
                                self.stdout, 1, float)

    @run_before('performance')
    def set_perf_variables(self):
        '''Build the dictionary with all the performance variables.'''

        self.perf_variables = {
            'Copy': self.extract_bw(),
            'Scale': self.extract_bw('Scale'),
            'Add': self.extract_bw('Add'),
            'Triad': self.extract_bw('Triad'),
        }

The performance reference tuple consists of the reference value, the lower and upper thresholds expressed as fractional numbers relative to the reference value, and the unit of measurement. If any of the thresholds is not relevant, None may be used instead. Also, the units in this reference variable are entirely optional, since they were already provided through the @performance_function decorator.

If any obtained performance value is beyond its respective thresholds, the test will fail with a summary as shown below:

./bin/reframe -c tutorials/basics/stream/stream3.py -r --performance-report
FAILURE INFO for StreamWithRefTest
  * Expanded name: StreamWithRefTest
  * Description:
  * System partition: catalina:default
  * Environment: gnu
  * Stage directory: /Users/user/Repositories/reframe/stage/catalina/default/gnu/StreamWithRefTest
  * Node list: tresa.localNone
  * Job type: local (id=4576)
  * Dependencies (conceptual): []
  * Dependencies (actual): []
  * Maintainers: []
  * Failing phase: performance
  * Rerun with '-n /f925207b -p gnu --system catalina:default -r'
  * Reason: performance error: failed to meet reference: Add=19585.3, expected 18500 (l=17575.0, u=19425.0)

Examining the performance logs

ReFrame has a powerful mechanism for logging its activities as well as performance data. It supports different types of log channels and it can send data simultaneously in any number of them. For example, performance data might be logged in files and at the same time being sent to Syslog or to a centralized log management server. By default (i.e., starting off from the builtin configuration file), ReFrame sends performance data to files per test under the perflogs/ directory:

perflogs
└── catalina
    └── default
        ├── StreamTest.log
        └── StreamWithRefTest.log

ReFrame creates a log file per test per system and per partition and appends to it every time the test is run on that system/partition combination. Let’s inspect the log file from our last test:

tail perflogs/catalina/default/StreamWithRefTest.log
job_completion_time,version,display_name,system,partition,environ,jobid,result,Copy_value,Copy_unit,Copy_ref,Copy_lower,Copy_upper,Scale_value,Scale_unit,Scale_ref,Scale_lower,Scale_upper,Add_value,Add_unit,Add_ref,Add_lower,Add_upper,Triad_value,Triad_unit,Triad_ref,Triad_lower,Triad_upper
2022-10-18T21:41:25,4.0.0-dev.2+90fbd3ef,StreamWithRefTest,catalina,default,gnu,81351,pass,24235.6,MB/s,25200,-0.05,0.05,16044.2,MB/s,16800,-0.05,0.05,17733.7,MB/s,18500,-0.05,0.05,18232.0,MB/s,18800,-0.05,0.05
2022-10-18T21:41:31,4.0.0-dev.2+90fbd3ef,StreamWithRefTest,catalina,default,gnu,81377,fail,23615.4,MB/s,25200,-0.05,0.05,16394.5,MB/s,16800,-0.05,0.05,17841.3,MB/s,18500,-0.05,0.05,18284.1,MB/s,18800,-0.05,0.05
2022-10-18T21:46:06,4.0.0-dev.2+90fbd3ef,StreamWithRefTest,catalina,default,gnu,81480,fail,23736.4,MB/s,25200,-0.05,0.05,16242.8,MB/s,16800,-0.05,0.05,17699.1,MB/s,18500,-0.05,0.05,18077.3,MB/s,18800,-0.05,0.05

The format of this file is controlled by handlers_perflog logging configuration parameter and, by default, contains several information about the test. For each test, all of its performance variables are logged along with their unit, the obtained value, the reference and the lower and upper threshold. The default format is in CSV, so that it can be easily post-processed. For this reason, a header is also printed to help identify the different fields.

Since version 4.0, ReFrame is very cautious when generating this file: if a change is detected in the information that is being logged, ReFrame will not append to the file, but it will instead create a new one, saving the old file using the .h<N> suffix, where N is an integer that is increased every time a new file is being created due to such changes. Examples of changes in the logged information are when the log record format changes or a new performance metric is added, deleted or has its name changed. This behavior guarantees that each log file is consistent and it will not break existing parsers.

For more information on configuring performance logging in ReFrame as well as logging in general, you may refer to the Logging Configuration reference.

Porting The Tests to an HPC cluster

It’s now time to port our tests to an HPC cluster. Obviously, HPC clusters are much more complex than our laptop or PC. Usually there are many more compilers, the user environment is handled in a different way, and the way to launch the tests varies significantly, since you have to go through a workload manager in order to access the actual compute nodes. Besides that, there might be multiple types of compute nodes that we would like to run our tests on, but each type might be accessed in a different way. It is already apparent that porting even an as simple as a “Hello, World” test to such a system is not that straightforward. As we shall see in this section, ReFrame makes that pretty easy.

Adapting the configuration

Our target system is the Piz Daint supercomputer at CSCS, but you can adapt the process to your target HPC system. In ReFrame, all the details of the various interactions of a test with the system environment are handled transparently and are set up in its configuration file. Let’s create a new configuration file for Piz Daint:

site_configuration = {
    'systems': [
        {
            'name': 'daint',
            'descr': 'Piz Daint Supercomputer',
            'hostnames': ['daint'],
            'modules_system': 'tmod32',
            'partitions': [
                {
                    'name': 'login',
                    'descr': 'Login nodes',
                    'scheduler': 'local',
                    'launcher': 'local',
                    'environs': ['builtin', 'gnu', 'intel', 'nvidia', 'cray'],
                },
                {
                    'name': 'gpu',
                    'descr': 'Hybrid nodes',
                    'scheduler': 'slurm',
                    'launcher': 'srun',
                    'access': ['-C gpu', '-A csstaff'],
                    'environs': ['gnu', 'intel', 'nvidia', 'cray'],
                    'max_jobs': 100,
                },
                {
                    'name': 'mc',
                    'descr': 'Multicore nodes',
                    'scheduler': 'slurm',
                    'launcher': 'srun',
                    'access': ['-C mc', '-A csstaff'],
                    'environs': ['gnu', 'intel', 'nvidia', 'cray'],
                    'max_jobs': 100,
                    'resources': [
                        {
                            'name': 'memory',
                            'options': ['--mem={size}']
                        }
                    ]
                }
            ]
        }
    ],
    'environments': [
        {
            'name': 'gnu',
            'modules': ['PrgEnv-gnu'],
            'cc': 'cc',
            'cxx': 'CC',
            'ftn': 'ftn',
            'target_systems': ['daint']
        },
        {
            'name': 'cray',
            'modules': ['PrgEnv-cray'],
            'cc': 'cc',
            'cxx': 'CC',
            'ftn': 'ftn',
            'target_systems': ['daint']
        },
        {
            'name': 'intel',
            'modules': ['PrgEnv-intel'],
            'cc': 'cc',
            'cxx': 'CC',
            'ftn': 'ftn',
            'target_systems': ['daint']
        },
        {
            'name': 'nvidia',
            'modules': ['PrgEnv-nvidia'],
            'cc': 'cc',
            'cxx': 'CC',
            'ftn': 'ftn',
            'target_systems': ['daint']
        },
        {
            'name': 'builtin',
            'cc': 'cc',
            'cxx': 'CC',
            'ftn': 'ftn',
            'target_systems': ['daint']
        }
    ]  # end of environments
}

First of all, we need to define a new system and set the list of hostnames that will help ReFrame identify it. We also set the modules_system configuration parameter to instruct ReFrame that this system makes use of the environment modules for managing the user environment. Then we define the system partitions that we want to test. In this case, we define three partitions:

  1. the login nodes,

  2. the multicore partition (2x Broadwell CPUs per node) and

  3. the hybrid partition (1x Haswell CPU + 1x Pascal GPU).

The login nodes are pretty much similar to the tresa:default partition which corresponded to our laptop: tests will be launched and run locally. The other two partitions are handled by Slurm and parallel jobs are launched using the srun command. Additionally, in order to access the different types of nodes represented by those partitions, users have to specify either -C mc or -C gpu options along with their account. This is what we do exactly with the access partition configuration option.

Note

System partitions in ReFrame do not necessarily correspond to real job scheduler partitions.

Piz Daint’s programming environment offers four compilers: Cray, GNU, Intel and NVIDIA. We want to test all of them, so we include them in the environs lists. Notice that we do not include Clang in the list, since there is no such compiler on this particular system. On the other hand, we include a different version of the builtin environment, which corresponds to the default login environment without loading any modules. It is generally useful to define such an environment so as to use it for tests that are running simple utilities and don’t need to compile anything.

Before looking into the definition of the new environments for the four compilers, it is worth mentioning the max_jobs parameter. This parameter specifies the maximum number of ReFrame test jobs that can be simultaneously in flight. ReFrame will try to keep concurrency close to this limit (but not exceeding it). By default, this is set to 8, so you are advised to set it to a higher number if you want to increase the throughput of completed tests.

The new environments are defined similarly to the ones we had for our local system, except that now we add also the modules parameter. The modules parameter is a list of environment modules that needs to be loaded, in order to make available this compiler.

Running the tests

We are now ready to run our tests on Piz Daint. We will only do so with the final versions of the tests from the previous section, which we will select using -n option.

export RFM_CONFIG_FILES=$(pwd)/tutorials/config/daint.py
./bin/reframe -c tutorials/basics/ -R -n 'HelloMultiLangTest|HelloThreadedExtended2Test|StreamWithRefTest' --performance-report -r
[ReFrame Setup]
  version:           4.0.0-dev.2
  command:           './bin/reframe -c tutorials/basics/ -R -n HelloMultiLangTest|HelloThreadedExtended2Test|StreamWithRefTest --performance-report -r'
  launched by:       user@host
  working directory: '/home/user/Devel/reframe'
  settings files:    '<builtin>', '/home/user/Devel/reframe/tutorials/config/daint.py'
  check search path: (R) '/home/user/Devel/reframe/tutorials/basics'
  stage directory:   '/home/user/Devel/reframe/stage'
  output directory:  '/home/user/Devel/reframe/output'
  log files:         '/tmp/rfm-nyqs7jb9.log'

[==========] Running 4 check(s)
[==========] Started on Tue Nov 15 18:20:32 2022

[----------] start processing checks
[ RUN      ] HelloMultiLangTest %lang=cpp /71bf65a3 @daint:login+builtin
[ RUN      ] HelloMultiLangTest %lang=cpp /71bf65a3 @daint:login+gnu
[ RUN      ] HelloMultiLangTest %lang=cpp /71bf65a3 @daint:login+intel
[ RUN      ] HelloMultiLangTest %lang=cpp /71bf65a3 @daint:login+nvidia
[ RUN      ] HelloMultiLangTest %lang=cpp /71bf65a3 @daint:login+cray
[ RUN      ] HelloMultiLangTest %lang=cpp /71bf65a3 @daint:gpu+gnu
[ RUN      ] HelloMultiLangTest %lang=cpp /71bf65a3 @daint:gpu+intel
[ RUN      ] HelloMultiLangTest %lang=cpp /71bf65a3 @daint:gpu+nvidia
[ RUN      ] HelloMultiLangTest %lang=cpp /71bf65a3 @daint:gpu+cray
[ RUN      ] HelloMultiLangTest %lang=cpp /71bf65a3 @daint:mc+gnu
[ RUN      ] HelloMultiLangTest %lang=cpp /71bf65a3 @daint:mc+intel
[ RUN      ] HelloMultiLangTest %lang=cpp /71bf65a3 @daint:mc+nvidia
[ RUN      ] HelloMultiLangTest %lang=cpp /71bf65a3 @daint:mc+cray
[ RUN      ] HelloMultiLangTest %lang=c /7cfa870e @daint:login+builtin
[ RUN      ] HelloMultiLangTest %lang=c /7cfa870e @daint:login+gnu
[ RUN      ] HelloMultiLangTest %lang=c /7cfa870e @daint:login+intel
[ RUN      ] HelloMultiLangTest %lang=c /7cfa870e @daint:login+nvidia
[ RUN      ] HelloMultiLangTest %lang=c /7cfa870e @daint:login+cray
[ RUN      ] HelloMultiLangTest %lang=c /7cfa870e @daint:gpu+gnu
[ RUN      ] HelloMultiLangTest %lang=c /7cfa870e @daint:gpu+intel
[ RUN      ] HelloMultiLangTest %lang=c /7cfa870e @daint:gpu+nvidia
[ RUN      ] HelloMultiLangTest %lang=c /7cfa870e @daint:gpu+cray
[ RUN      ] HelloMultiLangTest %lang=c /7cfa870e @daint:mc+gnu
[ RUN      ] HelloMultiLangTest %lang=c /7cfa870e @daint:mc+intel
[ RUN      ] HelloMultiLangTest %lang=c /7cfa870e @daint:mc+nvidia
[ RUN      ] HelloMultiLangTest %lang=c /7cfa870e @daint:mc+cray
[ RUN      ] HelloThreadedExtended2Test /57223829 @daint:login+builtin
[ RUN      ] HelloThreadedExtended2Test /57223829 @daint:login+gnu
[ RUN      ] HelloThreadedExtended2Test /57223829 @daint:login+intel
[ RUN      ] HelloThreadedExtended2Test /57223829 @daint:login+nvidia
[ RUN      ] HelloThreadedExtended2Test /57223829 @daint:login+cray
[ RUN      ] HelloThreadedExtended2Test /57223829 @daint:gpu+gnu
[ RUN      ] HelloThreadedExtended2Test /57223829 @daint:gpu+intel
[ RUN      ] HelloThreadedExtended2Test /57223829 @daint:gpu+nvidia
[ RUN      ] HelloThreadedExtended2Test /57223829 @daint:gpu+cray
[ RUN      ] HelloThreadedExtended2Test /57223829 @daint:mc+gnu
[ RUN      ] HelloThreadedExtended2Test /57223829 @daint:mc+intel
[ RUN      ] HelloThreadedExtended2Test /57223829 @daint:mc+nvidia
[ RUN      ] HelloThreadedExtended2Test /57223829 @daint:mc+cray
[ RUN      ] StreamWithRefTest /f925207b @daint:login+gnu
[ RUN      ] StreamWithRefTest /f925207b @daint:gpu+gnu
[ RUN      ] StreamWithRefTest /f925207b @daint:mc+gnu
[       OK ] ( 1/42) HelloMultiLangTest %lang=cpp /71bf65a3 @daint:login+builtin
[       OK ] ( 2/42) HelloMultiLangTest %lang=cpp /71bf65a3 @daint:login+gnu
[       OK ] ( 3/42) HelloMultiLangTest %lang=cpp /71bf65a3 @daint:login+intel
[       OK ] ( 4/42) HelloMultiLangTest %lang=cpp /71bf65a3 @daint:login+nvidia
[       OK ] ( 5/42) HelloMultiLangTest %lang=cpp /71bf65a3 @daint:login+cray
[       OK ] ( 6/42) HelloMultiLangTest %lang=c /7cfa870e @daint:login+builtin
[       OK ] ( 7/42) HelloMultiLangTest %lang=c /7cfa870e @daint:login+gnu
[       OK ] ( 8/42) HelloMultiLangTest %lang=c /7cfa870e @daint:login+intel
[       OK ] ( 9/42) HelloMultiLangTest %lang=c /7cfa870e @daint:login+nvidia
[       OK ] (10/42) HelloMultiLangTest %lang=c /7cfa870e @daint:login+cray
[       OK ] (11/42) HelloMultiLangTest %lang=cpp /71bf65a3 @daint:gpu+cray
[       OK ] (12/42) HelloMultiLangTest %lang=cpp /71bf65a3 @daint:mc+nvidia
[       OK ] (13/42) HelloMultiLangTest %lang=cpp /71bf65a3 @daint:mc+cray
[       OK ] (14/42) HelloMultiLangTest %lang=c /7cfa870e @daint:mc+cray
[       OK ] (15/42) HelloMultiLangTest %lang=cpp /71bf65a3 @daint:gpu+nvidia
[       OK ] (16/42) HelloMultiLangTest %lang=c /7cfa870e @daint:gpu+intel
[       OK ] (17/42) HelloMultiLangTest %lang=c /7cfa870e @daint:gpu+nvidia
[       OK ] (18/42) HelloMultiLangTest %lang=c /7cfa870e @daint:mc+intel
[       OK ] (19/42) HelloThreadedExtended2Test /57223829 @daint:login+builtin
[       OK ] (20/42) HelloThreadedExtended2Test /57223829 @daint:login+gnu
[       OK ] (21/42) HelloThreadedExtended2Test /57223829 @daint:login+intel
[       OK ] (22/42) HelloMultiLangTest %lang=c /7cfa870e @daint:gpu+cray
[       OK ] (23/42) HelloMultiLangTest %lang=c /7cfa870e @daint:mc+gnu
[       OK ] (24/42) HelloThreadedExtended2Test /57223829 @daint:login+nvidia
[       OK ] (25/42) HelloThreadedExtended2Test /57223829 @daint:login+cray
[       OK ] (26/42) HelloMultiLangTest %lang=c /7cfa870e @daint:mc+nvidia
[       OK ] (27/42) HelloMultiLangTest %lang=cpp /71bf65a3 @daint:gpu+gnu
[       OK ] (28/42) HelloMultiLangTest %lang=cpp /71bf65a3 @daint:gpu+intel
[       OK ] (29/42) HelloMultiLangTest %lang=cpp /71bf65a3 @daint:mc+gnu
[       OK ] (30/42) HelloMultiLangTest %lang=cpp /71bf65a3 @daint:mc+intel
[       OK ] (31/42) HelloMultiLangTest %lang=c /7cfa870e @daint:gpu+gnu
[       OK ] (32/42) StreamWithRefTest /f925207b @daint:login+gnu
P: Copy: 71061.6 MB/s (r:0, l:None, u:None)
P: Scale: 44201.5 MB/s (r:0, l:None, u:None)
P: Add: 48178.5 MB/s (r:0, l:None, u:None)
P: Triad: 48063.3 MB/s (r:0, l:None, u:None)
[       OK ] (33/42) HelloThreadedExtended2Test /57223829 @daint:mc+cray
[       OK ] (34/42) HelloThreadedExtended2Test /57223829 @daint:mc+intel
[       OK ] (35/42) HelloThreadedExtended2Test /57223829 @daint:mc+gnu
[       OK ] (36/42) HelloThreadedExtended2Test /57223829 @daint:mc+nvidia
[       OK ] (37/42) StreamWithRefTest /f925207b @daint:mc+gnu
P: Copy: 52660.1 MB/s (r:0, l:None, u:None)
P: Scale: 33117.6 MB/s (r:0, l:None, u:None)
P: Add: 34876.7 MB/s (r:0, l:None, u:None)
P: Triad: 35150.7 MB/s (r:0, l:None, u:None)
[       OK ] (38/42) HelloThreadedExtended2Test /57223829 @daint:gpu+intel
[       OK ] (39/42) HelloThreadedExtended2Test /57223829 @daint:gpu+cray
[       OK ] (40/42) HelloThreadedExtended2Test /57223829 @daint:gpu+nvidia
[       OK ] (41/42) HelloThreadedExtended2Test /57223829 @daint:gpu+gnu
[       OK ] (42/42) StreamWithRefTest /f925207b @daint:gpu+gnu
P: Copy: 49682.3 MB/s (r:0, l:None, u:None)
P: Scale: 34452.3 MB/s (r:0, l:None, u:None)
P: Add: 38030.7 MB/s (r:0, l:None, u:None)
P: Triad: 38379.0 MB/s (r:0, l:None, u:None)
[----------] all spawned checks have finished

[  PASSED  ] Ran 42/42 test case(s) from 4 check(s) (0 failure(s), 0 skipped)
[==========] Finished on Tue Nov 15 18:22:48 2022

================================================================================
PERFORMANCE REPORT
--------------------------------------------------------------------------------
[StreamWithRefTest /f925207b @daint:login:gnu]
  num_gpus_per_node: 0
  num_tasks: 1
  performance:
    - Copy: 71061.6 MB/s (r: 0 MB/s l: -inf% u: +inf%)
    - Scale: 44201.5 MB/s (r: 0 MB/s l: -inf% u: +inf%)
    - Add: 48178.5 MB/s (r: 0 MB/s l: -inf% u: +inf%)
    - Triad: 48063.3 MB/s (r: 0 MB/s l: -inf% u: +inf%)
[StreamWithRefTest /f925207b @daint:gpu:gnu]
  num_gpus_per_node: 0
  num_tasks: 1
  performance:
    - Copy: 49682.3 MB/s (r: 0 MB/s l: -inf% u: +inf%)
    - Scale: 34452.3 MB/s (r: 0 MB/s l: -inf% u: +inf%)
    - Add: 38030.7 MB/s (r: 0 MB/s l: -inf% u: +inf%)
    - Triad: 38379.0 MB/s (r: 0 MB/s l: -inf% u: +inf%)
[StreamWithRefTest /f925207b @daint:mc:gnu]
  num_gpus_per_node: 0
  num_tasks: 1
  performance:
    - Copy: 52660.1 MB/s (r: 0 MB/s l: -inf% u: +inf%)
    - Scale: 33117.6 MB/s (r: 0 MB/s l: -inf% u: +inf%)
    - Add: 34876.7 MB/s (r: 0 MB/s l: -inf% u: +inf%)
    - Triad: 35150.7 MB/s (r: 0 MB/s l: -inf% u: +inf%)
--------------------------------------------------------------------------------
Run report saved in '/home/user/.reframe/reports/run-report-1.json'
Log file(s) saved in '/tmp/rfm-nyqs7jb9.log'

There it is! Without any change in our tests, we could simply run them in a HPC cluster with all of its intricacies. Notice how our original four tests expanded to more than 40 test cases on that particular HPC cluster! One reason we could run immediately our tests on a new system was that we have not been restricting neither the valid system they can run nor the valid programming environments they can run with (except for the STREAM test). Otherwise we would have to add daint and its corresponding programming environments in valid_systems and valid_prog_environs lists respectively.

Tip

A quick way to try a test on a new system, if it’s not generic, is to pass the --skip-system-check and the --skip-prgenv-check command line options which will cause ReFrame to skip any test validity checks for systems or programming environments.

Although the tests remain the same, ReFrame has generated completely different job scripts for each test depending on where it was going to run. Let’s check the job script generated for the StreamWithRefTest:

cat output/daint/gpu/gnu/StreamWithRefTest/rfm_StreamWithRefTest_job.sh
#!/bin/bash
#SBATCH --job-name="rfm_StreamWithRefTest_job"
#SBATCH --ntasks=1
#SBATCH --output=rfm_StreamWithRefTest_job.out
#SBATCH --error=rfm_StreamWithRefTest_job.err
#SBATCH --time=0:10:0
#SBATCH -A csstaff
#SBATCH --constraint=gpu
module unload PrgEnv-cray
module load PrgEnv-gnu
export OMP_NUM_THREADS=4
export OMP_PLACES=cores
srun ./StreamWithRefTest

Whereas the exact same test running on our laptop was as simple as the following:

#!/bin/bash
export OMP_NUM_THREADS=4
export OMP_PLACES=cores
 ./StreamWithRefTest

In ReFrame, you don’t have to care about all the system interaction details, but rather about the logic of your tests as we shall see in the next section.

Adapting a test to new systems and programming environments

Unless a test is rather generic, you will need to make some adaptations for the system that you port it to. In this case, we will adapt the STREAM benchmark so as to run it with multiple compiler and adjust its execution based on the target architecture of each partition. Let’s see and comment the changes:

cat tutorials/basics/stream/stream4.py
import reframe as rfm
import reframe.utility.sanity as sn


@rfm.simple_test
class StreamMultiSysTest(rfm.RegressionTest):
    valid_systems = ['*']
    valid_prog_environs = ['cray', 'gnu', 'intel', 'nvidia']
    prebuild_cmds = [
        'wget https://raw.githubusercontent.com/jeffhammond/STREAM/master/stream.c'  # noqa: E501
    ]
    build_system = 'SingleSource'
    sourcepath = 'stream.c'
    env_vars = {
        'OMP_NUM_THREADS': 4,
        'OMP_PLACES': 'cores'
    }
    reference = {
        'catalina': {
            'Copy':  (25200, -0.05, 0.05, 'MB/s'),
            'Scale': (16800, -0.05, 0.05, 'MB/s'),
            'Add':   (18500, -0.05, 0.05, 'MB/s'),
            'Triad': (18800, -0.05, 0.05, 'MB/s')
        }
    }

    # Flags per programming environment
    flags = variable(dict, value={
        'cray':  ['-fopenmp', '-O3', '-Wall'],
        'gnu':   ['-fopenmp', '-O3', '-Wall'],
        'intel': ['-qopenmp', '-O3', '-Wall'],
        'nvidia':   ['-mp', '-O3']
    })

    # Number of cores for each system
    cores = variable(dict, value={
        'catalina:default': 4,
        'daint:gpu': 12,
        'daint:mc': 36,
        'daint:login': 10
    })

    @run_before('compile')
    def set_compiler_flags(self):
        self.build_system.cppflags = ['-DSTREAM_ARRAY_SIZE=$((1 << 25))']
        environ = self.current_environ.name
        self.build_system.cflags = self.flags.get(environ, [])

    @run_before('run')
    def set_num_threads(self):
        num_threads = self.cores.get(self.current_partition.fullname, 1)
        self.num_cpus_per_task = num_threads
        self.env_vars = {
            'OMP_NUM_THREADS': num_threads,
            'OMP_PLACES': 'cores'
        }

    @sanity_function
    def validate_solution(self):
        return sn.assert_found(r'Solution Validates', self.stdout)

    @performance_function('MB/s')
    def extract_bw(self, kind='Copy'):
        if kind not in {'Copy', 'Scale', 'Add', 'Triad'}:
            raise ValueError(f'illegal value in argument kind ({kind!r})')

        return sn.extractsingle(rf'{kind}:\s+(\S+)\s+.*',
                                self.stdout, 1, float)

    @run_before('performance')
    def set_perf_variables(self):
        self.perf_variables = {
            'Copy': self.extract_bw(),
            'Scale': self.extract_bw('Scale'),
            'Add': self.extract_bw('Add'),
            'Triad': self.extract_bw('Triad'),
        }

First of all, we need to add the new programming environments in the list of the supported ones. Now there is the problem that each compiler has its own flags for enabling OpenMP, so we need to differentiate the behavior of the test based on the programming environment. For this reason, we define the flags for each compiler in a separate dictionary (flags variable) and we set them in the set_compiler_flags() pipeline hook. We have first seen the pipeline hooks in the multithreaded “Hello, World!” example and now we explain them in more detail. When ReFrame loads a test file, it instantiates all the tests it finds in it. Based on the system ReFrame runs on and the supported environments of the tests, it will generate different test cases for each system partition and environment combination and it will finally send the test cases for execution. During its execution, a test case goes through the regression test pipeline, which is a series of well defined phases. Users can attach arbitrary functions to run before or after any pipeline stage and this is exactly what the set_compiler_flags() function is. We instruct ReFrame to run this function before the test enters the compile stage and set accordingly the compilation flags. The system partition and the programming environment of the currently running test case are available to a ReFrame test through the current_partition and current_environ attributes respectively. These attributes, however, are only set after the first stage (setup) of the pipeline is executed, so we can’t use them inside the test’s constructor.

We do exactly the same for setting the OMP_NUM_THREADS environment variables depending on the system partition we are running on, by attaching the set_num_threads() pipeline hook to the run phase of the test. In that same hook we also set the num_cpus_per_task attribute of the test, so as to instruct the backend job scheduler to properly assign CPU cores to the test. In ReFrame tests you can set a series of task allocation attributes that will be used by the backend schedulers to emit the right job submission script. The section Mapping of Test Attributes to Job Scheduler Backends of the Test API Reference summarizes these attributes and the actual backend scheduler options that they correspond to.

For more information about the regression test pipeline and how ReFrame executes the tests in general, have a look at How ReFrame Executes Tests.

Note

ReFrame tests are ordinary Python classes so you can define your own attributes as we do with flags and cores in this example.

Let’s run our adapted test now:

./bin/reframe -c tutorials/basics/stream/stream4.py -r --performance-report
[ReFrame Setup]
  version:           4.0.0-dev.2
  command:           './bin/reframe -c tutorials/basics/stream/stream4.py -r --performance-report'
  launched by:       user@host
  working directory: '/home/user/Devel/reframe'
  settings files:    '<builtin>', '/home/user/Devel/reframe/tutorials/config/daint.py'
  check search path: '/home/user/Devel/reframe/tutorials/basics/stream/stream4.py'
  stage directory:   '/home/user/Devel/reframe/stage'
  output directory:  '/home/user/Devel/reframe/output'
  log files:         '/tmp/rfm-yf6xjn_4.log'

[==========] Running 1 check(s)
[==========] Started on Tue Nov 15 18:22:48 2022

[----------] start processing checks
[ RUN      ] StreamMultiSysTest /eec1c676 @daint:login+gnu
[ RUN      ] StreamMultiSysTest /eec1c676 @daint:login+intel
[ RUN      ] StreamMultiSysTest /eec1c676 @daint:login+nvidia
[ RUN      ] StreamMultiSysTest /eec1c676 @daint:login+cray
[ RUN      ] StreamMultiSysTest /eec1c676 @daint:gpu+gnu
[ RUN      ] StreamMultiSysTest /eec1c676 @daint:gpu+intel
[ RUN      ] StreamMultiSysTest /eec1c676 @daint:gpu+nvidia
[ RUN      ] StreamMultiSysTest /eec1c676 @daint:gpu+cray
[ RUN      ] StreamMultiSysTest /eec1c676 @daint:mc+gnu
[ RUN      ] StreamMultiSysTest /eec1c676 @daint:mc+intel
[ RUN      ] StreamMultiSysTest /eec1c676 @daint:mc+nvidia
[ RUN      ] StreamMultiSysTest /eec1c676 @daint:mc+cray
[       OK ] ( 1/12) StreamMultiSysTest /eec1c676 @daint:login+gnu
P: Copy: 97772.6 MB/s (r:0, l:None, u:None)
P: Scale: 69418.6 MB/s (r:0, l:None, u:None)
P: Add: 71941.0 MB/s (r:0, l:None, u:None)
P: Triad: 73679.7 MB/s (r:0, l:None, u:None)
[       OK ] ( 2/12) StreamMultiSysTest /eec1c676 @daint:login+intel
P: Copy: 85123.0 MB/s (r:0, l:None, u:None)
P: Scale: 79701.7 MB/s (r:0, l:None, u:None)
P: Add: 81632.7 MB/s (r:0, l:None, u:None)
P: Triad: 44391.5 MB/s (r:0, l:None, u:None)
[       OK ] ( 3/12) StreamMultiSysTest /eec1c676 @daint:login+nvidia
P: Copy: 76641.4 MB/s (r:0, l:None, u:None)
P: Scale: 59041.9 MB/s (r:0, l:None, u:None)
P: Add: 64792.5 MB/s (r:0, l:None, u:None)
P: Triad: 69441.4 MB/s (r:0, l:None, u:None)
[       OK ] ( 4/12) StreamMultiSysTest /eec1c676 @daint:login+cray
P: Copy: 35658.5 MB/s (r:0, l:None, u:None)
P: Scale: 27732.2 MB/s (r:0, l:None, u:None)
P: Add: 39037.7 MB/s (r:0, l:None, u:None)
P: Triad: 45310.3 MB/s (r:0, l:None, u:None)
[       OK ] ( 5/12) StreamMultiSysTest /eec1c676 @daint:gpu+gnu
P: Copy: 42666.3 MB/s (r:0, l:None, u:None)
P: Scale: 38491.0 MB/s (r:0, l:None, u:None)
P: Add: 43686.4 MB/s (r:0, l:None, u:None)
P: Triad: 43466.6 MB/s (r:0, l:None, u:None)
[       OK ] ( 6/12) StreamMultiSysTest /eec1c676 @daint:gpu+intel
P: Copy: 51726.7 MB/s (r:0, l:None, u:None)
P: Scale: 54185.6 MB/s (r:0, l:None, u:None)
P: Add: 57608.3 MB/s (r:0, l:None, u:None)
P: Triad: 57390.7 MB/s (r:0, l:None, u:None)
[       OK ] ( 7/12) StreamMultiSysTest /eec1c676 @daint:gpu+nvidia
P: Copy: 51810.8 MB/s (r:0, l:None, u:None)
P: Scale: 39653.4 MB/s (r:0, l:None, u:None)
P: Add: 44008.0 MB/s (r:0, l:None, u:None)
P: Triad: 44384.4 MB/s (r:0, l:None, u:None)
[       OK ] ( 8/12) StreamMultiSysTest /eec1c676 @daint:gpu+cray
P: Copy: 51101.8 MB/s (r:0, l:None, u:None)
P: Scale: 38568.1 MB/s (r:0, l:None, u:None)
P: Add: 43193.6 MB/s (r:0, l:None, u:None)
P: Triad: 43142.9 MB/s (r:0, l:None, u:None)
[       OK ] ( 9/12) StreamMultiSysTest /eec1c676 @daint:mc+gnu
P: Copy: 48292.9 MB/s (r:0, l:None, u:None)
P: Scale: 38499.5 MB/s (r:0, l:None, u:None)
P: Add: 43555.7 MB/s (r:0, l:None, u:None)
P: Triad: 43871.4 MB/s (r:0, l:None, u:None)
[       OK ] (10/12) StreamMultiSysTest /eec1c676 @daint:mc+cray
P: Copy: 46538.3 MB/s (r:0, l:None, u:None)
P: Scale: 40133.3 MB/s (r:0, l:None, u:None)
P: Add: 43363.9 MB/s (r:0, l:None, u:None)
P: Triad: 43450.3 MB/s (r:0, l:None, u:None)
[       OK ] (11/12) StreamMultiSysTest /eec1c676 @daint:mc+nvidia
P: Copy: 46648.2 MB/s (r:0, l:None, u:None)
P: Scale: 40384.5 MB/s (r:0, l:None, u:None)
P: Add: 44001.1 MB/s (r:0, l:None, u:None)
P: Triad: 44489.7 MB/s (r:0, l:None, u:None)
[       OK ] (12/12) StreamMultiSysTest /eec1c676 @daint:mc+intel
P: Copy: 51335.9 MB/s (r:0, l:None, u:None)
P: Scale: 49490.3 MB/s (r:0, l:None, u:None)
P: Add: 56859.9 MB/s (r:0, l:None, u:None)
P: Triad: 56544.5 MB/s (r:0, l:None, u:None)
[----------] all spawned checks have finished

[  PASSED  ] Ran 12/12 test case(s) from 1 check(s) (0 failure(s), 0 skipped)
[==========] Finished on Tue Nov 15 18:24:00 2022

================================================================================
PERFORMANCE REPORT
--------------------------------------------------------------------------------
[StreamMultiSysTest /eec1c676 @daint:login:gnu]
  num_cpus_per_task: 10
  num_gpus_per_node: 0
  num_tasks: 1
  performance:
    - Copy: 97772.6 MB/s (r: 0 MB/s l: -inf% u: +inf%)
    - Scale: 69418.6 MB/s (r: 0 MB/s l: -inf% u: +inf%)
    - Add: 71941.0 MB/s (r: 0 MB/s l: -inf% u: +inf%)
    - Triad: 73679.7 MB/s (r: 0 MB/s l: -inf% u: +inf%)
[StreamMultiSysTest /eec1c676 @daint:login:intel]
  num_cpus_per_task: 10
  num_gpus_per_node: 0
  num_tasks: 1
  performance:
    - Copy: 85123.0 MB/s (r: 0 MB/s l: -inf% u: +inf%)
    - Scale: 79701.7 MB/s (r: 0 MB/s l: -inf% u: +inf%)
    - Add: 81632.7 MB/s (r: 0 MB/s l: -inf% u: +inf%)
    - Triad: 44391.5 MB/s (r: 0 MB/s l: -inf% u: +inf%)
[StreamMultiSysTest /eec1c676 @daint:login:nvidia]
  num_cpus_per_task: 10
  num_gpus_per_node: 0
  num_tasks: 1
  performance:
    - Copy: 76641.4 MB/s (r: 0 MB/s l: -inf% u: +inf%)
    - Scale: 59041.9 MB/s (r: 0 MB/s l: -inf% u: +inf%)
    - Add: 64792.5 MB/s (r: 0 MB/s l: -inf% u: +inf%)
    - Triad: 69441.4 MB/s (r: 0 MB/s l: -inf% u: +inf%)
[StreamMultiSysTest /eec1c676 @daint:login:cray]
  num_cpus_per_task: 10
  num_gpus_per_node: 0
  num_tasks: 1
  performance:
    - Copy: 35658.5 MB/s (r: 0 MB/s l: -inf% u: +inf%)
    - Scale: 27732.2 MB/s (r: 0 MB/s l: -inf% u: +inf%)
    - Add: 39037.7 MB/s (r: 0 MB/s l: -inf% u: +inf%)
    - Triad: 45310.3 MB/s (r: 0 MB/s l: -inf% u: +inf%)
[StreamMultiSysTest /eec1c676 @daint:gpu:gnu]
  num_cpus_per_task: 12
  num_gpus_per_node: 0
  num_tasks: 1
  performance:
    - Copy: 42666.3 MB/s (r: 0 MB/s l: -inf% u: +inf%)
    - Scale: 38491.0 MB/s (r: 0 MB/s l: -inf% u: +inf%)
    - Add: 43686.4 MB/s (r: 0 MB/s l: -inf% u: +inf%)
    - Triad: 43466.6 MB/s (r: 0 MB/s l: -inf% u: +inf%)
[StreamMultiSysTest /eec1c676 @daint:gpu:intel]
  num_cpus_per_task: 12
  num_gpus_per_node: 0
  num_tasks: 1
  performance:
    - Copy: 51726.7 MB/s (r: 0 MB/s l: -inf% u: +inf%)
    - Scale: 54185.6 MB/s (r: 0 MB/s l: -inf% u: +inf%)
    - Add: 57608.3 MB/s (r: 0 MB/s l: -inf% u: +inf%)
    - Triad: 57390.7 MB/s (r: 0 MB/s l: -inf% u: +inf%)
[StreamMultiSysTest /eec1c676 @daint:gpu:nvidia]
  num_cpus_per_task: 12
  num_gpus_per_node: 0
  num_tasks: 1
  performance:
    - Copy: 51810.8 MB/s (r: 0 MB/s l: -inf% u: +inf%)
    - Scale: 39653.4 MB/s (r: 0 MB/s l: -inf% u: +inf%)
    - Add: 44008.0 MB/s (r: 0 MB/s l: -inf% u: +inf%)
    - Triad: 44384.4 MB/s (r: 0 MB/s l: -inf% u: +inf%)
[StreamMultiSysTest /eec1c676 @daint:gpu:cray]
  num_cpus_per_task: 12
  num_gpus_per_node: 0
  num_tasks: 1
  performance:
    - Copy: 51101.8 MB/s (r: 0 MB/s l: -inf% u: +inf%)
    - Scale: 38568.1 MB/s (r: 0 MB/s l: -inf% u: +inf%)
    - Add: 43193.6 MB/s (r: 0 MB/s l: -inf% u: +inf%)
    - Triad: 43142.9 MB/s (r: 0 MB/s l: -inf% u: +inf%)
[StreamMultiSysTest /eec1c676 @daint:mc:gnu]
  num_cpus_per_task: 36
  num_gpus_per_node: 0
  num_tasks: 1
  performance:
    - Copy: 48292.9 MB/s (r: 0 MB/s l: -inf% u: +inf%)
    - Scale: 38499.5 MB/s (r: 0 MB/s l: -inf% u: +inf%)
    - Add: 43555.7 MB/s (r: 0 MB/s l: -inf% u: +inf%)
    - Triad: 43871.4 MB/s (r: 0 MB/s l: -inf% u: +inf%)
[StreamMultiSysTest /eec1c676 @daint:mc:intel]
  num_cpus_per_task: 36
  num_gpus_per_node: 0
  num_tasks: 1
  performance:
    - Copy: 51335.9 MB/s (r: 0 MB/s l: -inf% u: +inf%)
    - Scale: 49490.3 MB/s (r: 0 MB/s l: -inf% u: +inf%)
    - Add: 56859.9 MB/s (r: 0 MB/s l: -inf% u: +inf%)
    - Triad: 56544.5 MB/s (r: 0 MB/s l: -inf% u: +inf%)
[StreamMultiSysTest /eec1c676 @daint:mc:nvidia]
  num_cpus_per_task: 36
  num_gpus_per_node: 0
  num_tasks: 1
  performance:
    - Copy: 46648.2 MB/s (r: 0 MB/s l: -inf% u: +inf%)
    - Scale: 40384.5 MB/s (r: 0 MB/s l: -inf% u: +inf%)
    - Add: 44001.1 MB/s (r: 0 MB/s l: -inf% u: +inf%)
    - Triad: 44489.7 MB/s (r: 0 MB/s l: -inf% u: +inf%)
[StreamMultiSysTest /eec1c676 @daint:mc:cray]
  num_cpus_per_task: 36
  num_gpus_per_node: 0
  num_tasks: 1
  performance:
    - Copy: 46538.3 MB/s (r: 0 MB/s l: -inf% u: +inf%)
    - Scale: 40133.3 MB/s (r: 0 MB/s l: -inf% u: +inf%)
    - Add: 43363.9 MB/s (r: 0 MB/s l: -inf% u: +inf%)
    - Triad: 43450.3 MB/s (r: 0 MB/s l: -inf% u: +inf%)
--------------------------------------------------------------------------------
Run report saved in '/home/user/.reframe/reports/run-report-2.json'
Log file(s) saved in '/tmp/rfm-yf6xjn_4.log'

Notice the improved performance of the benchmark in all partitions and the differences in performance between the different compilers.

This concludes our introductory tutorial to ReFrame!