Tutorial 3: Miscellaneous topics¶
This page collects several smaller tutorials that show specific parts of ReFrame.
They all use the configuration file presented in Tutorial 1: Getting Started with ReFrame, which you can find in tutorials/config/settings.py
.
They also assumes that the reader is already familiar with the concepts presented in the basic tutorial.
Testing a CUDA Code¶
In this example, we will create a regression test for a simple CUDA matrix-vector multiplication kernel.
import reframe as rfm
import reframe.utility.sanity as sn
@rfm.simple_test
class CUDATest(rfm.RegressionTest):
def __init__(self):
self.descr = 'Matrix-vector multiplication example with CUDA'
self.valid_systems = ['daint:gpu']
self.valid_prog_environs = ['cray', 'gnu', 'pgi']
self.sourcepath = 'matvec.cu'
self.executable_opts = ['1024', '100']
self.modules = ['cudatoolkit']
self.sanity_patterns = sn.assert_found(
r'time for single matrix vector multiplication', self.stdout
)
There are three new things to notice in this test.
First, we restrict the list of valid systems only to the hybrid partition of Piz Daint, since we require GPU nodes.
Second, we set the sourcepath
to the CUDA source file as we would do with any other C, C++ or Fortran file.
ReFrame will recognize the .cu
extension of the source file and it will try to invoke nvcc
for compiling the code.
Finally, we define the modules
attribute.
This is essentially a list of environment modules that need to be loaded for running the test.
In this case and in this particular system, we need to load the cudatoolkit
module, which will make available the CUDA SDK.
More On Building Tests¶
We have already seen how ReFrame can compile a test with a single source file. However, ReFrame can also build tests that use Make or a configure-Make approach. We are going to demonstrate this through a simple C++ program that computes a dot-product of two vectors and is being compiled through a Makefile. Additionally, we can select the type of elements for the vectors at compilation time. Here is the C++ program:
#include <cassert>
#include <iostream>
#include <random>
#include <vector>
#ifndef ELEM_TYPE
#define ELEM_TYPE double
#endif
using elem_t = ELEM_TYPE;
template<typename T>
T dotprod(const std::vector<T> &x, const std::vector<T> &y)
{
assert(x.size() == y.size());
T sum = 0;
for (std::size_t i = 0; i < x.size(); ++i) {
sum += x[i] * y[i];
}
return sum;
}
template<typename T>
struct type_name {
static constexpr const char *value = nullptr;
};
template<>
struct type_name<float> {
static constexpr const char *value = "float";
};
template<>
struct type_name<double> {
static constexpr const char *value = "double";
};
int main(int argc, char *argv[])
{
if (argc < 2) {
std::cerr << argv[0] << ": too few arguments\n";
std::cerr << "Usage: " << argv[0] << " DIM\n";
return 1;
}
std::size_t N = std::atoi(argv[1]);
if (N < 0) {
std::cerr << argv[0]
<< ": array dimension must a positive integer: " << argv[1]
<< "\n";
return 1;
}
std::vector<elem_t> x(N), y(N);
std::random_device seed;
std::mt19937 rand(seed());
std::uniform_real_distribution<> dist(-1, 1);
for (std::size_t i = 0; i < N; ++i) {
x[i] = dist(rand);
y[i] = dist(rand);
}
std::cout << "Result (" << type_name<elem_t>::value << "): "
<< dotprod(x, y) << "\n";
return 0;
}
The directory structure for this test is the following:
tutorials/makefiles/
├── maketest.py
└── src
├── Makefile
└── dotprod.cpp
Let’s have a look at the test itself:
import reframe as rfm
import reframe.utility.sanity as sn
@rfm.parameterized_test(['float'], ['double'])
class MakefileTest(rfm.RegressionTest):
def __init__(self, elem_type):
self.descr = 'Test demonstrating use of Makefiles'
self.valid_systems = ['*']
self.valid_prog_environs = ['clang', 'gnu']
self.executable = './dotprod'
self.executable_opts = ['100000']
self.build_system = 'Make'
self.build_system.cppflags = [f'-DELEM_TYPE={elem_type}']
self.sanity_patterns = sn.assert_found(
rf'Result \({elem_type}\):', self.stdout
)
First, if you’re using any build system other than SingleSource
, you must set the executable
attribute of the test, because ReFrame cannot know what is the actual executable to be run.
We then set the build system to Make
and set the preprocessor flags as we would do with the SingleSource
build system.
Let’s inspect the build script generated by ReFrame:
cat output/catalina/default/clang/MakefileTest_float/rfm_MakefileTest_build.sh
#!/bin/bash
_onerror()
{
exitcode=$?
echo "-reframe: command \`$BASH_COMMAND' failed (exit code: $exitcode)"
exit $exitcode
}
trap _onerror ERR
make -j 1 CPPFLAGS="-DELEM_TYPE=float"
The compiler variables (CC
, CXX
etc.) are set based on the corresponding values specified in the coniguration of the current environment.
We can instruct the build system to ignore the default values from the environment by setting its flags_from_environ
attribute to false:
self.build_system.flags_from_environ = False
In this case, make
will be invoked as follows:
make -j 1 CPPFLAGS="-DELEM_TYPE=float"
Notice that the -j 1
option is always generated.
We can increase the build concurrency by setting the max_concurrency
attribute.
Finally, we may even use a custom Makefile by setting the Make
attribute:
self.build_system.max_concurrency = 4
self.build_system.makefile = 'Makefile_custom'
As a final note, as with the SingleSource
build system, it wouldn’t have been necessary to specify one in this test, if we wouldn’t have to set the CPPFLAGS.
ReFrame could automatically figure out the correct build system if sourcepath
refers to a directory.
ReFrame will inspect the directory and it will first try to determine whether this is a CMake or Autotools-based project.
If not, as in this case, it would fall back to the Make
build system.
More details on ReFrame’s build systems can be found here.
Retrieving the source code from a Git repository¶
It might be the case that a regression test needs to clone its source code from a remote repository.
This can be achieved in two ways with ReFrame.
One way is to set the sourcesdir
attribute to None
and explicitly clone a repository using the prebuild_cmds
:
self.sourcesdir = None
self.prebuild_cmds = ['git clone https://github.com/me/myrepo .']
Alternatively, we can retrieve specifically a Git repository by assigning its URL directly to the sourcesdir
attribute:
self.sourcesdir = 'https://github.com/me/myrepo'
ReFrame will attempt to clone this repository inside the stage directory by executing git clone <repo> .
and will then procede with the build procedure as usual.
Note
ReFrame recognizes only URLs in the sourcesdir
attribute and requires passwordless access to the repository.
This means that the SCP-style repository specification will not be accepted.
You will have to specify it as URL using the ssh://
protocol (see Git documentation page).
Adding a configuration step before compiling the code¶
It is often the case that a configuration step is needed before compiling a code with make
.
To address this kind of projects, ReFrame aims to offer specific abstractions for “configure-make” style of build systems.
It supports CMake-based projects through the CMake
build system, as well as Autotools-based projects through the Autotools
build system.
For other build systems, you can achieve the same effect using the Make
build system and the prebuild_cmds
for performing the configuration step.
The following code snippet will configure a code with ./custom_configure
before invoking make
:
self.prebuild_cmds = ['./custom_configure -with-mylib']
self.build_system = 'Make'
self.build_system.cppflags = ['-DHAVE_FOO']
self.build_system.flags_from_environ = False
The generated build script will then have the following lines:
./custom_configure -with-mylib
make -j 1 CPPFLAGS='-DHAVE_FOO'
Writing a Run-Only Regression Test¶
There are cases when it is desirable to perform regression testing for an already built executable.
In the following test we use simply the echo
Bash shell command to print a random integer between specific lower and upper bounds:
Here is the full regression test:
import reframe as rfm
import reframe.utility.sanity as sn
@rfm.simple_test
class EchoRandTest(rfm.RunOnlyRegressionTest):
def __init__(self):
self.descr = 'A simple test that echoes a random number'
self.valid_systems = ['*']
self.valid_prog_environs = ['*']
lower = 90
upper = 100
self.executable = 'echo'
self.executable_opts = [
'Random: ',
f'$((RANDOM%({upper}+1-{lower})+{lower}))'
]
self.sanity_patterns = sn.assert_bounded(
sn.extractsingle(
r'Random: (?P<number>\S+)', self.stdout, 'number', float
),
lower, upper
)
There is nothing special for this test compared to those presented so far except that it derives from the RunOnlyRegressionTest
.
Run-only regression tests may also have resources, as for instance a pre-compiled executable or some input data.
These resources may reside under the src/
directory or under any directory specified in the sourcesdir
attribute.
These resources will be copied to the stage directory at the beginning of the run phase.
Writing a Compile-Only Regression Test¶
ReFrame provides the option to write compile-only tests which consist only of a compilation phase without a specified executable.
This kind of tests must derive from the CompileOnlyRegressionTest
class provided by the framework.
The following test is a compile-only version of the MakefileTest
presented previously which checks that no warnings are issued by the compiler:
@rfm.parameterized_test(['float'], ['double'])
class MakeOnlyTest(rfm.CompileOnlyRegressionTest):
def __init__(self, elem_type):
self.descr = 'Test demonstrating use of Makefiles'
self.valid_systems = ['*']
self.valid_prog_environs = ['clang', 'gnu']
self.build_system = 'Make'
self.build_system.cppflags = [f'-DELEM_TYPE={elem_type}']
self.sanity_patterns = sn.assert_not_found(r'warning', self.stdout)
What is worth noting here is that the standard output and standard error of the test, which are accessible through the stdout
and stderr
attributes, correspond now to the standard output and error of the compilation command.
Therefore sanity checking can be done in exactly the same way as with a normal test.
Applying a Sanity Function Iteratively¶
It is often the case that a common sanity pattern has to be applied many times.
The following script prints 100 random integers between the limits given by the environment variables LOWER
and UPPER
.
if [ -z $LOWER ]; then
export LOWER=90
fi
if [ -z $UPPER ]; then
export UPPER=100
fi
for i in {1..100}; do
echo Random: $((RANDOM%($UPPER+1-$LOWER)+$LOWER))
done
In the corresponding regression test we want to check that all the random numbers generated lie between the two limits, which means that a common sanity check has to be applied to all the printed random numbers. Here is the corresponding regression test:
import reframe as rfm
import reframe.utility.sanity as sn
@rfm.simple_test
class DeferredIterationTest(rfm.RunOnlyRegressionTest):
def __init__(self):
self.descr = 'Apply a sanity function iteratively'
self.valid_systems = ['*']
self.valid_prog_environs = ['*']
self.executable = './random_numbers.sh'
numbers = sn.extractall(
r'Random: (?P<number>\S+)', self.stdout, 'number', float
)
self.sanity_patterns = sn.and_(
sn.assert_eq(sn.count(numbers), 100),
sn.all(sn.map(lambda x: sn.assert_bounded(x, 90, 100), numbers))
)
First, we extract all the generated random numbers from the output.
What we want to do is to apply iteratively the assert_bounded
sanity function for each number.
The problem here is that we cannot simply iterate over the numbers
list, because that would trigger prematurely the evaluation of the extractall
.
We want to defer also the iteration.
This can be achieved by using the map()
ReFrame sanity function, which is a replacement of Python’s built-in map()
function and does exactly what we want: it applies a function on all the elements of an iterable and returns another iterable with the transformed elements.
Passing the result of the map()
function to the all
sanity function ensures that all the elements lie between the desired bounds.
There is still a small complication that needs to be addressed.
As a direct replacement of the built-in all()
function, ReFrame’s all()
sanity function returns True
for empty iterables, which is not what we want.
So we must make sure that all 100 numbers are generated.
This is achieved by the sn.assert_eq(sn.count(numbers), 100)
statement, which uses the count()
sanity function for counting the generated numbers.
Finally, we need to combine these two conditions to a single deferred expression that will be assigned to the test’s sanity_patterns
.
As with the for
loop discussed above, we cannot defer the evaluation of the and
operator, so we use ReFrame’s the and_()
sanity function to accomplish this.
For more information about how exactly sanity functions work and how their execution is deferred, please refer to Understanding the Mechanism of Sanity Functions.
Customizing the Test Job Script¶
It is often the case that we need to run some commands before or after the parallel launch of our executable.
This can be easily achieved by using the prerun_cmds
and postrun_cmds
attributes of a ReFrame test.
The following example is a slightly modified version of the random numbers test presented above.
The lower and upper limits for the random numbers are now set inside a helper shell script in limits.sh
located in the test’s resources, which we need to source before running our tests.
Additionally, we want also to print FINISHED
after our executable has finished.
Here is the modified test file:
import reframe as rfm
import reframe.utility.sanity as sn
@rfm.simple_test
class PrepostRunTest(rfm.RunOnlyRegressionTest):
def __init__(self):
self.descr = 'Pre- and post-run demo test'
self.valid_systems = ['*']
self.valid_prog_environs = ['*']
self.prerun_cmds = ['source limits.sh']
self.postrun_cmds = ['echo FINISHED']
self.executable = './random_numbers.sh'
numbers = sn.extractall(
r'Random: (?P<number>\S+)', self.stdout, 'number', float
)
self.sanity_patterns = sn.all([
sn.assert_eq(sn.count(numbers), 100),
sn.all(sn.map(lambda x: sn.assert_bounded(x, 90, 100), numbers)),
sn.assert_found(r'FINISHED', self.stdout)
])
The prerun_cmds
and postrun_cmds
are lists of commands to be emitted in the generated job script before and after the parallel launch of the executable.
Obviously, the working directory for these commands is that of the job script itself, which is the stage directory of the test.
The generated job script for this test looks like the following:
cat output/catalina/default/gnu/PrepostRunTest/rfm_PrepostRunTest_job.sh
#!/bin/bash
source limits.sh
./random_numbers.sh
echo FINISHED
Generally, ReFrame generates the job shell scripts using the following pattern:
#!/bin/bash -l
{job_scheduler_preamble}
{test_environment}
{prerun_cmds}
{parallel_launcher} {executable} {executable_opts}
{postrun_cmds}
The job_scheduler_preamble
contains the backend job scheduler directives that control the job allocation.
The test_environment
are the necessary commands for setting up the environment of the test.
These include any modules or environment variables set at the system partition level or any modules or environment variables set at the test level.
Then the commands specified in prerun_cmds
follow, while those specified in the postrun_cmds
come after the launch of the parallel job.
The parallel launch itself consists of three parts:
The parallel launcher program (e.g.,
srun
,mpirun
etc.) with its options,the regression test executable as specified in the
executable
attribute andthe options to be passed to the executable as specified in the
executable_opts
attribute.
Flexible Regression Tests¶
New in version 2.15.
ReFrame can automatically set the number of tasks of a particular test, if its num_tasks
attribute is set to a negative value or zero.
In ReFrame’s terminology, such tests are called flexible.
Negative values indicate the minimum number of tasks that are acceptable for this test (a value of -4
indicates that at least 4
tasks are required).
A zero value indicates the default minimum number of tasks which is equal to num_tasks_per_node
.
By default, ReFrame will spawn such a test on all the idle nodes of the current system partition, but this behavior can be adjusted with the --flex-alloc-nodes
command-line option.
Flexible tests are very useful for diagnostics tests, e.g., tests for checking the health of a whole set nodes.
In this example, we demonstrate this feature through a simple test that runs hostname
.
The test will verify that all the nodes print the expected host name:
import reframe as rfm
import reframe.utility.sanity as sn
@rfm.simple_test
class HostnameCheck(rfm.RunOnlyRegressionTest):
def __init__(self):
self.valid_systems = ['daint:gpu', 'daint:mc']
self.valid_prog_environs = ['cray']
self.executable = 'hostname'
self.num_tasks = 0
self.num_tasks_per_node = 1
self.sanity_patterns = sn.assert_eq(
sn.getattr(self, 'num_tasks'),
sn.count(sn.findall(r'nid\d+', self.stdout))
)
The first thing to notice in this test is that num_tasks
is set to zero.
This is a requirement for flexible tests.
The sanity check of this test simply counts the host names printed and verifies that they are as many as expected.
Notice, however, that the sanity check does not use num_tasks
directly, but rather access the attribute through the sn.getattr()
sanity function, which is a replacement for the getattr()
builtin.
The reason for that is that at the time the sanity check expression is created, num_tasks
is 0
and it will only be set to its actual value during the run phase.
Consequently, we need to defer the attribute retrieval, thus we use the sn.getattr()
sanity function instead of accessing it directly
Testing containerized applications¶
New in version 2.20.
ReFrame can be used also to test applications that run inside a container. First, we need to enable the container platform support in ReFrame’s configuration and, specifically, at the partition configuration level:
'descr': 'Hybrid nodes',
'scheduler': 'slurm',
'launcher': 'srun',
'access': ['-C gpu', '-A csstaff'],
'environs': ['gnu', 'intel', 'pgi', 'cray'],
'container_platforms': [
{
'type': 'Singularity',
'modules': ['singularity']
}
],
'max_jobs': 100
},
{
'name': 'mc',
For each partition, users can define a list of container platforms supported using the container_platforms
configuration parameter.
In this case, we define the Singularity platform, for which we set the modules
parameter in order to instruct ReFrame to load the singularity
module, whenever it needs to run with this container platform.
The following test will use a Singularity container to run:
import reframe as rfm
import reframe.utility.sanity as sn
@rfm.simple_test
class ContainerTest(rfm.RunOnlyRegressionTest):
def __init__(self):
self.descr = 'Run commands inside a container'
self.valid_systems = ['daint:gpu']
self.valid_prog_environs = ['cray']
self.container_platform = 'Singularity'
self.container_platform.image = 'docker://ubuntu:18.04'
self.container_platform.commands = [
'pwd', 'ls', 'cat /etc/os-release'
]
self.container_platform.workdir = '/workdir'
self.sanity_patterns = sn.all([
sn.assert_found(r'^' + self.container_platform.workdir,
self.stdout),
sn.assert_found(r'18.04.\d+ LTS \(Bionic Beaver\)', self.stdout),
])
A container-based test can be written as RunOnlyRegressionTest
that sets the container_platform
attribute.
This attribute accepts a string that corresponds to the name of the container platform that will be used to run the container for this test.
In this case, the test will be using Singularity as a container platform.
If such a platform is not configured for the current system, the test will fail.
As soon as the container platform to be used is defined, you need to specify the container image to use and the commands to run inside the container by setting the image
and the commands
container platform attributes.
These two attributes are mandatory for container-based checks.
It is important to note that the executable
and executable_opts
attributes of the actual test are ignored in case of container-based tests.
ReFrame will run the container as follows:
singularity exec -B"/path/to/test/stagedir:/workdir" docker://ubuntu:18.04 bash -c 'cd rfm_workdir; pwd; ls; cat /etc/os-release'
By default ReFrame will mount the stage directory of the test under /rfm_workdir
inside the container and it will always prepend a cd
command to that directory.
The user commands are then run from that directory one after the other.
Once the commands are executed, the container is stopped and ReFrame goes on with the sanity and performance checks.
Users may also change the default mount point of the stage directory by using workdir
attribute:
Besides the stage directory, additional mount points can be specified through the mount_points
attribute:
self.container_platform.mount_points = [('/path/to/host/dir1', '/path/to/container/mount_point1'),
('/path/to/host/dir2', '/path/to/container/mount_point2')]
For a complete list of the available attributes of a specific container platform, please have a look at the Container Platforms section of the ReFrame Programming APIs guide. On how to configure ReFrame for running containerized tests, please have a look at the Container Platform Configuration section of the Configuration Reference.