Tutorial 2: Using Dependencies in ReFrame Tests

New in version 2.21.

A ReFrame test may define dependencies to other tests. An example scenario is to test different runtime configurations of a benchmark that you need to compile, or run a scaling analysis of a code. In such cases, you don’t want to rebuild your test for each runtime configuration. You could have a build test, which all runtime tests would depend on. This is the approach we take with the following example, that fetches, builds and runs several OSU benchmarks. We first create a basic compile-only test, that fetches the benchmarks and builds them for the different programming environments:

@rfm.simple_test
class OSUBuildTest(rfm.CompileOnlyRegressionTest):
    def __init__(self):
        self.descr = 'OSU benchmarks build test'
        self.valid_systems = ['daint:gpu']
        self.valid_prog_environs = ['gnu', 'pgi', 'intel']
        self.sourcesdir = None
        self.prebuild_cmds = [
            'wget http://mvapich.cse.ohio-state.edu/download/mvapich/osu-micro-benchmarks-5.6.2.tar.gz',
            'tar xzf osu-micro-benchmarks-5.6.2.tar.gz',
            'cd osu-micro-benchmarks-5.6.2'
        ]
        self.build_system = 'Autotools'
        self.build_system.max_concurrency = 8
        self.sanity_patterns = sn.assert_not_found('error', self.stderr)

There is nothing particular to that test, except perhaps that you can set sourcesdir to None even for a test that needs to compile something. In such a case, you should at least provide the commands that fetch the code inside the prebuild_cmds attribute.

For the next test we need to use the OSU benchmark binaries that we just built, so as to run the MPI ping-pong benchmark. Here is the relevant part:

class OSUBenchmarkTestBase(rfm.RunOnlyRegressionTest):
    '''Base class of OSU benchmarks runtime tests'''

    def __init__(self):
        self.valid_systems = ['daint:gpu']
        self.valid_prog_environs = ['gnu', 'pgi', 'intel']
        self.sourcesdir = None
        self.num_tasks = 2
        self.num_tasks_per_node = 1
        self.sanity_patterns = sn.assert_found(r'^8', self.stdout)


@rfm.simple_test
class OSULatencyTest(OSUBenchmarkTestBase):
    def __init__(self):
        super().__init__()
        self.descr = 'OSU latency test'
        self.perf_patterns = {
            'latency': sn.extractsingle(r'^8\s+(\S+)', self.stdout, 1, float)
        }
        self.depends_on('OSUBuildTest')
        self.reference = {
            '*': {'latency': (0, None, None, 'us')}
        }

    @rfm.require_deps
    def set_executable(self, OSUBuildTest):
        self.executable = os.path.join(
            OSUBuildTest().stagedir,
            'osu-micro-benchmarks-5.6.2', 'mpi', 'pt2pt', 'osu_latency'
        )
        self.executable_opts = ['-x', '100', '-i', '1000']

First, since we will have multiple similar benchmarks, we move all the common functionality to the OSUBenchmarkTestBase base class. Again nothing new here; we are going to use two nodes for the benchmark and we set sourcesdir to None, since none of the benchmark tests will use any additional resources. The new part comes in with the OSULatencyTest test in the following line:

        self.depends_on('OSUBuildTest')

Here we tell ReFrame that this test depends on a test named OSUBuildTest. This test may or may not be defined in the same test file; all ReFrame needs is the test name. By default, the depends_on() function will create dependencies between the individual test cases of the OSULatencyTest and the OSUBuildTest, such that the OSULatencyTest using PrgEnv-gnu will depend on the outcome of the OSUBuildTest using PrgEnv-gnu, but not on the outcome of the OSUBuildTest using PrgEnv-intel. This behaviour can be changed, but it is covered in detail in How Test Dependencies Work In ReFrame. You can create arbitrary test dependency graphs, but they need to be acyclic. If ReFrame detects cyclic dependencies, it will refuse to execute the set of tests and will issue an error pointing out the cycle.

A ReFrame test with dependencies will execute, i.e., enter its setup stage, only after all of its dependencies have succeeded. If any of its dependencies fails, the current test will be marked as failure as well.

The next step for the OSULatencyTest is to set its executable to point to the binary produced by the OSUBuildTest. This is achieved with the following specially decorated function:

    @rfm.require_deps
    def set_executable(self, OSUBuildTest):
        self.executable = os.path.join(
            OSUBuildTest().stagedir,
            'osu-micro-benchmarks-5.6.2', 'mpi', 'pt2pt', 'osu_latency'
        )
        self.executable_opts = ['-x', '100', '-i', '1000']

The @require_deps decorator will bind the arguments passed to the decorated function to the result of the dependency that each argument names. In this case, it binds the OSUBuildTest function argument to the result of a dependency named OSUBuildTest. In order for the binding to work correctly the function arguments must be named after the target dependencies. However, referring to a dependency only by the test’s name is not enough, since a test might be associated with multiple programming environments. For this reason, a dependency argument is actually bound to a function that accepts as argument the name of a target programming environment. If no arguments are passed to that function, as in this example, the current programming environment is implied, such that OSUBuildTest() is equivalent to OSUBuildTest(self.current_environ.name). This call returns the actual test case of the dependency that has been executed. This allows you to access any attribute from the target test, as we do in this example by accessing the target test’s stage directory, which we use to construct the path of the executable. This concludes the presentation of the OSULatencyTest test. The OSUBandwidthTest is completely analogous.

The OSUAllreduceTest shown below is similar to the other two, except that it is parameterized. It is essentially a scalability test that is running the osu_allreduce executable created by the OSUBuildTest for 2, 4, 8 and 16 nodes.

@rfm.parameterized_test(*([1 << i] for i in range(1, 5)))
class OSUAllreduceTest(OSUBenchmarkTestBase):
    def __init__(self, num_tasks):
        super().__init__()
        self.descr = 'OSU Allreduce test'
        self.perf_patterns = {
            'latency': sn.extractsingle(r'^8\s+(\S+)', self.stdout, 1, float)
        }
        self.depends_on('OSUBuildTest')
        self.reference = {
            '*': {'latency': (0, None, None, 'us')}
        }
        self.num_tasks = num_tasks

    @rfm.require_deps
    def set_executable(self, OSUBuildTest):
        self.executable = os.path.join(
            OSUBuildTest().stagedir,
            'osu-micro-benchmarks-5.6.2', 'mpi', 'collective', 'osu_allreduce'
        )
        self.executable_opts = ['-m', '8', '-x', '1000', '-i', '20000']

The full set of OSU example tests is shown below:

# Copyright 2016-2020 Swiss National Supercomputing Centre (CSCS/ETH Zurich)
# ReFrame Project Developers. See the top-level LICENSE file for details.
#
# SPDX-License-Identifier: BSD-3-Clause

import os

import reframe as rfm
import reframe.utility.sanity as sn


class OSUBenchmarkTestBase(rfm.RunOnlyRegressionTest):
    '''Base class of OSU benchmarks runtime tests'''

    def __init__(self):
        self.valid_systems = ['daint:gpu']
        self.valid_prog_environs = ['gnu', 'pgi', 'intel']
        self.sourcesdir = None
        self.num_tasks = 2
        self.num_tasks_per_node = 1
        self.sanity_patterns = sn.assert_found(r'^8', self.stdout)


@rfm.simple_test
class OSULatencyTest(OSUBenchmarkTestBase):
    def __init__(self):
        super().__init__()
        self.descr = 'OSU latency test'
        self.perf_patterns = {
            'latency': sn.extractsingle(r'^8\s+(\S+)', self.stdout, 1, float)
        }
        self.depends_on('OSUBuildTest')
        self.reference = {
            '*': {'latency': (0, None, None, 'us')}
        }

    @rfm.require_deps
    def set_executable(self, OSUBuildTest):
        self.executable = os.path.join(
            OSUBuildTest().stagedir,
            'osu-micro-benchmarks-5.6.2', 'mpi', 'pt2pt', 'osu_latency'
        )
        self.executable_opts = ['-x', '100', '-i', '1000']


@rfm.simple_test
class OSUBandwidthTest(OSUBenchmarkTestBase):
    def __init__(self):
        super().__init__()
        self.descr = 'OSU bandwidth test'
        self.perf_patterns = {
            'bandwidth': sn.extractsingle(r'^4194304\s+(\S+)',
                                          self.stdout, 1, float)
        }
        self.depends_on('OSUBuildTest')
        self.reference = {
            '*': {'bandwidth': (0, None, None, 'MB/s')}
        }

    @rfm.require_deps
    def set_executable(self, OSUBuildTest):
        self.executable = os.path.join(
            OSUBuildTest().stagedir,
            'osu-micro-benchmarks-5.6.2', 'mpi', 'pt2pt', 'osu_bw'
        )
        self.executable_opts = ['-x', '100', '-i', '1000']


@rfm.parameterized_test(*([1 << i] for i in range(1, 5)))
class OSUAllreduceTest(OSUBenchmarkTestBase):
    def __init__(self, num_tasks):
        super().__init__()
        self.descr = 'OSU Allreduce test'
        self.perf_patterns = {
            'latency': sn.extractsingle(r'^8\s+(\S+)', self.stdout, 1, float)
        }
        self.depends_on('OSUBuildTest')
        self.reference = {
            '*': {'latency': (0, None, None, 'us')}
        }
        self.num_tasks = num_tasks

    @rfm.require_deps
    def set_executable(self, OSUBuildTest):
        self.executable = os.path.join(
            OSUBuildTest().stagedir,
            'osu-micro-benchmarks-5.6.2', 'mpi', 'collective', 'osu_allreduce'
        )
        self.executable_opts = ['-m', '8', '-x', '1000', '-i', '20000']


@rfm.simple_test
class OSUBuildTest(rfm.CompileOnlyRegressionTest):
    def __init__(self):
        self.descr = 'OSU benchmarks build test'
        self.valid_systems = ['daint:gpu']
        self.valid_prog_environs = ['gnu', 'pgi', 'intel']
        self.sourcesdir = None
        self.prebuild_cmds = [
            'wget http://mvapich.cse.ohio-state.edu/download/mvapich/osu-micro-benchmarks-5.6.2.tar.gz',
            'tar xzf osu-micro-benchmarks-5.6.2.tar.gz',
            'cd osu-micro-benchmarks-5.6.2'
        ]
        self.build_system = 'Autotools'
        self.build_system.max_concurrency = 8
        self.sanity_patterns = sn.assert_not_found('error', self.stderr)

Notice that the order in which dependencies are defined in a test file is irrelevant. In this case, we define OSUBuildTest at the end. ReFrame will make sure to properly sort the tests and execute them.

Here is the output when running the OSU tests with the asynchronous execution policy:

[==========] Running 7 check(s)
[==========] Started on Wed Jun  3 09:00:40 2020

[----------] started processing OSUBuildTest (OSU benchmarks build test)
[ RUN      ] OSUBuildTest on daint:gpu using PrgEnv-gnu
[ RUN      ] OSUBuildTest on daint:gpu using PrgEnv-intel
[ RUN      ] OSUBuildTest on daint:gpu using PrgEnv-pgi
[----------] finished processing OSUBuildTest (OSU benchmarks build test)

[----------] started processing OSULatencyTest (OSU latency test)
[ RUN      ] OSULatencyTest on daint:gpu using PrgEnv-gnu
[      DEP ] OSULatencyTest on daint:gpu using PrgEnv-gnu
[ RUN      ] OSULatencyTest on daint:gpu using PrgEnv-intel
[      DEP ] OSULatencyTest on daint:gpu using PrgEnv-intel
[ RUN      ] OSULatencyTest on daint:gpu using PrgEnv-pgi
[      DEP ] OSULatencyTest on daint:gpu using PrgEnv-pgi
[----------] finished processing OSULatencyTest (OSU latency test)

[----------] started processing OSUBandwidthTest (OSU bandwidth test)
[ RUN      ] OSUBandwidthTest on daint:gpu using PrgEnv-gnu
[      DEP ] OSUBandwidthTest on daint:gpu using PrgEnv-gnu
[ RUN      ] OSUBandwidthTest on daint:gpu using PrgEnv-intel
[      DEP ] OSUBandwidthTest on daint:gpu using PrgEnv-intel
[ RUN      ] OSUBandwidthTest on daint:gpu using PrgEnv-pgi
[      DEP ] OSUBandwidthTest on daint:gpu using PrgEnv-pgi
[----------] finished processing OSUBandwidthTest (OSU bandwidth test)

[----------] started processing OSUAllreduceTest_2 (OSU Allreduce test)
[ RUN      ] OSUAllreduceTest_2 on daint:gpu using PrgEnv-gnu
[      DEP ] OSUAllreduceTest_2 on daint:gpu using PrgEnv-gnu
[ RUN      ] OSUAllreduceTest_2 on daint:gpu using PrgEnv-intel
[      DEP ] OSUAllreduceTest_2 on daint:gpu using PrgEnv-intel
[ RUN      ] OSUAllreduceTest_2 on daint:gpu using PrgEnv-pgi
[      DEP ] OSUAllreduceTest_2 on daint:gpu using PrgEnv-pgi
[----------] finished processing OSUAllreduceTest_2 (OSU Allreduce test)

[----------] started processing OSUAllreduceTest_4 (OSU Allreduce test)
[ RUN      ] OSUAllreduceTest_4 on daint:gpu using PrgEnv-gnu
[      DEP ] OSUAllreduceTest_4 on daint:gpu using PrgEnv-gnu
[ RUN      ] OSUAllreduceTest_4 on daint:gpu using PrgEnv-intel
[      DEP ] OSUAllreduceTest_4 on daint:gpu using PrgEnv-intel
[ RUN      ] OSUAllreduceTest_4 on daint:gpu using PrgEnv-pgi
[      DEP ] OSUAllreduceTest_4 on daint:gpu using PrgEnv-pgi
[----------] finished processing OSUAllreduceTest_4 (OSU Allreduce test)

[----------] started processing OSUAllreduceTest_8 (OSU Allreduce test)
[ RUN      ] OSUAllreduceTest_8 on daint:gpu using PrgEnv-gnu
[      DEP ] OSUAllreduceTest_8 on daint:gpu using PrgEnv-gnu
[ RUN      ] OSUAllreduceTest_8 on daint:gpu using PrgEnv-intel
[      DEP ] OSUAllreduceTest_8 on daint:gpu using PrgEnv-intel
[ RUN      ] OSUAllreduceTest_8 on daint:gpu using PrgEnv-pgi
[      DEP ] OSUAllreduceTest_8 on daint:gpu using PrgEnv-pgi
[----------] finished processing OSUAllreduceTest_8 (OSU Allreduce test)

[----------] started processing OSUAllreduceTest_16 (OSU Allreduce test)
[ RUN      ] OSUAllreduceTest_16 on daint:gpu using PrgEnv-gnu
[      DEP ] OSUAllreduceTest_16 on daint:gpu using PrgEnv-gnu
[ RUN      ] OSUAllreduceTest_16 on daint:gpu using PrgEnv-intel
[      DEP ] OSUAllreduceTest_16 on daint:gpu using PrgEnv-intel
[ RUN      ] OSUAllreduceTest_16 on daint:gpu using PrgEnv-pgi
[      DEP ] OSUAllreduceTest_16 on daint:gpu using PrgEnv-pgi
[----------] finished processing OSUAllreduceTest_16 (OSU Allreduce test)

[----------] waiting for spawned checks to finish
[       OK ] ( 1/21) OSUBuildTest on daint:gpu using PrgEnv-pgi [compile: 29.581s run: 0.086s total: 29.708s]
[       OK ] ( 2/21) OSUBuildTest on daint:gpu using PrgEnv-gnu [compile: 26.250s run: 69.120s total: 95.437s]
[       OK ] ( 3/21) OSUBuildTest on daint:gpu using PrgEnv-intel [compile: 39.385s run: 89.213s total: 129.871s]
[       OK ] ( 4/21) OSULatencyTest on daint:gpu using PrgEnv-pgi [compile: 0.012s run: 145.355s total: 154.504s]
[       OK ] ( 5/21) OSUAllreduceTest_2 on daint:gpu using PrgEnv-pgi [compile: 0.014s run: 148.276s total: 154.433s]
[       OK ] ( 6/21) OSUAllreduceTest_4 on daint:gpu using PrgEnv-pgi [compile: 0.011s run: 149.763s total: 154.407s]
[       OK ] ( 7/21) OSUAllreduceTest_8 on daint:gpu using PrgEnv-pgi [compile: 0.013s run: 151.262s total: 154.378s]
[       OK ] ( 8/21) OSUAllreduceTest_16 on daint:gpu using PrgEnv-pgi [compile: 0.010s run: 152.716s total: 154.360s]
[       OK ] ( 9/21) OSULatencyTest on daint:gpu using PrgEnv-gnu [compile: 0.014s run: 210.952s total: 220.847s]
[       OK ] (10/21) OSUBandwidthTest on daint:gpu using PrgEnv-pgi [compile: 0.015s run: 213.285s total: 220.758s]
[       OK ] (11/21) OSUAllreduceTest_4 on daint:gpu using PrgEnv-gnu [compile: 0.011s run: 215.596s total: 220.717s]
[       OK ] (12/21) OSUAllreduceTest_16 on daint:gpu using PrgEnv-gnu [compile: 0.011s run: 218.742s total: 220.651s]
[       OK ] (13/21) OSUAllreduceTest_2 on daint:gpu using PrgEnv-intel [compile: 0.013s run: 203.214s total: 206.115s]
[       OK ] (14/21) OSUAllreduceTest_8 on daint:gpu using PrgEnv-intel [compile: 0.016s run: 204.819s total: 206.078s]
[       OK ] (15/21) OSUBandwidthTest on daint:gpu using PrgEnv-gnu [compile: 0.012s run: 258.772s total: 266.873s]
[       OK ] (16/21) OSUAllreduceTest_8 on daint:gpu using PrgEnv-gnu [compile: 0.014s run: 263.576s total: 266.752s]
[       OK ] (17/21) OSULatencyTest on daint:gpu using PrgEnv-intel [compile: 0.011s run: 227.234s total: 231.789s]
[       OK ] (18/21) OSUAllreduceTest_4 on daint:gpu using PrgEnv-intel [compile: 0.013s run: 229.729s total: 231.724s]
[       OK ] (19/21) OSUAllreduceTest_2 on daint:gpu using PrgEnv-gnu [compile: 0.013s run: 286.203s total: 292.444s]
[       OK ] (20/21) OSUAllreduceTest_16 on daint:gpu using PrgEnv-intel [compile: 0.028s run: 242.030s total: 242.091s]
[       OK ] (21/21) OSUBandwidthTest on daint:gpu using PrgEnv-intel [compile: 0.013s run: 243.719s total: 247.384s]
[----------] all spawned checks have finished

[  PASSED  ] Ran 21 test case(s) from 7 check(s) (0 failure(s))
[==========] Finished on Wed Jun  3 09:07:24 2020

Before starting running the tests, ReFrame topologically sorts them based on their dependencies and schedules them for running using the selected execution policy. With the serial execution policy, ReFrame simply executes the tests to completion as they “arrive”, since the tests are already topologically sorted. In the asynchronous execution policy, tests are spawned and not waited for. If a test’s dependencies have not yet completed, it will not start its execution and a DEP message will be printed to denote this.

Finally, ReFrame’s runtime takes care of properly cleaning up the resources of the tests respecting dependencies. Normally when an individual test finishes successfully, its stage directory is cleaned up. However, if other tests are depending on this one, this would be catastrophic, since most probably the dependent tests would need the outcome of this test. ReFrame fixes that by not cleaning up the stage directory of a test until all its dependent tests have finished successfully.