ReFrame Test Library (experimental)

This is a collection of generic tests that you can either run out-of-the-box by specializing them for your system using the -S option or create your site-specific tests by building upon them.

Microbenchmarks

OSU microbenchmarks

There are two final parameterized tests that represent the various OSU benchmarks:

  • The osu_run test that runs the benchmarks only. This assumes that the OSU microbenchmarks are installed and available.

  • The osu_build_run test that builds and runs the benchmarks. This test uses two fixtures in total: one to build the tests and one to fetch them.

Depending on your setup you can select the most appropriate final test. The benchmarks define various variables with a reasonable default value that affect the execution of the benchmark. For collective communication benchmarks, setting the num_tasks is required. All tests set num_tasks_per_node to 1 by default.

Examples

Run the run-only version of the point to point bandwidth benchmark:

reframe -n 'osu_run.*benchmark_info=mpi.pt2pt.osu_bw' -S modules=my-osu-benchmarks -S valid_systems=mysystem -S valid_prog_environs=myenv -l

Build and run the CUDA-aware version of the allreduce benchmark.

reframe -n 'osu_build_run.*benchmark_info=mpi.collective.osu_allreduce.*build_type=cuda' -S device_buffers=cuda -S num_tasks=16 -S valid_systems=sys:part -S valid_prog_environs=myenv -l
class hpctestlib.microbenchmarks.mpi.osu.build_osu_benchmarks(*args, **kwargs)[source]

Bases: CompileOnlyRegressionTest

Fixture for building the OSU benchmarks

build_type = <reframe.core.parameters.TestParam object>

Build variant parameter.

Type

str

Values

'cpu', 'cuda', 'rocm', 'openacc'

osu_benchmarks = <reframe.core.fixtures.TestFixture object>

The fixture object that retrieves the benchmarks

Type

fetch_osu_benchmarks

Scope

session

class hpctestlib.microbenchmarks.mpi.osu.fetch_osu_benchmarks(*args, **kwargs)[source]

Bases: RunOnlyRegressionTest

Fixture for fetching the OSU benchmarks.

version = 5.9

The version of the benchmarks to fetch.

Type

str

Default

'5.9'

class hpctestlib.microbenchmarks.mpi.osu.osu_benchmark(*args, **kwargs)[source]

Bases: RunOnlyRegressionTest

OSU benchmark test base class.

benchmark_info = <reframe.core.parameters.TestParam object>

Parameter indicating the available benchmark to execute.

Type

2-element tuple containing the benchmark name and whether latency or bandwidth is to be measured.

Values

mpi.collective.osu_alltoall, mpi.collective.osu_allreduce, mpi.pt2pt.osu_bw, mpi.pt2pt.osu_latency

device_buffers = cpu

Device buffers.

Use accelerator device buffers. Valid values are cpu, cuda, openacc or rocm.

Type

str

Default

'cpu'

message_size

Maximum message size.

Both the performance and the sanity checks will be done for this message size.

This value is set to 8 for latency benchmarks and to 4194304 for bandwidth benchmarks.

Type

int

num_iters = 1000

Number of iterations.

This value is passed to the excutable through the -i option.

Type

int

Default

1000

num_tasks

Number of tasks to use.

This variable is required. It is set to 2 for point to point benchmarks, but it is undefined for collective benchmarks

Required

Yes

num_warmup_iters = 10

Number of warmup iterations.

This value is passed to the excutable through the -x option.

Type

int

Default

10

class hpctestlib.microbenchmarks.mpi.osu.osu_build_run(*args, **kwargs)[source]

Bases: osu_benchmark

OSU benchmark test (build and run)

osu_binaries = <reframe.core.fixtures.TestFixture object>

The fixture object that builds the OSU binaries

Type

build_osu_benchmarks

Scope

environment

class hpctestlib.microbenchmarks.mpi.osu.osu_run(*args, **kwargs)[source]

Bases: osu_benchmark

Run-only OSU benchmark test

GPU benchmarks

class hpctestlib.microbenchmarks.gpu.gpu_burn.gpu_burn_build(*args, **kwargs)[source]

Bases: CompileOnlyRegressionTest

Fixture for building the GPU burn benchmark.

Summary

Variables

Parameters

Fixtures

None

None

gpu_arch = None

Set the GPU architecture.

This variable will be passed to the compiler to generate the arch-specific code.

Type

str or None

Default

None

gpu_build = None

Set the build option to either 'cuda' or 'hip'.

Type

str

Default

'cuda'

class hpctestlib.microbenchmarks.gpu.gpu_burn.gpu_burn_check(*args, **kwargs)[source]

Bases: RunOnlyRegressionTest

GPU burn benchmark.

This benchmark runs continuously GEMM, either single or double precision, on a selected set of GPUs on the node where the benchmark runs.

The floating point precision of the computations, the duration of the benchmark as well as the list of GPU devices that the benchmark will run on can be controlled through test variables.

This benchmark tries to build the benchmark code through the gpu_burn_build fixture.

This benchmark sets the num_gpus_per_node test attribute, if not already set, based on the number of devices with type == 'gpu' defined in the corresponding partition configuration. Similarly, this benchmark will use the arch device configuration attribute to set the gpu_arch variable, if this is not already set by the user.

Summary

Variables

Parameters

Metrics

Fixtures

System features

Environment features

None

+gpu

+cuda OR +hip

devices = []

List of device IDs to run the benchmark on.

If empty, the benchmark will run on all the available devices.

Type

List[int]

Default

[]

duration = 10

Duration of the benchmark in seconds.

Type

int

Default

10

gpu_perf_min()[source]

Lowest performance recorded among all the selected devices.

gpu_temp_max()[source]

Maximum temperature recorded among all the selected devices.

use_dp = True

Use double-precision arithmetic when running the benchmark.

Type

bool

Default

True

Scientific Applications

class hpctestlib.sciapps.amber.nve.amber_nve_check(*args, **kwargs)[source]

Bases: RunOnlyRegressionTest

Amber NVE test.

Amber is a suite of biomolecular simulation programs. It began in the late 1970’s, and is maintained by an active development community.

This test is parametrized over the benchmark type (see benchmark_info) and the variant of the code (see variant). Each test instance executes the benchmark, validates numerically its output and extracts and reports a performance metric.

assert_energy_readout()[source]

Assert that the obtained energy meets the required tolerance.

property bench_name

The benchmark name.

Type

str

benchmark_info = <reframe.core.parameters.TestParam object>

Parameter pack encoding the benchmark information.

The first element of the tuple refers to the benchmark name, the second is the energy reference and the third is the tolerance threshold.

Type

Tuple[str, float, float]

Values
[
    ('Cellulose_production_NVE', -443246.0, 5.0E-05),
    ('FactorIX_production_NVE', -234188.0, 1.0E-04),
    ('JAC_production_NVE_4fs', -44810.0, 1.0E-03),
    ('JAC_production_NVE', -58138.0, 5.0E-04)
]
property energy_ref

The energy reference value for this benchmark.

Type

str

property energy_tol

The energy tolerance value for this benchmark.

Type

str

input_file

The input file to use.

This is set to mdin.CPU or mdin.GPU depending on the test variant during initialization.

Type

str

Required

Yes

num_tasks

See num_tasks.

The mpi variant of the test requires num_tasks > 1.

Required

Yes

output_file = amber.out

The output file to pass to the Amber executable.

Type

str

Required

No

Default

'amber.out'

perf()[source]

The performance of the benchmark expressed in ns/day.

class hpctestlib.sciapps.gromacs.benchmarks.gromacs_check(*args, **kwargs)[source]

Bases: RunOnlyRegressionTest

GROMACS benchmark test.

GROMACS is a versatile package to perform molecular dynamics, i.e. simulate the Newtonian equations of motion for systems with hundreds to millions of particles.

The benchmarks consist on a set of different inputs files that vary in the number of atoms and can be found in the following repository, which is also versioned: https://github.com/victorusu/GROMACS_Benchmark_Suite/.

Each test instance validates numerically its output and extracts and reports a performance metric.

assert_energy_readout()[source]

Assert that the obtained energy meets the benchmark tolerances.

property bench_name

The benchmark name.

Type

str

benchmark_info = <reframe.core.parameters.TestParam object>

Parameter pack encoding the benchmark information.

The first element of the tuple refers to the benchmark name, the second is the energy reference and the third is the tolerance threshold.

Type

Tuple[str, float, float]

Values

benchmark_version = 1.0.0

The version of the benchmark suite to use.

Type

str

Default

'1.0.0'

property energy_ref

The energy reference value for this benchmark.

Type

str

property energy_tol

The energy tolerance value for this benchmark.

Type

str

nb_impl = <reframe.core.parameters.TestParam object>

Parameter encoding the implementation of the non-bonded calculations

Type

str

Values

['cpu', 'gpu']

Data Analytics

class hpctestlib.data_analytics.spark.spark_checks.compute_pi_check(*args, **kwargs)[source]

Bases: RunOnlyRegressionTest

Test Apache Spark by computing PI.

Apache Spark is a unified analytics engine for large-scale data processing. It provides high-level APIs in Java, Scala, Python and R, and an optimized engine that supports general execution graphs. It also supports a rich set of higher-level tools including Spark SQL for SQL and structured data processing, MLlib for machine learning, GraphX for graph processing, and Structured Streaming for incremental computation and stream processing (see spark.apache.org).

This test checks that Spark is functioning correctly. To do this, it is necessary to define the tolerance of acceptable deviation. The tolerance is used to check that the computations are executed correctly, by comparing the value of pi calculated to the one obtained from the math library. The default assumption is that Spark is already installed on the system under test.

assert_pi_readout()[source]

Assert that the obtained pi value meets the specified tolerances.

exec_cores = 1

The number of cores per each Spark executor

Type

int

Required

No

Default

1

executor_memory

Amount of memory to use per executor process, following the JVM memory strings convention, i.e a number with a size unit suffix (“k”, “m”, “g” or “t”) (e.g. 512m, 2g)

Type

str

Required

Yes

num_workers = 1

The number of Spark workers per node

Type

int

Required

No

Default

1

spark_local_dirs = /tmp

The local directories used by Spark

Type

str

Required

No

Default

‘/tmp’

spark_prefix

The Spark installation prefix path

Type

str

Required

Yes

tolerance = 0.01

The absolute tolerance of the computed value of PI

Type

float

Required

No

Default

0.01

variant = <reframe.core.parameters.TestParam object>

Parameter encoding the variant of the test.

Type

str

Values

['spark', 'pyspark']

Python

class hpctestlib.python.numpy.numpy_ops.numpy_ops_check(*args, **kwargs)[source]

Bases: RunOnlyRegressionTest

NumPy basic operations test.

NumPy is the fundamental package for scientific computing in Python. It provides a multidimensional array object, various derived objects (such as masked arrays and matrices), and an assortment of routines for fast operations on arrays, including mathematical, logical, shape manipulation, sorting, selecting, I/O, discrete Fourier transforms, basic linear algebra, basic statistical operations, random simulation and much more.

This test test performs some fundamental NumPy linear algebra operations (matrix product, SVD, Cholesky decomposition, eigendecomposition, and inverse matrix calculation) and users the execution time as a performance metric. The default assumption is that NumPy is already installed on the currest system.

time_cholesky()[source]

Time of the cholesky kernel in seconds.

time_dot()[source]

Time of the dot kernel in seconds.

time_eigendec()[source]

Time of the eigendec kernel in seconds.

time_inv()[source]

Time of the inv kernel in seconds.

time_svd()[source]

Time of the svd kernel in seconds.

Interactive Computing

class hpctestlib.interactive.jupyter.ipcmagic.ipcmagic_check(*args, **kwargs)[source]

Bases: RunOnlyRegressionTest

Test ipcmagic via a distributed TensorFlow training with ipyparallel.

ipcmagic is a Python package and collection of CLI scripts for controlling clusters for Jupyter. For more information, please have a look here.

This test checks the ipcmagic performance. To do this, a single-layer neural network is trained against a noisy linear function. The parameters of the fitted linear function are returned in the end along with the resulting loss function. The default assumption is that ipcmagic is already installed on the system under test.

assert_successful_execution()[source]

Checks that the program is running on 2 different nodes (hostnames are different), that IPCMagic is configured and returns the correct end-of-program message (returns the slope parameter in the end).

Machine Learning

class hpctestlib.ml.tensorflow.horovod.tensorflow_cnn_check(*args, **kwargs)[source]

Bases: RunOnlyRegressionTest

Run a synthetic CNN benchmark with TensorFlow2 and Horovod.

TensorFlow is an end-to-end open source platform for machine learning. It has a comprehensive, flexible ecosystem of tools, libraries and community resources that lets researchers push the state-of-the-art in ML and developers easily build and deploy ML powered applications. For more information, refer to https://www.tensorflow.org/.

Horovod is a distributed deep learning training framework for TensorFlow, Keras, PyTorch, and Apache MXNet. The goal of Horovod is to make distributed deep learning fast and easy to use. For more information refer to https://github.com/horovod/horovod.

This test runs the Horovod tensorflow2_synthentic_benchmark.py example, checks its sanity and extracts the GPU performance.

batch_size = 32

The size of the batch used during the learning of models.

Type

int

Default

32

benchmark_version = v0.21.0

The version of Horovod to use.

Type

str

Default

'v0.21.0'

model = InceptionV3

The name of the model to use for this benchmark.

Type

str

Default

'InceptionV3'

num_batches_per_iter = 5

The number of batches per iteration.

Type

int

Default

5

num_iters = 5

The number of iterations.

Type

int

Default

5

num_warmup_batches = 5

The number of warmup batches

Type

int

Default

5

throughput_iteration()[source]

The average GPU throughput per iteration in images/s.

throughput_total()[source]

The total GPU throughput of the benchmark in images/s.

class hpctestlib.ml.pytorch.horovod.pytorch_cnn_check(*args, **kwargs)[source]

Bases: RunOnlyRegressionTest

Run a synthetic CNN benchmark with PyTorch and Horovod.

PyTorch is a Python package that provides tensor computation like NumPy with strong GPU acceleration and deep neural networks built on a tape-based autograd system. For more information, refer to https://pytorch.org/.

Horovod is a distributed deep learning training framework for TensorFlow, Keras, PyTorch, and Apache MXNet. The goal of Horovod is to make distributed deep learning fast and easy to use. For more information refer to https://github.com/horovod/horovod.

This test runs the Horovod pytorch_synthentic_benchmark.py example, checks its sanity and extracts the GPU performance.

batch_size = 64

The size of the batch used during the learning of models.

Type

int

Default

64

benchmark_version = v0.21.0

The version of Horovod to use.

Type

str

Default

'v0.21.0'

model = inception_v3

The name of the model to use for this benchmark.

Type

str

Default

'inception_v3'

num_batches_per_iter = 5

The number of batches per iteration.

Type

int

Default

5

num_iters = 5

The number of iterations.

Type

int

Default

5

num_warmup_batches = 5

The number of warmup batches

Type

int

Default

5

throughput_iteration()[source]

The average GPU throughput per iteration in images/s.

throughput_total()[source]

The total GPU throughput of the benchmark in images/s.