Testing Practices

Overview

Summary of the testing practices used by RAPIDS projects.

Intended audience

Developers

Project Leads

Operations

Infrastructure

Overall goal, follow the open source ecosystem for infrastructure choices

Jenkins

Open source project for CI/CD and general automation that is very extensible and well adopted

gpuCI is built on top of Jenkins with a variety of plugins.

GitHub

  • All projects are hosted on GitHub for open source visibility

Docker

  • Industry standard for containerized software
  • Allows for isolating all builds

Code Style / Formatting

  • Run style checks and/or auto-formatters per commit to ensure code is uniformly formatted in a clean and consistent way before it is ever merged to the repository
    • Handles things like maximum line length, trailing whitespace, linebreak semantics, etc.

Follow open source ecosystem in use of code formatters

  • C++
    • Clang-Format (Planned)
  • Python
    • Flake8
    • Auto-formatting with Black (Planned)

Unit Testing

  • Runs per commit to ensure code is never pushed in a broken state
    • Reports back to github to give end user feedback of exactly what the issue(s) is
  • Matrix of tests across supported Operating System / CUDA / Python versions along with running through CUDA-memcheck (planned)
  • Tests that project builds successfully both with and without GPUs, and then ensures that the suite of unit tests run successfully with GPUs
    • Tests building of packages for conda and pip
  • Tests are designed as black box tests for both external user facing functions and internal functions
    • Tests are written to compare against Pandas / Numpy / Scikit-Learn / NetworkX / etc. to ensure the results are completely in line with expectations of end users

Follow open source ecosystem in use of testing frameworks

Datasets

  • Many tests depend on the presence of specific datasets in order to properly verify code correctness. See datasets for more details.

Integration / Workflow Testing and Benchmarking

  • Runs nightly to ensure the different libraries integrate as expected similar to how other Python libraries integrate (i.e. cuDF with cuML vs Pandas with SKLearn)
  • In addition to checking the runs succeed without error and checking correctness, measures performance regressions in the workflows
  • Pipe output of Google Benchmark into ASV dashboards for easy consumption
  • Run with profiling and dump an nvprof / nsight profile per workflow for easy analysis by developers (planned)
  • Allows for naturally using example / workflow notebooks for integration / workflow / performance testing as well
  • Matrix of tests across supported Operating System / CUDA / Python versions

Follow open source ecosystem in use of testing frameworks

Packaging and Release

  • Follow open source ecosystem in packaging and delivery mechanisms:
    • Conda
    • Pip
    • Docker
  • Releases
    • Release every ~6 weeks with no known critical bugs
    • Allows users to have a stable release that won’t introduce performance regressions / bugs during the development process
    • Full set of unit / integration / workflow tests are performed before publish packages / containers
  • “Nightlies”
    • Allows cutting edge users to install the latest conda package to test new functionality coming in the next release
    • Conda packages are created for each project on a per merge basis
    • Docker containers are built nightly and have integration tests run and must pass before publishing containers

Examples of Testing in Action

  • DLPack support for cuDF
  • String support for cuDF
    • https://github.com/rapidsai/cudf/pull/1032
      • C++ implementation and Python bindings
      • Originally depended on an unreleased version of a different library so CI builds failed and prevented merging into the main repository until it was resolved
      • GTests and PyTest unit tests where results of each incrementally drove development across both C++ and Python
      • Heavily uses unit test parameterization to effectively test different function parameters for sufficient test coverage