HPC and Big Data Technologies for Global Challenges


Reproducible Software Environments & Benchmarks with Ansible and Spack
HiDALGO approach

Sergiy Gogolenko

High-Performance Computing Center, Stuttgart, DE

HiDALGO∘2021-04-14

Motivation

Novel Conventions in Reporoducbility for HPC Research

Evolution of approaches to reproducible research in HPC

  • "Experimental setup" section in HPC papers
  • "Measurement bias is significant, commonplace, and unpredictable". T.Mytkowicz et al. (2009) "Producing wrong data without doing anything obviously wrong!" ACM SIGPLAN
    • "Program performance is sensitive to the experimental setup."
      • conventions for repr. experiments.
        E.g.: I.Jimenez et al (2017) "The Popper Convention: Making Reproducible Systems Evaluation Practical". IPDPS17
    • statistics can help to detect (causal analysis) and avoid (setup randomization) bias
      • careful reporting in terms of statistics.
        E.g.: T.Hoefler and R.Belli (2015) "Scientific Benchmarking of Parallel Computing Systems: Twelve Ways to Tell the Masses when Reporting Performance Results". SC15

Evolution of approaches to reproducible research in HPC

  • personal story: EC cares (2020)

    … to have a reproducible benchmarking, optimisations, and visualisations.

    … to provide how to reproduce the experiments in line with open science and open data standards.

    scientifically rigorous reporting results

Popper convention

    • verbal instructions are NOT safe and/or "rigorous"
    • instead, provide a set of precise configs and scripts that describe
      • available packages and default preferences
      • how to deploy the software
      • how to produce plots
    • must report metadata

Popper toolchain

Reproducing Python environment on the local workstation (PyPI)

If you do not make a performance study, it's as simple as this:

  • take/create a list of required packages

    numpy==1.18.2
    scipy>=1.1.0
    pandas
    mpi4py
    profilehooks
    
  • and run pip

    pip install -r requirements.txt
    
  • report/store the existing software environment

    pip freeze > requirements.txt
    
  • … but in HPC we care!!!

What about reproducing general software environments on the resources of several HPC centers at once?

It's like a little bit harder.
Like a little...

... except a lot?
...except a lot

Typical issues

  • management of dependencies (versions, etc)
  • taking care of site- and system-specific details:
    • diversity of build systems and compilers
    • diversity of recommended setups (compilers, options, libs, etc) for different sysytems/sites
    • different stacks of pre-installed ("native") software
  • off-line sites
  • maintainance of different installation versions (combinatorial versioning_)
  • sometimes one need to do an extra work to port codes (patching)

Typical lifecycle of the HPC system user

"try different options from manuals"

"call support"

"tedious, error-prone and time-consuming process"

Can we reduce the troubles?

"yes, we can approach to the level of PyPI simplicity"

the-delivery-hero-a-simpsons-as-a-service-storyboard-6-638.jpg

distributed-computing-and-caching-in-the-cloud-hazelcast-and-microsoft-7-638.jpg

the-delivery-hero-a-simpsons-as-a-service-storyboard-11-638.jpg

the-delivery-hero-a-simpsons-as-a-service-storyboard-51-638.jpg

the-delivery-hero-a-simpsons-as-a-service-storyboard-55-638.jpg

Why to use Spack at the 1st place?

  • make it possible to install off-line on bare metal with a full control on the installation process
  • consistent build customization for each platform
  • reproducible software environments for all use cases over all platforms
    • way to reproducible science with lock-files
    • installation matrices for benchmarks and performance studies
      • effect of compilers & different configurations
  • provides almost for granted such features as:
    • containerization of environments (both Docker & Singularity) and
    • GitLabCI-pipelines
    • documentation for the installation process

Why rpm / apt / yum or Homebrew / conan are insufficient?

  • binary package managers (rpm, yum, apt, yast, etc.)
    • manage a single stack
    • install one version of each package in a single prefix ( /usr ).
    • seamless upgrades to a stable, well tested stack
  • port systems (Homebrew, etc.)
    • Minimal support for builds parameterized by compilers, dependency versions.

Common disadvantage:

  • usually do not support combinatorial versioning

For Python users

  • advanges over PyPI or poetry:
    • full support of packages with non-Python dependencies
      • compile non-Python dependencies
      • can build cythonized versions of a package
      • can link to an optimized libraries (e.g., MKL in case of BLAS/LAPACK)
  • advanges over conda:
    • ability to choose a specific compiler
    • can link to an specific libraries (BLAS/LAPACK, MPI,…)
    • better platform support for supercomputers (builds optimized binaries for specific microarchitectures)
  • disadvantages of Spack:
    • PyPI: incredible amount of packages that are not yet in Spack
    • conda: Windows support

Even more for Data Scientist (and Java in general)

  • Java: ibm-java, jdk, openjdk, icedtea, etc
  • Spark/PySpark, Hadoop

Package number

Container receipts?

  • Container receipts look like a valid reproducible software environment for HPC, don't they?
  • Formally, yes,…
  • … but it is a substitution of terms, isn't it?
  • will you be able to run it on different hardware?

Basics of Spack

Core Spack Concepts

Sorry, your browser does not support SVG.

Sorry, your browser does not support SVG.

Spec

Syntax of spec (sigils, etc)

$ spack install python                                     # unconstrained
$ spack install python@3                                   # @ custom version
$ spack install python@3.9%clang                           # % custom compiler
$ spack install python@3.9%gcc@5.4 +optimizations~debug    # +/- build option
$ spack install python@3.9 cppflags="-O3 –g3"              # set compiler flags
$ spack install python@3.9 target=skylake                  # set target microarchitecture
$ spack install python@3.9 ^openssl@1.1:%clang ^readline@8 # ^ dependency information
  • installed packages can be referred by hash

    $ spack find -l readline
    
    ==> 2 installed packages
    -- linux-ubuntu16.04-broadwell / gcc@5.4.0 ----------------------
    sei2v7j readline@8.0  feqfow6 readline@8.0
    
    $ spack install python@3.9 ^openssl@1.1:%clang ^/feq     # / hash
    

How to construct spec?

  • hash (concretize or find -l)
  • package name

    $ spack list                          # all packages
    $ spack list *pyth*                   # simple filters 
    $ spack list -d MPI                   # also search the description for a match (apt-chache search)
    
    • list extensible packages

      $ spack extensions                # extendable packages
      
      ==> Extendable packages:
          aspell  go-bootstrap  jdk      lua     mofem-cephas  openjdk  python  ruby  spiral  yorick
          go      icedtea       kim-api  matlab  octave        perl     r       rust  tcl
      
    • list Python extensions

      $ spack extensions -s packages python | head -n4
      
      ==> python@3.9.0%gcc@5.4.0+bz2+ctypes+dbm~debug+libxml2+lzma~nis+optimizations+pic...
      ==> 1345 extensions:
      adios2
      akantu
      

How to construct spec?

  • version

    $ spack versions python                   # basic
    $ spack versions  *mpi*                   # include virtual packages in list
    $ spack list --format version_json *mpi*  # print output JSON with versions, deps, homepage, path to package
    
  • compiler

    $ spack compilers                     # list of specs for registered compilers
    $ spack compiler list                 # ----/----
    $ spack compiler info clang           # know better about configs of specific compilers 
    

How to construct spec? Configuration and Options

  • flags (for compilers & linkers -> NOT RECOMMENDED)
    • compilers: cppflags, cflags, cxxflags, and fflags
    • linkers: ldflags, and ldlibs
  • variants and dependencies

    $ spack info libosrm
    
    CMakePackage:   libosrm
    
    Description:
        libOSRM is a high performance routing library written in C++14 designed
        to run on OpenStreetMap data.
    
    Homepage: http://project-osrm.org/
    
    Tags: 
        None
    
    Preferred version:  
        5.24.0    https://github.com/Project-OSRM/osrm-backend/archive/v5.24.0.tar.gz
    
    Safe versions:  
        5.24.0    https://github.com/Project-OSRM/osrm-backend/archive/v5.24.0.tar.gz
    
    Variants:
        Name [Default]          Allowed values    Description
        ====================    ==============    =====================================
    
        build_type [Release]    Debug, Release    The build type to build
        doxygen [off]           on, off           Install with libosmium
        ipo [off]               on, off           CMake interprocedural optimization
        lib_only [on]           on, off           Install OSRM in a library only mode
        osmium [off]            on, off           Install with libosmium
        protozero [off]         on, off           Install with third party Protozero
        shared [off]            on, off           Enables the build of shared libraries
    
    Installation Phases:
        cmake    build    install
    
    Build Dependencies:
        binutils  boost  bzip2  cmake  doxygen  git  intel-tbb  libosmium  libxml2  libzip
        lua  pkg-config  protozero  zlib
    
    Link Dependencies:
        boost  bzip2  doxygen  intel-tbb  libosmium  libxml2  libzip  lua  protozero  zlib
    
    Run Dependencies:
        None
    
    Virtual Packages: 
        None
    

How to construct spec? Virtual packages

  • list virtual packages

    $ spack providers
    
    Virtual packages:
        D     fftw-api  golang  lapack          mysql-client  rpc        unwind
        awk   flame     iconv   mariadb-client  opencl        scalapack  yacc
        blas  gl        ipp     mkl             osmesa        sycl
        daal  glu       java    mpe             pil           szip
        elf   glx       jpeg    mpi             pkgconfig     tbb
    
  • get providers list

    $ spack providers mpi
    
    mpi:
    fujitsu-mpi            mpich@1:   mpt@3:         mvapich2x       openmpi@2.0.0:
    intel-mpi              mpich@3:   mvapich2       nvhpc           spectrum-mpi
    intel-oneapi-mpi       mpilander  mvapich2@2.1:  openmpi
    intel-parallel-studio  mpt        mvapich2@2.3:  openmpi@1.6.5
    mpich                  mpt@1:     mvapich2-gdr   openmpi@1.7.5:
    

How to construct spec? Platform

  • full specification of the current platform

    $ spack arch -f               # frontend
    $ spack arch -b               # backend
    
  • parts of specification of the current platform

    $ spack arch -b -t            # only the target
    $ spack arch -b -p            # only platform
    $ spack arch -b -o            # only the operating system  
    
  • list platforms defined in Spack

    $ spack arch --known-targets  # 
    

Packages

from spack import *


class Libosrm(CMakePackage):
    """libOSRM is a high performance routing library written in C++14
    designed to run on OpenStreetMap data."""

    homepage = "http://project-osrm.org/"
    url      = "https://github.com/Project-OSRM/osrm-backend/archive/v5.24.0.tar.gz"

    maintainers = ['sgo-go']

    # To add checksums for more versions, one can use =spack checksum libosrm=
    version('master', branch='master')
    version('5.24.0',                   sha256='a66b20e7ffe83e5e5fe12324980320e12a6ec2b05f2befd157de5c60c665613c')
    version('5.23.0-rc.2',              sha256='bc1f6024bbfd491bddaf02ac9a15fc7786274e22fd1097014a4590430fd41199')
    version('5.23.0-rc.1',              sha256='df4ad08a758be487809f477fbbdc80558c5bca3ed2a24b8719fb636a6ace8b36')
    version('5.23.0',                   sha256='8527ce7d799123a9e9e99551936821cc0025baae6f2120dbf2fbc6332c709915')
    version('5.22.0-customsnapping.3',  sha256='414922ec383f9cbfcb10f2ced80688359f1ee5e87b920b0d00b3d6eda9b5925b')
    version('5.21.0-customsnapping.11', sha256='9dcb8795ae8c581264655c818dfc2f33f394557869a738675cc41021e1c07b78')
    version('5.21.0-customsnapping.10', sha256='bbd6c3878ec559742f700e92202a7239a6c61cedfc399921f68b7d4e5eb30eb4')
    version('5.21.0-customsnapping.9',  sha256='933b6bb7b29b0f96d54814ac5d81478e0de1a05cb3f1e3d6748c941c3efc87bd')
    version('5.21.0-customsnapping.8',  sha256='ff1ac87b8671145a6dbf8d2985df07627293c939794f49b9114574f48821f2ca')
    version('5.21.0-customsnapping.7',  sha256='34562aa5ee13dd18113d926ab91147ca29677ceddec21e8e11676c51c51012a2')

    variant('shared', default=False,
	    description='Enables the build of shared libraries')
    variant('build_type', default='Release',
	    description='The build type to build',
	    values=('Debug', 'Release'))
    variant('osmium', default=False,
	    description='Install with libosmium')
    variant('doxygen', default=False,
	    description='Install with libosmium')
    variant('protozero', default=False,
	    description='Install with third party Protozero (otherwise download it during installation)')

    # ---- See about OSRM library at:
    # https://github.com/Project-OSRM/osrm-backend/blob/master/docs/libosrm.md
    variant('lib_only', default=True,
	    description='Install OSRM in a library only mode')

    # Build-time dependencies:
    # depends_on('m4', type='build') # build-essential
    depends_on('binutils', type='build')
    depends_on('pkg-config', type='build')
    depends_on('cmake@3.1:', type='build', when='@5.21:')
    depends_on('git', type='build', when='@master')

    # depends_on('expat@2.2.6:')
    depends_on('bzip2')
    depends_on('libxml2')
    depends_on('libzip')
    depends_on('zlib')
    depends_on('boost@1.54.0:')
    depends_on('lua@5.2:')
    depends_on('intel-tbb') # 2019.3

    depends_on('libosmium', when='+osmium')
    depends_on('doxygen', when='+doxygen')
    depends_on('protozero', when='+protozero')

    conflicts('%gcc', when='@:5.0', msg='libOSRM needs C++14 support (GCC >= 5)')
    # conflicts('+cxx', when='@:0.6', msg='Variant cxx requires Julia >= 1.0.0')
    # conflicts('@:0.7.0', when='target=aarch64:')
    # conflicts('@:0.5.1', when='%gcc@8:', msg='Julia <= 0.5.1 needs GCC <= 7')

    def setup_build_environment(self, env):
	env.set('LC_CTYPE', 'en_US.UTF-8')
	env.set('LC_ALL', 'en_US.UTF-8')

	env.set('BOOST_INCLUDEDIR', self.spec['boost'].headers.directories[0])
	env.set('BOOST_LIBRARYDIR', self.spec['boost'].libs.directories[0])

    def cmake_args(self):
	variant_bool = lambda feature: str(feature in self.spec)
	cmake_args = []

	cmake_args.append('-DLUA_INCLUDE_DIR=%s' % self.spec['lua'].headers.directories[0])
	cmake_args.append('-DBUILD_SHARED_LIBS:BOOL=%s' % variant_bool('+shared'))
	cmake_args.append('-DBoost_USE_STATIC_LIBS=ON') # %s' % variant_bool('+shared')

	return cmake_args

Supported build systems:

Installation with Spack: Command install

$ spack install           python@3.9+optimized # install the package along with all its dependencies
$ spack install -f            requirements.yml # install from file (similar to pip -r)

$ spack install --only package      python@3.9 # install only the package or only the dependencies
$ spack install --only dependencies python@3.9 # install only the package or only the dependencies

$ spack install -j32      python@3.9+optimized # explicitly set 32 parallel jobs

$ spack install --fail-fast         python@3.9 # stop all builds if any build fails
$ spack install -y        python@3.9+optimized # use "yes" to every confirmation request

$ spack install --overwrite         python@3.9 # reinstall an existing spec

Installation with Spack: Mirrors, Tests, and Logging

Mirrors

$ spack install -n                   libosrm # do not use checksums to verify downloaded files (unsafe)
$ spack install --no-cache           libosrm # do not check for pre-built Spack packages in mirrors (by default =--use-cache=)
$ spack install --cache-only         libosrm # only install package from binary mirrors
$ spack install --no-check-signature libosrm # do not check signatures of binary packages

Tests

$ spack install --test root   libosrm # run package tests for top-level packages
$ spack install --test all    libosrm # run package tests during installation for all packages

Logs (CDash\(^*\) and junit reporters)

$ spack install --log-format cdash  libosrm # use CDash format to be used for log files
$ spack install --log-file osrm.log libosrm # filename for the log file. if not passed a default will be used
$ spack install --help-cdash                # show usage instructions for CDash reporting

Installation with Spack: Debugging options

# Debug
$ spack install -v                  libosrm # display verbose build output while installing
$ spack install --show-log-on-error libosrm # print full build log to stderr if build fails
$ spack install --keep-prefix       libosrm # don't remove the install prefix if installation fails
$ spack install --dont-restage      libosrm # if a partial install is detected, don't delete prior state
$ spack install --source            libosrm # install source files in prefix
$ spack install --fake              libosrm # fake install
$ spack install --dirty             libosrm # preserve user environment in spack's build environment (danger!)

Installation with Spack: Installation steps. What can fail?

Installation steps

  1. concretization of spec
  2. installation of dependencies
  3. fetch and stage
  4. create build directory
  5. config/build/install
  6. clean up build directory and stage

Installation with Spack: What can fail? Concretizer

$ spack spec -l boost libosrm # show dependency hashes as well as versions
$ spack spec -L boost libosrm # show full dependency hashes as well as versions
$ spack spec -I boost libosrm # show install status of packages.
$ spack spec -N boost libosrm # show fully qualified package names
$ spack spec -t boost libosrm # show dependency types
$ spack --color always spec -lIt libosrm | less -R
Input spec
--------------------------------
 -   [    ]  .libosrm

Concretized
--------------------------------
[+]  cljend3  [    ]  hidalgo.libosrm@5.24.0%gcc@5.4.0~doxygen~ipo+lib_only~osmium~protozero~shared build_type=Release arch=linux-ubuntu16.04-broadwell
 -   faipele  [b   ]      ^builtin.binutils@2.35.1%gcc@5.4.0+gold~headers~interwork~ld~libiberty~lto+nls~plugins arch=linux-ubuntu16.04-broadwell
 -   5lobbsd  [b   ]          ^builtin.diffutils@3.3%gcc@5.4.0 arch=linux-ubuntu16.04-broadwell
 -   cerl5gx  [bl  ]          ^builtin.gettext@0.21%gcc@5.4.0+bzip2+curses+git~libunistring+libxml2+tar+xz arch=linux-ubuntu16.04-broadwell
[+]  44w5m3q  [bl  ]              ^builtin.bzip2@1.0.6%gcc@5.4.0+shared arch=linux-ubuntu16.04-broadwell
[+]  2lu3ocr  [bl  ]              ^builtin.libiconv@1.16%gcc@5.4.0 arch=linux-ubuntu16.04-broadwell
[+]  qxsf3og  [bl  ]              ^builtin.libxml2@2.9.10%gcc@5.4.0~python arch=linux-ubuntu16.04-broadwell
[+]  uqlvynn  [b   ]                  ^builtin.pkg-config@0.29.1%gcc@5.4.0+internal_glib patches=49ffcd644e190dc5efcb2fab491177811ea746c1a526f75d77118c2706574358 arch=linux-ubuntu16.04-broadwell
[+]  z4e64mz  [blr ]                  ^builtin.xz@5.1.0alpha%gcc@5.4.0~pic arch=linux-ubuntu16.04-broadwell
[+]  lhfrrg2  [bl  ]                  ^builtin.zlib@1.2.11%gcc@5.4.0+optimize+pic+shared arch=linux-ubuntu16.04-broadwell
[+]  kmjncy3  [bl  ]              ^builtin.ncurses@6.2%gcc@5.4.0~symlinks+termlib arch=linux-ubuntu16.04-broadwell
[+]  j3btfe7  [bl  ]              ^builtin.tar@1.28%gcc@5.4.0 patches=08921fcbd732050c74ddf1de7d8ad95ffdbc09f8b4342456fa2f6a0dd02a957c,125cd6142fac2cc339e9aebfe79e40f90766022b8e8401532b1729e84fc148c2,5c314db58d005043bf407abaf25eb9823b9032a22fd12a0b142d4bf548130fa4,d428578be7fb99b831eb61e53b8d88a859afe08b479a21238180899707d79ce4 arch=linux-ubuntu16.04-broadwell
[+]  teztdxn  [bl  ]      ^builtin.boost@1.74.0%gcc@5.4.0+atomic+chrono~clanglibcpp~container~context~coroutine+date_time~debug+exception~fiber+filesystem+graph~icu+iostreams+locale+log+math~mpi+multithreaded~numpy~pic+program_options~python+random+regex+serialization+shared+signals~singlethreaded+system~taggedlayout+test+thread+timer~versionedlayout+wave cxxstd=98 visibility=hidden arch=linux-ubuntu16.04-broadwell
 -   begrqoj  [b   ]      ^builtin.cmake@3.5.1%gcc@5.4.0~doc+ncurses+openssl+ownlibs~qt arch=linux-ubuntu16.04-broadwell
[+]  oytuwse  [bl  ]      ^builtin.intel-tbb@2020.3%gcc@5.4.0+shared+tm cxxstd=default patches=62ba015ebd1819c45bef47411540b789b493e31ca668c4ff4cb2afcbc306b476,ce1fb16fb932ce86a82ca87cf0431d1a8c83652af9f552b264213b2ff2945d73 arch=linux-ubuntu16.04-broadwell
[+]  3ctx5ya  [bl  ]      ^builtin.libzip@1.2.0%gcc@5.4.0 arch=linux-ubuntu16.04-broadwell
[+]  gaavhbi  [bl  ]      ^builtin.lua@5.3.5%gcc@5.4.0+shared arch=linux-ubuntu16.04-broadwell
[+]  sei2v7j  [bl  ]          ^builtin.readline@8.0%gcc@5.4.0 arch=linux-ubuntu16.04-broadwell
[+]  svmtcfx  [  r ]          ^builtin.unzip@6.0%gcc@5.4.0 arch=linux-ubuntu16.04-broadwell

-I shows by [+] packages installed with Spack

Installation with Spack: What can fail? Concretizer

Solutions:

  1. depending on error message change spec (e.g., manually define dependency) AND GO TO 4.
  2. play with option -c / --cover (how extensively to traverse the DAG)
  3. try clingo concretizer (e.g., see the issue #19951)

    $ spack install clingo%aocc+python ^python@2.7.5
    $ spack load clingo
    $ python -m pip install --user --upgrade clingo
    $ spack find -p clingo
    
  4. try to modify package.py

Installation with Spack: What can fail? Dependency

  • Failed dependency can be identified with

    $ spack install --fail-fast --only dependencies python@3.9
    

Solution:

  • find out how to install failed package

Installation with Spack: What can fail? Config/Build/Install of Package

Solution:

  1. install dependencies, stage package and switch there

    $ spack clean python@3.9 # clean old builds
    $ spack stage python@3.9 # stage package (in particular, create build directory)
    $ spack cd    python@3.9 # switch to build directory
    
    • try to fix issue with config/build/install manually…
    • … in the build environment

      $ spack build-env python@3.9 bash # use build environment in bash session
      
  2. change package.py correspondingly
    • prepare patch if needed
  3. clean up stage

Installation with Spack: Reproducibility mechanisms

  • content of folder

    ls -a $(spack location -i boost)
    
    .  ..  include  lib  .spack
    
  • metadata (.spack)

    $ tree $(spack location -i boost)/.spack
    
    ~/spack/opt/spack/linux-ubuntu16.04-broadwell/gcc-5.4.0/boost-1.74.0-teztdxn6e7aqxjlnactwivwoyrza5uan/.spack
    ├── install_manifest.json           # info about files
    ├── repos                           # packages from repos required to install package 
    │   └── builtin
    │       ├── packages
    │       │   ├── boost
    │       │   │   ├── 1.72_boost_process.patch
    │       │   │   ├── boost_11856.patch
    │       │   │   ├── boost_154.patch
    │       │   │   ├── boost_1.63.0_pgi_17.4_workaround.patch
    │       │   │   ├── boost_1.63.0_pgi.patch
    │       │   │   ├── boost_1.67.0_pgi.patch
    │       │   │   ├── bootstrap-path.patch
    │       │   │   ├── call_once_variadic.patch
    │       │   │   ├── clang-linux_add_option2.patch
    │       │   │   ├── clang-linux_add_option.patch
    │       │   │   ├── darwin_clang_version.patch
    │       │   │   ├── fujitsu_version_analysis.patch
    │       │   │   ├── nvhpc.patch
    │       │   │   ├── package.py
    │       │   │   ├── python_jam.patch
    │       │   │   ├── python_jam_pre156.patch
    │       │   │   ├── system-non-virtual-dtor-include.patch
    │       │   │   ├── system-non-virtual-dtor-test.patch
    │       │   │   └── xl_1_62_0_le.patch
    │       │   └── zlib
    │       │       ├── package.py
    │       │       └── w_patch.patch
    │       └── repo.yaml
    ├── spack-build-env.txt             # build environment
    ├── spack-build-out.txt             # output pdoduced during build
    └── spec.yaml                       # fully concretized specs
    
  • see also package analyzers

Basic commands

We met already

$ spack list pyt*   # basic search
$ spack list -d MPI # --search-description (also search the description)
$ spack list -v mpi # --virtuals (also include virtual packages like ~mpi~)

$ spack install python@3.9%clang+optimizations ^openssl@1.1
$ spack spec python@3.9%clang@3.8+optimizations~debug ^openssl@1.1

find: check what can be used?

Mental model: module avail

  • spec quiries

    $ spack find ^python@3.7:   # every installed package that depends on mpich
    $ spack find cppflags="-O3" # built with cppflags="-O3"
    
  • many options

    $ spack find -vdfpl libosrm # variants; dependencies; compiler flags; path; dependency hashes and versions
    
    ==> 1 installed package
    -- linux-ubuntu16.04-broadwell / gcc@5.4.0 ----------------------
    cljend3 libosrm@5.24.0%gcc ~doxygen~ipo+lib_only~osmium~protozero~shared build_type=Release \
      ~/spack/opt/spack/linux-ubuntu16.04-broadwell/gcc-5.4.0/libosrm-5.24.0-cljend3d5xl67apn443bisynsc2qow5n
    ...
    oytuwse     intel-tbb@2020.3%gcc +shared+tm cxxstd=default \
      patches=62ba015ebd1819c45bef47411540b789b493e31ca668c4ff4cb2afcbc306b476,\
        ce1fb16fb932ce86a82ca87cf0431d1a8c83652af9f552b264213b2ff2945d73\
    ~/spack/opt/spack/linux-ubuntu16.04-broadwell/gcc-5.4.0/intel-tbb-2020.3-oytuwsekyhdkpzgr23lpaikphlbi5b7i
    ...
    

extensions: check what can be used? Extensible packages

Mental model: pip freeze

  • same can be used to look for Python extensions

    $ spack extensions -s installed -l python@3.9
    
    ==> 91 installed:
    -- linux-ubuntu16.04-broadwell / gcc@5.4.0 ----------------------
    jfsg4va py-absl-py@0.7.1           uvfpwty py-ipywidgets@7.5.1           wk45srk py-notebook@6.1.4
    zaheb77 py-argon2-cffi@20.1.0      vb3fqto py-jedi@0.13.3                ia52w5s py-numexpr@2.7.0 
    6t6bpnu py-astunparse@1.6.3        qyep3ry py-jinja2@2.10.3              kl356sd py-numpy@1.19.4  
    hx3smrj py-async-generator@1.10    btz2obt py-joblib@0.14.0              vu2hbdd py-pandas@1.1.4  
    sm57agu py-attrs@20.3.0            zeezzki py-jsonschema@3.2.0           64thtja py-pandocfilters@1.4.2
    hlkiwaw py-babel@2.7.0             ky5z5w3 py-jupyter@1.0.0              lx7tvvc py-parso@0.6.1        
    qrtn4qe py-backcall@0.1.0          fuwgb36 py-jupyter-client@6.1.7       lym624m py-pexpect@4.7.0      
    pfqlasj py-bleach@3.1.0            mivk4ck py-jupyter-console@6.1.0      nm2pvbb py-pickleshare@0.7.5  
    dqtwtaq py-bottleneck@1.2.1        otvcfqc py-jupyter-core@4.6.3         yhh6l3o py-pillow@8.0.0       
    xxfvwb3 py-cached-property@1.5.2   touqak7 py-jupyterlab-pygments@0.1.1  yggtq4k py-pip@20.2           
    cm4lc2e py-certifi@2020.6.20       cbigt3n py-keras@2.2.4                opm6xwx py-pkgconfig@1.5.1        
    dh6ieic py-cffi@1.14.3             f3eeils py-keras-applications@1.0.8   bmm2pm6 py-prometheus-client@0.7.1
    loavcnk py-cycler@0.10.0           dq2lhng py-keras-preprocessing@1.1.2  5b22yba py-prompt-toolkit@2.0.9   
    lljlmpx py-cython@0.29.21          uyfbpmu py-kiwisolver@1.1.0           tm2do3c py-protobuf@3.12.2        
    qlttezg py-decorator@4.4.2         ecjsden py-markupsafe@1.1.1           6ltbrs4 py-ptyprocess@0.6.0       
    zet4w5e py-defusedxml@0.6.0        qvpjxp4 py-matplotlib@3.3.3           atkpmfo py-py@1.8.0               
    4krwyqj py-entrypoints@0.3         o2euz3e py-mistune@0.8.4              uxe5b4t py-pybind11@2.5.0         
    6jpojou py-gast@0.3.3              6blhip6 py-mpi4py@3.0.3               fypwsnl py-pycparser@2.20         
    ue4m7zr py-google-pasta@0.1.8      f7zo6x4 py-nbclient@0.5.0             lfpzpfz py-pygments@2.6.1         
    oqbyjff py-grpcio@1.32.0           fdqc7ms py-nbconvert@6.0.1            vqnjzqp py-pyparsing@2.4.2        
    mknyf7g py-ipykernel@5.3.4         srd52cl py-nbformat@5.0.7             l55llo6 py-pyrsistent@0.15.7      
    57ncgy7 py-ipython@7.18.1          vwhb77l py-nest-asyncio@1.4.0         pt74pv5 py-python-dateutil@2.8.0  
    3etfubk py-ipython-genutils@0.2.0  sra7n3k py-nose@1.3.7                 4wrfs5r py-pytz@2020.1
    

load and activate: How to access packages?

Mental model: module load

  • load package

    $ spack load --first libosrm@5.24.0 # first match if multiple packages match the spec
    $ spack load --sh    libosrm@5.24.0 # print sh commands to load the package
    $ spack unload       libosrm@5.24.0
    $ spack unload -a                   # unload all loaded Spack packages
    
  • global activation/deactivation of extensions

    $ spack activate         py-numpy
    $ spack activate -f      py-numpy # without first activating dependencies
    $ spack deactivate       py-numpy
    $ spack deactivate --all python
    
    • drawbacks:
      • some extensions may still need thier C-dependencies to be loaded manually (e.g, spack load openblas for py-numpy)
      • multiple versions of a package cannot be activated side-by-side

load and activate: How to access packages? Check active packages

  • can check packages with find

    $ spack load libosrm@5.24.0
    $ spack find --loaded
    
    ==> 13 installed packages
    -- linux-ubuntu16.04-broadwell / gcc@5.4.0 ----------------------
    boost@1.74.0  intel-tbb@2020.3  libosrm@5.24.0  libzip@1.2.0  ncurses@6.2   unzip@6.0      zlib@1.2.11
    bzip2@1.0.6   libiconv@1.16     libxml2@2.9.10  lua@5.3.5     readline@8.0  xz@5.1.0alpha
    
    • load package only

      $ spack load --only=package libosrm@5.24.0
      
  • can check extensions with extensions

    $ spack extensions -s activated -l python@3
    

uninstall

$ spack uninstall    python@3.9 # 
$ spack uninstall -f python@3.9 # remove regardless of whether other packages or environments depend on this one
$ spack uninstall -R python@3.9 # also uninstall any packages that depend on the ones given via command line
$ spack uninstall -y python@3.9 # assume "yes" is the answer to every confirmation request
$ spack uninstall -a python@3.9 # remove all installed packages that match each supplied spec

Environments

Mental model: Python virtual environments with PyPI

Typical work flow for installing environment

$ spack env create sna
$ spack env activate sna

# Add specs to the environment
$ spack add python@3.9.0+optimizations
$ spack add py-numpy ^python@3+optimizations
$ spack add py-scipy ^python@3+optimizations
$ spack add py-scikit-learn ^python@3+optimizations

$ spack concretize
$ spack install

despacktivate

Content of environment folder

  • content of the folder

    $ tree -a $(spack location -e sna)
    ~/spack/var/spack/environments/sna
    ├── .spack-env
    │   └── transaction_lock
    ├── spack.lock
    └── spack.yaml
    
  • content of spack.yaml

    # This is a Spack Environment file.
    #
    # It describes a set of packages to be installed, along with
    # configuration settings.
    spack:
      # add package specs to the `specs` list
      specs: [python@3.9.0+optimizations, py-numpy ^python@3+optimizations, py-scipy ^python@3+optimizations,
        py-scikit-learn ^python@3+optimizations]
      view: true  
    

Editing Environment Config: Matrices and Combinatorial versioning

  • open environment config in editor

    $ spack config edit
    
  • 1st iteration of improvements: matrices

    spack:
      specs:
      - python@3.9.0+optimizations
      - matrix:
        - [py-numpy, py-scipy, py-scikit-learn]
        - [^python@3.9.0+optimizations]
      view: true
    

Editing Config: Definitions

  • 2nd iteration of improvements: definitions

    spack:
      definitions:
      - packages:
        - py-numpy
        - py-scipy
        - py-scikit-learn
      - pythons: [python@3.9.0+optimizations]
      # - compilers: [gcc@8.1.0]
      specs:
      - $pythons
      - matrix:
        - [$packages]
        - [$^pythons]
        # - [$%compilers]
      view: true
    

Editing Config: More features

  • 3rd iteration of improvements: containers and concretization together

    spack:
      definitions:
      - packages:
        - py-numpy
        - py-scipy
        - py-scikit-learn
      - pythons: [python@3.9.0+optimizations]
      specs:
      - $pythons
      - matrix:
        - [$packages]
        - [$^pythons]
      concretization: together
      view: true
      container:
        format: singularity
    

Editing Environment Config: Installation preferences

spack:
  packages:
    python:
      version: [3.9.0, "3.8:"]
      variants: +optimizations
      # buildable: true
  specs:
  - python
  - matrix:
    - [py-numpy, py-scipy, py-scikit-learn]
    - [^python]
  view: true

Reproducing environments

  • create environment from file
    • logical: a set of abstract specs (manifest)

      $ spack env create sna spack.yaml
      
    • exact: a set of all fully concretized specs

      $ spack env create sna spack.lock
      
  • concretize

    $ spack concretize -f
    

Stacks

Allow to define "a set of packages we want to install across a wide range of compilers".

  • constraints resolution for matrices:
    • dependencies and variants can be used regardless of whether they apply to every package
  • instruments
    • keywords for environment definitions
      • matrix: cartesian product for lists of spec parts
      • exclude: excludes specific combinations from matrix
    • conditional definitions (e.g., when: arch.satisfies('x86_64:'))
    • view descriptors

Environment vs. BundlePackage

  • basic bundle packages: opencl-headers, fastmath
  • advanced bundles: Xsdk
  • PythonPackage bundles: py-exodus-bundler, py-jupyter
class Fastmath(BundlePackage):
    homepage = "https://fastmath-scidac.org/"
    version('latest')

    depends_on('amrex')  # default is 3 dimensions
    depends_on('chombo@3.2')
    depends_on('hypre~internal-superlu')
    depends_on('mpi')
    depends_on('arpack-ng')
    depends_on('petsc')
    depends_on('phasta')
    depends_on('pumi')
    depends_on('sundials')
    depends_on('superlu-dist')
    depends_on('trilinos')
    depends_on('zoltan')
  • bundle packages in the context of HiDALGO:
    • definition of benchmark suites and sets of miniapps (e.g., Ceed)

Configuring Spack

Configuration scopes

  • common (lower-precedence scopes first):

    Scope Locationn Description
    defaults $SPACK_ROOT/etc/spack/defaults "factory" settings
    system /etc/spack settings for this machine
    site $SPACK_ROOT/etc/spack settings for Spack instance
    user ~/.spack all instances of Spack for user
    custom options --config-scope or -C </path/to/scope> custom location
  • platform-specific: <base-scope>/<platform>) (darwin, linux,…)

Configuration sections

$ spack config list
compilers mirrors repos packages modules config upstreams
Section Description
config basic configuration options
packages build customization
compilers compiler definitions
mirrors list of local and remote mirrors
repos list of repos with build instructions
upstreams installation trees of external Spack instances
modules set up for mudulefile generation

Commands

  • browsing and editing

    $ spack config get upstreams
    EDITOR=emacsclient spack config edit packages &
    
  • check end result for the superposition of scopes

    $ spack config blame config
    $ spack --insecure -C /path/to/first/scope -C /path/to/second/scope config blame config
    
  • update to the latest format (for the given Spack version)

    $ spack config update config
    $ spack config revert config  
    

config

config:
  # Directories
  install_tree:         # path to the root of the Spack install tree.
    root: $spack/opt/spack
  build_stage::         # temporary locations Spack can try to use for builds.
    - $spack/var/spack/stage
  test_stage: $spack/tmp/$user/test     # directory to run tests and store test results.
  source_cache: $spack/var/spack/cache  # cache directory for tarballs and archived repositories.
  misc_cache: $spack/var/$user/cache    # cache directory for miscellaneous files.

  # Options
  install_missing_compilers: false      # Spack will not build unavailable compiler in specs.
  checksum: true                        # always check checksums after downloading archives.
  ccache: false                         # do not use ccache to cache C compiles.
  build_jobs: 32                        # the maximum number of jobs in `make`

packages

  • permissions

    all:
      permissions:
        read: world # read-world is default, can remove
        write: group
        group: spack
    uap_simulator:
      permissions:
        read: group
        group: hid_uap
    
  • concretization preferences (for packages and virtual packages)

    packages:
      python:
        compiler: [gcc@10.2.0]
        variants: +optimization
        version: [3.6, 3.9.0, 3.7]
      all:
        compiler: [gcc@10.2.0, 'clang@10:', 'gcc', intel]
        target: [zen2]
        providers:
          mpi: [mpe, openmpi]
    

External packages

  • specification of external packages

    mpt:
      buildable: False
      externals:
      - spec: mpt@2.23
        modules:
        - mpt/2.23
    openmpi:
      externals:
      - spec: openmpi@4.0.5
        modules: [openmpi/4.0.5]
      - spec: openmpi@4.0.4%gcc@9.2.0 arch=linux-centos8-zen2
        prefix: /opt/hlrs/non-spack/mpi/openmpi/4.0.4-gcc-9.2.0
    
  • better to make MPI non-buildable for everything

    mpi:
      buildable: False
    

Automatic search for externals

$ spack external find --not-buildable cmake 
  • limited to finding a subset of common build-only dependencies (discoverable packages)

    $ spack external list
    
    ==> Detectable packages per repository
    Repository: builtin
        python    jdk        gcc          gpgme    meson     openssl
        autoconf  cpio       gdal         hugo     mpich     perl
        automake  cuda       ruby         intel    mvapich2  pkg-config
        bash      diffutils  ghostscript  krb5     nag       pkgconf
        bazel     findutils  git          libtool  ncurses   spectrum-mpi
        bison     fish       git-lfs      llvm     ninja     tar
        bzip2     fj         gmake        lustre   opengl    texinfo
        ccache    flex       gmt          m4       openjdk   xz
        cmake     fzf        go           maven    openmpi
    
  • Spack does not
    • collects and examines beyond executable files
    • search through module files
    • overwrite existing entries in the package configuration

compilers

  • definition of compiler

    - compiler:
        spec: clang@10.0.0
        paths:
          cc: /opt/hlrs/non-spack/compiler/aocc/2.2.0/bin/aocc-clang
          cxx: /opt/hlrs/non-spack/compiler/aocc/2.2.0/bin/aocc-clang++
          f77: /opt/hlrs/non-spack/compiler/aocc/2.2.0/bin/aocc-flang
          fc: /opt/hlrs/non-spack/compiler/aocc/2.2.0/bin/aocc-flang
        flags: {}
        operating_system: centos8
        target: x86_64
        modules: [aocc/2.2.0]
        environment: {}
        extra_rpaths: []
    

Mixing compilers

  • example of mixing (C/C++ from clang@8.0.0 with gfortran)

    - compiler:
        spec: clang-gfortran@3.8.0
        paths:
          cc: /usr/bin/clang-3.8
          cxx: /usr/bin/clang++-3.8
          f77: /usr/bin/gfortran
          fc: /usr/bin/gfortran
        operating_system: ubuntu16.04
        flags:
          cflags: -O3 -fPIC
        target: x86_64
        modules: []
        environment: {}
        extra_rpaths: []
    

Automatic search and registration of compilers

  • register one if absend
    • automatic

      $ module load gcc
      $ spack compiler find gcc
      
    • by default it goes to ~/.spack/<os>/compilers.yaml

      $ spack -C $SPACK_ROOT/etc/spack compiler find gcc    
      
  • use Spack-installed compilers

    $ spack install clang@8.0.0
    $ spack -C $SPACK_ROOT/etc/spack compiler add $(spack location -i clang@8.0.0)
    

mirrors

  • mirrors.yaml from default scope

    mirrors:
      spack-public: https://spack-llnl-mirror.s3-us-west-2.amazonaws.com/
    
  • specification of off-line mirror

    mirrors::
      hidalgo: file:///path/to/hidalgo/mirror
    

repos

  • repos.yaml from default scope

    repos:
      - $spack/var/spack/repos/builtin
    
  • commands

    $ spack create /path/to/new/repo hidalgo # create a new
    $ spack list                             # show registered
    $ spack add                /path/to/repo # add to Spack's configuration
    $ spack -C defaults rm           builtin # remove from Spack's configuration
    

upstreams

  • default location of Spack installations at Hawk

    ll /opt/hlrs/spack/current
    # lrwxrwxrwx 1 hpcoft28 hpc43203 18 Jul  3  2020 /opt/hlrs/spack/current -> rev-004_2020-06-17
    
  • more installations

    ll /opt/hlrs/spack/rev*
    # rev-008_2020-10-03
    
  • register it as an upstream

    upstreams:
      spack-hawk-new:
        install_tree:
          /opt/hlrs/spack/rev-008_2020-10-03
      spack-hawk-default:
        install_tree:
          /opt/hlrs/spack/current
    
  • check packages compiled for the given microarchitecture

    $ spack find target=zen2
    

Ideally sites must provide their configs

Ansible

Sorry, your browser does not support SVG.

Sorry, your browser does not support SVG.

Prerquisites

Local Remote Dependencies
AS S Python 2.6+ or 3.5+
A   Extra modules of Python
  A bash
A   ssh
  S A C/C++ compiler for building
  AS make
A S tar, gzip, unzip, bzip2, xz
  S patch
S   git and curl for fetching

Inventory files

  • inventory folder structure

    inventory
    ├── 01-clusters.yaml
    └── group_vars
        ├── HLRS.yaml
        └── PSNC.yaml
    
  • hosts and groups definition (01-clusters.yaml)

    ---
    HLRS:
      hosts:
        hawk:
          ansible_host: hawk.hww.hlrs.de
        vulcan:
          ansible_host: vulcan.hww.hlrs.de
      vars:
        use_workspace: true
    PSNC:
      hosts:
        eagle:
          ansible_host: eagle.man.poznan.pl
    hidalgo:
      children:
        HLRS:
        PSNC:
      vars:
        spack_prefix: ~/spack-hidalgo/{{ inventory_hostname }}
        use_workspace: false
    

Inventory: Check Setup

  • check what variables Ansible set up in inventory for each host

    ansible-inventory all -i inventory --graph --var
    
    @all:
      |--@hidalgo:
      |  |--@HLRS:
      |  |  |--hawk
      |  |  |  |--{ansible_host = hawk.hww.hlrs.de}
      |  |  |  |--{ansible_user = hpcgogol}
      |  |  |  |--{spack_prefix = ~/spack-hidalgo/{{ inventory_hostname }}}
      |  |  |  |--{use_workspace = True}
      |  |  |--vulcan
      |  |  |  |--{ansible_host = vulcan.hww.hlrs.de}
      |  |  |  |--{ansible_user = hpcgogol}
      |  |  |  |--{spack_prefix = ~/spack-hidalgo/{{ inventory_hostname }}}
      |  |  |  |--{use_workspace = True}
      |  |  |--{ansible_user = hpcgogol}
      |  |  |--{use_workspace = True}
      |  |--@PSNC:
      |  |  |--eagle
      |  |  |  |--{ansible_host = eagle.man.poznan.pl}
      |  |  |  |--{ansible_user = gogolenko}
      |  |  |  |--{spack_prefix = ~/spack-hidalgo/{{ inventory_hostname }}}
      |  |  |  |--{use_workspace = False}
      |  |  |--{ansible_user = gogolenko}
      |  |--{spack_prefix = ~/spack-hidalgo/{{ inventory_hostname }}}
      |  |--{use_workspace = False}
      |--@ungrouped:
    

Inventory: Check Accessibility

  • check accessibility of HiDALGO HPC infrastructure

    ansible hidalgo -i inventory -m ping
    
    PLAY [Ansible Ad-Hoc] **********************************************************
    
    TASK [ping] ********************************************************************
    ok: [vulcan]
    ok: [hawk]
    ok: [eagle]
    
    PLAY RECAP *********************************************************************
    eagle                      : ok=1    changed=0    unreachable=0    failed=0    skipped=0
    hawk                       : ok=1    changed=0    unreachable=0    failed=0    skipped=0
    vulcan                     : ok=1    changed=0    unreachable=0    failed=0    skipped=0
    

Simple Playbook

  • simple playbook: download locally and distribute

    ---
    - name: Install Spack on HiDALGO infrastructure
      hosts: hidalgo
      gather_facts: no
      vars:
        spack_tarball: ./spack-0.16.1.tar.gz
        spack_version: '0.16.1'
        spack_checksum: 'md5:b6f9fdea5b5228f0a591c7cdcf44513c'
        spack_extra_repos:
        - $spack/var/spack/repos/hidalgo
    
      tasks:
        - name: Download new Spack release locally
          run_once: True
          delegate_to: localhost
          delegate_facts: True
          ansible.builtin.get_url:
    	url: 'https://github.com/spack/spack/releases/download/v{{ spack_version }}/spack-{{ spack_version }}.tar.gz'
    	dest: '{{ spack_tarball | default(".") }}'
    	checksum: '{{ spack_checksum }}'
    	force: no
          register: spack_local_tarball
    
        - name: Unarchive Spack tarball {{ spack_local_tarball.dest }} locally
          unarchive:
    	src: '{{ spack_local_tarball.dest }}'
    	dest: '{{ spack_prefix }}'
    	extra_opts: [--strip-components=1]
    	creates: '{{ spack_prefix }}/bin/spack'
    
  • launch

    ansible-playbook -i inventory -l hawk,eagle install_spack.yml
    

Assembling configs

  • task for creating configs from variables

    - name: Mock =repos.yaml= to enable accessing additional repos
      ansible.builtin.copy:
        content: |
          repos:
          {{ spack_extra_repos | default([]) | to_nice_yaml }}
        dest: '{{ spack_prefix }}/etc/spack/repos.yaml'
        force: true
    
  • result

    ssh hawk 'cat ~/spack-hidalgo/hawk/etc/spack/repos.yaml'
    
    repos:
    - $spack/var/spack/repos/hidalgo
    

So we've done it!

Not really...

What about "off-line" clusters?!
mirror

Workflow: Spack deployment with "cluster agnostic" mirror

Sorry, your browser does not support SVG.

Mirror for spec list

  • mirror for individual specs

    - name: Mirror individual packages
      ansible.builtin.shell:
        cmd: |
          spack \
    	mirror create -D -n{{ spack_mirror_versions_per_spec }} -d {{ spack_mirror_dir }} \
    	  {{ spack_mirror_packages | join(' ')  }}
        creates: '{{ spack_mirror_dir }}'
      when: spack_mirror_packages is defined and (spack_mirror_packages | length > 0)
    
  • issue: interference of ~/.spack
    • Spack will ignore packages defined in packages.yaml (non-buildable)

Mirror for spec list: Interference of ~/.spack

  • spack graph -d libosrm | dot -Tpdf | zathura - 
    spack graph -d libosrm | dot -Tsvg -o/tmp/spack.stdin.svg && eog /tmp/spack.stdin.svg
    spack graph -i -d | dot -Tsvg -o./full_installation.svg # all installed (not only in env)
    

Sorry, your browser does not support SVG.

Sorry, your browser does not support SVG.

Mirror for spec list: corrected

  • mirror for individual specs

    - name: Mirror individual packages
      ansible.builtin.shell:
        cmd: |
          spack -C {{ tempdir_spack_build.path }}/etc/spack \
    	mirror create -D -n{{ spack_mirror_versions_per_spec }} -d {{ spack_mirror_dir }} \
    	  {{ spack_mirror_packages | join(' ')  }}
        creates: '{{ spack_mirror_dir }}'
      environment:
        PATH: "{{ ansible_env.PATH }}:{{ tempdir_spack_build.path }}/bin"
      when: spack_mirror_packages is defined and (spack_mirror_packages | length > 0)
    
    
  • tempting solution: put the following text to packages.yaml

    packages:: {}
    
  • correct solution: copy packages.yaml from etc/spack/defaults to '{{ tempdir_spack_build.path }}/etc/spack'

Mirror for environments

- name: Mirror environments
  ansible.builtin.shell:
    cmd: |
      . {{ tempdir_spack_build.path }}/share/spack/setup-env.sh && \
	 spack env create --without-view tmpenv{{ index }} {{ item  }} \
      && spack env activate tmpenv{{ index }} \
      && spack -C {{ tempdir_spack_build.path }}/etc/spack concretize \
      && spack -C {{ tempdir_spack_build.path }}/etc/spack \
	   mirror create -a -n{{ spack_mirror_versions_per_spec }} -d {{ spack_mirror_dir }}
  # environment:
  #   PATH: "{{ ansible_env.PATH }}:{{ tempdir_spack_build.path }}/bin"
  loop: "{{ spack_mirror_envs | flatten(levels=1) }}"
  loop_control:
    index_var: index
  when: spack_mirror_envs is defined

Workflow: Spack deployment with "cluster aware" mirror

Sorry, your browser does not support SVG.

Summary: Reproducing general software environments on the resources of several HPC centers at once

If Ansible inventory is available,

  • take exact or prepare logical Spack environment. E.g.,

    spack:
      specs:
      - python@3.9.0+optimizations
      - matrix:
        - [py-numpy, py-scipy, py-scikit-learn]
        - [^python@3.9.0+optimizations]
      view: true
    
  • run Spack deployment playbook

    ansible-playbook -i inventory install_spack.yml    
    
  • launch Spack command for installing environment

    ansible hidalgo -i inventory -m shell -a \
      'cmd=". {{ spack_prefix }}/share/spack/setup-env.sh \
        && spactivate sna && spack install"'
    

Spack Deployment: Summary

Highlights

  • decouple Spack config variables from deployment rules in Ansible
  • have 2 strategies for creating mirrors:
    • unsafe: download single generic mirror for all systems
    • safe: download mirrors for each target system separately

Result

  • deploy reproducible software environments (and specs)
  • deploy simultaneously on several hosts

Reporting hardware and software with Ansible

  • platform (hardware, OS): variable '{{ ansible_facts }}'

    - name: Collect facts returned by facter
      ansible.builtin.setup:
        gather_subset:
          - '!all'
          - '!min'
          - hardware
        filter: "*processor*"
    
  • "exact" software: slurp spack.lock and report it with

    {{ spack_env_lock['content']  | b64decode | from_json | to_nice_yaml }}
    

Reporting hardware and software with Ansible: Alternative for reporing soft

  • "exact" software: options --yaml and --json

    spack find --json
    spack spec -y libosrm
    
    spec:
    - libosrm:
        version: 5.24.0
        arch:
          platform: linux
          platform_os: ubuntu16.04
          target:
    	name: broadwell
    	vendor: GenuineIntel
    	features:
    	- adx
    	- aes
    	- avx
    	- avx2
    	- bmi1
    	- bmi2
    	- f16c
    	- fma
    	- mmx
    	- movbe
    	- pclmulqdq
    	- popcnt
    	- rdrand
    	- rdseed
    	- sse
    	- sse2
    	- sse4_1
    	- sse4_2
    	- ssse3
    	generation: 0
    	parents:
    	- haswell
        compiler:
          name: gcc
          version: 5.4.0
        namespace: hidalgo
        parameters:
          build_type: Release
          doxygen: false
          ipo: false
          lib_only: true
          osmium: false
    

Reporting hardware and software with Ansible: From compute node

  • run from Eagle compute node

    ansible all -i 127.0.0.1, -m ansible.builtin.setup --connection=local
    
  • known issues

    facts:
      discovered_interpreter_python: /usr/bin/python
      processor:
      - '0'
      - GenuineIntel
      - Common KVM processor
      - '1'
      - GenuineIntel
      - Common KVM processor
      - '2'
      - GenuineIntel
      - Common KVM processor
      - '3'
      - GenuineIntel
      - Common KVM processor
      - '4'
      - GenuineIntel
      - Common KVM processor
      - '5'
      - GenuineIntel
      - Common KVM processor
      - '6'
      - GenuineIntel
      - Common KVM processor
      - '7'
      - GenuineIntel
      - Common KVM processor
      - '8'
      - GenuineIntel
      - Common KVM processor
      - '9'
      - GenuineIntel
      - Common KVM processor
      - '10'
      - GenuineIntel
      - Common KVM processor
      - '11'
      - GenuineIntel
      - Common KVM processor
      processor_cores: 12
      processor_count: 1
      processor_nproc: 12
      processor_threads_per_core: 1
      processor_vcpus: 12
    
  • same thing inside /proc/cpuinfo

Further topics

Spack

Resources (readings, videos)

Thanks to

  • LLNL team led by Todd Gamblin, Adam Stewart from UIUC, and Spack community
  • Ansible team
  • users: oshch (Oleksandr) and ktokm (Kamil)
  • EC for HiDALGO project
  • Matt Groening
  • … and you for your attention!