easybuild.tools.filetools module

Set of file tools.

author:Stijn De Weirdt (Ghent University)
author:Dries Verdegem (Ghent University)
author:Kenneth Hoste (Ghent University)
author:Pieter De Baets (Ghent University)
author:Jens Timmerman (Ghent University)
author:Toon Willems (Ghent University)
author:Ward Poelmans (Ghent University)
author:Fotis Georgatos (Uni.Lu, NTUA)
author:Sotiris Fragkiskos (NTUA, CERN)
author:Davide Vanzo (ACCRE, Vanderbilt University)
author:Damian Alvarez (Forschungszentrum Juelich GmbH)
author:Maxime Boissonneault (Compute Canada)
class easybuild.tools.filetools.ZlibChecksum(algorithm)

Bases: object

wrapper class for adler32 and crc32 checksums to match the interface of the hashlib module


Return hex string of the checksum


Calculates a new checksum using the old one and the new data

easybuild.tools.filetools.adjust_permissions(provided_path, permission_bits, add=True, onlyfiles=False, onlydirs=False, recursive=True, group_id=None, relative=True, ignore_errors=False, skip_symlinks=None)

Change permissions for specified path, using specified permission bits

  • add – add permissions relative to current permissions (only relevant if ‘relative’ is set to True)
  • onlyfiles – only change permissions on files (not directories)
  • onlydirs – only change permissions on directories (not files)
  • recursive – change permissions recursively (only makes sense if path is a directory)
  • group_id – also change group ownership to group with this group ID
  • relative – add/remove permissions relative to current permissions (if False, hard set specified permissions)
  • ignore_errors – ignore errors that occur when changing permissions (up to a maximum ratio specified by –max-fail-ratio-adjust-permissions configuration option)

Add or remove (if add is False) permission_bits from all files (if onlydirs is False) and directories (if onlyfiles is False) in path

easybuild.tools.filetools.apply_patch(patch_file, dest, fn=None, copy=False, level=None, use_git_am=False, use_git=False)

Apply a patch to source code in directory dest - assume unified diff created with “diff -ru old new”

Raises EasyBuildError on any error and returns True on success

easybuild.tools.filetools.apply_regex_substitutions(paths, regex_subs, backup='.orig.eb', on_missing_match=None)

Apply specified list of regex substitutions.

  • paths – list of paths to files to patch (or just a single filepath)
  • regex_subs – list of substitutions to apply, specified as (<regexp pattern>, <replacement string>)
  • backup – create backup of original file with specified suffix (no backup if value evaluates to False)
  • on_missing_match – Define what to do when no match was found in the file. Can be ‘error’ to raise an error, ‘warn’ to print a warning or ‘ignore’ to do nothing Defaults to the value of –strict
easybuild.tools.filetools.back_up_file(src_file, backup_extension='bak', hidden=False, strip_fn=None)

Backs up a file appending a backup extension and timestamp to it (if there is already an existing backup).

  • src_file – file to be back up
  • backup_extension – extension to use for the backup file (can be empty or None)
  • hidden – make backup hidden (leading dot in filename)
  • strip_fn – strip specified trailing substring from filename of backup

location of backed up file

easybuild.tools.filetools.calc_block_checksum(path, algorithm)

Calculate a checksum of a file by reading it into blocks


Change to directory at specified location.

Parameters:path – location to change to
Returns:previous location we were in

Check whether a lock with specified name already exists.

If it exists, either wait until it’s released, or raise an error (depending on –wait-on-lock configuration option).


Clean up all still existing locks that were created in this session.

easybuild.tools.filetools.clean_up_locks_signal_handler(signum, frame)

Signal handler, cleans up locks & exits with received signal number.

easybuild.tools.filetools.cleanup(logfile, tempdir, testing, silent=False)

Cleanup the specified log file and the tmp directory, if desired.

  • logfile – path to log file to clean up
  • tempdir – path to temporary directory to clean up
  • testing – are we in testing mode? if so, don’t actually clean up anything
  • silent – be silent (don’t print anything to stdout)
easybuild.tools.filetools.compute_checksum(path, checksum_type='md5')

Compute checksum of specified file.

  • path – Path of file to compute checksum for
  • checksum_type – type(s) of checksum (‘adler32’, ‘crc32’, ‘md5’ (default), ‘sha1’, ‘sha256’, ‘sha512’, ‘size’)
easybuild.tools.filetools.convert_name(name, upper=False)

Converts name so it can be used as variable name

easybuild.tools.filetools.copy(paths, target_path, force_in_dry_run=False, **kwargs)

Copy single file/directory or list of files and directories to specified location

  • paths – path(s) to copy
  • target_path – target location
  • force_in_dry_run – force running the command during dry run
  • kwargs – additional named arguments to pass down to copy_dir
easybuild.tools.filetools.copy_dir(path, target_path, force_in_dry_run=False, dirs_exist_ok=False, check_for_recursive_symlinks=True, **kwargs)

Copy a directory from specified location to specified location

  • path – the original directory path
  • target_path – path to copy the directory to
  • force_in_dry_run – force running the command during dry run
  • dirs_exist_ok – boolean indicating whether it’s OK if the target directory already exists
  • check_for_recursive_symlinks – If symlink arg is not given or False check for recursive symlinks first

shutil.copytree is used if the target path does not exist yet; if the target path already exists, the ‘copy’ function will be used to copy the contents of the source path to the target path

Additional specified named arguments are passed down to shutil.copytree/copy if used.

easybuild.tools.filetools.copy_easyblocks(paths, target_dir)

Find right location for easyblock file and copy it there

easybuild.tools.filetools.copy_file(path, target_path, force_in_dry_run=False)

Copy a file from specified location to specified location

  • path – the original filepath
  • target_path – path to copy the file to
  • force_in_dry_run – force copying of file during dry run
easybuild.tools.filetools.copy_files(paths, target_path, force_in_dry_run=False, target_single_file=False, allow_empty=True, verbose=False)

Copy list of files to specified target path. Target directory is created if it doesn’t exist yet.

  • paths – list of filepaths to copy
  • target_path – path to copy files to
  • force_in_dry_run – force copying of files during dry run
  • target_single_file – if there’s only a single file to copy, copy to a file at target path (not a directory)
  • allow_empty – allow empty list of paths to copy as input (if False: raise error on empty input list)
  • verbose – print a message to report copying of files
easybuild.tools.filetools.copy_framework_files(paths, target_dir)

Find right location for framework file and copy it there

easybuild.tools.filetools.copytree(src, dst, symlinks=False, ignore=None)

DEPRECATED and removed. Use copy_dir

easybuild.tools.filetools.create_index(path, ignore_dirs=None)

Create index for files in specified path.


Create lock with specified name.


Create info dictionary from specified patch spec.

easybuild.tools.filetools.create_unused_dir(parent_folder, name)

Create a new folder in parent_folder using name as the name. When a folder of that name already exists, ‘_0’ is appended which is retried for increasing numbers until an unused name was found


Return decoded version of class name.


Decoding function to revert result of encode_string.


Derive alternate PyPI URL for given URL.


Determine common path prefix for a given list of paths.


Determine size of file from provided HTTP header info (without downloading it).


Determine full path for lock with specifed name.

easybuild.tools.filetools.det_patched_files(path=None, txt=None, omit_ab_prefix=False, github=False, filter_deleted=False)

Determine list of patched files from a patch. It searches for “+++ path/to/patched/file” lines to determine the patched files. Note: does not correctly handle filepaths with spaces.

  • path – the path to the diff
  • txt – the contents of the diff (either path or txt should be give)
  • omit_ab_prefix – ignore the a/ or b/ prefix of the files
  • github – only consider lines that start with ‘diff –git’ to determine list of patched files
  • filter_deleted – filter out all files that were deleted by the patch

Determine total size of given filepath (in bytes).

easybuild.tools.filetools.diff_files(path1, path2)

Return unified diff between two files

easybuild.tools.filetools.dir_contains_files(path, recursive=True)

Return True if the given directory does contain any file

:recursive If False only the path itself is considered, else all subdirectories are also searched

easybuild.tools.filetools.download_file(filename, url, path, forced=False)

Download a file from the given URL, to the specified path.

easybuild.tools.filetools.dump_index(path, max_age_sec=None)

Create index for files in specified path, and dump it to file (alphabetically sorted).


return encoded version of class name

This encoding function handles funky software names ad infinitum, like:
example: ‘0_foo+0x0x#-$__’ becomes: ‘0_underscore_foo_plus_0x0x_hash__minus__dollar__underscore__underscore_’

The intention is to have a robust escaping mechanism for names like c++, C# et al

It has been inspired by the concepts seen at, but in lowercase style: * http://fossies.org/dox/netcdf- * http://celldesigner.org/help/CDH_Species_01.html * http://research.cs.berkeley.edu/project/sbp/darcsrepo-no-longer-updated/src/edu/berkeley/sbp/misc/ReflectiveWalker.java # noqa and can be extended freely as per ISO/IEC 10646:2012 / Unicode 6.1 names: * http://www.unicode.org/versions/Unicode6.1.0/ For readability of >2 words, it is suggested to use _CamelCase_ style. So, yes, ‘_GreekSmallLetterEtaWithPsiliAndOxia_’ could indeed be a fully valid software name; software “electron” in the original spelling anyone? ;-)


Expand specified glob paths to a list of unique non-glob paths to only files.

easybuild.tools.filetools.extract_cmd(filepath, overwrite=False)

Determines the file type of file at filepath, returns extract cmd based on file suffix

easybuild.tools.filetools.extract_file(fn, dest, cmd=None, extra_options=None, overwrite=False, forced=False, change_into_dir=None)

Extract file at given path to specified directory :param fn: path to file to extract :param dest: location to extract to :param cmd: extract command to use (derived from filename if not specified) :param extra_options: extra options to pass to extract command :param overwrite: overwrite existing unpacked file :param forced: force extraction in (extended) dry run mode :param change_into_dir: change into resulting directory;

None (current default) implies True, but this is deprecated, this named argument should be set to False or True explicitely (in a future major release, default will be changed to False)
Returns:path to directory (in case of success)

Returns a non-existing file to be used as destination for backup files


Try to locate a possible new base directory - this is typically a single subdir, e.g. from untarring a tarball - when extracting multiple tarballs in the same directory,

expect only the first one to give the correct path
easybuild.tools.filetools.find_easyconfigs(path, ignore_dirs=None)

Find .eb easyconfig files in path


Find EasyBuild script with given name (in easybuild/scripts subdirectory).


Find best match for filename extension.

easybuild.tools.filetools.find_flexlm_license(custom_env_vars=None, lic_specs=None)

Find FlexLM license.

Considered specified list of environment variables; checks for path to existing license file or valid license server specification; duplicate paths are not retained in the returned list of license specs.

If no license is found through environment variables, also consider ‘lic_specs’.

  • custom_env_vars – list of environment variables to considered (if None, only consider $LM_LICENSE_FILE)
  • lic_specs – list of license specifications

tuple with list of valid license specs found and name of first valid environment variable

easybuild.tools.filetools.find_glob_pattern(glob_pattern, fail_on_no_match=True)

Find unique file/dir matching glob_pattern (raises error if more than one match is found)


Make sure file is an easyblock and get easyblock class name

easybuild.tools.filetools.get_source_tarball_from_git(filename, targetdir, git_config)

Downloads a git repository, at a specific tag or commit, recursively or not, and make an archive with it

  • filename – name of the archive to save the code to (must be .tar.gz)
  • targetdir – target directory where to save the archive to
  • git_config – dictionary containing url, repo_name, recursive, and one of tag or commit
easybuild.tools.filetools.guess_patch_level(patched_files, parent_dir)

Guess patch level based on list of patched files and specified directory.

Check the given directory for recursive symlinks.

That means symlinks to folders inside the path which would cause infinite loops when traversed regularily.

Parameters:path – Path to directory to check

Put fake ‘vsc’ Python package in place, to catch easyblocks/scripts that still import from vsc.* namespace (vsc-base & vsc-install were ingested into the EasyBuild framework for EasyBuild 4.0,


Determine whether specified URL is already an alternate PyPI URL, i.e. whether it contains a hash.


Check whether given bytestring represents the contents of a binary file or not.


Return whether specified easyblock name is a generic easyblock or not.


Determine whether file at specified path is a patch file (based on +++ and — lines being present).


Return whether file at specified location exists and is readable.


Check whether provided string is a SHA256 checksum.

easybuild.tools.filetools.load_index(path, ignore_dirs=None)

Load index for specified path, and return contents (or None if no index exists).

easybuild.tools.filetools.locate_files(files, paths, ignore_subdirs=None)

Determine full path for list of files, in given list of paths (directories).

easybuild.tools.filetools.mkdir(path, parents=False, set_gid=None, sticky=None)

Create a directory Directory is the path to create

  • parents – create parent directories if needed (mkdir -p)
  • set_gid – set group ID bit, to make subdirectories and files inherit group
  • sticky – set the sticky bit on this directory (a.k.a. the restricted deletion flag), to avoid users can removing/renaming files in this directory
easybuild.tools.filetools.modify_env(old, new)

NO LONGER SUPPORTED: use modify_env from easybuild.tools.environment instead

easybuild.tools.filetools.move_file(path, target_path, force_in_dry_run=False)

Move a file from path to target_path

  • path – the original filepath
  • target_path – path to move the file to
  • force_in_dry_run – force running the command during dry run
easybuild.tools.filetools.move_logs(src_logfile, target_logfile)

Move log file(s).


Normalize path removing empty and dot components.

Similar to os.path.normpath but does not resolve ‘..’ which may return a wrong path when symlinks are used

easybuild.tools.filetools.open_file(path, mode)

Open a (usually) text file. If mode is not binary, then utf-8 encoding will be used for Python 3.x

easybuild.tools.filetools.parse_http_header_fields_urlpat(arg, urlpat=None, header=None, urlpat_headers_collection=None, maxdepth=3)

Recurse into multi-line string “[URLPAT::][HEADER:]FILE|FIELD” where FILE may be another such string or file containing lines matching the same format, such as “^https://www.example.com::/path/to/headers.txt”, and flatten the result to dict e.g. {‘^https://www.example.com’: [‘Authorization: Basic token’, ‘User-Agent: Special Agent’]}

easybuild.tools.filetools.parse_log_for_error(txt, regExp=None, stdout=True, msg=None)

NO LONGER SUPPORTED: use parse_log_for_error from easybuild.tools.run instead

easybuild.tools.filetools.path_matches(path, paths)

Check whether given path matches any of the provided paths.


Fetch list of source URLs (incl. source filename) for specified Python package from PyPI, using ‘simple’ PyPI API.

easybuild.tools.filetools.read_file(path, log_error=True, mode='r')

Read contents of file at given path, in a robust way.


Register signal handler for signals that cancel the current EasyBuild session, so we can clean up the locks that were created first.


Remove single file/directory or list of files and directories

Parameters:paths – path(s) to remove

Remove directory at specified path.


Remove file at specified path.


Remove lock with specified name.


Return fully resolved path for given path.

Parameters:path – path that (maybe) contains symlinks
easybuild.tools.filetools.rmtree2(path, n=3)

Wrapper around shutil.rmtree to make it more robust when used on NFS mounted file systems.

easybuild.tools.filetools.run_cmd(cmd, log_ok=True, log_all=False, simple=False, inp=None, regexp=True, log_output=False, path=None)

NO LONGER SUPPORTED: use run_cmd from easybuild.tools.run instead

easybuild.tools.filetools.run_cmd_qa(cmd, qa, no_qa=None, log_ok=True, log_all=False, simple=False, regexp=True, std_qa=None, path=None)

NO LONGER SUPPORTED: use run_cmd_qa from easybuild.tools.run instead

easybuild.tools.filetools.search_file(paths, query, short=False, ignore_dirs=None, silent=False, filename_only=False, terse=False, case_sensitive=False)

Search for files using in specified paths using specified search query (regular expression)

  • paths – list of paths to search in
  • query – search query to use (regular expression); will be used case-insensitive
  • short – figure out common prefix of hits, use variable to factor it out
  • ignore_dirs – list of directories to ignore (default: [‘.git’, ‘.svn’])
  • silent – whether or not to remain silent (don’t print anything)
  • filename_only – only return filenames, not file paths
  • terse – stick to terse (machine-readable) output, as opposed to pretty-printing
easybuild.tools.filetools.set_gid_sticky_bits(path, set_gid=None, sticky=None, recursive=False)

Set GID/sticky bits on specified path.

Create a symlink at the specified path to the given path.

  • source_path – source file path
  • symlink_path – symlink file path
  • use_abspath_source – resolves the absolute path of source_path
easybuild.tools.filetools.verify_checksum(path, checksums)

Verify checksum of specified file.

  • file – path of file to verify checksum of
  • checksum – checksum value (and type, optionally, default is MD5), e.g., ‘af314’, (‘sha’, ‘5ec1b’)
easybuild.tools.filetools.weld_paths(path1, path2)

Weld two paths together, taking into account overlap between tail of 1st path with head of 2nd path.

easybuild.tools.filetools.which(cmd, retain_all=False, check_perms=True, log_ok=True, log_error=None, on_error=None)

Return (first) path in $PATH for specified command, or None if command is not found

  • retain_all – returns all locations to the specified command in $PATH, not just the first one
  • check_perms – check whether candidate path has read/exec permissions before accepting it as a match
  • log_ok – Log an info message where the command has been found (if any)
  • on_error – What to do if the command was not found, default: WARN. Possible values: IGNORE, WARN, ERROR
easybuild.tools.filetools.write_file(path, data, append=False, forced=False, backup=False, always_overwrite=True, verbose=False, show_progress=False, size=None)

Write given contents to file at given path; overwrites current file contents without backup by default!

  • path – location of file
  • data – contents to write to file. Can be a file-like object of binary data
  • append – append to existing file rather than overwrite
  • forced – force actually writing file in (extended) dry run mode
  • backup – back up existing file before overwriting or modifying it
  • always_overwrite – don’t require –force to overwrite an existing file
  • verbose – be verbose, i.e. inform where backup file was created
  • show_progress – show progress bar while writing file
  • size – size (in bytes) of data to write (used for progress bar)