easybuild.tools.filetools module

Set of file tools.

author:Stijn De Weirdt (Ghent University)
author:Dries Verdegem (Ghent University)
author:Kenneth Hoste (Ghent University)
author:Pieter De Baets (Ghent University)
author:Jens Timmerman (Ghent University)
author:Toon Willems (Ghent University)
author:Ward Poelmans (Ghent University)
author:Fotis Georgatos (Uni.Lu, NTUA)
author:Sotiris Fragkiskos (NTUA, CERN)
author:Davide Vanzo (ACCRE, Vanderbilt University)
author:Damian Alvarez (Forschungszentrum Juelich GmbH)
author:Maxime Boissonneault (Compute Canada)
class easybuild.tools.filetools.ZlibChecksum(algorithm)

Bases: object

wrapper class for adler32 and crc32 checksums to match the interface of the hashlib module


Return hex string of the checksum


Calculates a new checksum using the old one and the new data

easybuild.tools.filetools.adjust_permissions(name, permissionBits, add=True, onlyfiles=False, onlydirs=False, recursive=True, group_id=None, relative=True, ignore_errors=False, skip_symlinks=None)

Add or remove (if add is False) permissionBits from all files (if onlydirs is False) and directories (if onlyfiles is False) in path

easybuild.tools.filetools.apply_patch(patch_file, dest, fn=None, copy=False, level=None, use_git_am=False)

Apply a patch to source code in directory dest - assume unified diff created with “diff -ru old new”

easybuild.tools.filetools.apply_regex_substitutions(path, regex_subs, backup='.orig.eb')

Apply specified list of regex substitutions.

  • path – path to file to patch
  • regex_subs – list of substitutions to apply, specified as (<regexp pattern>, <replacement string>)
  • backup – create backup of original file with specified suffix (no backup if value evaluates to False)
easybuild.tools.filetools.back_up_file(src_file, backup_extension='bak', hidden=False, strip_fn=None)

Backs up a file appending a backup extension and timestamp to it (if there is already an existing backup).

  • src_file – file to be back up
  • backup_extension – extension to use for the backup file (can be empty or None)
  • hidden – make backup hidden (leading dot in filename)
  • strip_fn – strip specified trailing substring from filename of backup

location of backed up file

easybuild.tools.filetools.calc_block_checksum(path, algorithm)

Calculate a checksum of a file by reading it into blocks


Change to directory at specified location.

Parameters:path – location to change to
Returns:previous location we were in
easybuild.tools.filetools.cleanup(logfile, tempdir, testing, silent=False)

Cleanup the specified log file and the tmp directory, if desired.

  • logfile – path to log file to clean up
  • tempdir – path to temporary directory to clean up
  • testing – are we in testing mode? if so, don’t actually clean up anything
  • silent – be silent (don’t print anything to stdout)
easybuild.tools.filetools.compute_checksum(path, checksum_type='md5')

Compute checksum of specified file.

  • path – Path of file to compute checksum for
  • checksum_type – type(s) of checksum (‘adler32’, ‘crc32’, ‘md5’ (default), ‘sha1’, ‘sha256’, ‘sha512’, ‘size’)
easybuild.tools.filetools.convert_name(name, upper=False)

Converts name so it can be used as variable name

easybuild.tools.filetools.copy(paths, target_path, force_in_dry_run=False)

Copy single file/directory or list of files and directories to specified location

  • paths – path(s) to copy
  • target_path – target location
  • force_in_dry_run – force running the command during dry run
easybuild.tools.filetools.copy_dir(path, target_path, force_in_dry_run=False, **kwargs)

Copy a directory from specified location to specified location

  • path – the original directory path
  • target_path – path to copy the directory to
  • force_in_dry_run – force running the command during dry run

Additional specified named arguments are passed down to shutil.copytree

easybuild.tools.filetools.copy_file(path, target_path, force_in_dry_run=False)

Copy a file from specified location to specified location

  • path – the original filepath
  • target_path – path to copy the file to
  • force_in_dry_run – force running the command during dry run
easybuild.tools.filetools.copytree(src, dst, symlinks=False, ignore=None)

Copied from Lib/shutil.py in python 2.7, since we need this to work for python2.4 aswell and this code can be improved…

Recursively copy a directory tree using copy2().

The destination directory must not already exist. If exception(s) occur, an Error is raised with a list of reasons.

If the optional symlinks flag is true, symbolic links in the source tree result in symbolic links in the destination tree; if it is false, the contents of the files pointed to by symbolic links are copied.

The optional ignore argument is a callable. If given, it is called with the src parameter, which is the directory being visited by copytree(), and names which is the list of src contents, as returned by os.listdir():

callable(src, names) -> ignored_names

Since copytree() is called recursively, the callable will be called once for each directory that is copied. It returns a list of names relative to the src directory that should not be copied.

XXX Consider this example code rather than the ultimate tool.


Return decoded version of class name.


Decoding function to revert result of encode_string.


Derive alternate PyPI URL for given URL.


Determine common path prefix for a given list of paths.

easybuild.tools.filetools.det_patched_files(path=None, txt=None, omit_ab_prefix=False, github=False, filter_deleted=False)

Determine list of patched files from a patch. It searches for “+++ path/to/patched/file” lines to determine the patched files. Note: does not correctly handle filepaths with spaces.

  • path – the path to the diff
  • txt – the contents of the diff (either path or txt should be give)
  • omit_ab_prefix – ignore the a/ or b/ prefix of the files
  • github – only consider lines that start with ‘diff –git’ to determine list of patched files
  • filter_deleted – filter out all files that were deleted by the patch

Determine total size of given filepath (in bytes).

easybuild.tools.filetools.diff_files(path1, path2)

Return unified diff between two files

easybuild.tools.filetools.download_file(filename, url, path, forced=False)

Download a file from the given URL, to the specified path.


return encoded version of class name

This encoding function handles funky software names ad infinitum, like:
example: ‘0_foo+0x0x#-$__’ becomes: ‘0_underscore_foo_plus_0x0x_hash__minus__dollar__underscore__underscore_’

The intention is to have a robust escaping mechanism for names like c++, C# et al

It has been inspired by the concepts seen at, but in lowercase style: * http://fossies.org/dox/netcdf- * http://celldesigner.org/help/CDH_Species_01.html * http://research.cs.berkeley.edu/project/sbp/darcsrepo-no-longer-updated/src/edu/berkeley/sbp/misc/ReflectiveWalker.java and can be extended freely as per ISO/IEC 10646:2012 / Unicode 6.1 names: * http://www.unicode.org/versions/Unicode6.1.0/ For readability of >2 words, it is suggested to use _CamelCase_ style. So, yes, ‘_GreekSmallLetterEtaWithPsiliAndOxia_’ could indeed be a fully valid software name; software “electron” in the original spelling anyone? ;-)


Expand specified glob paths to a list of unique non-glob paths to only files.

easybuild.tools.filetools.extract_cmd(filepath, overwrite=False)

Determines the file type of file at filepath, returns extract cmd based on file suffix

easybuild.tools.filetools.extract_file(fn, dest, cmd=None, extra_options=None, overwrite=False, forced=False)

Extract file at given path to specified directory :param fn: path to file to extract :param dest: location to extract to :param cmd: extract command to use (derived from filename if not specified) :param extra_options: extra options to pass to extract command :param overwrite: overwrite existing unpacked file :param forced: force extraction in (extended) dry run mode :return: path to directory (in case of success)


Returns a non-existing file to be used as destination for backup files


Try to locate a possible new base directory - this is typically a single subdir, e.g. from untarring a tarball - when extracting multiple tarballs in the same directory,

expect only the first one to give the correct path
easybuild.tools.filetools.find_easyconfigs(path, ignore_dirs=None)

Find .eb easyconfig files in path


Find EasyBuild script with given name (in easybuild/scripts subdirectory).


Find best match for filename extension.

easybuild.tools.filetools.find_flexlm_license(custom_env_vars=None, lic_specs=None)

Find FlexLM license.

Considered specified list of environment variables; checks for path to existing license file or valid license server specification; duplicate paths are not retained in the returned list of license specs.

If no license is found through environment variables, also consider ‘lic_specs’.

  • custom_env_vars – list of environment variables to considered (if None, only consider $LM_LICENSE_FILE)
  • lic_specs – list of license specifications

tuple with list of valid license specs found and name of first valid environment variable

easybuild.tools.filetools.get_source_tarball_from_git(filename, targetdir, git_config)

Downloads a git repository, at a specific tag or commit, recursively or not, and make an archive with it

  • filename – name of the archive to save the code to (must be .tar.gz)
  • targetdir – target directory where to save the archive to
  • git_config – dictionary containing url, repo_name, recursive, and one of tag or commit
easybuild.tools.filetools.guess_patch_level(patched_files, parent_dir)

Guess patch level based on list of patched files and specified directory.


Determine whether specified URL is already an alternate PyPI URL, i.e. whether it contains a hash.


Determine whether file at specified path is a patch file (based on +++ and — lines being present).


Return whether file at specified location exists and is readable.


Check whether provided string is a SHA256 checksum.

easybuild.tools.filetools.mkdir(path, parents=False, set_gid=None, sticky=None)

Create a directory Directory is the path to create

  • parents – create parent directories if needed (mkdir -p)
  • set_gid – set group ID bit, to make subdirectories and files inherit group
  • sticky – set the sticky bit on this directory (a.k.a. the restricted deletion flag), to avoid users can removing/renaming files in this directory
easybuild.tools.filetools.modify_env(old, new)

NO LONGER SUPPORTED: use modify_env from easybuild.tools.environment instead

easybuild.tools.filetools.move_file(path, target_path, force_in_dry_run=False)

Move a file from path to target_path

  • path – the original filepath
  • target_path – path to move the file to
  • force_in_dry_run – force running the command during dry run
easybuild.tools.filetools.move_logs(src_logfile, target_logfile)

Move log file(s).

easybuild.tools.filetools.parse_log_for_error(txt, regExp=None, stdout=True, msg=None)

NO LONGER SUPPORTED: use parse_log_for_error from easybuild.tools.run instead

easybuild.tools.filetools.path_matches(path, paths)

Check whether given path matches any of the provided paths.


Fetch list of source URLs (incl. source filename) for specified Python package from PyPI, using ‘simple’ PyPI API.

easybuild.tools.filetools.read_file(path, log_error=True)

Read contents of file at given path, in a robust way.


Remove single file/directory or list of files and directories

Parameters:paths – path(s) to remove

Remove directory at specified path.


Remove file at specified path.


Return fully resolved path for given path.

Parameters:path – path that (maybe) contains symlinks
easybuild.tools.filetools.rmtree2(path, n=3)

Wrapper around shutil.rmtree to make it more robust when used on NFS mounted file systems.

easybuild.tools.filetools.run_cmd(cmd, log_ok=True, log_all=False, simple=False, inp=None, regexp=True, log_output=False, path=None)

NO LONGER SUPPORTED: use run_cmd from easybuild.tools.run instead

easybuild.tools.filetools.run_cmd_qa(cmd, qa, no_qa=None, log_ok=True, log_all=False, simple=False, regexp=True, std_qa=None, path=None)

NO LONGER SUPPORTED: use run_cmd_qa from easybuild.tools.run instead

easybuild.tools.filetools.search_file(paths, query, short=False, ignore_dirs=None, silent=False, filename_only=False, terse=False)

Search for files using in specified paths using specified search query (regular expression)

  • paths – list of paths to search in
  • query – search query to use (regular expression); will be used case-insensitive
  • short – figure out common prefix of hits, use variable to factor it out
  • ignore_dirs – list of directories to ignore (default: [‘.git’, ‘.svn’])
  • silent – whether or not to remain silent (don’t print anything)
  • filename_only – only return filenames, not file paths
  • terse – stick to terse (machine-readable) output, as opposed to pretty-printing

Create a symlink at the specified path to the given path.

  • source_path – source file path
  • symlink_path – symlink file path
  • use_abspath_source – resolves the absolute path of source_path
easybuild.tools.filetools.verify_checksum(path, checksums)

Verify checksum of specified file.

  • file – path of file to verify checksum of
  • checksum – checksum value (and type, optionally, default is MD5), e.g., ‘af314’, (‘sha’, ‘5ec1b’)
easybuild.tools.filetools.weld_paths(path1, path2)

Weld two paths together, taking into account overlap between tail of 1st path with head of 2nd path.

easybuild.tools.filetools.which(cmd, retain_all=False, check_perms=True)

Return (first) path in $PATH for specified command, or None if command is not found

  • retain_all – returns all locations to the specified command in $PATH, not just the first one
  • check_perms – check whether candidate path has read/exec permissions before accepting it as a match
easybuild.tools.filetools.write_file(path, txt, append=False, forced=False, backup=False, always_overwrite=True, verbose=False)

Write given contents to file at given path; overwrites current file contents without backup by default!

  • path – location of file
  • txt – contents to write to file
  • append – append to existing file rather than overwrite
  • forced – force actually writing file in (extended) dry run mode
  • backup – back up existing file before overwriting or modifying it
  • always_overwrite – don’t require –force to overwrite an existing file
  • verbose – be verbose, i.e. inform where backup file was created