filetools
Set of file tools.
Authors:
- Stijn De Weirdt (Ghent University)
- Dries Verdegem (Ghent University)
- Kenneth Hoste (Ghent University)
- Pieter De Baets (Ghent University)
- Jens Timmerman (Ghent University)
- Toon Willems (Ghent University)
- Ward Poelmans (Ghent University)
- Fotis Georgatos (Uni.Lu, NTUA)
- Sotiris Fragkiskos (NTUA, CERN)
- Davide Vanzo (ACCRE, Vanderbilt University)
- Damian Alvarez (Forschungszentrum Juelich GmbH)
- Maxime Boissonneault (Compute Canada)
ZlibChecksum
¶
adjust_permissions(provided_path, permission_bits, add=True, onlyfiles=False, onlydirs=False, recursive=True, group_id=None, relative=True, ignore_errors=False, skip_symlinks=None)
¶
Change permissions for specified path, using specified permission bits
PARAMETER | DESCRIPTION |
---|---|
add |
add permissions relative to current permissions (only relevant if 'relative' is set to True)
DEFAULT:
|
onlyfiles |
only change permissions on files (not directories)
DEFAULT:
|
onlydirs |
only change permissions on directories (not files)
DEFAULT:
|
recursive |
change permissions recursively (only makes sense if path is a directory)
DEFAULT:
|
group_id |
also change group ownership to group with this group ID
DEFAULT:
|
relative |
add/remove permissions relative to current permissions (if False, hard set specified permissions)
DEFAULT:
|
ignore_errors |
ignore errors that occur when changing permissions (up to a maximum ratio specified by --max-fail-ratio-adjust-permissions configuration option) Add or remove (if add is False) permission_bits from all files (if onlydirs is False) and directories (if onlyfiles is False) in path
DEFAULT:
|
apply_patch(patch_file, dest, fn=None, copy=False, level=None, use_git_am=False, use_git=False)
¶
Apply a patch to source code in directory dest - assume unified diff created with "diff -ru old new"
Raises EasyBuildError on any error and returns True on success
apply_regex_substitutions(paths, regex_subs, backup='.orig.eb', on_missing_match=None)
¶
Apply specified list of regex substitutions.
PARAMETER | DESCRIPTION |
---|---|
paths |
list of paths to files to patch (or just a single filepath)
|
regex_subs |
list of substitutions to apply, specified as (
|
backup |
create backup of original file with specified suffix (no backup if value evaluates to False)
DEFAULT:
|
on_missing_match |
Define what to do when no match was found in the file. Can be 'error' to raise an error, 'warn' to print a warning or 'ignore' to do nothing Defaults to the value of --strict
DEFAULT:
|
back_up_file(src_file, backup_extension='bak', hidden=False, strip_fn=None)
¶
Backs up a file appending a backup extension and timestamp to it (if there is already an existing backup).
PARAMETER | DESCRIPTION |
---|---|
src_file |
file to be back up
|
backup_extension |
extension to use for the backup file (can be empty or None)
DEFAULT:
|
hidden |
make backup hidden (leading dot in filename)
DEFAULT:
|
strip_fn |
strip specified trailing substring from filename of backup
DEFAULT:
|
RETURNS | DESCRIPTION |
---|---|
location of backed up file |
calc_block_checksum(path, algorithm)
¶
Calculate a checksum of a file by reading it into blocks
change_dir(path)
¶
Change to directory at specified location.
PARAMETER | DESCRIPTION |
---|---|
path |
location to change to
|
RETURNS | DESCRIPTION |
---|---|
previous location we were in |
check_lock(lock_name)
¶
Check whether a lock with specified name already exists.
If it exists, either wait until it's released, or raise an error (depending on --wait-on-lock configuration option).
clean_up_locks()
¶
Clean up all still existing locks that were created in this session.
clean_up_locks_signal_handler(signum, frame)
¶
Signal handler, cleans up locks & exits with received signal number.
cleanup(logfile, tempdir, testing, silent=False)
¶
Cleanup the specified log file and the tmp directory, if desired.
PARAMETER | DESCRIPTION |
---|---|
logfile |
path to log file to clean up
|
tempdir |
path to temporary directory to clean up
|
testing |
are we in testing mode? if so, don't actually clean up anything
|
silent |
be silent (don't print anything to stdout)
DEFAULT:
|
compute_checksum(path, checksum_type=DEFAULT_CHECKSUM)
¶
Compute checksum of specified file.
PARAMETER | DESCRIPTION |
---|---|
path |
Path of file to compute checksum for
|
checksum_type |
type(s) of checksum ('adler32', 'crc32', 'md5' (default), 'sha1', 'sha256', 'sha512', 'size')
DEFAULT:
|
convert_name(name, upper=False)
¶
Converts name so it can be used as variable name
copy(paths, target_path, force_in_dry_run=False, **kwargs)
¶
Copy single file/directory or list of files and directories to specified location
PARAMETER | DESCRIPTION |
---|---|
paths |
path(s) to copy
|
target_path |
target location
|
force_in_dry_run |
force running the command during dry run
DEFAULT:
|
kwargs |
additional named arguments to pass down to copy_dir
DEFAULT:
|
copy_dir(path, target_path, force_in_dry_run=False, dirs_exist_ok=False, check_for_recursive_symlinks=True, **kwargs)
¶
Copy a directory from specified location to specified location
PARAMETER | DESCRIPTION |
---|---|
path |
the original directory path
|
target_path |
path to copy the directory to
|
force_in_dry_run |
force running the command during dry run
DEFAULT:
|
dirs_exist_ok |
boolean indicating whether it's OK if the target directory already exists
DEFAULT:
|
check_for_recursive_symlinks |
If symlink arg is not given or False check for recursive symlinks first shutil.copytree is used if the target path does not exist yet; if the target path already exists, the 'copy' function will be used to copy the contents of the source path to the target path Additional specified named arguments are passed down to shutil.copytree/copy if used.
DEFAULT:
|
copy_easyblocks(paths, target_dir)
¶
Find right location for easyblock file and copy it there
copy_file(path, target_path, force_in_dry_run=False)
¶
Copy a file from specified location to specified location
PARAMETER | DESCRIPTION |
---|---|
path |
the original filepath
|
target_path |
path to copy the file to
|
force_in_dry_run |
force copying of file during dry run
DEFAULT:
|
copy_files(paths, target_path, force_in_dry_run=False, target_single_file=False, allow_empty=True, verbose=False)
¶
Copy list of files to specified target path. Target directory is created if it doesn't exist yet.
PARAMETER | DESCRIPTION |
---|---|
paths |
list of filepaths to copy
|
target_path |
path to copy files to
|
force_in_dry_run |
force copying of files during dry run
DEFAULT:
|
target_single_file |
if there's only a single file to copy, copy to a file at target path (not a directory)
DEFAULT:
|
allow_empty |
allow empty list of paths to copy as input (if False: raise error on empty input list)
DEFAULT:
|
verbose |
print a message to report copying of files
DEFAULT:
|
copy_framework_files(paths, target_dir)
¶
Find right location for framework file and copy it there
copytree(src, dst, symlinks=False, ignore=None)
¶
DEPRECATED and removed. Use copy_dir
create_index(path, ignore_dirs=None)
¶
Create index for files in specified path.
create_lock(lock_name)
¶
Create lock with specified name.
create_patch_info(patch_spec)
¶
Create info dictionary from specified patch spec.
create_unused_dir(parent_folder, name)
¶
Create a new folder in parent_folder using name as the name. When a folder of that name already exists, '_0' is appended which is retried for increasing numbers until an unused name was found
decode_class_name(name)
¶
Return decoded version of class name.
decode_string(name)
¶
Decoding function to revert result of encode_string.
derive_alt_pypi_url(url)
¶
Derive alternate PyPI URL for given URL.
det_common_path_prefix(paths)
¶
Determine common path prefix for a given list of paths.
det_file_size(http_header)
¶
Determine size of file from provided HTTP header info (without downloading it).
det_lock_path(lock_name)
¶
Determine full path for lock with specifed name.
det_patched_files(path=None, txt=None, omit_ab_prefix=False, github=False, filter_deleted=False)
¶
Determine list of patched files from a patch. It searches for "+++ path/to/patched/file" lines to determine the patched files. Note: does not correctly handle filepaths with spaces.
PARAMETER | DESCRIPTION |
---|---|
path |
the path to the diff
DEFAULT:
|
txt |
the contents of the diff (either path or txt should be give)
DEFAULT:
|
omit_ab_prefix |
ignore the a/ or b/ prefix of the files
DEFAULT:
|
github |
only consider lines that start with 'diff --git' to determine list of patched files
DEFAULT:
|
filter_deleted |
filter out all files that were deleted by the patch
DEFAULT:
|
det_size(path)
¶
Determine total size of given filepath (in bytes).
diff_files(path1, path2)
¶
Return unified diff between two files
dir_contains_files(path, recursive=True)
¶
Return True if the given directory does contain any file
:recursive If False only the path itself is considered, else all subdirectories are also searched
download_file(filename, url, path, forced=False)
¶
Download a file from the given URL, to the specified path.
dump_index(path, max_age_sec=None)
¶
Create index for files in specified path, and dump it to file (alphabetically sorted).
encode_class_name(name)
¶
return encoded version of class name
encode_string(name)
¶
This encoding function handles funky software names ad infinitum, like: example: '0_foo+0x0x#-$__' becomes: '0_underscore_foo_plus_0x0x_hash__minus__dollar__underscore__underscore_' The intention is to have a robust escaping mechanism for names like c++, C# et al
It has been inspired by the concepts seen at, but in lowercase style: * http://fossies.org/dox/netcdf-4.2.1.1/escapes_8c_source.html * http://celldesigner.org/help/CDH_Species_01.html * http://research.cs.berkeley.edu/project/sbp/darcsrepo-no-longer-updated/src/edu/berkeley/sbp/misc/ReflectiveWalker.java # noqa and can be extended freely as per ISO/IEC 10646:2012 / Unicode 6.1 names: * http://www.unicode.org/versions/Unicode6.1.0/ For readability of >2 words, it is suggested to use CamelCase style. So, yes, 'GreekSmallLetterEtaWithPsiliAndOxia' could indeed be a fully valid software name; software "electron" in the original spelling anyone? ;-)
expand_glob_paths(glob_paths)
¶
Expand specified glob paths to a list of unique non-glob paths to only files.
extract_cmd(filepath, overwrite=False)
¶
Determines the file type of file at filepath, returns extract cmd based on file suffix
extract_file(fn, dest, cmd=None, extra_options=None, overwrite=False, forced=False, change_into_dir=None)
¶
Extract file at given path to specified directory
PARAMETER | DESCRIPTION |
---|---|
fn |
path to file to extract
|
dest |
location to extract to
|
cmd |
extract command to use (derived from filename if not specified)
DEFAULT:
|
extra_options |
extra options to pass to extract command
DEFAULT:
|
overwrite |
overwrite existing unpacked file
DEFAULT:
|
forced |
force extraction in (extended) dry run mode
DEFAULT:
|
change_into_dir |
change into resulting directory; None (current default) implies True, but this is deprecated, this named argument should be set to False or True explicitely (in a future major release, default will be changed to False)
DEFAULT:
|
RETURNS | DESCRIPTION |
---|---|
path to directory (in case of success) |
find_backup_name_candidate(src_file)
¶
Returns a non-existing file to be used as destination for backup files
find_base_dir()
¶
Try to locate a possible new base directory - this is typically a single subdir, e.g. from untarring a tarball - when extracting multiple tarballs in the same directory, expect only the first one to give the correct path
find_easyconfigs(path, ignore_dirs=None)
¶
Find .eb easyconfig files in path
find_eb_script(script_name)
¶
Find EasyBuild script with given name (in easybuild/scripts subdirectory).
find_extension(filename)
¶
Find best match for filename extension.
find_flexlm_license(custom_env_vars=None, lic_specs=None)
¶
Find FlexLM license.
Considered specified list of environment variables; checks for path to existing license file or valid license server specification; duplicate paths are not retained in the returned list of license specs.
If no license is found through environment variables, also consider 'lic_specs'.
PARAMETER | DESCRIPTION |
---|---|
custom_env_vars |
list of environment variables to considered (if None, only consider $LM_LICENSE_FILE)
DEFAULT:
|
lic_specs |
list of license specifications
DEFAULT:
|
RETURNS | DESCRIPTION |
---|---|
tuple with list of valid license specs found and name of first valid environment variable |
find_glob_pattern(glob_pattern, fail_on_no_match=True)
¶
Find unique file/dir matching glob_pattern (raises error if more than one match is found)
get_easyblock_class_name(path)
¶
Make sure file is an easyblock and get easyblock class name
get_source_tarball_from_git(filename, targetdir, git_config)
¶
Downloads a git repository, at a specific tag or commit, recursively or not, and make an archive with it
PARAMETER | DESCRIPTION |
---|---|
filename |
name of the archive to save the code to (must be .tar.gz)
|
targetdir |
target directory where to save the archive to
|
git_config |
dictionary containing url, repo_name, recursive, and one of tag or commit
|
guess_patch_level(patched_files, parent_dir)
¶
Guess patch level based on list of patched files and specified directory.
has_recursive_symlinks(path)
¶
Check the given directory for recursive symlinks.
That means symlinks to folders inside the path which would cause infinite loops when traversed regularily.
PARAMETER | DESCRIPTION |
---|---|
path |
Path to directory to check
|
install_fake_vsc()
¶
Put fake 'vsc' Python package in place, to catch easyblocks/scripts that still import from vsc.* namespace (vsc-base & vsc-install were ingested into the EasyBuild framework for EasyBuild 4.0, see https://github.com/easybuilders/easybuild-framework/pull/2708)
is_alt_pypi_url(url)
¶
Determine whether specified URL is already an alternate PyPI URL, i.e. whether it contains a hash.
is_binary(contents)
¶
Check whether given bytestring represents the contents of a binary file or not.
is_generic_easyblock(easyblock)
¶
Return whether specified easyblock name is a generic easyblock or not.
is_patch_file(path)
¶
Determine whether file at specified path is a patch file (based on +++ and --- lines being present).
is_readable(path)
¶
Return whether file at specified location exists and is readable.
is_sha256_checksum(value)
¶
Check whether provided string is a SHA256 checksum.
load_index(path, ignore_dirs=None)
¶
Load index for specified path, and return contents (or None if no index exists).
locate_files(files, paths, ignore_subdirs=None)
¶
Determine full path for list of files, in given list of paths (directories).
mkdir(path, parents=False, set_gid=None, sticky=None)
¶
Create a directory Directory is the path to create
PARAMETER | DESCRIPTION |
---|---|
parents |
create parent directories if needed (mkdir -p)
DEFAULT:
|
set_gid |
set group ID bit, to make subdirectories and files inherit group
DEFAULT:
|
sticky |
set the sticky bit on this directory (a.k.a. the restricted deletion flag), to avoid users can removing/renaming files in this directory
DEFAULT:
|
modify_env(old, new)
¶
NO LONGER SUPPORTED: use modify_env from easybuild.tools.environment instead
move_file(path, target_path, force_in_dry_run=False)
¶
Move a file from path to target_path
PARAMETER | DESCRIPTION |
---|---|
path |
the original filepath
|
target_path |
path to move the file to
|
force_in_dry_run |
force running the command during dry run
DEFAULT:
|
move_logs(src_logfile, target_logfile)
¶
Move log file(s).
normalize_path(path)
¶
Normalize path removing empty and dot components.
Similar to os.path.normpath but does not resolve '..' which may return a wrong path when symlinks are used
open_file(path, mode)
¶
Open a (usually) text file. If mode is not binary, then utf-8 encoding will be used for Python 3.x
parse_http_header_fields_urlpat(arg, urlpat=None, header=None, urlpat_headers_collection=None, maxdepth=3)
¶
Recurse into multi-line string "[URLPAT::][HEADER:]FILE|FIELD" where FILE may be another such string or file containing lines matching the same format, such as "^https://www.example.com::/path/to/headers.txt", and flatten the result to dict e.g. {'^https://www.example.com': ['Authorization: Basic token', 'User-Agent: Special Agent']}
parse_log_for_error(txt, regExp=None, stdout=True, msg=None)
¶
NO LONGER SUPPORTED: use parse_log_for_error from easybuild.tools.run instead
path_matches(path, paths)
¶
Check whether given path matches any of the provided paths.
pypi_source_urls(pkg_name)
¶
Fetch list of source URLs (incl. source filename) for specified Python package from PyPI, using 'simple' PyPI API.
read_file(path, log_error=True, mode='r')
¶
Read contents of file at given path, in a robust way.
register_lock_cleanup_signal_handlers()
¶
Register signal handler for signals that cancel the current EasyBuild session, so we can clean up the locks that were created first.
remove(paths)
¶
Remove single file/directory or list of files and directories
PARAMETER | DESCRIPTION |
---|---|
paths |
path(s) to remove
|
remove_dir(path)
¶
Remove directory at specified path.
remove_file(path)
¶
Remove file at specified path.
remove_lock(lock_name)
¶
Remove lock with specified name.
resolve_path(path)
¶
Return fully resolved path for given path.
PARAMETER | DESCRIPTION |
---|---|
path |
path that (maybe) contains symlinks
|
rmtree2(path, n=3)
¶
Wrapper around shutil.rmtree to make it more robust when used on NFS mounted file systems.
run_cmd(cmd, log_ok=True, log_all=False, simple=False, inp=None, regexp=True, log_output=False, path=None)
¶
NO LONGER SUPPORTED: use run_cmd from easybuild.tools.run instead
run_cmd_qa(cmd, qa, no_qa=None, log_ok=True, log_all=False, simple=False, regexp=True, std_qa=None, path=None)
¶
NO LONGER SUPPORTED: use run_cmd_qa from easybuild.tools.run instead
search_file(paths, query, short=False, ignore_dirs=None, silent=False, filename_only=False, terse=False, case_sensitive=False)
¶
Search for files using in specified paths using specified search query (regular expression)
PARAMETER | DESCRIPTION |
---|---|
paths |
list of paths to search in
|
query |
search query to use (regular expression); will be used case-insensitive
|
short |
figure out common prefix of hits, use variable to factor it out
DEFAULT:
|
ignore_dirs |
list of directories to ignore (default: ['.git', '.svn'])
DEFAULT:
|
silent |
whether or not to remain silent (don't print anything)
DEFAULT:
|
filename_only |
only return filenames, not file paths
DEFAULT:
|
terse |
stick to terse (machine-readable) output, as opposed to pretty-printing
DEFAULT:
|
set_gid_sticky_bits(path, set_gid=None, sticky=None, recursive=False)
¶
Set GID/sticky bits on specified path.
symlink(source_path, symlink_path, use_abspath_source=True)
¶
Create a symlink at the specified path to the given path.
PARAMETER | DESCRIPTION |
---|---|
source_path |
source file path
|
symlink_path |
symlink file path
|
use_abspath_source |
resolves the absolute path of source_path
DEFAULT:
|
verify_checksum(path, checksums, computed_checksums=None)
¶
Verify checksum of specified file.
PARAMETER | DESCRIPTION |
---|---|
path |
path of file to verify checksum of
|
checksums |
checksum values to compare to (and type, optionally, default is MD5), e.g., 'af314', ('sha', '5ec1b')
|
computed_checksums |
Optional dictionary of (current) checksum(s) for this file indexed by the checksum type (e.g. 'sha256'). Each existing entry will be used, missing ones will be computed.
DEFAULT:
|
weld_paths(path1, path2)
¶
Weld two paths together, taking into account overlap between tail of 1st path with head of 2nd path.
which(cmd, retain_all=False, check_perms=True, log_ok=True, log_error=None, on_error=None)
¶
Return (first) path in $PATH for specified command, or None if command is not found
PARAMETER | DESCRIPTION |
---|---|
retain_all |
returns all locations to the specified command in $PATH, not just the first one
DEFAULT:
|
check_perms |
check whether candidate path has read/exec permissions before accepting it as a match
DEFAULT:
|
log_ok |
Log an info message where the command has been found (if any)
DEFAULT:
|
on_error |
What to do if the command was not found, default: WARN. Possible values: IGNORE, WARN, ERROR
DEFAULT:
|
write_file(path, data, append=False, forced=False, backup=False, always_overwrite=True, verbose=False, show_progress=False, size=None)
¶
Write given contents to file at given path; overwrites current file contents without backup by default!
PARAMETER | DESCRIPTION |
---|---|
path |
location of file
|
data |
contents to write to file. Can be a file-like object of binary data
|
append |
append to existing file rather than overwrite
DEFAULT:
|
forced |
force actually writing file in (extended) dry run mode
DEFAULT:
|
backup |
back up existing file before overwriting or modifying it
DEFAULT:
|
always_overwrite |
don't require --force to overwrite an existing file
DEFAULT:
|
verbose |
be verbose, i.e. inform where backup file was created
DEFAULT:
|
show_progress |
show progress bar while writing file
DEFAULT:
|
size |
size (in bytes) of data to write (used for progress bar)
DEFAULT:
|