New storage FAQs and issues
New storage FAQs and issues
Workflows with some of the issues highlighted below will have a knock on effect for other users, so please take the time to check and change your code to make appropriate use of new storage system. If used correctly, the new storage offers us a high performance scalable file system, with the capability for object storage as tools and interfaces evolve, and we can continue to serve the growing demand for storage in the most cost effective manner.
We understand these changes may cause you some extra work, but we hope that you can understand why they were necessary and how to adapt to these changes. We will continue to add to this page when new issues or solutions are found.
Parallel threads can update the same file concurrently on same or from different servers.
Suggested solution: use a /work/scratch-pw*
volume which is PFS (but not
/work/scratch-nopw*
!), then move output to SOF storage.
Suggested solution: see job submission advice here showing how to use SBATCH options to use distinct output and log files for each job, or element of a job array.
This is a form of parallel write truncation
Suggested solution: take care to check for completion of 1 process before another process deletes or modifies a file. Be sure to check a job has completed before interactively deleting files from any server you are logged into (eg. sci1.jasmin.ac.uk)
This can happen with rsync leading to duplicate copying processes.
Suggested solution: check for successful termination of 1 process before starting another.
Here’s an example of how this shows up using “lsof” and by listing user processes with “ps”. The same file “ISIMIPnc_to_SDGVMtxt.py” is being edited in 2 separate “vim” editors. In this case, the system team was unable to kill the processes on behalf of the user, so the only solution was to reboot sci1.
lsof /gws/nopw/j04/gwsnnn/
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
vim 20943 fbloggs cwd DIR 0,43 0 2450 /gws/nopw/j04/gwsnnn/fbloggs/sdgvm/ISIMIP
vim 20943 fbloggs 4u REG 0,43 24576 2896 /gws/nopw/j04/gwsnnn/fbloggs/sdgvm/ISIMIP/.ISIMIPnc_to_SDGVMtxt.py.swp
vim 31843 fbloggs cwd DIR 0,43 0 2450 /gws/nopw/j04/gwsnnn/fbloggs/sdgvm/ISIMIP
vim 31843 fbloggs 3r REG 0,43 12111 2890 /gws/nopw/j04/gwsnnn/fbloggs/sdgvm/ISIMIP/ISIMIPnc_to_SDGVMtxt.py
ps -ef | grep fbloggs
......
fbloggs 20943 1 0 Jan20 ? 00:00:00 vim ISIMIPnc_to_SDGVMtxt.py
fbloggs 31843 1 0 Jan20 ? 00:00:00 vim ISIMIPnc_to_SDGVMtxt.py smc_1D-2D_1979-2012_Asia_NewDelhi.py
Suggested solution: If you are unable to kill the processes yourself, contact the helpdesk with sufficient information to ask for it to be done for you. In some cases, the only solution at present is for the host or hosts to be rebooted.
The larger file systems in operation within JASMIN are suitable for storing and manipulating large datasets and not currently optimised for handling small ( <64kBytes) files. These systems are not the same as those you would find on a desktop computer or even large server, and often involve many disks to store the data itself and metadata servers to store the file system metadata (such as file size, modification dates, ownership etc). If you are compiling code from source files, or running code from python virtual environments, these are examples of activities which can involve accessing large numbers of small files.
Later versions of our PFS systems handled this by using SSD storage for small files, transparent to the user. SOF however, can’t do this (until later in 2019), so in Phase 4, we introduced larger home directories based on SSD, as well as an additional and larger scratch area.
Suggested solution: Please consider using your home directory for small-
file storage, or /work/scratch-nopw2
for situations involving LOTUS
intermediate job storage. It should be possible to share code, scripts and
other small files from your home directory by changing the file and directory
permissions yourself.
We are planning to address this further in Phase 5 by deploying additional SSD storage which could be made available in small amounts to GWSs as an additional type of storage. [Now available: please ask about adding an “SMF” volume to your workspace]
Writing netCDF3 classic files to SOF storage e.g. /gws/nopw/j04/*
should be
avoided. This is due to the fact that operations involving a lot of
repositioning of the file pointer (as happens with netCDF3 writing) has
similar issues from writing large numbers of small files to SOF storage (known
as QB ).
Suggested solution: It is more efficient to write netCDF3 classic files to
another filesystem type (e.g. /work/scratch/pw*
or /work/scratch-nopw2
) and then move them to a SOF GWS, rather than writing directly to SOF.
This can be due to overloading of the scientific analysis servers
(sci*.jasmin.ac.uk
) which we provide for interactive use. They’re great
for testing a code and developing a workflow, but are not designed for
actually doing the big processing. Please take this heavy-lifting or
long-running work to the LOTUS batch processing cluster, leaving the
interactive compute nodes responsive enough for everyone to use.
Suggested solution: When you log in via one of the login*.jasmin.ac.uk
nodes, you are shown a ‘message of the day’: a list of all the sci*
machines,
along with memory usage and the number of users on each node at that time.
This can help you select a less-used machine (but don’t necessarily expect the
same machine to be the right choice next time!).