Data Transfer Tools: Copy service
*** Unfortunately this service is not yet available on the new SCI machines (sci*.jasmin.ac.uk) which submit LOTUS jobs using SLURM. It is still available when used on the older SCI machines (jasmin-sci*.ceda.ac.uk) which submit via LSF, however the LSF system is now being drained of nodes as we move over to SLURM, and the older sci machines will be replaced in due course. Please look out for further announcements. ***
This article provides information about the bcopy service, including:
- Where you can use it
- How to use it
- Further advice
bcopy command provides a service which uses the LOTUS batch cluster to copy data efficiently between disk areas within JASMIN, for example:
- home directory*
- group workspaces
- scratch volumes
It is the recommended method to be used for copying large amounts of data between group workspaces.
* Home directories are unlikely to contain large enough quantities of data to make it worthwhile copying data this way, but this may be useful for testing purposes.
The service is available on the scientific analysis servers
cems-sci[1,2].ceda.ac.uk the command
/usr/local/bin/bcopy. These servers are able to submit jobs to LOTUS.
It is not installed on the data transfer servers, as these have no access to LOTUS.
How to use it
$ bcopy -s SourceDirectory -d DestinationDirectory [-c] [--retry]
-cis optional for checking if the source is readable by the current user
--retryis only used when the system reports an error and asks the user to re-run the job with the --retry switch appended and the end of the command
- You need to issue the command in a directory where you have write permission, because it will attempt to create a number of files associated with running a LOTUS job.
- Both source and destination need to be directories. It is not possible to specify particular files within a directory to be copied.
$ bcopy -s /group_workspaces/jasmin/gws1/mydir -d /group_workspaces/jasmin/gws2/dest
Once the command has been issued, you should receive confirmation of job submission:
PANFS source directory: /group_workspaces/jasmin/gws1/mydur PANFS destination directory: /group_workspaces/jasmin/gws2/dest Job <8063810> is submitted to queue <copy>. Done
Once the job has completed, you should see files names
JOBID is the job number as given in the submission confirmation above. You should inspect these text files to check any output given by the service.
- Although some of the messages generated by the service refer to PANFS or Panasas, the service can be used for copying between any 2 file systems which are mounted on LOTUS (since LOTUS nodes are used as workers for the copy process).
- You are strongly recommended to use the bcopy service in combination with a synchronisation tool like
rsyncto ensure that all data has been copied. For example, you could use the bcopy service to do the main bulk of the copying, then run a subsequent rsync command to "mop up" any remaining files which may have been missed. In this case, a data transfer server should be used for the
rsyncoperation, in case the remaining transfer is large. See rsync.
- You should cease any further changes in the source directory once the initial bcopy command has been issued.
- For full verification, you should consider generating a manifest including checksums of each file before and after the copy operation. The 2 manifests can then be compared to ensure that all the data has transferred and its integrity preserved.
- Note that, as with other transfers (including rsync) invoked by a user with normal system privileges, file and directory ownership are NOT preserved when moving data which contains items belonging to multiple users. Files and directories created in the destination will take on the ownership of the user who invoked the transfer. If you need to preserve ownership metadata, please contact the helpdesk email@example.com.