Data Transfer Tools: Using the Globus Command-Line Interface

This article describes:

  • how to transfer data using the Globus Command Line Interface. It covers:
  • How an end-user can set up their host (laptop, desktop or home directory on their departmental server) with
    • the Globus Command-Line Interface (CLI) and 
    • Globus Connect Personal 
  • ...so that it can act as Globus endpoint and be used for efficient data transfers to JASMIN’s Globus endpoint (data-xfer1.ceda.ac.uk, a high-performance server in the JASMIN Data Transfer Zone)
The  Globus CLI is fully documented here. It provides a command-line interface for managed transfers via the Globus cloud-based transfer service, which usually achieves the best possible transfer rate over a given route compared to other methods. Typically this will be significantly faster than can be achieved over scp, rsync or sftp transfers, particularly if the physical network path is long.
The Globus CLI is designed for use either interactively within an interactive shell or in scripts. An alternative  Python software development kit (SDK) is also available and should be considered for more sophisticated workflows.
Alternatively, the Globus web interface at  https://www.globus.org/app/transfer can be used as an easy-to-use interface to orchestrate transfers interactively.
Whichever method is used: CLI, SDK or web interface, transfers are invoked as asynchronous, managed tasks which can then be monitored, and if need be set to retry automatically until some pre-set deadline.

Prerequisites

  • Linux environment with normal user privileges, or
  • Mac environment with ability to install applications, or
  • Windows environment with ability to install applicaitons
  • Python environment for that platform, with virtualenv installed (to enable installation of additional packages)
  • An active JASMIN user account, with “jasmin-login” and “hpxfer” privileges. Currently, your JASMIN account must also be linked to your CEDA account (in future we hope to remove this dependency however).
    • You can check whether this is the case by going to your JASMIN account profile and checking the "Linked CEDA account" section.

Note on IP adress requirement for hpxfer services

The  JASMIN “hpxfer” registration asks for a specific IP address. In this case particular case, using Globus, this is not required, so a dummy value should be specified: please use that of host jasmin-xfer1.ceda.ac.uk whose IP address is 130.246.142.210 and will be recognised as a dummy value during the registration process.

Please note that if you subsequently need to access jasmin-xfer[23].ceda.ac.uk for ssh-based transfers (but on these higher-performance machines), you may still need to contact the helpdesk to supply a specific IP address of the source host at your institution. However you can still access these 2 machines from within JASMIN (via the login nodes) to pull data from external hosts: you only need to supply the IP address if you need to initiate a direct connection from a host at your institution to one of these 2 machines.

Steps

In summary, the steps involved are as follows, but are explained in detail below:
  • Get a Globus ID if you haven’t already got one
  • Set up Globus CLI on end-user machine
  • Set up Globus Connect Personal on end-user machine. This is usually possible with regular (non-admin) user privileges.
  • Transfer some data, first using Globus Tutorial endpoints
  • Transfer some data to the JASMIN Globus Endpoint
In detail:

$ globus login

Copy & paste resulting URL to browser, obtain Authorization code and enter this at the command line where you did “globus login”. You are now able to log in from this particular Globus CLI instance.

The instructions below show the process for Linux (command-line):
$ tar xzf globusconnectpersonal-latest.tgz 
$ cd globusconnectpersonal-2.x.x# replace 2.x.x with actual filename
$ globus whoami # Verify that you are already logged in (after “globus login” above)
username@globusid.org 
$ globus endpoint create --personal my-endpoint # choose label for this endpoint
Message:     Endpoint created successfully
Endpoint ID: 3922ca0e-5727-11e7-bf07-22000b9a448b
Setup Key:   5177b3ce-9292-46e4-91a7-ae0219f845f3

Complete the installation using the setup key:

$ ./globusconnectpersonal -setup 5177b3ce-9292-46e4-91a7-ae0219f845f3
Configuration directory: /home/users/mpritcha/.globusonline/lta
Contacting relay.globusonline.org:2223
Done!
<p>
	Check that you can see your endpoint listed (the endpoint ID and label should correspond to the values above
</p>
<pre>
$ globus endpoint search --filter-scope my-endpoints

ID                                   | Owner                      | Display Name            

------------------------------------ | -------------------------- | ------------------------               
3922ca0e-5727-11e7-bf07-22000b9a448b | mattpritchard@globusid.org | my-endpoint

Start globus connect personal

$ ./globusconnectpersonal -start &

List the contents of your a directory on the endpoint you have created, as a test:

(Note the syntax <endpointID>:<path>)

$ globus ls 3922ca0e-5727-11e7-bf07-22000b9a448b:/home/users/mpritcha/

So now we have a working Globus Connect Personal Endpoint. We can now try transferring files to another Globus endpoint.

Transfer some data, first using Globus Tutorial endpoints

Globus provides some open-access endpoints for testing: these can be activated simply with your globusid (created above) and are therefore useful for testing that everything is working properly. Let’s set up some shorthand names for these endpoints:

$ go1=ddb59aef-6d04-11e5-ba46-22000b92c6ec
$ go2=ddb59af0-6d04-11e5-ba46-22000b92c6ec

You can also find these if you do

$ globus endpoint search tutorial

Activate the “globus tutorial endpoint 1”

$ globus endpoint activate $go1
Autoactivation succeeded with message: Endpoint activated successfully using Globus Online credentials.

Activate the “globus tutorial endpoint 2”

$ globus endpoint activate $go2

Autoactivation succeeded with message: Endpoint activated successfully using Globus Online credentials.

Transfer a local file to “globus tutorial endpoint 1”

$ globus transfer myfile.dat $go1 # (where myfile.dat is a local file).

Transfer some data to the JASMIN Globus endpoint

Let's locate the JASMIN gridftp server endpoint: you can find it with a search as follows: 

$ globus endpoint search "jasmin gridftp server"
ID                                   | Owner                | Display Name         
------------------------------------ | -------------------- | ---------------------
4cc8c764-0bc1-11e6-a740-22000bf2d559 | ceda@globusid.org    | JASMIN gridftp server

Its endpoint ID is 4cc8c764-0bc1-11e6-a740-22000bf2d559 : please check the endpoint ID to make sure it matches.

In order to use this endpoint, we need to activate it.

Let’s set up some shorthands:

$ ep1=3922ca0e-5727-11e7-bf07-22000b9a448b # our local endpoint
$ ep2=4cc8c764-0bc1-11e6-a740-22000bf2d559 # JASMIN gridftp server on data-xfer1

Activate the JASMIN endpoint (ep2). This particular endpoint is already configured to use the CEDA “SLCS” service to provide short-term credentials using your CEDA ID (...not your JASMIN one ...we’re working on that)

$ globus endpoint activate $ep2 --myproxy -U username
Myproxy password: 
Endpoint activated successfully using a credential fetched from a MyProxy server.

Note (1) You can also specify the password in the command using the -P option, to do this in one action, but this is less secure as your password will be visible in your system’ command history

Note (2)  This means that you can activate / re-activate your credential at any time, independently of any transfers.

You can alternatively activate your endpoint by using the --web which opens up your default web browser to complete the activation using the Globus web interface. You can then return to your terminal window once this step has completed successfully:

globus endpoint activate $ep2 --web

Try a listing on the JASMIN endpoint (ep2). The path you choose needs to be one for which you have access permissions, for example your home directory or a group workspace you belong to.

$ globus ls $ep2:/group_workspaces/jasmin/cedaproc/username/
mydir/
testfile

Do a transfer from ep1 to ep2

$ globus transfer $ep1:/home/users/username/1G.dat $ep2:/group_workspaces/jasmin/cedaproc/username/1G.dat --label "my first transfer"
Message: The transfer has been accepted and a task has been created and queued for execution
Task ID: 86e4a498-572b-11e7-bf07-22000b9a448b

Check on the progress of my task

	$ globus task list
Task ID                              | Status    | Type     | Source Display Name       | Dest Display Name      | Label            
------------------------------------ | --------- | -------- | ------------------------- | ---------------------- | -----------------
86e4a498-572b-11e7-bf07-22000b9a448b | SUCCEEDED | TRANSFER | my-endpoint      | JASMIN gridftp server  | my first transfer

Still need help? Contact Us Contact Us