Globus Command-Line Interface

Share via

Link copied to clipboard

Data Transfer Tools: Using the Globus Command-Line Interface

On this page

Updated for new JASMIN Default Collection (replaces previous JASMIN Globus Endpoint)

Please read Globus transfers with JASMIN first for a wider introduction to Globus on JASMIN.

This article describes

how to transfer data using the Globus Command Line Interface. It covers:
- how an end-user can set up their host (laptop, desktop or home directory on their departmental server) with the Globus Command-Line Interface (CLI)
- examples of common tasks using the CLI

It is not necessary to use the Globus CLI on a JASMIN server: it is a tool that you can use anywhere (for example your own desktop/laptop) to interact with the Globus service, to orchestrate a transfer between 2 endpoints (collections, in new Globus terminology). The CLI is not centrally installed on JASMIN, and does not need to be in the same place as either of the 2 collections involved in the transfer. You could use the CLI on your own laptop/desktop, even if the 2 collections were 2 institutional Globus collections on opposite sides of the world. You could of course decide to install the CLI in your home directory on JASMIN if that were useful as part of your processing/data transfer workflow.

The Globus CLI is fully documented here with examples . It provides a command-line interface for managed transfers via the Globus cloud-based transfer service, which usually achieves the best possible transfer rate over a given route compared to other methods. Typically this will be significantly faster than can be achieved over scp, rsync or sftp transfers, particularly if the physical network path is long.

The Globus CLI is designed for use either interactively within an interactive shell or in scripts. An alternative Python software development kit (SDK) is also available and should be considered for more sophisticated workflows.

Alternatively, the Globus web interface at https://app.globus.org can be used as an easy-to-use interface to orchestrate transfers interactively.

Whichever method is used: CLI, SDK or web interface, transfers are invoked as asynchronous, managed tasks which can then be monitored, and if need be set to retry automatically until some pre-set deadline.

Prerequisites

Linux environment with normal user privileges, or
Mac environment with ability to install applications, or
Windows environment with ability to install applications
Python environment for that platform, with ability to create virtual environments (to enable installation of additional packages)
For use of the JASMIN Default Collection:
- An active JASMIN user account, with “jasmin-login” role
You may also wish to set up your own Globus endpoint using Globus Connect Personal, though this is not needed for these examples.

Initial Setup

Get a Globus identity

Go to https://app.globus.org and either:

choose one of the listed identity providers (e.g. GitHub, Google, …)
follow the link at the bottom to “use Globus ID to sign in”

Set up the Globus CLI on your machine

Do the following on your own (local) machine. Make a Python virtual environment and activate it:

python3 -m venv ./venv
source ./venv/bin/activate

Download the Globus CLI and install it into the virtual environment ( venv).

pip install globus-cli

Try the globus login command. The first time you run this, you will be prompted to authorise the Globus CLI to carry out operations on behalf of your Globus ID. The URL will open in your default browser, where you should authenticate with your Globus ID credentials. If you prefer, you can copy/paste the URL from the command-line to a browser of your choice. Either way, you then need to click “Allow” in the browser window, then copy/paste the resulting “Native App Authorization Code” back to the terminal window where you issued the globus login command:

globus login --no-local-server
Please authenticate with Globus here:
------------------------------------
https://auth.globus.org/v2/oauth2/authorize?client_id=abc1234-9c3c-4ad42-be31-8d6c87101239014&redirect_uri=https%3A%2F%2Fauth.globus.org%2Fv2%2Fweb%2Fauth-code&scope=openid+profile+email+urn%3Aglobus%3Aauth%3Ascope%3Aauth.globus.org%3Aview_identity_set+urn%3Aglobus%3Aauth%3Ascope%3Atransfer.api.globus.org%3Aall+urn%3Aglobus%3Aauth%3Ascope%3Agroups.api.globus.org%3Aall+urn%3Aglobus%3Aauth%3Ascope%3Asearch.api.globus.org%3Aall&state=_default&response_type=code&access_type=offline&prompt=login
------------------------------------
Enter the resulting Authorization Code here:

You should then see the following:

You have successfully logged in to the Globus CLI!

You can check your primary identity with
  globus whoami

For information on which of your identities are in session use
  globus session show

Logout of the Globus CLI with
  globus logout

You can now use the Globus CLI commands as listed by the following command:

globus  --help
  Usage: globus [OPTIONS] COMMAND [ARGS]...

    Interact with Globus from the command line

    All `globus` subcommands support `--help` documentation.

    Use `globus login` to get started!

    The documentation is also online at https://docs.globus.org/cli/

  Options:
    -v, --verbose                  Control level of output
    -h, --help                     Show this message and exit.
    -F, --format [unix|json|text]  Output format for stdout. Defaults to text
    --jmespath, --jq TEXT          A JMESPath expression to apply to json
                                    output. Takes precedence over any specified '
                                    --format' and forces the format to be json
                                    processed by this expression
    --map-http-status TEXT         Map HTTP statuses to any of these exit codes:
                                    0,1,50-99. e.g. "404=50,403=51"

  Commands:
    bookmark        Manage endpoint bookmarks
    collection      Manage your Collections
    delete          Submit a delete task (asynchronous)
    endpoint        Manage Globus endpoint definitions
    get-identities  Lookup Globus Auth Identities
    group           Manage Globus Groups
    list-commands   List all CLI Commands
    login           Log into Globus to get credentials for the Globus CLI
    logout          Logout of the Globus CLI
    ls              List endpoint directory contents
    mkdir           Create a directory on an endpoint
    rename          Rename a file or directory on an endpoint
    rm              Delete a single path; wait for it to complete
    search          Use Globus Search to store and query for data
    session         Manage your CLI auth session
    task            Manage asynchronous tasks
    transfer        Submit a transfer task (asynchronous)
    update          Update the Globus CLI to its  latest version
    version         Show the version and exit
    whoami          Show the currently logged-in identity

Examples

Find an endpoint (aka collection)

We will use the globus endpoint search subcommand. Find help on the particular options for that with

globus endpoint search --help
Usage: globus endpoint search [OPTIONS] [FILTER_FULLTEXT]

  Search for Globus endpoints with search filters. If --filter-scope is set to
  the default of 'all', then FILTER_FULLTEXT is required.

  If FILTER_FULLTEXT is given, endpoints which have attributes (display name,
  legacy name, description, organization, department, keywords) that match the
  search text will be returned. The result size limit is 100 endpoints.

Options:
  --filter-scope [all|administered-by-me|my-endpoints|my-gcp-endpoints|recently-used|in-use|shared-by-me|shared-with-me]
                                  The set of endpoints to search over.
                                  [default: all]
  --filter-owner-id TEXT          Filter search results to endpoints owned by
                                  a specific identity. Can be the Identity ID,
                                  or the Identity Username, as in
                                  "go@globusid.org"
  --limit INTEGER RANGE           The maximum number of results to return.
                                  [default: 25; 1<=x<=1000]
  -v, --verbose                   Control level of output
  -h, --help                      Show this message and exit.
  -F, --format [unix|json|text]   Output format for stdout. Defaults to text
  --jmespath, --jq TEXT           A JMESPath expression to apply to json
                                  output. Takes precedence over any specified
                                  '--format' and forces the format to be json
                                  processed by this expression
  --map-http-status TEXT          Map HTTP statuses to any of these exit
                                  codes: 0,1,50-99. e.g. "404=50,403=51"

Search for the collections matching the search term “tutorial”:

globus endpoint search "tutorial"
ID                                   | Owner                                                        | Display Name
------------------------------------ | ------------------------------------------------------------ | ----------------------------------------------
6c54cade-bde5-45c1-bdea-f4bd71dba2cc | 6df1b656-c953-40a3-91a9-e9e8ad5173ea@clients.auth.globus.org | Globus Tutorial Collection 1
31ce9ba0-176d-45a5-add3-f37d233ba47d | 6df1b656-c953-40a3-91a9-e9e8ad5173ea@clients.auth.globus.org | Globus Tutorial Collection 2

The 2 globus tutorial collections actually “see” the same filesystem, so we’ll just use the first one.

For convenience, let’s set environment variables representing the ID of this collection:

export c1=6c54cade-bde5-45c1-bdea-f4bd71dba2cc
echo $c1
6c54cade-bde5-45c1-bdea-f4bd71dba2cc

Let’s try listing that collection, so that we know we can interact with it. We are prompted to grant consent first:

globus ls $c1
The collection you are trying to access data on requires you to grant consent for the Globus CLI to access it.

Please run:

globus session consent 'urn:globus:auth:scope:transfer.api.globus.org:all[*https://auth.globus.org/scopes/6c54cade-bde5-45c1-bdea-f4bd71dba2cc/data_access]'

to login with the required scopes.

Copy & paste the command it gives you (don’t copy the one above) and run it, which should open a web browser window. Follow the instructions which should complete the process, then return to your terminal session.

Now let’s find another collection, this time a public test collection which can be used for performance testing:

globus endpoint search "star dtn"
ID                                   | Owner              | Display Name
------------------------------------ | ------------------ | -------------------------------------------------
ff2ee779-54fb-4dac-ade2-57568c587ae3 | esnet@globusid.org | ESnet STAR DTN private collection
ece400da-0182-4777-91d6-27a1808f8371 | esnet@globusid.org | ESnet Starlight DTN (Anonymous read only testing)
e9e0d9f4-c419-44e0-8198-017fd61bf0c4 | esnet@globusid.org | ESnet Starlight DTN (read-write testing)

We’ll use the one labelled Anonymous read only testing. Set stardtn to the ID of this endpoint:

export stardtn=ece400da-0182-4777-91d6-27a1808f8371

None of the endpoints mentioned so far require authentication in order to use them. This makes demonstrating basic functionality simpler, but we’ll look at how to use one that does, later.

Listing files at a path on an collection

Use the endpoint ls command to list the contents of the stardtn endpoint, at the path /

globus ls $stardtn:/
500GB-in-large-files/
50GB-in-medium-files/
5GB-in-small-files/
5MB-in-tiny-files/
Climate-Huge/
Climate-Large/
Climate-Medium/
Climate-Small/
bebop/
logs/
write-testing/
100G.dat
100M.dat
10G.dat
10M.dat
1G.dat
1M.dat
500G.dat
50G.dat
50M.dat

These are files and directories containing dummy data which can be used for test purposes.

Copy a file from one endpoint to another

Let’s transfer the file 1M.dat from the stardtn endpoint to c1:

globus transfer $stardtn:/1M.dat $c1:/~/1M.dat
Message: The transfer has been accepted and a task has been created and queued for execution
Task ID: 74cb181c-bf63-11ee-a90e-032e06ca0965

The transfer task is a separate activity and does not require any connection from the CLI client to either of the 2 endpoints: the Globus transfer service manages the transfer for us. We can check on the progress of this transfer task with:

globus task show 74cb181c-bf63-11ee-a90e-032e06ca0965
Label:                        None
Task ID:                      74cb181c-bf63-11ee-a90e-032e06ca0965
Is Paused:                    False
Type:                         TRANSFER
Directories:                  0
Files:                        1
Status:                       SUCCEEDED
Request Time:                 2024-01-30T11:33:58+00:00
Faults:                       0
Total Subtasks:               2
Subtasks Succeeded:           2
Subtasks Pending:             0
Subtasks Retrying:            0
Subtasks Failed:              0
Subtasks Canceled:            0
Subtasks Expired:             0
Subtasks with Skipped Errors: 0
Completion Time:              2024-01-30T11:34:01+00:00
Source Endpoint:              ESnet Starlight DTN (Anonymous read only testing)
Source Endpoint ID:           ece400da-0182-4777-91d6-27a1808f8371
Destination Endpoint:         Globus Tutorial Collection 1
Destination Endpoint ID:      6c54cade-bde5-45c1-bdea-f4bd71dba2cc
Bytes Transferred:            1000000
Bytes Per Second:             421388

We can also list the destination collection to check that the file has reached its destination:

globus ls $c1:/~/
  1M.dat

We can also make a subdirectory with mkdir:

globus mkdir $c1:/~/mydata/
    The directory was created successfully

We can move our 1M.dat into that directory with a globus rename command

globus rename $c1 /~/1M.dat /~/mydata/1M.dat
    File or directory renamed successfully

We now have a directory mydata containing files 1M.dat:

globus ls $c1:/~/mydata/
    1M.dat

Recursively copy a directory and its contents, from one endpoint to another

Now Let’s copy a directory from the stardtn collection which contains some small files, to our destination endpoint c1 (The Globus tutorial collections only provide very limited storage space).

The files we want to copy are at the path /5MB-in-tiny-files/a/a/ on the stardtn endpoint, and are small, as their names suggest:

globus ls $stardtn:/5MB-in-tiny-files/a/a/
a-a-1KB.dat
a-a-2KB.dat
a-a-5KB.dat

Copy the parent directory recursively to ep1:

globus transfer -r $stardtn:/5MB-in-tiny-files/a/a $c1:/~/star-data
Message: The transfer has been accepted and a task has been created and queued for execution
Task ID: 4ae9bab0-7d40-11ec-bef3-a18800fa5978

Check destination content:

globus ls $c1
mydata1/
star-data/

globus ls $c1:/~/star-data
a-a-1KB.dat
a-a-2KB.dat
a-a-5KB.dat

We could now delete one of the small files using the globus delete command:

globus delete $c1:/~/star-data/a-a-2KB.dat
Message: The delete has been accepted and a task has been created and queued for execution
Task ID: be4d6934-7d40-11ec-891f-939ceb6dfaf1

And list contents again, to verify that it has been deleted:

globus ls $c1:/~/star-data
a-a-1KB.dat
a-a-5KB.dat

Sync a source directory to a target (repeatable)

We could now repeat the copying of the source data, but this time using the -s or --sync-level exists command so that we only copy the data that is now missing from the destination. The full set of sync options is [exists|size|mtime|checksum].

globus transfer -s exists -r $stardtn:/5MB-in-tiny-files/a/a $c1:/~/star-data
Message: The transfer has been accepted and a task has been created and queued for execution
Task ID: 759a3cac-7d41-11ec-bef3-a18800fa5978

This should only copy the data that do not already exist at the desination: We end up with the same set of files at the destination:

globus ls $c1:/~/star-data
a-a-1KB.dat
a-a-2KB.dat
a-a-5KB.dat

But we can see that only 2000 bytes were transferred (so we know it only copied that one file, which is what we wanted):

globus task show 759a3cac-7d41-11ec-bef3-a18800fa5978
Label:                        None
Task ID:                      759a3cac-7d41-11ec-bef3-a18800fa5978
Is Paused:                    False
Type:                         TRANSFER
Directories:                  1
Files:                        3
Status:                       SUCCEEDED
Request Time:                 2022-01-24T18:14:24+00:00
Faults:                       0
Total Subtasks:               5
Subtasks Succeeded:           5
Subtasks Pending:             0
Subtasks Retrying:            0
Subtasks Failed:              0
Subtasks Canceled:            0
Subtasks Expired:             0
Subtasks with Skipped Errors: 0
Completion Time:              2022-01-24T18:14:58+00:00
Source Endpoint:              ESnet Starlight DTN (Anonymous read only testing)
Source Endpoint ID:           ece400da-0182-4777-91d6-27a1808f8371
Destination Endpoint:         Globus Tutorial Collection 1
Destination Endpoint ID:      6c54cade-bde5-45c1-bdea-f4bd71dba2cc
Bytes Transferred:            2000
Bytes Per Second:             60

This task could be repeated in a shell script, cron job or even using the Globus timer functionality, for either a source or destination directory that is expected to change.

Interact with a collection that requires authentication

Most Globus Connect Server endpoints are configured to require some form of authentication & authorization process. In the case of the JASMIN Default Collection, you link your Globus identity to your JASMIN identity. This may be different for other collections that you use elsewhere.

Let’s find, then set up an alias to the JASMIN Default Collection Endpoint. We can search for that name:

globus endpoint search "jasmin default"
ID                                   | Owner                                                        | Display Name
------------------------------------ | ------------------------------------------------------------ | -------------------------
a2f53b7f-1b4e-4dce-9b7c-349ae760fee0 | a77928d3-f601-40bb-b497-2a31092f8878@clients.auth.globus.org | JASMIN Default Collection

Set up an alias for this collection:

export jdc=a2f53b7f-1b4e-4dce-9b7c-349ae760fee0

If you’ve already interacted with this collection recently, you should find that you can list it with the CLI already. If not, you will be prompted to authenticate. Follow through all the steps until you complete the process, then return to the terminal session.

If successful, you can now interact with the JASMIN endpoint, for example listing your home directory:

globus ls $jdc:/~/
...
(file listing of your JASMIN home directory)
...

The authentication via your JASMIN account lasts for 30 days, so you can run and re-run transfers during that period without needing to repeat the process (hence without any human interaction, if you have scheduled/automated transfers, see below).

If this needs to be renewed, then:

the simple way to do this is to either:

(manually) visit the Globus web interface and access the JASMIN Default Collection again
(manuall) use the CLI to list the collection again

In either case, if the authentication has timed out, you will be prompted to follow instructions to renew it, then the action (listing the directory) should complete successfully.

There are ways to do use a “refresh token” programatically to renew the authentication. Watch this space for details of how to do that (or f)

Automation

The functionality demonstrated above can be combined into scripts which can perform useful, repeatable tasks such as:

recursively syncing the contents of directories between 2 endpoints

Globus provide 2 implementations of this here:

Examples of automation using the Globus CLI , specifically:

cli-sync.sh : bash script using the Globus CLI as demonstrated above
globus_folder_sync.py : Python code using the Globus Python Software Development Kit (SDK)

We have not covered the Python SDK here, but this is a useful example of how you could integrate Globus transfer functionality into your own code and workflows. You would need to install and authorise this SDK first.

Taking the first of these examples, we can adapt it slightly:

1. Select the JASMIN endpoint at the destination, and set the destination path. Modify the corresponding variables in the script to these values:

DESTINATION_COLLECTION='a2f53b7f-1b4e-4dce-9b7c-349ae760fee0' #JASMIN Default Collection ID
DESTINATION_PATH='/home/users/<username>/sync-demo/' #replace <username> with your JASMIN username

For STFC users only where the other collection in the transfer is within the STFC network, an additional collection is provided “JASMIN STFC Internal Collection” and has ID 9efc947f-5212-4b5f-8c9d-47b93ae676b7.

2. If you haven’t already, activate the Python virtual environment where you have the CLI installed, and login:

source ~/.globus-cli-venv/bin/activate
globus login

3. Check that you can interact with the JASMIN collection from the CLI, by trying to list it

Follow any instructions needed, if you need to renew your authentication.

4. Run the script to sync the data from the Globus Tutorial Endpoint to the destination directory.

You should see output similar to that shown below.

./cli-sync.sh
Checking for a previous transfer
Last transfer f5db7238-8f06-11ec-8fe0-dfc5b31adbac SUCCEEDED, continuing
Verified that source is a directory
Submitted sync from 6c54cade-bde5-45c1-bdea-f4bd71dba2cc:/share/godata/ to a2f53b7f-1b4e-4dce-9b7c-349ae760fee0:/~/sync-demo/
Link:
https://app.globus.org/activity/04e277f4-8f07-11ec-811e-493dd0cf73a1/overview
Saving sync transfer ID to last-transfer-id.txt

5. Check on the status of the task. You could do this by

following the URL to https://app.globus.org to view the task under “activities”, or

globus task show <taskid>

6. You could then make some change to either source or destination directory, and simply re-run the script

./cli-sync.sh

7. Experiment by changing the SYNCTYPE. Other options are:

See here for descriptions of the available sync levels :

EXISTS
SIZE
MTYPE
CHECKSUM

8. Automating repeats of the sync operation

You could then consider how to repeat the task automatically. For example:

triggering a re-run of the cli-sync.sh command according to some condition that’s met in your workflow.
scheduling the running of the cli-sync.sh command on your own machine using cron on your own machine.
- Remember: the invocation of the command does NOT need to be done on JASMIN, it can be done wherever you have the CLI installed, for example your local machine.
use the web interface (go to “Transfer & Timer Options”) to configure repeating tasks initiated there.
Learn about how to use timers with Globus : these can be set up using the web interface or using an additional CLI globus-timer-cli which can be installed into the same virtualenv as the main globus cli.
Learn about Globus Flows to create fully automated workflows. Globus have created a number of pre-canned workflow actions (e.g. “make directory”, “transfer”, “delete”, ..) which you can chain together in your own workflow, or combine with your own to create custom workflows. A useful example might be:
- watching a directory for arrival/creation of a certain file
- triggering a compute/analysis step on files in the directory (using a Globus Compute endpoint of your own?)
- transferring the output of that analysis to elsewhere, and cleaning up

CLI Troubleshooting

If you get an error like the following:

The resource you are trying to access requires you to re-authenticate with specific identities.
message: Missing required data_access consent
Please use "globus session update" to re-authenticate with specific identities

you can explicitly request the data_access consent for the collection with which you are trying to interact, with the following command, using the ID of the collection, i.e.

globus session consent 'urn:globus:auth:scope:transfer.api.globus.org:all[*https://auth.globus.org/scopes/<COLLECTION_ID>/data_access]'

for example, for the JASMIN Default Collection:

globus session consent 'urn:globus:auth:scope:transfer.api.globus.org:all[*https://auth.globus.org/scopes/a2f53b7f-1b4e-4dce-9b7c-349ae760fee0/data_access]'

This should take you through the steps to add the required consent to your session.

Last updated on 2025-06-04 as part of: advice for missing data_access consent (71230ec8a)

On this page:

Docs