Docs
What is a Group Workspace?
What is a Group Workspace?
Introduction
This article describes the way that storage is usually provided to projects on JASMIN: the Group Workspace (GWS).
What is a GWS?
GWSs are portions of disk allocated for particular projects to manage themselves, enabling collaborating scientists to share network accessible storage on JASMIN. Users can pull data from external sites to a common cache, process and analyse their data, and where allowed, exploit data available from other GWSs and from the CEDA archive.
It is important to understand that these workspaces are not the same as the CEDA archive. Data in a GWS can be earmarked for ingestion into the CEDA archive, but this process should be discussed directly with the CEDA Archive team (via the CEDA Helpdesk), it is not automatic and will not happen without prior arrangement.
Data within GWSs are the responsibility of the designated GWS manager and are not backed up by the JASMIN team (see below).
Accessing a GWS
Where are GWSs available?
Each GWS volume is found in a directory mounted under a pattern of paths which refers to the capabilities of the storage, ie.
Path | Type | Capability |
---|---|---|
/gws/nopw/j04/* | SOF | no parallel write (nopw) |
/gws/pw/j07/* | PFS | parallel write capable (pw) |
/gws/smf/j04/* | SSD | small-files optimised (smf) |
A project may have several GWS volumes, perhaps of different types, for example:
/gws/nopw/j04/jules
SOF volume for thejules
project
GWSs are available on:
- Transfer servers including
xfer*
,hpxfer*
,gridftp*
and the JASMIN Default Collection Globus endpoint. - The general scientific analysis servers
sci*
- All nodes in the LOTUS and ORCHID clusters
- Some application-specific servers (by arrangement)
They are NOT available on login
or nx
servers.
Requesting access to a GWS
There is a Unix group associated with each GWS to provide convenient access control. Any JASMIN user can apply for access to a given GWS by following the links provided in the list of available GWSs on the JASMIN Accounts Portal. Important: The GWS Manager (not the JASMIN team) will need to authorise the request before you are granted access.
Once you have been granted the relevant access role, then the relevant Unix group will be added
to your account. If you are not sure of the group name for your GWS you can
find this by entering the command groups
to see the names of the groups you
belong to. The group name normally has the prefix “gws_".
gws_<name>
, so that group read/write permissions apply to this group rather than the default group, users
.
GWS management
Each GWS has a designated manager. See the article on managing a GWS.
Backup
Please note that data in GWSs are only backed up if the GWS Manager has put tasks in place to do this. The Elastic Tape service is available to enable to make a secondary near-line copy of data. Please discuss the details with your GWS Manager.
Recommended directory structure for a GWS
We recommend that a sensible directory structure is set up within your GWS in order to conventions are used within your GWS:
<your_gws>/
users/
<userid>/ # each user can create their own directory here
public/ # required if you want to share data via HTTP
data/
internal/ # internal/intermediate data
incoming/ # third-party data brought to the GWS
output/ # output data generated by project
See the GWS etiquette article for more details about GWSs and the
GWS data
sharing via HTTP article for
information about the use of the public
directory.