What is a Group Workspace?
What is a Group Workspace?
This article describes the way that storage is usually provided to projects on JASMIN: the Group Workspace (GWS).
GWSs are portions of disk allocated for particular projects to manage themselves, enabling collaborating scientists to share network accessible storage on JASMIN. Users can pull data from external sites to a common cache, process and analyse their data, and where allowed, exploit data available from other GWSs and from the CEDA archive.
It is important to understand that these workspaces are not the same as the CEDA archive. Data in a GWS can be earmarked for ingestion into the CEDA archive, but this process should be discussed directly with the CEDA Archive team (via the CEDA Helpdesk), it is not automatic and will not happen without prior arrangement.
Data within GWSs are the responsibility of the designated GWS manager and are not backed up by the JASMIN team (see below).
Each GWS volume is found in a directory mounted under a pattern of paths which refers to the capabilities of the storage, ie.
Path | Type | Capability |
---|---|---|
/gws/nopw/j04/* | SOF | no parallel write (nopw) |
/gws/pw/j07/* | PFS | parallel write capable (pw) |
/gws/smf/j04/* | SSD | small-files optimised (smf) |
A project may have several GWS volumes, perhaps of different types, for example:
/gws/nopw/j04/jules
SOF volume for the jules
projectGWSs are available on:
xfer*
, hpxfer*
, gridftp*
and the JASMIN Default Collection Globus endpoint.sci*
They are NOT available on login
or nx
servers.
There is a Unix group associated with each GWS to provide convenient access control. Any JASMIN user can apply for access to a given GWS by following the links provided in the list of available GWSs on the JASMIN Accounts Portal. Important: The GWS Manager (not the JASMIN team) will need to authorise the request before you are granted access.
Once you have been granted the relevant access role, then the relevant Unix group will be added
to your account. If you are not sure of the group name for your GWS you can
find this by entering the command groups
to see the names of the groups you
belong to. The group name normally has the prefix “gws_".
gws_<name>
, so that group read/write permissions apply to this group rather than the default group, users
.
Each GWS has a designated manager. See the article on managing a GWS.
Please note that data in GWSs are only backed up if the GWS Manager has put tasks in place to do this. The Elastic Tape service is available to enable to make a secondary near-line copy of data. Please discuss the details with your GWS Manager.
We recommend that a sensible directory structure is set up within your GWS in order to conventions are used within your GWS:
<your_gws>/
users/
<userid>/ # each user can create their own directory here
public/ # required if you want to share data via HTTP
data/
internal/ # internal/intermediate data
incoming/ # third-party data brought to the GWS
output/ # output data generated by project
See the GWS etiquette article for more details about GWSs and the
GWS data
sharing via HTTP article for
information about the use of the public
directory.