CEDA Archive
This article describes accessing the CEDA Archive from JASMIN:
- Overview
- Register for a CEDA Account
- Accessing the CEDA Archive on JASMIN servers
- Archive access groups
- Data licensing
- Example of accessing data in the CEDA Archive
Overview
The CEDA Archive provides direct access to thousands of atmospheric, climate change and earth observation datasets. The Archive is directly accessible as a file system from the shared science machines on JASMIN.
It is a separate service run by the CEDA team - it is not a JASMIN service. Therefore, many of the links in this document will take you to the CEDA Archive help documentation site (as the information relates to CEDA Archive services). This is separate from the JASMIN help documentation site (which is specifically about JASMIN services).
Register for a CEDA Account
First, you need a CEDA Archive account. If you do not have a CEDA account, please follow the steps in this CEDA help document to register as a new CEDA user. It also explains how you can reset your password if you have forgotten it. When you have made a CEDA account, you will then need to link it to your JASMIN account.
The JASMIN Account Portal deals with the management of access to JASMIN resources (e.g. compute and storage), whereas MyCEDA (the CEDA Accounts Portal) deals with access to CEDA resources (e.g. access to datasets in the archives). You will need both accounts linked in order to access CEDA Archive data from JASMIN - you can check whether your accounts are linked from within the JASMIN Accounts Portal.
Accessing the CEDA Archive on JASMIN servers
Once you have linked your CEDA and JASMIN accounts, you will have access to large parts of the archive straightaway.
The contents of the CEDA Archive are available on the file system under /badc and /neodc. Note: do not access data via any symlinks that point to /datacentre/archvol* - these are not permanent links and may change when data are migrated to new storage. Please use the archive path names under /badc and /neodc. Search the CEDA data catalogue for further details about data held in the archive.
Note: badc is for atmospheric data, neodc is for earth observation data - they are named after CEDA's previous archive names (British Atmospheric Data Centre, and the NERC Earth Observation Data Centre).
Most data on the Archive is open access - however, some datasets are restricted. You can work this out by looking at the UNIX access groups the data are within (see below). If your required datasets are restricted, access to these can be obtained by applying for specific access via the data centre (see this article for more details). If direct access is not possible the data can be obtained via standard FTP and web-based access methods to the CEDA Archive and transferred to a suitable group workspace on JASMIN. As the data centres use the same JASMIN infrastructure the transfer rates are high.
The CEDA Data Catalogue is a useful tool to find and apply for access to datasets.
Archive access groups
The UNIX access groups used within the CEDA Archive are listed below with links to example datasets in the CEDA data catalogue for those wishing to use them:
open
- Available to any logged in JASMIN user with a linked CEDA user account. See a full list of available datasets here.cmip5_research
- restricted CMIP3 and CMIP5 datasetsesacat1
- Satellite data including MERIS, MIPAS and SCIAMACHY.ecmwf
- Access to the ECMWF Operational Datasets.eurosat
- Satellite data including IASI, AVHRR-3 and GOME-2.ukmo_wx
- Met Office observational dataset collections including LIDARNET, MIDAS, MetDB and NIMRODukmo_clim
- Climatology datasets from the Met Office, including Central England Temperature dataset collection, HadISST.byacl
- These data have specific restrictions on them meaning that they can't be accessed directly from JASMIN, but can be obtained via FTP and web access.
Data Licensing
All use of data accessed directly from the CEDA Archive must be used in line with the relevant data licence in place for the relevant dataset for the purposes stated in the access application. Data licence information can be found on the relevant CEDA Data Catalogue page, a link to which can be found in the 00README_catalogue_and_licence.txt
files found in the archive. For specific data licences granted for restricted datasets, users should log into their MyCEDA page to view their granted licence and the associated usage purpose under which access was granted. Any required alternative use of the data beyond the original purpose stated in the original licence application can only be made with a freshly granted new licence application.
Accessing data in the archive
In the example below, the logged-in user is listing the contents of the CRU data sets within the BADC archive. These are "open" so all logged-in users can access them:
$ ls -l /badc/cru/data total 320 -rw-r----- 1 badc open 396 Feb 18 2015 00README drwxr-x--- 8 badc open 4096 Mar 22 10:32 cru_cy drwxr-x--- 4 badc open 4096 Dec 6 2014 crutem drwxr-x--- 12 badc open 4096 May 9 14:11 cru_ts drwxr-x--- 3 badc open 4096 Feb 18 2015 PDSI