JASMIN Help Site logo JASMIN Help Site logo
  • Docs 
  • Guides 
  • Training 
  • Discussions   

  •   Search this site  

Can't find what you're looking for?

Try our Google custom search, across all JASMIN sites

Docs
  • getting started
    • get started with jasmin
    • generate ssh key pair
    • get jasmin portal account
    • get login account
    • beginners training workshop
    • how to contact us about jasmin issues
    • jasmin status
    • jasmin training accounts
    • tips for new users
    • how to login
    • multiple account types
    • present ssh key
    • reconfirm email address
    • reset jasmin account password
    • ssh auth
    • storage
    • understanding new jasmin storage
    • update a jasmin account
  • interactive computing
    • interactive computing overview
    • check network details
    • login servers
    • login problems
    • graphical linux desktop access using nx
    • sci servers
    • tenancy sci analysis vms
    • transfer servers
    • jasmin notebooks service
    • jasmin notebooks service with gpus
    • creating a virtual environment in the notebooks service
    • project specific servers
    • dask gateway
    • access from vscode
  • batch computing
    • lotus overview
    • slurm scheduler overview
    • slurm queues
    • lotus cluster specification
    • how to monitor slurm jobs
    • how to submit a job
    • how to submit an mpi parallel job
    • example job 2 calc md5s
    • orchid gpu cluster
    • slurm status
    • slurm quick reference
  • software on jasmin
    • software overview
    • quickstart software envs
    • python virtual environments
    • additional software
    • community software esmvaltool
    • community software checksit
    • compiling and linking
    • conda environments and python virtual environments
    • conda removal
    • creating and using miniforge environments
    • idl
    • jasmin sci software environment
    • jasmin software faqs
    • jaspy envs
    • matplotlib
    • nag library
    • name dispersion model
    • geocat replaces ncl
    • postgres databases on request
    • running python on jasmin
    • running r on jasmin
    • rocky9 migration 2024
    • share software envs
  • data transfer
    • data transfer overview
    • data transfer tools
    • globus transfers with jasmin
    • bbcp
    • ftp and lftp
    • globus command line interface
    • globus connect personal
    • gridftp ssh auth
    • rclone
    • rsync scp sftp
    • scheduling automating transfers
    • transfers from archer2
  • short term project storage
    • apply for access to a gws
    • elastic tape command line interface hints
    • faqs storage
    • gws etiquette
    • gws scanner ui
    • gws scanner
    • gws alert system
    • install xfc client
    • xfc
    • introduction to group workspaces
    • jdma
    • managing a gws
    • secondary copy using elastic tape
    • share gws data on jasmin
    • share gws data via http
    • using the jasmin object store
    • configuring cors for object storage
  • long term archive storage
    • ceda archive
  • mass
    • external access to mass faq
    • how to apply for mass access
    • moose the mass client user guide
    • setting up your jasmin account for access to mass
  • for cloud tenants
    • introduction to the jasmin cloud
    • jasmin cloud portal
    • cluster as a service
    • cluster as a service kubernetes
    • cluster as a service identity manager
    • cluster as a service slurm
    • cluster as a service pangeo
    • cluster as a service shared storage
    • adding and removing ssh keys from an external cloud vm
    • provisioning tenancy sci vm managed cloud
    • sysadmin guidance external cloud
    • best practice
  • workflow management
    • rose cylc on jasmin
    • using cron
  • uncategorized
    • mobaxterm
    • requesting resources
    • processing requests for resources
    • acknowledging jasmin
    • approving requests for access
    • working with many linux groups
    • jasmin conditions of use
  • getting started
    • get started with jasmin
    • generate ssh key pair
    • get jasmin portal account
    • get login account
    • beginners training workshop
    • how to contact us about jasmin issues
    • jasmin status
    • jasmin training accounts
    • tips for new users
    • how to login
    • multiple account types
    • present ssh key
    • reconfirm email address
    • reset jasmin account password
    • ssh auth
    • storage
    • understanding new jasmin storage
    • update a jasmin account
  • interactive computing
    • interactive computing overview
    • check network details
    • login servers
    • login problems
    • graphical linux desktop access using nx
    • sci servers
    • tenancy sci analysis vms
    • transfer servers
    • jasmin notebooks service
    • jasmin notebooks service with gpus
    • creating a virtual environment in the notebooks service
    • project specific servers
    • dask gateway
    • access from vscode
  • batch computing
    • lotus overview
    • slurm scheduler overview
    • slurm queues
    • lotus cluster specification
    • how to monitor slurm jobs
    • how to submit a job
    • how to submit an mpi parallel job
    • example job 2 calc md5s
    • orchid gpu cluster
    • slurm status
    • slurm quick reference
  • software on jasmin
    • software overview
    • quickstart software envs
    • python virtual environments
    • additional software
    • community software esmvaltool
    • community software checksit
    • compiling and linking
    • conda environments and python virtual environments
    • conda removal
    • creating and using miniforge environments
    • idl
    • jasmin sci software environment
    • jasmin software faqs
    • jaspy envs
    • matplotlib
    • nag library
    • name dispersion model
    • geocat replaces ncl
    • postgres databases on request
    • running python on jasmin
    • running r on jasmin
    • rocky9 migration 2024
    • share software envs
  • data transfer
    • data transfer overview
    • data transfer tools
    • globus transfers with jasmin
    • bbcp
    • ftp and lftp
    • globus command line interface
    • globus connect personal
    • gridftp ssh auth
    • rclone
    • rsync scp sftp
    • scheduling automating transfers
    • transfers from archer2
  • short term project storage
    • apply for access to a gws
    • elastic tape command line interface hints
    • faqs storage
    • gws etiquette
    • gws scanner ui
    • gws scanner
    • gws alert system
    • install xfc client
    • xfc
    • introduction to group workspaces
    • jdma
    • managing a gws
    • secondary copy using elastic tape
    • share gws data on jasmin
    • share gws data via http
    • using the jasmin object store
    • configuring cors for object storage
  • long term archive storage
    • ceda archive
  • mass
    • external access to mass faq
    • how to apply for mass access
    • moose the mass client user guide
    • setting up your jasmin account for access to mass
  • for cloud tenants
    • introduction to the jasmin cloud
    • jasmin cloud portal
    • cluster as a service
    • cluster as a service kubernetes
    • cluster as a service identity manager
    • cluster as a service slurm
    • cluster as a service pangeo
    • cluster as a service shared storage
    • adding and removing ssh keys from an external cloud vm
    • provisioning tenancy sci vm managed cloud
    • sysadmin guidance external cloud
    • best practice
  • workflow management
    • rose cylc on jasmin
    • using cron
  • uncategorized
    • mobaxterm
    • requesting resources
    • processing requests for resources
    • acknowledging jasmin
    • approving requests for access
    • working with many linux groups
    • jasmin conditions of use
  1.   For Cloud Tenants
  1. Home
  2. Docs
  3. For Cloud Tenants
  4. Cluster-as-a-Service - Pangeo

Cluster-as-a-Service - Pangeo

 

Share via
JASMIN Help Site
Link copied to clipboard

Cluster-as-a-Service - Pangeo

On this page
Introduction   Cluster configuration   Accessing the cluster   Using the Pangeo environment   Administering the hub  

This article describes how to deploy and use a Pangeo cluster using JASMIN Cluster-as-a-Service (CaaS).

Introduction  

Pangeo  is a community that promotes a philosophy of open, scalable and reproducible science, focusing primarily on the geosciences. As part of that effort, the Pangeo community provides a curated Python  ecosystem based on popular open-source packages like xarray  , Iris  , Dask  and Jupyter notebooks  , along with documentation and recipes for deployment on various infrastructures.

The Pangeo cluster type in CaaS is a multi-user implementation of the Pangeo ecosystem using JupyterHub  deployed on Kubernetes  , giving users a scalable and fault- tolerant infrastructure to use for doing science, all through a web-browser interface. Authentication is handled by the Identity Manager for the tenancy via JupyterHub’s LDAP integration. Each authenticated user gets their own Jupyter notebook environment running in its own container, isolated from other users. The automatic spawning of containers for authenticated users is handled by JupyterHub, which also provides an interface for admins to manage the running containers. This is achieved by using the Pangeo Helm chart  to deploy the Pangeo ecosystem on a CaaS Kubernetes cluster.

Cluster configuration  

The Pangeo ecosystem is deployed on top of CaaS Kubernetes, so all the configuration variables for Kubernetes also apply to Pangeo clusters.

In addition, the following variables are available to configure the Pangeo installation:

Variable Description Required? Can be updated?
Notebook CPUs The number of CPUs to allocate to each user notebook environment. Yes Yes
Notebook RAM The amount of RAM, in GB, to allocate to each user notebook environment. Yes Yes
Notebook storage The amount of persistent storage, in GB, to allocate to each user notebook environment. This is where users will store their notebooks and any other files they import. The storage is persistent across notebook restarts - a user can shut down their notebook server and start a new server later without losing their data. Backups are not provided - if required, they are the responsibility of the user or cluster admins. Yes Yes
Pangeo domain The domain to use for the Pangeo notebook web interface.

If left empty, pangeo.<dashed-external-ip>.sslip.io is used. For example, if the selected external IP is 192.171.139.83, the domain will be pangeo.192-171-139-83.sslip.io.

If given, the domain must already be configured to point to the selected External IP (see Kubernetes configuration), otherwise configuration will fail. Only use this option if you have control over your own DNS entries - the CaaS system or Kubernetes will not create a DNS entry for you.
No No

These variables define the amount of resource available to each user for processing - in order to appropriately configure (a) the size of your cluster worker nodes and (b) the resources for each notebook environment, you will need to consider how many users you are expecting and what workloads they might want to run.

Accessing the cluster  

Access to the underlying Kubernetes cluster is achieved in the same way as any other CaaS Kubernetes cluster.

The Pangeo web interface will be available at https://<pangeo domain>. Access to the Pangeo interface is managed through FreeIPA, and users sign in with the same username and password as for other clusters. As part of the cluster configuration, CaaS will create a FreeIPA group called <clustername>_notebook_users. Granting access to the Pangeo interface is as simple as adding a user to this group.

Before adding user to group:

Adding user to group: before
Adding user to group: before

And after:

Adding user to group: after
Adding user to group: after

Using the Pangeo environment  

A full discussion of the capabilities of Jupyter notebooks, the JupyterLab environment and the many libraries included in the Pangeo ecosystem is beyond the scope of this documentation. There are plenty of examples on the web, and Pangeo provide some example notebooks  .

As a very brief example of the power and simplicity of Jupyter notebooks, especially on Kubernetes, this short video (no sound) shows a user signing into a CaaS Pangeo cluster and uploading a notebook. When run, the notebook spawns a Dask cluster inside Kubernetes and uses it to perform a noddy but relatively time-consuming calculation. You can see the Dask cluster scale up to meet demand in the Dask dashboard:

Administering the hub  

JupyterHub allows admin users to manage the notebooks running in the system, and even to impersonate other users. Unfortunately, the JupyterHub LDAP integration does not currently allow for an entire group to be designated as admins, so admin access in JupyterHub is granted specifically to the FreeIPA admin user.

To access the admin section, first click Hub > Control Panel in the JupyterLab interface:

Administering the hub (1)
Administering the hub (1)

This will open the JupyterHub control panel - any user can use this to start and stop their notebook server, but admins see an extra Admin tab:

Administering the hub (2)
Administering the hub (2)

Clicking on this tab will show a list of the users in the hub, along with buttons to start and stop the servers for those users:

Administering the hub (3)
Administering the hub (3)

Additional admins can be added by clicking the edit user button for that user. This will pop up a dialogue:

Administering the hub (4)
Administering the hub (4)

Check the Admin checkbox and click Edit User to save. The user is now an admin for the hub.

Last updated on 2024-09-05 as part of:  replacing refs using old syntax & tidied some other links (f03769a9c)
On this page:
Introduction   Cluster configuration   Accessing the cluster   Using the Pangeo environment   Administering the hub  
Follow us

Social media & development

   

Useful links

  • CEDA Archive 
  • CEDA Catalogue 
  • JASMIN 
  • JASMIN Accounts Portal 
  • JASMIN Projects Portal 
  • JASMIN Cloud Portal 
  • JASMIN Notebooks Service 
  • JASMIN Community Discussions 

Contact us

  • Helpdesk
UKRI/STFC logo
UKRI/NERC logo
NCAS logo
NCEO logo
Accessibility | Terms and Conditions | Privacy and Cookies
Copyright © 2025 Science and Technology Facilities Council.
Hinode theme for Hugo licensed under Creative Commons (CC BY-NC-SA 4.0).
JASMIN Help Site
Code copied to clipboard