HPC Users@UiO Newsletter #2, 2022

News on HPC systems @ UiO, application deadline for CPU time through Sigma2, interesting conferences and external courses.

  • USIT Underavdeling for IT i forskning ITF(NO), or Division for research computing (RC,EN), is responsible for delivering IT support for research at University of Oslo.
  • The division's departments operate infrastructure for research, and support researchers in the use of computational resources, data storage, application portals, parallelization and optimizing of code, and advanced user support.
  • Announcement of this newsletter is done on the hpc-users mailing list. To join hpc-users list, send a mail to sympa@usit.uio.no with subject "subscribe hpc-users Mr Fox" (if your name is Mr Fox). The newsletter will be issued at least twice a year.


 

Hi all, and welcome back from summer vacations! An exciting fall coming up with Lumi-G and new NIRD going into production, conferences going physical again etc. But first let's take some time to look back at where we were before summer. Enjoy!

Data@UiO conference - video recordings available

logoBig thanks to all who participated in the Data@UiO conference! If you missed parts of the conference or need a refresher after summer, video recordings are now available. In addition to some very nice keynote presentations from Anders Kvellestad and Einar Brock Johnsen, you can find demos on accessing Light HPC, Fox, an overview of our and Sigma2's resources, how to find these and more.

NIRD2020 installed

NIRD2020 in Lefdal Mines DatacenterNIRD2020 is now in place, aiming for production in late October or early November. The system is based on IBM's Spectrum Scale, Spectrum Discover and Spectrum Protect, with total capacity of 32  petabytes (PB).

The new infrastructure offers, among other things, adaptable application services, support for file and object storage and several protocols, and application programming interfaces (APIs).With an I/O throughput of 229 GB/s, the new NIRD will support demanding workflows, such as artificial intelligence/machine learning and data-intensive analysis.

NIRD is placed in the Lefdal Mines Datacenter (LMD), in the containers you can see in the photo (the fox in front not showing scale). LMD will also be the location of the machine to replace Fram, code-named A2, currently scheduled to be procured in 2023, with the procurement process starting this summer.

Fox Supercomputer - get access now

Fox HPC cluster logo
The Fox cluster is the 'general use' HPC system within Educloud, open to researchers and students at UiO and their external collaborators. There are 24 regular compute nodes with 3,000 total cores and five GPU accelerated nodes with NVIDIA RTX 3090 and NVIDIA A100 cards available. Access to Fox requires having an Educloud user, see registration instructions. About 100 projects have already joined the Educloud! 

For instructions and guidelines on how to use Fox, see Foxdocs - the Fox User Manual

SSH, 2FA, jump hosts and all that

As was mentioned at the Data@UiO conference, we have to tighten the ship a bit when it comes to security and SSH. Historically, linux machines at UiO had their SSH port open to the world. As the world has evolved there is now a rather urgent need to reduce the number of potential attack surfaces at UiO, and one of the steps is to close down SSH for most machines except a few jump hosts. For our machines in the hpc.uio.no domain this is being enforced since 20 June. This means that if you want to login to an app node, LightHPC node etc. you need to go through a jump host, e.g., login.uio.no:

ssh -J login.uio.no server.hpc.uio.no

A next step will be to enforce two-factor authentication on the jump hosts, much like what we already have for Fox in Educloud. So if you already are using Fox, you know how this works - and if you are not yet using Fox, then why not? Speaking of two-factor authentication on Fox, if you want to avoid having to type in the two-factor code every time you log in, put the following in your .ssh/config:

Host fox.educloud.no
    ControlPath ~/.ssh/controlsock-%h-%p
    ControlMaster=auto

If you then again log in to Fox, you will (as usual) be asked for your two-factor code and password, but as long as this ssh session is active all subsequent logins to Fox will "reuse" this connection.  Once the first login session has closed you have to authenticate again, but for many people this will only require one full login process in the morning and simplify all subsequent logins throughout the day.

Software request form

If you need additional software or want us to upgrade an existing software package, we are happy to do this for you (or help you to install it yourself if you prefer that). In order for us to get all the relevant information and take care of the installation as quick as possible, we have created a software request form. After filling in the form a ticket will be created in RT and we will get back to you with the installation progress.

To request software, go to the software request form.

LUMI-G soon in production

LUMI, the third fastest HPC system in the world (May 2022) is aiming to bring LUMI-G, its GPU accelerated partition, into production very soon. If you wish to use LUMI, you apply for access through the regular Sigma2 call for resources prior to each period start (1 April and 1 October), see below. LUMI access is managed by Sigma2 as an extension to the national e-infrastructure resources. 

LUMI-G is specifically aimed at machine learning use-cases so if you require GPU compute please consider applying. Each LUMI-G node is built around 4x AMD MI250X GPUs that are directly connected to the global interconnect, which means GPUs on different nodes can share data directly without involving any CPUs!

New e-Infrastructure allocation period 2022.2, application deadline 19 August 2022

The e-Infrastructure period 2022.2 (01.10.2022 - 31.03.2023) is getting nearer, and the deadline for applications for HPC CPU hours and storage (for both regular and sensitive data), is 19 August. This also includes access to LUMI-C and LUMI-G. New from period 2022.2 will be the introduction of the LARGE allocation class. In this period, a LARGE project is a project applying for 20 million CPU hours or more.  LARGE projects will be evaluated by external review in addition to the Resource Allocation Committee and will receive an allocation for two compute periods when accepted (if applicable). As more than one allocation period will be needed to fully implement the LARGE allocation procedure, 2022.2 will be a soft start with flexible case handling and temporary allocations as necessary to cover up for any case handling delays. LARGE projects will be contacted individually by the administration to follow up on the allocation process. 

Please note that although applications for allocations can span multiple allocation periods, they require verification from the applicants prior to each application deadline to be processed by the Resource Allocation Committee for a subsequent period. Hence any existing multi-period application must be verified before the deadline to be evaluated and receive an allocation before the new period starts. This does not apply to LARGE projects.

Kind reminder: If you have many CPU hours remaining in the current period, you should of course try to utilize them asap, but since many users will be doing the same there is likely going to be a resource squeeze and potentially long queue times. The quotas are allocated according to several criteria, of which publications registered to Cristin is an important one (in addition to historical usage). The quotas are based on even use throughout the allocation period. If you think you will be unable to spend all your allocated CPU hours, it is highly appreciated to notify sigma@uninett.no so that the CPU hours may be released for someone else. You may get extra hours if you need more later. For those of you that have run out of hours already, or are about to run out of hours, take a look at the Sigma2 extra allocation page to see how to ask for more. No guarantees of course.

Run

projects

to list project accounts you are able to use.

Run

cost -p nn0815k

to check your allocation (replace 0815 with your project's account name).

Run

cost -p nn0815k --detail

to check your allocation and print consumption for all users of that allocation.

HPC Course week/training

Norwegian Research Infrastructure Services NRIS (formerly known as the The Metacenter), has an extensive education and training program to assist existing and future users of our services. UiO has joined NRIS training providing training to all Norwegian HPC users, instead of just focusing on UiO users, this makes it possible to provide a more streamlined and consistent training by consolidating the training events.  The courses are aimed to give the participants an understanding of our services as well as using the resources effectively.

There are two courses coming up.

1. HPC On-boarding October 2022 - Link to more details and registration

2. Best Practices on NRIS Clusters November 2022 - We have not set dates and schedule yet, please visit here for more information.

Other hardware needs

If you are in need of particular types of hardware (fancy GPUs, ARM, Kunluns, Dragons, Graphcore, etc.) not provided through our local infrastructure, please contact us (hpc-drift@usit.uio.no), and we'll try to help you as best we can.

Also, if you have a computational challenge where your laptop is too small but a full-blown HPC solution is a bit of an overkill, it might be worth checking out NREC. This service can provide you with your own dedicated server, with a range of operating systems to choose from.

With the ongoing turmoil about computing architectures we are also looking into RISC-V. The European Processor Initiative is aiming for ARM and RISC-V and UiO needs to stay on top of things.

Creative Computing Hub Oslo (in short C2HO)

An interdisciplinary network called Creative Computing Hub Oslo (in short C2HO), which focuses on gathering and sharing creative-computing expertise/resources at UiO and within the Oslo region was recently established. USIT is a partner in this project.

Project website: https://c2ho.no/

Project launch: 25 August 14:00 - 16:00 (all are welcome, requires registration)

Registration: https://www.hf.uio.no/imv/english/research/news-and-events/events/other/c2ho/c2ho-opening-event.html

Image may contain: Font, Technology, Circle, Darkness.

There will be talks, panel discussions, and and a live coding music performance.

C2HO Hackathon:

Dates: 22 - 25 August

https://www.hf.uio.no/imv/english/research/news-and-events/events/other/c2ho/c2ho-hackathon-2022.html

During the same week (22 to 25 August) the C2HO network is also organizing a Hackathon. The hackathon is targeting primarily university students (from any institute) but anyone can join participating teams. Winners will be awarded and will showcase their work during the opening event.

Image may contain: Font, Science, Multimedia, Darkness, Circle.

Publication tracker

USIT Department for Research Computing (RC) is interested in keeping track of publications where computation on RC services are involved. We greatly appreciate an email to:

hpc-publications@usit.uio.no

about any publications (including in the general media). If you would like to cite use of our services, please follow this information.

Fox checking out A2 machine room construction site

Stuffed fox checking out construction work for A2 in LMD. Photo: Jon Kerr Nilsen, UiO

Published Aug. 10, 2022 12:39 PM - Last modified Aug. 10, 2022 12:39 PM