CPK messages are initially sent to the CPK mailing list, you can (un)subscribe via this link. You can also follow the service interruption messages via RSS using the link in the title under the RSS icon. If the CPK takes more time to resolve, any updates are published on this website.

For RU wide service interruption see meldingen.ru.nl.

 

Service Interruptions


1415: Clusternode maintenance day - February 6th 2026

Every half year we do clusternode maintenance, with at least a package ugprade and a reboot, but sometimes other maintenance can happen, such as changes in filesystems or network configurations. The upcoming date for this maintenance is February 6th, 2026 (Friday)

Resolved Reports


1392: Network failure in part of network

There are servers unreachable due to an unknown problem with the 25Gbit network switch in room ak008. We don’t know the root cause yet. Update: rebooting the affected switch has resolved the problem. DOWN amanda22.science.ru.nl DOWN cephgrafana.science.ru.nl DOWN cephgw3.science.ru.nl DOWN cephgw4.science.ru.nl DOWN cephmon2.science.ru.nl DOWN cephosd07.science.ru.nl DOWN cephosd08.science.ru.nl DOWN cephosd09.science.ru.nl DOWN cephosd10.science.ru.nl DOWN cephosd11.science.ru.nl DOWN cephosd12.science.ru.nl DOWN cephosd13.science.ru.nl DOWN cephosd23.science.ru.nl DOWN cephosd26.science.ru.nl DOWN cephosd27.science.ru.nl DOWN chemotionvm.science.ru.nl DOWN containervm02.science.ru.nl DOWN dockervm01.science.ru.nl DOWN dockervm02.science.ru.nl DOWN dockervm03.science.ru.nl DOWN ds3-huyg.net.science.ru.nl DOWN elabrolabvm.science.ru.nl DOWN filemakervm.science.ru.nl DOWN home3.science.ru.nl DOWN jupytervm.science.ru.nl DOWN mariavm02.science.ru.nl DOWN mossvm.vm.science.ru.nl DOWN rootcell.science.ru.nl DOWN vmhost03.science.ru.nl DOWN vmhost04.science.ru.nl DOWN vmhost07.science.ru.nl DOWN zammad3.vm.science.ru.nl

1391: Planned network disruption on some server networks

Services on networks that are currently behind our old picos switches will be transferred to be behind our firewalls. This may take a few minutes of downtime due to the changes needed for moving the gateway functionality. If all goes well, it will be one period of a few minutes, if we need to roll back and fix things, there may be a repeat downtime. All went well, there was a few minutes of interruption on the cncz homepage due to ARP caching issues. ...

1390: Authentication downtime

Following the recent update of our LDAP servers certificates, multiple users have reported authentication failures when attempting to log in via RADIUS. The issue appears to affect only users with RADIUS-based authentication, while LDAP-based authentication continues to function successfully. Affected users typically include HFML technicians, guest console users, and Science logins with Wi-Fi access.

Updated Jun 6, 2025  ·  Erik Joost Visser · Created Jun 6, 2025 ·  Erik Visser

1389: Servers unreachable

In the transition from the older 25Gbit switches to newer switches, two blocks of 4 connections were temporarily unavailable due to a peculiarity of the switches where ports are operated in blocks of 4. The removal of an unused cable triggered a disabling of the other ports in the bunch, reseating one of these connections in one case and re-inserting the unused cable fixed the block of ports to start working again. ...

1388: /vol/astro2 and /vol/astro5 unavailable

Due to errors on the /vol/astro2 filesystem, we had to reboot the fileserver comas1 and take it offline to perform repairs. During this process, both /vol/astro2 and /vol/astro5 have been unavailable.

Updated May 12, 2025  ·  Erik Joost Visser · Created May 12, 2025 ·  Erik Visser

1387: /vol/astro6 and /vol/astro7 unavailable

Due to errors on the filesystem /vol/astro7, we had to reboot the fileserver comas2, and take in offline to run repairs on the filesystems. During those repairs /vol/astro7 and /vol/astro6 are not available. Last time this happend it took 24 hours to complete

Updated May 13, 2025  ·  Erik Joost Visser · Created May 12, 2025 ·  Eric Lieffers

1386: Gitlab/Mattermost/Pages offline after package updates

After the package for gitlab was updated, the service did not come up again, due to a problem with post-install scripts. After rebooting the package installation could finish and service was restored.

1385: DHCP on 25Gbit not working

Due to a change in the network setup the dhcp servers for the 25 gbit networks (which are being changed to new switches and a new management method) the dns servers were unreachable by the dhcp servers. This caused the dhcp servers to be unable to resolve the names in the configuration to their ip addresses. It was fixed by a change in the central firewall.

1384: ubuntu 20 docker broken

A package upgrade broke the internal networking on our ubuntu 20.04 docker container VMs. Internal name resolution does not work anymore.

1383: lilo7 access with managed SSH keys

We have updated the authentication process for login server lilo7 (lilo7.science.ru.nl) as part of a test. As of now, your ~/.ssh/authorized_keys file on lilo7 is ignored. This change is designed to prevent bad actors from easily gaining persistent access. The good news is that we can now grant access to lilo7 from outside the previously very restricted IP ranges. To gain access from a new IP range, you will need to provide us with your public SSH key and the IP range you will be connecting from. ...