CPK messages are initially sent to the CPK mailing list, you can (un)subscribe via this link. You can also follow the service interruption messages via RSS using the link in the title under the RSS icon. If the CPK takes more time to resolve, any updates are published on this website.

For RU wide service interruption see meldingen.ru.nl.

 

Service Interruptions


 No service interruptions.


Resolved Reports


1339: motherboard replacement vmhost06

Apologies for the short notice, we are now going to replace the motherboard of one of our main vmhost servers, meaning vms gitlab9 (pep) slurm22 pep3 jitsivm poliep indicoimapp2vm pep4 mariavm01 smtp2 will be down for up to 1 hour. several services depend on the mariavm01 (websites, slurm), so they are affected too.

1338: Daily backups offline

Our daily backup system relies on cephfs storage, which is currently offline, see CPK#1337. This means that as of July 22nd we are unable to perform or restore daily backups. When the cephfs problems are resolved the daily backups should also be OK and restorable again. NB, this has no effect on the Monthly backups, which continue to work normally.

1337: Cephfs offline

After the power down of the Huygens building we are experiencing a problem with bringing Ceph file system back online. We currently do not know when the Ceph cluster is operational again. Update 2023-08-01 10:30 Ceph is working again. This CPK is now closed. CPK#1338 is also closed. Update 2023-07-31 12:30 After some more support from 42on, we managed to restart the cephfs, we cannot be sure all files are there, but almost all files are....

Updated Aug 1, 2023  ·  Miek Gieben · Created Jul 22, 2023

1336: VPN service downtime

The VPNsec service will be moved to a new server. This move will cause downtime and existing VPN connections will be destroyed. Downtime is expected not to exceed several minutes.

Updated Jul 6, 2023  ·  Wim Janssen · Created Jul 4, 2023

1335: Mailman disruption

Last friday, a change in the mailman configuration has been rolled out which had the inadvertent effect that mails were not delivered to external addresses anymore. However, these mailman posts were sent successfully to internal Science mail addresses. The change has been rolled back for the moment but is a necessity meaning that we’re looking for another solution.

Updated Sep 28, 2023  ·  Miek Gieben · Created Jul 3, 2023

1334: router change for most Science services (dr-huyg)

The connecting router (dr-huyg) for all servers in the subnets 131.174.30.0/24, 131.174.31.0/24 and 131.174.16.128/26 will be replaced. It is expected that this will cause an interruption of ca. 10 minutes in the connectivity, but unforeseen circumstances may increase this delay. The reason to do this now is because of the planned power interruption on July 22. The old router hardware has a high probability of failing to survive this.

1333: Science IT services down July 21 and 22 - Huygens building power outage

Friday July 21 from 17:00, we will start shutting down compute clusternodes, in order to prepare for the power outage of the Huygens building Saturday July 22. Other servers will be shut down later. The most important servers (mail, home, file, Ceph, gitlab, loginservers) will be shutdown starting Saturday morning 7:00. We will try to keep basic services (DNS/DHCP, SMTP(mail) and license servers) up during this power outage. RU services are not serviced from the Huygens building, so will not be affected....

1332: Certificate of authentication server expired

Due to the expiration of an LDAP certificate, it is temporarily not possible to log in to various services. A new certificate is being installed urgently. Affected services include Eduroam in combination with Science logins, VPN, GitLab and Mattermost.

1331: Downtime Felixdisk and bioboost

Due to a failure in a power distribition unit (pdu) the servers felixdisk and bioboost went down. Both servers have been connected to another pdu and are now working again.

1330: networking problems due to routing change

The planned routing change, which should not have caused issues for more than a few seconds, didn’t work as planned and caused problems for up to 15 minutes. Update 2023-06-12 - 22:00 The situation has become worse, some problems: DNS resolving, some fileservers and jupyterhub are having problems due to the network change. We will attempt to resolve the issue asap. Update 2023-06-13 - 11:30 After correcting errors (fixed IP addresses) all services are up again....