CPK messages are initially sent to the CPK mailing list, you can (un)subscribe via this link. You can also follow the service interruption messages via RSS using the link in the title under the RSS icon. If the CPK takes more time to resolve, any updates are published on this website.

For RU wide service interruption see meldingen.ru.nl.

 

Service Interruptions


1407: NFS problems under investigation

When we moved the gateways of our networks from the old location to the new firewalls, we have received some complaints about NFS filesystems having slowness, longer delays or unavailability. In general NFS should never be a requirement for a clusternode job if you can avoid it, because this I/O is always much slower than local I/O from /scratch. We are investigating how we can optimise the network to resolve this issue, but we are hard pressed to know the exact cause of the problem....

1415: Clusternode maintenance day - February 6th 2026

Every half year we do clusternode maintenance, with at least a package ugprade and a reboot, but sometimes other maintenance can happen, such as changes in filesystems or network configurations. The upcoming date for this maintenance is February 6th, 2026 (Friday)

Resolved Reports


1256: Problems with a virtual host

The virtual machine host ‘oscar’ stopped running at 5:30am. The listed virtual machines are not running at the moment. We are working on it. Update: Resolved. Again, a broken LVM snapshot caused the problem.

Updated Oct 11, 2022  ·  Bram Daams · Created Nov 5, 2019 · 

1255: Power Distribution Unit failure

A failed PDU caused the servers with a single Powersupply to die and restart after the PDU was replaced. Running jobs were not finished on these nodes.

Updated Oct 11, 2022  ·  Bram Daams · Created Oct 23, 2019 · 

1254: Webserver 'havik' offline

A loose power cable caused an interruption of one of our web servers. The machine is equipped with redundant power supplies. It’s not clear why a single loose cable caused a service interruption.

Updated Oct 11, 2022  ·  Bram Daams · Created Oct 6, 2019 · 

1253: NFS service interruption

Due to a configuration error, the NFS service on file server ‘peck’ refused to start. After fixing the error, the NFS service started up correctly.

Updated Oct 11, 2022  ·  Bram Daams · Created Sep 23, 2019 · 

1252: Boot problems of virtual host

At the weekly reboot of vmhost ‘oscar’, it did not boot properly. The listed virtual machines are not running at the moment. We are working on it. Update: most probably, the problems where caused by full snapshots. After removal of the snapshots, the system booted properly.

Updated Oct 11, 2022  ·  Bram Daams · Created Sep 3, 2019 · 

1251: Mailbox problems for a few users

Due to an administrative error the mailboxen of a few hundreds of users became inaccessible. Mail couldn’t be read from or delivered to these mailboxes until this had been repaired. In this period the mailclient of 19 users tried to read the mailbox and thus possibly reported an error.

Updated Oct 11, 2022  ·  Bram Daams · Created Aug 23, 2019 · 

1250: Mail problems with newly introduced incoming mail server

A new receiving mailserver was introduced July 1. This server has the most recent version of Ubuntu and other software and settings that are more conforming to the directives (in Dutch). After the introduction, problems appeared with receiving mail from Microsoft Office 365 Exchange Online (mail.protection.outlook.com). An even bigger problem was that some accepted mails were not forwarded, but bounced to the sender. Because that problem couldn’t be fixed immediately, the newly introduced server has been shut down on July 2....

Updated Oct 11, 2022  ·  Bram Daams · Created Jul 1, 2019 · 

1249: Network interruption central RU network

Due to a faulty RU core network device, several service couldn’t be reached. This device is in a redundant setup. The ISC network department will start an investigation what exactly happened and how to prevent nuisance in the future.

Updated Oct 11, 2022  ·  Bram Daams · Created May 22, 2019 · 

1247: DNS nameservice interruption for Science domains

Yesterday afternoon a DNS error with science.ru.nl (due to the introduction of the secure DNSSEC) made science.ru.nl disappear from the internet for many users, when viewed from outside campus. In the evening this error has been corrected. Because of the caching of DNS responses, the last problems should disappear tonight. In the meantime one can use workarounds like rebooting the home router and/or pc, using a different network (mobile provider) or using rainloop....

Updated Oct 11, 2022  ·  Bram Daams · Created May 14, 2019 · 

1248: Dozens of Science websites down, also Roundcube

Because a web application used too much memory, all websites of this webserver had problems. After restarting the webserver, all websites could be reached again. We will make sure that web applications get memory limits, which will prevent this problem in the future.

Updated Oct 11, 2022  ·  Bram Daams · Created May 14, 2019 ·