CPK messages are initially sent to the CPK mailing list, you can (un)subscribe via this link. You can also follow the service interruption messages via RSS using the link in the title under the RSS icon. If the CPK takes more time to resolve, any updates are published on this website.

 

Service Interruptions


1365: restore old ceph shares to new locations

Even though we closed CPK #1359, the user’s shares are not all restored. We have recovered the data from cephfs to temporary storage that is not accessible to users, it will take a bit more time to find new permanent locations for this data. The storage will not be cluster based anymore, but single server ZFS with snapshots.

Resolved Reports


1346: Jupyter and Chemotion offline

For an unknown reason, the Jupiter and Chemotion servers didn’t start. We are currently looking at how we can start the machine again. The virtual machines have been rebuild, based on their existing disks.

1345: Ceph performance degraded due to broken storage node

Ceph filesystem storage is experiencing reduced performance, because one of the storage nodes is currently offline. A combination of factors causes this to affect performance, fullness, uneven distribution of data over the storage nodes. We expect this to be resolved when the node is back in the cluster. update 7 Oct 2023 The cluster is mostly complete again for a while already, but there’s a lot of remaining issues to be resolved by the cluster....

1344: Increased chance of SPAM/phishing mails in Science mailbox

The RU contract with the anti-SPAM/anti-phishing service Proofpoint expired September 21, 2023, which means that ‘Proofpoint End User Digest’ mails are not sent anymore after that date. C&CZ is migrating Science mailboxes to the anti-SPAM/anti-phishing service of Microsoft Exchange Online Protection. There is an increased chance of receiving SPAM/phishing mails in the Science mailbox during this migration period. P.S. the RU central mailboxes (with addresses ending in @ru.nl) have already been migrated to Microsoft Exchange Online Protection....

Updated Apr 2, 2024  ·  Erik Joost Visser · Created Sep 14, 2023 ·  Erik Visser

1343: Science and Radboud Newsletter in Spam

The Science newsletter is sent from an @ru.nl mail-account. Recently, the @ru.nl mailservice was changed by adding Exchange Online Protection (EOP). EOP is the successor for Radboud University to the Proofpoint e-mail security filter. EOP added a mail header line to the newsletter List-Unsubscribe: https://c8ce19ec62454d21b73b7d5a25559d8f.svc.dynamics.com/t/lu/7... which made the mail appear to be spam to the Science spamfilter SpamAssassin 1.1 URI_HEX URI: URI hostname has long hexadecimal sequence When we noticed this, we fixed it by welcome listing (allowlisting/passlisting) the sender address communications-science@ru....

1342: Some home directories not available

After the Monday morning reboot, the NFS server on home1 refused to start properly. We are investigating why a manual restart of nfs was needed.

1341: Missed RU mail due to stopped external forwarding

RU mail management let us know that yesterday the forwarding to external (non-RU) mail addresses has been stopped as announced earlier. Unfortunately, mail for several dozens of Science users was/is not forwarded to the Science mailservers. These mails can still be found in MS365 (RU mail), either in the Inbox or in the Deleted Items. RU mail management promised that the forwarding will be corrected tomorrow.

1340: cpu replacement vmhost06

Announcement of maintenance, Wednesday afternoon we are going to replace the cpu of one of our main vmhost servers, meaning vms gitlab9 (pep) slurm22 pep3 jitsivm poliep indicoimapp2vm pep4 mariavm01 smtp2 will be down for up to 1 hour. several services depend on the mariavm01 (websites, slurm), so they will be affected too.

Updated Aug 17, 2023  ·  Eric Lieffers · Created Aug 14, 2023

1339: motherboard replacement vmhost06

Apologies for the short notice, we are now going to replace the motherboard of one of our main vmhost servers, meaning vms gitlab9 (pep) slurm22 pep3 jitsivm poliep indicoimapp2vm pep4 mariavm01 smtp2 will be down for up to 1 hour. several services depend on the mariavm01 (websites, slurm), so they are affected too.

1338: Daily backups offline

Our daily backup system relies on cephfs storage, which is currently offline, see CPK#1337. This means that as of July 22nd we are unable to perform or restore daily backups. When the cephfs problems are resolved the daily backups should also be OK and restorable again. NB, this has no effect on the Monthly backups, which continue to work normally.

1337: Cephfs offline

After the power down of the Huygens building we are experiencing a problem with bringing Ceph file system back online. We currently do not know when the Ceph cluster is operational again. Update 2023-08-01 10:30 Ceph is working again. This CPK is now closed. CPK#1338 is also closed. Update 2023-07-31 12:30 After some more support from 42on, we managed to restart the cephfs, we cannot be sure all files are there, but almost all files are....

Updated Aug 1, 2023  ·  Miek Gieben · Created Jul 22, 2023