HCoop Status (hcoopstatus) group
Notices
-
Preparing to deploy the domtool changes to switch the default web node
about 3 months ago from web -
shared moinmoin upgraded to 1.9.6 for security fixes, no action required by members.
about 4 months ago from web -
Domtool-server was left wedged in kernel space, rebooting deleuze. Mail will be down for ~5 minutes.
about 4 months ago from web -
navajos has taken a lunch break due to afs weirdness, halting and restarting now. web services down for a few moments.
about 4 months ago from web -
All services back to normal, sorry for the minor blip there. Cron seems to hate us from time to time.
about 4 months ago from web -
apache mod_disk_cache to blame again, nuking the cache and killing it forever since deleuze is on its way out this year anyway
about 4 months ago from web -
Deleuze's kernel OOPsed yesterday when the afs server disappeared for a few minutes; preparing to restart. Mail will be offline for ~20 min.
about 4 months ago from web -
domtool is working again and another machine was pulled into service for ns2.hcoop.net, but on-site. off-site dns returns tomorrow night.
about 5 months ago from web -
outpost.hcoop.net disappeared from the net today; no response from provider, switching to a new provider tomorrow. domtool down until then.
about 5 months ago from web -
mdstat estimates 40 minutes to rebuild fritz's primary raid1. Will be rebooting after to ensure failed drive stays failed from the array.
about 5 months ago from web -
Databases are down again, still trying to acquire the replacement drive for fritz.
about 5 months ago from web -
Fritz is going up and down today; it appears sdb is dying, but the problems are intermittent. Calling Dell for replacement.
about 5 months ago from web -
Managed to reboot, but then munin was started as soon as anacron finished, rebooting again in an attempt to purge munin
about 5 months ago from web -
Yes, fritz is down again. Somehow munin was disabled but not removed, ran at 3 a.m. and took the RAID offline.
about 5 months ago from web -
Salvager completed, all volumes back up. If you migrated to navajos this weekend it was down for unrelated reasons, but will be up shortly
about 5 months ago from web -
Remote hands request has been entered, fritz should be rebooted within an hour or two.
about 5 months ago from web -
Fritz is down due to panicing, IPKVM is unable to reboot. Waiting for response from remote hands for reboot. May not be until 9 a.m. EST.
about 5 months ago from web -
Deleuze is back up, mail services should be restored. Unfortunately, the system RAID5 is degraded and we need to replace a drive.
about 6 months ago from web -
Load is catastrophically high on deleuze, rebooting. Mail services will be offline for a few minutes.
about 6 months ago from web -
mire is down after running out of ram/swap, force rebooting now
about 6 months ago from web