Emergency Router Cut-Over

Today Was Fun

At around 1650CDT, I started dropping ~20% of my WAN packets from my Buffalo router+WAP. By 1700, the WLAN was broadcasting but refusing to switch traffic. The wife was en rt with a massive speech to finish & practice for tomorrow so I had little time to fix myself.

Because I was less-sure about my Ruckus ZoneFlex 7372's config and stability, I first pulled her a cat6 drop from a 2nd floor switch, down the hall to her office. Then I moved to shore up my Ansible playbooks to move from a 6-VLAN config, which I had been planning to cut over to this weekend at the latest, to a single VLAN to simply restore the status quo. Basically exactly what I had before my Buffalo drank the wrong Koolaid.

What Resulted Tonight?

  • Cable modem physically moved in the house from the 2nd floor to the basement
  • Cable modem downlink moved from dd-wrt Buffalo, to Debian Linux router
  • Buffalo demoted from DHCP & DNS so it's mostly an unmanaged switch
  • Ruckus AP moved to the basement & directly connected to the new router
  • Important WLAN stations moved to temporary SSID & WPA PSK

What Did I Learn?

  • Ansible saved my life

    I had spent a lot of time working on a role to configure bridges, VLANs, raw ifaces, and addressing. That shit paid better dividends than Enron downing power plants

  • My network needs more HA

    • Hot-spare or load-balanced ISP uplink
    • Hot-spare or load-balanced DOCSIS 3 modem on-hand
    • At least 2x UPS (workstation + infrastructure, maybe 4x for workstation and infrastructure idempotency?)
    • Hot-spare workstation (or at least enough equipment to rebuild what I currently have)

    These are all pipedreams at this point but the maintenance cycle to upgrade my screens and now this event have shown me that I'm woefully unprepared.

    For a guy that gets paid to assure service responsiveness & uptime, I'm clearly doing a shitty job of it at home.

    And that could seriously fuck me...

  • I have the best wife in the world

    Even with a critical deadline on the hook, she fully appreciated my diagnosis of a failing router & suggestion that we migrate immediately. We worked cloesly to coordinate my downtime window that was to be required and everything went well. I think a lot of IT spouses could learn something from this incident.

links

social