Instaclustr Node Patching — finding a Kernel bug with the AWS EC2 Xen Hypervisor

How does the Instaclustr OS patching cycle work?

  • Step 1: Perform thorough internal testing of various node infrastructures, including extensive performance testing
  • Step 2: Upgrade a small portion of our non-production SLA tier clusters (developer size nodes)
  • Step 3: Upgrade all remaining non-production SLA tier clusters
  • Step 4: Make this Operating System version the default for all new nodes
  • Step 5: Upgrade all production SLA tier clusters
  • Step 6: Perform final analysis of the fleet to ensure that all applicable nodes are running the correct OS version

Issues found in testing

[    6.902955] ena: The ena device sent a completion but the driver didn't receive a MSI-X interrupt (cmd 8), autopolling mode is OFF[    6.914601] ena: Failed to submit get_feature command 12 error: -62[    6.921175] ena 0000:00:03.0: Cannot init indirect table[    6.927008] ena 0000:00:03.0: Cannot init RSS rc: -62[    6.947883] ena: probe of 0000:00:03.0 failed with error -62[   65.783000] nvme nvme0: I/O 15 QID 0 timeout, completion polled

Recommendations from AWS

Instaclustr Actions

Next Steps



