Fri, 28 March 2008 To you straight from the mouths of our datacenter team: "I will try to explain it as easy as possible. at 18:45EST 3-27-2008; The PFC3B line card in the Primary Supervisor of Core1 went into a LOOP; This caused an issue that did not allow Core1 to process Layer3 traffic across the MSFC3. Layer2 however was not affected in this matter which allowed Core2 to continue running normally and pass Layer2 traffic Across the Core1 Layer2 Back-bone. HSRP which is used as a counter-measure; still showed Core1 as functional even though it was experiencing a Loop. at 19:00EST; a CISCO SMART-NET Tech has been dispatched with additional hardware (SUP, MSFC3, PFC3) in case a defective board was found. At 19:35EST, Core1 was taken off-line in preparation for Cisco's Arrival. Upon arrival at about 20:45EST; a debug was generated and CISCO has re-certified the Module. We are waiting on Final Explanation from CISCO as to what caused the PFC3 daughter board to act in the way that it has. We have taken numerous Physical counter-measures to make sure that this will not happen again. Some of these counter-measures include modification to the HSRP algorythm as well as physical scripting. Lastly; This incident only affected clients that the HSRP was Active on the Device. We have placed more sophisticated HSRP counter-measures." Laymen's explanation: there was a hardware failure. Our side of the blame line is not having enough replication of our DNS records in another datacenter. That is being fixed today. Sorry again for the downtime. Category: general -- posted at: 10:51 AM Comments[0] |

