FCC Offers Lessons Learned to Avoid Network Outages
Tuesday, April 17, 2018 | Comments

In a public notice, the FCC disseminated lessons learned from major network outages and reminded communications service providers to review industry best practices to ensure network reliability. The commission also touted its new network reliability page on its website to help ensure that network providers, public-safety entities and the public can readily find the FCC’s work in promoting industry best practices.

Based on its recent analysis of several major network outages that affected subscribers, including those calling 9-1-1 for emergency assistance, FCC staff determined that the outages could likely have been prevented or mitigated if the provider had followed certain network reliability best practices. Therefore, the commission encourages communications service providers to implement the following practices:
1. Minimize impact of maintenance windows. Network operators and service providers should be aware of the dynamic nature of peak traffic periods and should consider scheduling potentially service-affecting procedures to minimize the impact on end-user services.
2. Monitor 9-1-1 network components. Network operators, service providers and public-safety entities should actively monitor and manage the 9-1-1 network components using network management controls, where available, to quickly restore 9-1-1 service and provide priority repair during network failure events. When multiple interconnecting providers and vendors are involved, they will need to cooperate to provide end-to-end analysis of complex call-handling problems.
3 Ensure real-world testing conditions. Service providers and network operators should consider validating upgrades, new procedures and commands in a lab or other test environment that simulates the target network and load prior to the first application in the field.

In addition, the following practices could prevent or mitigate similar outages in the future:
1. Registration traffic. Include registration traffic in the highest priority category of network traffic. Attach critical alarms to failures in the registration process.
2. Data packet monitoring. Monitor traffic to detect when data packets do not progress across a network element.
3. Redundancy failover. Failover to redundant equipment when the number of error messages within a predetermined period of time exceeds a certain threshold, rather than continuing to try to use the equipment that is generating the error messages.
4. Redundancy during maintenance. When performing maintenance activity on multiple pieces of equipment that have the same function for redundancy, perform maintenance on only one piece of equipment at a time. Once successful maintenance has been verified, maintenance activity can begin on the next piece of equipment.

The full notice is here.

Would you like to comment on this story? Find our comments system below.



 
 
Post a comment
Name: *
Email: *
Title: *
Comment: *
 

Comments

No Comments Submitted Yet

Be the first by using the form above to submit a comment!

Site Navigation

Close