Next-generation networks are increasingly challenging to manage as service touchpoints are continuously added. Coupled with the need for legacy networks to work alongside new network rollouts, operators are seeing a growing number of “alarm storms” generated by all these systems and services. These alarm storms not only extend the time needed to evaluate issues, but also make it more challenging to balance the resources used to investigate issues and manage networks.
Automation, artificial intelligence (AI) and machine learning (ML) in network operations is increasingly popular with multiple system operators (MSOs) as a means to reduce costs, predict network performance, and drive network efficiencies. By using test data representative of an MSO network to train neural networks alongside a classification engine, the relationship between nodes and grouping of like behavior was explored. This paper will show how AI/ML techniques were successfully implemented to suppress 99% of the alarms, locate and partition the root cause of an alarm storm with high accuracy (first recommendation accuracy up to 80%), and reduced time-to-solution (from hours to minutes), resulting in higher customer satisfaction and network reliability.