A Profile Management Application (PMA) is a critical component of DOCSIS downstream and upstream for both speed and reliability. This is especially true with increased bandwidth demands in recent years. As such, it is critical to react quickly to issues in order to provide the best customer experience. Faster mitigation of network issues reduces customer impact and call volumes. The profile recommendation interval was lowered from 6 hours in the previous system (Harb, 2020) to 5 minutes in the DOCSIS 3.0 (D3.0) upstream (US), and from 3.5 days to 1 day in the DOCSIS 3.1 (D3.1) US and downstream (DS), while reducing operational costs and improving capacity.
Ingesting and analyzing large amounts of data at a high rate creates high demand for both storage and CPU. The technology stack was refactored and costs were lowered by eliminating redundancy and leveraging streaming, batching, cloud computing, and parallel processing. Aligning the polling and PMA processing using Simple Storage Service (S3), Simple Notification Service (SNS), and Simple Queueing Service (SQS) allows for processing of a single batch of related data immediately after polling. Storage demand was reduced by moving components from a relational database to S3 with a large batch size. CPU demand was reduced by moving the analysis logic from a large Apache Spark cluster to a smaller Elastic Kubernetes Service (EKS) cluster. CPU demand was further reduced by refactoring the clustering algorithm to use Single Instruction Multiple Data (SIMD) parallel processing.
Making PMA recommendations more often improves capacity to an extent, but larger capacity gains were made by making changes to the profile selection and clustering algorithms. For D3.1, the Modulation Error Ratio (MER) data model was improved by using histograms and a time decay function. Better utilization and capacity estimates were created by using the model and the added time dimension. Optimal percentiles and corresponding weights are generated for each modem and used as inputs to the clustering algorithm. The result was capacity gains of greater than 8% and 5 Tb/s.