In 2020, Comcast deployed a Profile Management Application (PMA) system for optimizing the DOCSIS 3.0 (D3.0) configuration across upstream channels in the network to achieve the proper balance between efficiency and robustness (fault tolerance). The established PMA system strengthens organizational objectives to increase network capacity, proactively respond to impairments, and support robust service optimizations for customers. At its core, the current system adopts a rules-based approach, in which a static policy (in the form of defined thresholds) for the different telemetry features (e.g., signal-to-noise ratio and codeword error rates) govern the choice of channel configuration.
Limitations within PMA exist when shaping telemetry thresholds to adapt to a wider range of environmental conditions. Currently, configurations are assigned to channels starting with conservative profiles and progressively moving toward efficient, yet less robust profiles. Further innovations within Comcast’s PMA implementation focus on the delicate balance of applying intelligent, dynamic decision making policies while preserving proper configurations for the diverse set of network devices.
A reinforcement learning (RL) approach for PMA allows, through experience, learning an optimal policy and therefore, enhancing the criteria used at various decision points. Simultaneously, RL simplifies policy management by consolidating permutations of telemetry boundaries into a single entity, called a ‘state’.
PMA efficacy improves with RL by reducing the latency of transitioning into optimal, efficient profiles, and doing so with increased confidence across varying network conditions. Inherent in this implementation is the risk reduction for operators to deploy more profile changes that maximize capacity without crossing the boundary that introduces disruption in service. This paper introduces a proof-of concept RL-based PMA system along with performance study based on initial experimentation conducted in our laboratory