Several different approaches –Pay per View, Near Video on Demand (NVOD) and Video on Demand (VOD) have been used to deliver video services to customers over cable networks, and a variety of network architectures have been proposed for VOD. This paper will model the performance characteristics of different VOD architectures and pay special attention to their scaling properties. To observe fundamental video stream traffic characteristics and the scalability of servers and the transmission infrastructure, we propose to perform simulation experiments for various VOD architectures to reveal which bottlenecks were the most serious. Different VOD architectures assume different locations or types of bottlenecks. Sensitivity analysis will be conducted by changing the values of various inputs (including technical ones such as headend locations, content distribution and streaming mechanisms). Simulation is done using different load balancing scenarios such as server load, round robin and the scalability issues are discussed by using server caching at the local hubs. Failure mode recovery analysis is also conducted as one of the scenarios to study the fault tolerance issues in VOD networks.