Delay estimation in computer networks
Johnson, Nicholas Alexander
Computer networks are becoming increasingly large and complex; more so with the recent penetration of the internet into all walks of life. It is essential to be able to monitor and to analyse networks in a timely and efficient manner; to extract important metrics and measurements and to do so in a way which does not unduly disturb or affect the performance of the network under test. Network tomography is one possible method to accomplish these aims. Drawing upon the principles of statistical inference, it is often possible to determine the statistical properties of either the links or the paths of the network, whichever is desired, by measuring at the most convenient points thus reducing the effort required. In particular, bottleneck-link detection methods in which estimates of the delay distributions on network links are inferred from measurements made at end-points on network paths, are examined as a means to determine which links of the network are experiencing the highest delay. Initially two published methods, one based upon a single Gaussian distribution and the other based upon the method-of-moments, are examined by comparing their performance using three metrics: robustness to scaling, bottleneck detection accuracy and computational complexity. Whilst there are many published algorithms, there is little literature in which said algorithms are objectively compared. In this thesis, two network topologies are considered, each with three configurations in order to determine performance in six scenarios. Two new estimation methods are then introduced, both based on Gaussian mixture models which are believed to offer an advantage over existing methods in certain scenarios. Computationally, a mixture model algorithm is much more complex than a simple parametric algorithm but the flexibility in modelling an arbitrary distribution is vastly increased. Better model accuracy potentially leads to more accurate estimation and detection of the bottleneck. The concept of increasing flexibility is again considered by using a Pearson type-1 distribution as an alternative to the single Gaussian distribution. This increases the flexibility but with a reduced complexity when compared with mixture model approaches which necessitate the use of iterative approximation methods. A hybrid approach is also considered where the method-of-moments is combined with the Pearson type-1 method in order to circumvent problems with the output stage of the former. This algorithm has a higher variance than the method-of-moments but the output stage is more convenient for manipulation. Also considered is a new approach to detection algorithms which is not dependant on any a-priori parameter selection and makes use of the Kullback-Leibler divergence. The results show that it accomplishes its aim but is not robust enough to replace the current algorithms. Delay estimation is then cast in a different role, as an integral part of an algorithm to correlate input and output streams in an anonymising network such as the onion router (TOR). TOR is used by users in an attempt to conceal network traffic from observation. Breaking the encryption protocols used is not possible without significant effort but by correlating the un-encrypted input and output streams from the TOR network, it is possible to provide a degree of certainty about the ownership of traffic streams. The delay model is essential as the network is treated as providing a pseudo-random delay to each packet; having an accurate model allows the algorithm to better correlate the streams.