On mobile usage data analysis, data-driven network optimization and data synthesis
Item statusRestricted Access
Embargo end date30/11/2022
Recent advancements in mobile systems have led to a rapid proliferation of smart phones and smart devices into our daily lives. This has given rise to a plethora of new services and applications for commercial businesses as well as personal end user consumption. Moreover, the rate at which mobile Internet data is consumed has also increased many-fold in the recent years. Therefore, in order to optimally serve mobile data consumers as well as improve the underlying system performance, understanding and characterizing mobile traffic data becomes essential. In the recent past, mobile networking domain has witnessed several technological innovations including cloudification and virtualization of radio access networks (RAN), advanced resource orchestration via network slicing, massive MIMO antennas, millimeter wave etc. These technological advancements, however, present a paradigm shift from the traditional cellular networks and bring a higher complexity in the management and orchestration of mobile networks due to an increased number of decision variables. Therefore, it becomes important to develop efficient, scalable, traffic-aware solutions for the optimization of mobile network performance. In this thesis, we take a step forward towards mobile network data analysis and network performance optimization. We start by experimenting with a nationwide traffic dataset and build our understanding of mobile traffic usage characteristics at the services (i.e., mobile apps) level. We derive the correlation between service usage and the underlying spatial demographic features. Further, using a data-driven approach, we solve the problem of network resource orchestration in the virtualized RAN (vRAN) setting. We propose a distributed and scalable heuristic solution, GreenRAN, to minimize the overall vRAN energy consumption. A major factor limiting this traffic-driven analytics research is the scarce availability of real-world networking datasets. Access to these datasets is generally restricted either due to the challenges involved in large-scale database access management or due to the privacy issues. To lower this dataset access barrier, we develop a tool, MTGAN, for generating synthetic high-fidelity mobile traffic datasets. Our synthetic traffic generator is a deep neural network based on Conditional Generative Adversarial Networks. Finally, we also study and compare different state-of-the-art mechanisms for privacy preserving data publication. We develop a metric, STRAP, to evaluate user privacy provided by different privacy preserving mechanisms and compare them under a common measure of privacy as determined by STRAP.