Design and application of dispersion entropy algorithms for physiological time-series analysis
Changes in the variability of recorded physiological time-series have been connected with transitions in the state of the monitored physiological system. The two primary paradigms describing this connection are the Critical Slow Down (CSD) and the Loss of Complexity (LoC) paradigms. The CSD paradigm considers that during frail or pathological states, a slowing down is observed in the capacity of the system to recover from external stressors resulting in increased output complexity for certain regulated variables. The LoC paradigm suggests that when the equilibrium of a system is disrupted, certain effector variables that displayed multi-scale complexity produce output measurements of reduced variability indicating a loss in the system’s flexibility and capacity to adapt in the presence of external stressors. For this purpose, entropy has emerged as a prominent nonlinear metric capable of assessing the non-linear dynamics and variability of time-series. Consequently, multiple entropy quantification algorithms have been developed for the analysis of time-series. These algorithms are based on Shannon Entropy such as the Permutation Entropy and Dispersion Entropy (DisEn) algorithms; and on Conditional Entropy such as the Sample Entropy and Fuzzy Entropy algorithms. Within the scope of this study, the univariate and multivariate DisEn algorithms, first introduced in 2016 and in 2019 respectively, are used as the foundation and benchmark for the introduction of novel algorithmic variations. The selection of the DisEn algorithms is made due to their capability of producing features with significant discrimination capacity taking into consideration amplitude-based information while maintaining a linear computational complexity and having a functional multivariate variation capable of quantifying cross-channel dynamics. To initially ensure the effective quantification of DisEn during univariate physiological timeseries analysis, the effect of missing and outlier samples, which are common occurrence in physiological recordings, is studied and quantified. To improve algorithmic robustness, novel variations of the univariate DisEn algorithm are introduced for the analysis of low recording quality time-series. The original algorithm and its variations are tested under different experimental setups that are replicated across heart rate variability, electroencephalogram, and respiratory impedance time-series. The analysis indicates that missing samples have a reduced effect on the output DisEn and the error percentage can be maintained at values lower than 8% with the introduction of a variation that skips invalid values. Contrary to missing samples, outliers have a major disruptive effect with error percentages in the range of 57% to 73% for the original DisEn algorithm that is limited in values lower than 22% with the introduction of respective variations. To expand the study from univariate to multivariate analysis, the multivariate DisEn algorithm is applied to physiological network segments formulated from multi-channel recordings of synchronized electroencephalogram, nasal respiratory, blood pressure, and electrocardiogram signals. The effect of outliers, present across different channels, is quantified for both univariate and multivariate DisEn features. The sensitivity of DisEn features to outliers is utilized for the detection of artifactual network segments using logistic regression classifiers. Two variations of the classifier are deployed in several experimental setups, with the first utilizing solely univariate and the second both univariate and multivariate DisEn features. Noteworthy performance is achieved, with the percentage of correct network segment classifications surpassing 95% in a number of experimental setups, for both configurations. Finally, to improve DisEn quantification during the analysis of multivariate systems for physiological monitoring applications, the framework of Stratified Entropy is introduced. Based on the framework, a set of strata with a clear hierarchy of prioritization are defined. Each channel of an input multi-channel time-series is allocated to a stratum and their contribution to the output DisEn value is determined by their allocation. Three novel Stratified DisEn algorithms are presented, as implementations of the framework, allowing multivariate analysis with controllable contribution from each channel to the output DisEn value. The original algorithm and the novel variations are implemented on synthetic time-series consisting of 1/f and white Gaussian noise, waveform physiological time-series and derived physiological data. The introduced Stratified DisEn variations operate as expected and correctly prioritize the channels allocated to the primary stratum of the hierarchy across all synthetic time-series setups. The results of waveform physiological time-series indicate that certain of the novel features extracted through Stratified DisEn achieve effect size increases in the range of 0.2 to 1.4 when separating between states of healthy sleep and sleep with obstructive sleep apnea. The derived physiological data results further highlight the increased discrimination capacity of the novel features with increases in the range of 5% to 30% in the mean absolute difference between values extracted during steady versus stressful physiological states. Furthermore, an example of decrease in the output DisEn values when moving from a steady to a stressful physiological state is highlighted during the prioritization of the heart rate channel, in alignment with LoC, providing an example of how Stratified Entropy could be used to test hypothesis based on the CSD and LoC paradigms. By making steps towards addressing the challenge of low data quality and providing a new framework of analysis, this thesis aims to improve the process of assessing and measuring the variability of physiological time-series, leading to the consequent extraction of viable physiological information.