Multimodal and disentangled representation learning for medical image analysis

Chartsias, Agisilaos

Multimodal and disentangled representation learning for medical image analysis

Simple item page

dc.contributor.advisor

Tsaftaris, Sotirios

en

dc.contributor.advisor

Escudero Rodriguez, Javier

en

dc.contributor.author

Chartsias, Agisilaos

en

dc.date.accessioned

2021-01-18T18:17:48Z

dc.date.available

2021-01-18T18:17:48Z

dc.date.issued

2020-11-30

dc.description.abstract

Automated medical image analysis is a growing research field with various applications in modern healthcare. Furthermore, a multitude of imaging techniques (or modalities) have been developed, such as Magnetic Resonance (MR) and Computed Tomography (CT), to attenuate different organ characteristics. Research on image analysis is predominately driven by deep learning methods due to their demonstrated performance. In this thesis, we argue that their success and generalisation relies on learning good latent representations. We propose methods for learning spatial representations that are suitable for medical image data, and can combine information coming from different modalities. Specifically, we aim to improve cardiac MR segmentation, a challenging task due to varied images and limited expert annotations, by considering complementary information present in (potentially unaligned) images of other modalities. In order to evaluate the benefit of multimodal learning, we initially consider a synthesis task on spatially aligned multimodal brain MR images. We propose a deep network of multiple encoders and decoders, which we demonstrate outperforms existing approaches. The encoders (one per input modality) map the multimodal images into modality invariant spatial feature maps. Common and unique information is combined into a fused representation, that is robust to missing modalities, and can be decoded into synthetic images of the target modalities. Different experimental settings demonstrate the benefit of multimodal over unimodal synthesis, although input and output image pairs are required for training. The need for paired images can be overcome with the cycle consistency principle, which we use in conjunction with adversarial training to transform images from one modality (e.g. MR) to images in another (e.g. CT). This is useful especially in cardiac datasets, where different spatial and temporal resolutions make image pairing difficult, if not impossible. Segmentation can also be considered as a form of image synthesis, if one modality consists of semantic maps. We consider the task of extracting segmentation masks for cardiac MR images, and aim to overcome the challenge of limited annotations, by taking into account unannanotated images which are commonly ignored. We achieve this by defining suitable latent spaces, which represent the underlying anatomies (spatial latent variable), as well as the imaging characteristics (non-spatial latent variable). Anatomical information is required for tasks such as segmentation and regression, whereas imaging information can capture variability in intensity characteristics for example due to different scanners. We propose two models that disentangle cardiac images at different levels: the first extracts the myocardium from the surrounding information, whereas the second fully separates the anatomical from the imaging characteristics. Experimental analysis confirms the utility of disentangled representations in semi-supervised segmentation, and in regression of cardiac indices, while maintaining robustness to intensity variations such as the ones induced by different modalities. Finally, our prior research is aggregated into one framework that encodes multimodal images into disentangled anatomical and imaging factors. Several challenges of multimodal cardiac imaging, such as input misalignments and the lack of expert annotations, are successfully handled in the shared anatomy space. Furthermore, we demonstrate that this approach can be used to combine complementary anatomical information for the purpose of multimodal segmentation. This can be achieved even when no annotations are provided for one of the modalities. This thesis creates new avenues for further research in the area of multimodal and disentangled learning with spatial representations, which we believe are key to more generalised deep learning solutions in healthcare.

en

dc.identifier.uri

https://hdl.handle.net/1842/37483

dc.identifier.uri

http://dx.doi.org/10.7488/era/767

dc.language.iso

en

dc.publisher

The University of Edinburgh

en

dc.relation.hasversion

Chartsias, A., Papanastasiou, G., Wang, C., Semple, S., Newby, D.E., Dharmakumar, R., Tsaftaris, S.A., 2019. Disentangle, align and fuse for multimodal and semi-supervised image segmentation. IEEE Transactions on Medical Imaging

en

dc.relation.hasversion

Chartsias, A., Papanastasiou, G., Wang, C., Stirrat, C., Semple, S., Newby, D.E., Dharmakumar, R., Tsaftaris, S.A., 2019. Multimodal Cardiac Segmentation Using Disentangled Representation Learning. In International Workshop on Statistical Atlases and Computational Models of the Heart (pp. 128-137). Springer, Cham.

en

dc.relation.hasversion

Chartsias, A., Joyce, T., Papanastasiou, G., Semple, S., Williams, M., Newby, D.E., Dharmakumar, R. and Tsaftaris, S.A., 2019. Disentangled representation learning in cardiac image analysis. Medical Image Analysis, 58, p.101535.

en

dc.relation.hasversion

Chartsias, A., Joyce, T., Papanastasiou, G., Semple, S., Williams, M., Newby, D.E., Dharmakumar, R., Tsaftaris, S.A., 2018. Factorised spatial representation learning: Application in semi-supervised myocardial segmentation. In International Conference on Medical Image Computing and Computer-Assisted Intervention (pp. 490-498). Springer, Cham.

en

dc.relation.hasversion

• Chartsias, A., Joyce, T., Dharmakumar, R., Tsaftaris, S.A., 2017, September. Adversarial image synthesis for unpaired multi-modal cardiac data. In International workshop on simulation and synthesis in medical imaging (pp. 3-13). Springer, Cham.

en

dc.relation.hasversion

Chartsias, A., Joyce, T., Giuffrida, M.V., Tsaftaris, S.A., 2018. Multimodal MR synthesis via modality-invariant latent representation. IEEE Transactions on Medical Imaging, 37(3), pp.803-814.

en

dc.relation.hasversion

Joyce, T., Chartsias, A., Tsaftaris, S.A., 2017. Robust multi-modal MR image synthesis. In International Conference on Medical Image Computing and Computer-Assisted Intervention (pp. 347-355). Springer, Cham.

en

dc.subject

medical imaging analysis

en

dc.subject

automating image analysis

en

dc.subject

multimodal learning

en

dc.subject

intuitive disentanglement

en

dc.subject

multimodal processing

en

dc.subject

image decomposition

en

dc.title

Multimodal and disentangled representation learning for medical image analysis

en

dc.type

Thesis or Dissertation

en

dc.type.qualificationlevel

Doctoral

en

dc.type.qualificationname

PhD Doctor of Philosophy

en

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Chartsias2020.pdf
Size:: 15.54 MB
Format:: Adobe Portable Document Format
Description:

Download

This item appears in the following Collection(s)

Engineering thesis and dissertation collection