Registration, Semantic Segmentation and Change Detection with Deep Learning in Urban Environments

Παπαδομανωλάκη, Μαρία; Papadomanolaki, Maria

dc.contributor.author	Παπαδομανωλάκη, Μαρία	el
dc.contributor.author	Papadomanolaki, Maria	en
dc.date.accessioned	2022-10-13T07:57:07Z
dc.date.available	2022-10-13T07:57:07Z
dc.identifier.uri	https://dspace.lib.ntua.gr/xmlui/handle/123456789/55905
dc.identifier.uri	http://dx.doi.org/10.26240/heal.ntua.23603
dc.rights	Αναφορά Δημιουργού-Όχι Παράγωγα Έργα 3.0 Ελλάδα	*
dc.rights.uri	http://creativecommons.org/licenses/by-nd/3.0/gr/	*
dc.subject	Remote sensing	en
dc.subject	Deep learning	el
dc.subject	Semantic segmentation	el
dc.subject	Change detection	el
dc.subject	Image registration	el
dc.title	Registration, Semantic Segmentation and Change Detection with Deep Learning in Urban Environments	en
dc.contributor.department	Τηλεπισκόπισης	el
heal.type	doctoralThesis
heal.classification	Deep Learning	en
heal.language	en
heal.access	free
heal.recordProvider	ntua	el
heal.publicationDate	2022-05-26
heal.abstract	During the last years, the immense availability of heterogeneous satellite data has enabled the remote sensing community to investigate numerous research fields in order to observe the earth’s surface in a systematic way. One way to monitor the globe is by comparing properly registered image sequences acquired at different time points. Additionally, semantic maps of single images can contribute to the automatic recognition of a specific region’s layout. In the current thesis, we address the topics of registration, semantic segmentation and change detection, proposing novel data-driven methods mainly focusing on deep learning techniques. Even though many state of the art techniques have obtained promising results on these subjects, the formulation of generic and effective algorithms remains a challenge, due to the diverse nature of satellite images in terms of size and modalities especially on high and very high resolution images. The contributions we have made are the following (i) a novel deep learning semantic segmentation framework where object-priors are integrated in order to produce smoother object shapes, (ii) a multi-task encoder-decoder change detection network that exploits not only semantic but also temporal features through fully convolutional LSTMs proposed for the first time on this problem, and (iii) an automatic end-to-end trained registration algorithm based on deep learning that outputs directly the deformable transformation parameters, taking into account the regions of changes. All our proposed methods have been evaluated on a variety of public and private high and very high resolution optical datasets demonstrating very good performances superior to the current state of the art methods. The first contribution focuses on the task of urban semantic segmentation, solving the problems of scattered erroneous pixel predictions as well as irregular shape outputs that conventional deep learning frameworks report in the literature. Specifically, we present an object-based fully convolutional framework where the satellite images are firstly segmented into superpixels based on the spectral values of individual pixels as well as the spatial relationships between them. The object information is incorporated into the training process through an additional loss function that assigns the dominant class of each superpixel to all the pixels belonging in that segmented object. This method was thoroughly evaluated on a variety of very high resolution satellite datasets. It was also compared with other patch-based and fully-convolutional networks, proving its efficiency both quantitatively and qualitatively. The second contribution is related to urban change detection and tackles the problem of many false positive detections which results from the variety of spectral intensities, the rooftop alterations as well as the limited change data samples. Image time-series are processed by a multi-task fully convolutional network, where both the tasks of change detection and semantic segmentation are implemented using two separate decoders. Te synergy of these two tasks during training benefits both of them, leading to more stable training and better performances. Additionally, the temporal information among the images is captured in different resolutions by positioning fully convolutional LSTM units in each encoding level of the network. In particular, we were the first to introduce the use of fully convolutional LSTM units for this problem, reducing a lot the parameters needed by the traditional LSTMs that were mainly used from the community. The robustness of the method was assessed on high and very high resolution datasets, while other fully convolutional, multi-task and multi-temporal state of the art schemes were employed in order to conduct a detailed comparison. In the last contribution, we aim to resolve the geometric distortions of image pairs caused by different off-nadir angles or sensors with dissimilar characteristics, a problem which also affects the change detection algorithms. In this study we were the first to propose an end-to-end trained framework for the regression of dense deformations between pairs of very high resolution optical data. In particular, we proposed a novel and data-driven semi-supervised multi-step approach, where the deformable transformation parameters are the direct output of a fully convolutional network, registering automatically the image-pairs through a 2D transformer layer. The images’ edges are also registered during training, in an attempt to better preserve the object shapes. Lastly, the map of changed areas between the pair of images is exploited during training, to loosen the registration restrictions in the areas of change, avoiding the disorientation of the model from extreme spectral and geometric differences. Experiments were performed using high and very high resolution datasets, comparing the results with other learning and non-learning based techniques from the literature. Solutions are provided for all the three aforementioned topics, as they are indissolubly related. That is, semantic segmentation can be adopted as an extra task for enhancing the extracted feature information during the training of change detection algorithms. In addition, change detection is essentially a semantic segmentation problem since the outcome is a semantic map with changed and non-changed areas, hence successful methods emerging in semantic segmentation can provide ideas and inspiration for further evolving the change detection techniques. Finally, change detection requires that the image pairs are firstly properly registered in order to avoid false positive detections caused by geometric alterations. In this thesis we aimed to provide solutions that are modular and combine information from these three problems to provide more efficient, fast and robust algorithms towards the automatic monitoring of the earth’s surface. All the algorithms that were proposed and developed in this thesis have been published and become publicly available to the community.	en
heal.sponsor	ΕΛΚΕ	el
heal.advisorName	Καράντζαλος, Κωνσταντίνος
heal.committeeMemberName	Καράντζαλος, Κωνσταντίνος
heal.committeeMemberName	Αργιαλάς, Δημήτριος
heal.committeeMemberName	Δουλάμης, Αναστάσιος
heal.committeeMemberName	Βακαλοπούλου, Μαρία
heal.committeeMemberName	Κομοντάκης, Νικόλαος
heal.committeeMemberName	Κοντοές, Χαράλαμπος
heal.committeeMemberName	Ιωαννίδης, Χαράλαμπος
heal.academicPublisher	Εθνικό Μετσόβιο Πολυτεχνείο. Σχολή Αγρονόμων και Τοπογράφων Μηχανικών	el
heal.academicPublisherID	ntua
heal.numberOfPages	196
heal.fullTextAvailability	false