Academic papers
Recent Publications
Journals
A Single-Step Multiclass SVM based on Quantum Annealing for Remote Sensing Data Classification
Kernel Approximation on a Quantum Annealer for Remote Sensing Regression Tasks
Edoardo Pasetto, Morris Riedel, Kristel Michielsen, Gabriele Cavallaro
IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing (JSTARS)
The increased development of quantum computing hardware in recent years has led to increased interest in its application to various areas. Finding effective ways to apply this technology to real-world use-cases is a current area of research in the Remote Sensing (RS) community. This paper proposes an Adiabatic Quantum Kitchen Sinks (AQKS) kernel approximation algorithm with parallel quantum annealing on the D-Wave Advantage quantum annealer. The proposed implementation is applied to Support Vector Regression (SVR) and Gaussian Process Regression (GPR) algorithms. To evaluate its performance, a regression problem related to estimating chlorophyll concentration in water is considered. The proposed algorithm was tested on two real-world datasets and its results were compared with those obtained from a classical implementation of kernel-based algorithms and a Random Kitchen Sinks (RKS) implementation. On average, the parallel AQKS achieved comparable results to the benchmark methods, indicating its potential for future applications.
Sen4Map: Advancing Mapping with Sentinel-2 by Providing Detailed Semantic Descriptions and Customizable Land-Use and Land-Cover Data
Surbhi Sharma, Rocco Sedona, Morris Riedel, Gabriele Cavallaro, Claudia Paris
IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing (JSTARS)
This paper presents Sen4Map, a large-scale benchmark dataset designed to enhance the capability of generating land-cover maps using Sentinel-2 data. Comprising non-overlapping 64x64 patches extracted from Sentinel-2 time series images, the dataset spans 335,125 geo-tagged locations across the European Union. These locations are associated with detailed land-cover and land-use information gathered by expert surveyors in 2018. Unlike most existing large datasets available in the literature, the presented database provides: (1) a detailed description of the land-cover and land-use properties of each sampled area; (2) independence of scale, as it is associated with reference data collected in-situ by expert surveyors; (3) the ability to test both temporal and spatial classification approaches because of the availability of time series of 64x64 patches associated with each labeled sample; and (4) samples were collected following a stratified random sample design to obtain a statistically representative spatial distribution of land-cover classes throughout the European Union. To showcase the properties and challenges offered by Sen4Map, we benchmarked the current state-of-the-art land-cover classification approaches. The dataset and code can be downloaded at: https://datapub.fz-juelich.de/sen4map.
Local Binary and Multiclass SVMs Trained on a Quantum Annealer
Enrico Zardini, Amer Delilbasic, Enrico Blanzieri, Gabriele Cavallaro, Davide Pastorello
IEEE Transactions on Quantum Engineering
Support vector machines (SVMs) are widely used machine learning models, with formulations for both classification and regression tasks. In the last years, with the advent of working quantum annealers, hybrid SVM models characterised by quantum training and classical execution have been introduced. These models have demonstrated comparable performance to their classical counterparts. However, they are limited in the training set size due to the restricted connectivity of the current quantum annealers. Hence, to take advantage of large datasets, a strategy is required. In the classical domain, local SVMs, namely, SVMs trained on the data samples selected by a k -nearest neighbors model, have already proven successful. Here, the local application of quantum-trained SVM models is proposed and empirically assessed. In particular, this approach allows overcoming the constraints on the training set size of the quantum-trained models while enhancing their performance. In practice, the Fast Local Kernel Support Vector Machine (FaLK-SVM) method, designed for efficient local SVMs, has been combined with quantum-trained SVM models for binary and multiclass classification. In addition, for comparison, FaLK-SVM has been interfaced for the first time with a classical single-step multiclass SVM model (CS SVM). Concerning the empirical evaluation, D-Wave's quantum annealers and real-world datasets taken from the remote sensing domain have been employed. The results have shown the effectiveness and scalability of the proposed approach, but also its practical applicability in a real-world large-scale scenario.
Vectorized Highly Parallel Density-Based Clustering for Applications With Noise
J. Arnold, J. P. G. Hermosillo Muriedas, S. Nassyr, R. Sedona, M. Götz, A. Streit, M. Riedel, G. Cavallaro
IEEE Access
Clustering in data mining involves grouping similar objects into categories based on their characteristics. As the volume of data continues to grow and advancements in high-performance computing evolve, a critical need has emerged for algorithms that can efficiently process these computations and exploit the various levels of parallelism offered by modern supercomputing systems. Exploiting Single Instruction Multiple Data (SIMD) instructions enhances parallelism at the instruction level and minimizes data movement within the memory hierarchy. To fully harness a processor’s SIMD capabilities and achieve optimal performance, adapting algorithms for better compatibility with vector operations is necessary. In this paper, we introduce a vectorized implementation of the Density-based Clustering for Applications with Noise (DBSCAN) algorithm suitable for the execution on both shared and distributed memory systems. By leveraging SIMD, we enhance the performance of distance computations. Our proposed Vectorized HPDBSCAN (VHPDBSCAN) demonstrates a performance improvement of up to two times over the state-of-the-art parallel version, Highly Parallel DBSCAN (HPDBSCAN), on the ARM-based A64FX processor on two different datasets with varying dimensions. We have parallelized computations which are essential for the efficient workload distribution. This has significantly enhanced the performance on higher dimensional datasets. Additionally, we evaluate VHPDBSCAN’s energy consumption on the A64FX and Intel Xeon processors. The results show that in both processors, due to the reduced runtime, the total energy consumption of the application is reduced by 50% on the A64FX Central Processing Unit (CPU) and by approximately 19% on the Intel Xeon 8368 CPU compared to HPDBSCAN.
Few-Shot Remote Sensing Image Classification with Meta-Learning [preprint]
Surbhi Sharma, Ribana Roscher, Morris Riedel, Gabriele Cavallaro
techrxiv
The performance of machine learning models relies on the quality, quantity, and diversity of annotated remote sensing datasets. However, the expensive effort required to annotate samples from diverse locations around the globe, coupled with the need for higher computation, often leads to models that are less generalizable across different regions. This paper explores the use of few-shot learning with meta-learning to improve the generalization capability of deep learning models on remote sensing image classification problems with limited annotated samples. The experiments show that metric-based meta-learners, such as prototypical and matching networks, provide comparable performance to more complex optimization-based meta-learning approaches like model-agnostic meta-learning and its variations. Few-shot learning with meta-learning can unlock greater generalization capabilities in machine learning models, thereby significantly impacting various remote sensing applications.
Deep Learning-based 3D Surface Reconstruction - A Survey
A. Farshian, M. Götz, G. Cavallaro, C. Debus, M. Nießner, J. A. Benediktsson, A. Streit
Proceedings of the IEEE
In the last decade, deep learning has significantly impacted industry and science. Initially largely motivated by computer vision tasks in two-dimensional imagery, the focus has shifted towards three-dimensional data analysis. In particular, 3D surface reconstruction, i.e., reconstructing a three-dimensional shape from sparse input, is of great interest to a large variety of application fields. Deep learning-based approaches show promising quantitative and qualitative surface reconstruction performance compared to traditional computer vision and geometric algorithms. This survey provides a comprehensive overview of these deep learning-based methods for 3D surface reconstruction. To this end, we will first discuss input data modalities, such as volumetric data, point clouds as well as RGB, single-view, multi-view, and depth images, along with corresponding acquisition technologies and common benchmark datasets. For practical purposes, we also discuss evaluation metrics enabling to judge the reconstructive performance of different methods. The main part of the document will introduce a methodological taxonomy ranging from point- and mesh-based techniques, to volumetric and implicit neural approaches. Recent research trends, both methodological and for applications, are highlighted, pointing towards future developments.
Enhancing Distributed Neural Network Training Through Node-Based Communications
S. Moreno-Álvarez, M. E. Paoletti, G. Cavallaro and J. M. Haut
IEEE Transactions on Neural Networks and Learning Systems
The amount of data needed to effectively train modern deep neural architectures has grown significantly, leading to increased computational requirements. These intensive computations are tackled by the combination of last generation computing resources, such as accelerators, or classic processing units. Nevertheless, gradient communication remains as the major bottleneck, hindering the efficiency notwithstanding the improvements in runtimes obtained through data parallelism strategies. Data parallelism involves all processes in a global exchange of potentially high amount of data, which may impede the achievement of the desired speedup and the elimination of noticeable delays or bottlenecks. As a result, communication latency issues pose a significant challenge that profoundly impacts the performance on distributed platforms. This research presents node-based optimization steps to significantly reduce the gradient exchange between model replicas whilst ensuring model convergence. The proposal serves as a versatile communication scheme, suitable for integration into a wide range of general-purpose deep neural network (DNN) algorithms. The optimization takes into consideration the specific location of each replica within the platform. To demonstrate the effectiveness, different neural network approaches and datasets with disjoint properties are used. In addition, multiple types of applications are considered to demonstrate the robustness and versatility of our proposal. The experimental results show a global training time reduction whilst slightly improving accuracy. Code: https://github.com/mhaut/eDNNcomm.
Toward the Production of Spatiotemporally Consistent Annual Land Cover Maps using Sentinel-2 Time Series
R. Sedona, C. Paris, J. Ebert, M. Riedel, G. Cavallaro
IEEE Geoscience and Remote Sensing Letters (GRSL)
Predicting Classification Performance for Benchmark Hyperspectral Datasets
B. Zhao, H. I. Ragnarsson, M. O. Ulfarsson, G. Cavallaro, J. A. Benediktsson
IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing
The classification of hyperspectral images (HSIs) is an essential application of remote sensing and it is addressed by numerous publications every year. A large body of these papers present new classification algorithms and benchmark them against established methods on public hyperspectral datasets. The metadata contained in these research papers (i.e., the size of the image, the number of classes, the type of classifier, etc.) present an unexploited source of information that can be used to estimate the performance of classifiers before doing the actual experiments. In this article, we propose a novel approach to investigate to what degree HSIs can be classified by using only metadata. This can guide remote sensing researchers to identify optimal classifiers and develop new algorithms. In the experiments, different linear and nonlinear prediction methods are trained and tested by using data on classification accuracy and metadata from 100 HSIs classification papers. The experimental results demonstrate that the proposed ensemble learning voting method outperforms other comparative methods in quantitative assessments.
Remote Sensing Image Classification Using CNNs With Balanced Gradient for Distributed Heterogeneous Computing
S. Moreno-Álvarez, M. E. Paoletti, G. Cavallaro, J. A. Rico, J. M. Haut
IEEE Geoscience and Remote Sensing Letters
Land-cover classification methods are based on the processing of large image volumes to accurately extract representative features. Particularly, convolutional models provide notable characterization properties for image classification tasks. Distributed learning mechanisms on high-performance computing platforms have been proposed to speed up the processing, while achieving an efficient feature extraction. High-performance computing platforms are commonly composed of a combination of central processing units (CPUs) and graphics processing units (GPUs) with different computational capabilities. As a result, current homogeneous workload distribution techniques for deep learning (DL) become obsolete due to their inefficient use of computational resources. To address this, new computational balancing proposals, such as heterogeneous data parallelism, have been implemented. Nevertheless, these techniques should be improved to handle the peculiarities of working with heterogeneous data workloads in the training of distributed DL models. The objective of handling heterogeneous workloads for current platforms motivates the development of this work. This letter proposes an innovative heterogeneous gradient calculation applied to land-cover classification tasks through convolutional models, considering the data amount assigned to each device in the platform while maintaining the acceleration. Extensive experimentation has been conducted on multiple datasets, considering different deep models on heterogeneous platforms to demonstrate the performance of the proposed methodology.
Quantum SVR for Chlorophyll Concentration Estimation in Water with Remote Sensing
E. Pasetto, M. Riedel, F. Melgani, K. Michielsen, G. Cavallaro
IEEE Geoscience and Remote Sensing Letters (GRSL)
The increasing availability of quantum computers motivates researching their potential capabilities in enhancing the performance of data analysis algorithms. Similarly, as in other research communities, also in remote sensing (RS), it is not yet defined how its applications can benefit from the usage of quantum computing (QC). This letter proposes a formulation of the support vector regression (SVR) algorithm that can be executed by D-Wave quantum computers. Specifically, the SVR is mapped to a quadratic unconstrained binary optimization (QUBO) problem that is solved with quantum annealing (QA). The algorithm is tested on two different types of computing environments offered by D-Wave: the advantage system, which directly embeds the problem into the quantum processing unit (QPU), and a hybrid solver that employs both classical and QC resources. For the evaluation, we considered a biophysical variable estimation problem with RS data. The experimental results show that the proposed quantum SVR implementation can achieve comparable or, in some cases, better results than the classical implementation. This work is one of the first attempts to provide insight into how QA could be exploited and integrated in future RS workflows based on machine learning (ML) algorithms.
Learning from Data for Remote Sensing Image Analysis
Y. Bazi, G. Cavallaro, B. Demir and F. Melgani
International journal of remote sensing
Recent advances in satellite technology have led to a regular, frequent and high- resolution monitoring of Earth at the global scale, providing an unprecedented amount of Earth observation (EO) data. The growing operational capability of global Earth monitoring from space provides a wealth of information on the state of our planet Earth that waits to be mined for several different EO applications, e.g. climate change analysis, urban area studies, forestry applications, risk and damage assessment, water quality assessment, crop monitoring and so on. Recent studies in machine learning have triggered substantial performance gain for the above-mentioned tasks. Advanced machine learning models such as deep convolutional neural networks (CNNs), recursive neural networks and transformers have recently made great progress in a wide range of remote sensing (RS) tasks, such as object detection, RS image classification, image captioning and so on. The study of Bai et al. (2021) analyzes the research progress, hotspots, trends and methods in the field of deep learning in remote sensing, and deep learning is becoming an important tool for remote sensing and has been widely used in numerous remote sensing tasks related to image processing and analysis. In this context, the present special issue aims at gathering a collection of papers in the most advanced and trendy areas dealing with learning from data and with applications to remote sensing image analysis. The manuscripts can be subdivided into five groups depending mainly on the processing or learning task. A specific collection for hyperspectral imagery has been included given the special attention by the remote sensing com-munity to this kind of data.
A High-Performance Multispectral Adaptation GAN for Harmonizing Dense Time Series of Landsat-8 and Sentinel-2 images
R Sedona, C Paris, G Cavallaro, L Bruzzone, M Riedel
IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing (J-STARS)
The combination of data acquired by Landsat-8 and Sentinel-2 Earth Observation (EO) missions produces dense Time Series (TSs) of multispectral images that are essential for monitoring the dynamics of land-cover and land-use classes across the Earth’s surface with high temporal resolution. However, the optical sensors of the two missions have different spectral and spatial properties, thus they require a harmonization processing step before they can be exploited in Remote Sensing (RS) applications. In this work, we propose a workflow based on a Deep Learning (DL) approach to harmonize these two products developed and deployed on an High-Performance Computing (HPC) environment. In particular, we use a multispectral Generative Adversarial Network (GAN) with a U-Net generator and a PatchGan discriminator to integrate existing Landsat-8 TSs with data sensed by the Sentinel-2 mission. We show a qualitative and quantitative comparison with an existing physical method (NASA Harmonized Landsat and Sentinel (HLS)) and analyze original and generated data in different experimental setups with the support of spectral distortion metrics. To demonstrate the effectiveness of the proposed approach, a crop type mapping task is addressed using the harmonized dense TS of images, which achieved an Overall Accuracy (OA) of 87.83% compared to 81.66% of the state-of-the-art method.
Exploration of Machine Learning Methods for the Classification of Infrared Limb Spectra of Polar Stratospheric Clouds
R. Sedona, L. Hoffmann, R. Spang, G. Cavallaro, S. Griessbach, M. Höpfner, M. Book, M. Riedel
Atmospheric Measurement Techniques
Polar stratospheric clouds (PSCs) play a key role in polar ozone depletion in the stratosphere. Improved observations and continuous monitoring of PSCs can help to validate and improve chemistry–climate models that are used to predict the evolution of the polar ozone hole. In this paper, we explore the potential of applying machine learning (ML) methods to classify PSC observations of infrared limb sounders. Two datasets were considered in this study. The first dataset is a collection of infrared spectra captured in Northern Hemisphere winter 2006/2007 and Southern Hemisphere winter 2009 by the Michelson Interferometer for Passive Atmospheric Sounding (MIPAS) instrument on board the European Space Agency's (ESA) Envisat satellite. The second dataset is the cloud scenario database (CSDB) of simulated MIPAS spectra. We first performed an initial analysis to assess the basic characteristics of the CSDB and to decide which features to extract from it. Here, we focused on an approach using brightness temperature differences (BTDs). From both the measured and the simulated infrared spectra, more than 10 000 BTD features were generated. Next, we assessed the use of ML methods for the reduction of the dimensionality of this large feature space using principal component analysis (PCA) and kernel principal component analysis (KPCA) followed by a classification with the support vector machine (SVM). The random forest (RF) technique, which embeds the feature selection step, has also been used as a classifier. All methods were found to be suitable to retrieve information on the composition of PSCs. Of these, RF seems to be the most promising method, being less prone to overfitting and producing results that agree well with established results based on conventional classification methods.
Cloud Deep Networks for Hyperspectral Image Analysis
J. M. Haut, J. A. Gallardo, M. E. Paoletti, G. Cavallaro, J. Plaza, A. Plaza and M. Riedel
IEEE Transactions on Geoscience and Remote Sensing
Advances in remote sensing hardware have led to a significantly increased capability for high-quality data acquisition, which allows the collection of remotely sensed images with very high spatial, spectral, and radiometric resolution. This trend calls for the development of new techniques to enhance the way that such unprecedented volumes of data are stored, processed, and analyzed. An important approach to deal with massive volumes of information is data compression, related to how data are compressed before their storage or transmission. For instance, hyperspectral images (HSIs) are characterized by hundreds of spectral bands. In this sense, high-performance computing (HPC) and high-throughput computing (HTC) offer interesting alternatives. Particularly, distributed solutions based on cloud computing can manage and store huge amounts of data in fault-tolerant environments, by interconnecting distributed computing nodes so that no specialized hardware is needed. This strategy greatly reduces the processing costs, making the processing of high volumes of remotely sensed data a natural and even cheap solution. In this paper, we present a new cloud-based technique for spectral analysis and compression of HSIs. Specifically, we develop a cloud implementation of a popular deep neural network for non-linear data compression, known as autoencoder (AE). Apache Spark serves as the backbone of our cloud computing environment by connecting the available processing nodes using a master-slave architecture. Our newly developed approach has been tested using two widely available HSI data sets. Experimental results indicate that cloud computing architectures offer an adequate solution for managing big remotely sensed data sets.
Remote Sensing Big Data Classification with High Performance Distributed Deep Learning
R. Sedona, G. Cavallaro, J. Jitsev, A. Strube, M. Riedel, J. A. Benediktsson
Remote Sensing
High-Performance Computing (HPC) has recently been attracting more attention in remote sensing applications due to the challenges posed by the increased amount of open data that are produced daily by Earth Observation (EO) programs. The unique parallel computing environments and programming techniques that are integrated in HPC systems are able to solve large-scale problems such as the training of classification algorithms with large amounts of Remote Sensing (RS) data. This paper shows that the training of state-of-the-art deep Convolutional Neural Networks (CNNs) can be efficiently performed in distributed fashion using parallel implementation techniques on HPC machines containing a large number of Graphics Processing Units (GPUs). The experimental results confirm that distributed training can drastically reduce the amount of time needed to perform full training, resulting in near linear scaling without loss of test accuracy.
Parallel Computation of Component Trees on Distributed Memory Machines
M. Goetz, G. Cavallaro, T. Geraud, M. Book and M. Riedel
IEEE Transactions on Parallel and Distributed Systems (TPDS)
Component trees are region-based representations that encode the inclusion relationship of the threshold sets of an image. These representations are one of the most promising strategies for the analysis and the interpretation of spatial information of complex scenes as they allow the simple and efficient implementation of connected filters. This work proposes a new efficient hybrid algorithm for the parallel computation of two particular component trees-the max- and min-tree-in shared and distributed memory environments. For the node-local computation a modified version of the flooding-based algorithm of Salembier is employed. A novel tuple-based merging scheme allows to merge the acquired partial images into a globally correct view. Using the proposed approach a speed-up of up to 44.88 using 128 processing cores on eight-bit gray-scale images could be achieved. This is more than a five-fold increase over the state-of-the-art shared-memory algorithm, while also requiring only one-thirty-second of the memory.
Automatic Attribute Profiles
G. Cavallaro, N. Falco, M. D. Mura and J. A. Benediktsson
IEEE Transactions on Image Processing (TIP)
Morphological attribute profiles are multilevel decompositions of images obtained with a sequence of transformations performed by connected operators. They have been extensively employed in performing multiscale and region-based analysis in a large number of applications. One main, still unresolved, issue is the selection of filter parameters able to provide representative and non-redundant threshold decomposition of the image. This paper presents a framework for the automatic selection of filter thresholds based on Granulometric Characteristic Functions (GCFs). GCFs describe the way that non-linear morphological filters simplify a scene according to a given measure. Since attribute filters rely on a hierarchical representation of an image (e.g., the Tree of Shapes) for their implementation, GCFs can be efficiently computed by taking advantage of the tree representation. Eventually, the study of the GCFs allows the identification of a meaningful set of thresholds. Therefore, a trial and error approach is not necessary for the threshold selection, automating the process and in turn decreasing the computational time. It is shown that the redundant information is reduced within the resulting profiles (a problem of high occurrence, as regards manual selection). The proposed approach is tested on two real remote sensing data sets, and the classification results are compared with strategies present in the literature.
Integration of LiDAR and Hyperspectral Data for Land-cover Classification: A Case Study
P. Ghamisi, G. Cavallaro, Dan, Wu, J. A. Benediktsson and A. Plaza
Computer Vision and Pattern Recognition
In this paper, an approach is proposed to fuse LiDAR and hyperspectral data, which considers both spectral and spatial information in a single framework. Here, an extended self-dual attribute profile (ESDAP) is investigated to extract spatial information from a hyperspectral data set. To extract spectral information, a few well-known classifiers have been used such as support vector machines (SVMs), random forests (RFs), and artificial neural networks (ANNs). The proposed method accurately classify the relatively volumetric data set in a few CPU processing time in a real ill-posed situation where there is no balance between the number of training samples and the number of features. The classification part of the proposed approach is fully-automatic.
Remote Sensing Image Classification Using Attribute Filters Defined over the Tree of Shapes
G. Cavallaro, M. Dalla Mura, J. A. Benediktsson and A. Plaza
IEEE Transactions on Geoscience and Remote Sensing (TGRS)
Remotely sensed images with very high spatial resolution provide a detailed representation of the surveyed scene with a geometrical resolution that, at the present, can be up to 30 cm (WorldView-3). A set of powerful image processing operators have been defined in the mathematical morphology framework. Among those, connected operators [e.g., attribute filters (AFs)] have proven their effectiveness in processing very high resolution images. AFs are based on attributes which can be efficiently implemented on tree-based image representations. In this paper, we considered the definition of min, max, direct, and subtractive filter rules for the computation of AFs over the tree-of-shapes representation. We study their performance on the classification of remotely sensed images. We compare the classification results over the tree of shapes with the results obtained when the same rules are applied on the component trees. The random forest is used as a baseline classifier, and the experiments are conducted using multispectral data sets acquired by QuickBird and IKONOS sensors over urban areas.
Extended Self-Dual Attribute Profiles for the Classification of Hyperspectral Images
G. Cavallaro, M. Dalla Mura, J. A. Benediktsson and L. Bruzzone
IEEE Geoscience and Remote Sensing Letters (GRSL)
In this letter, we explore the use of self-dual attribute profiles (SDAPs) for the classification of hyperspectral images. The hyperspectral data are reduced into a set of components by nonparametric weighted feature extraction (NWFE), and a morphological processing is then performed by the SDAPs separately on each of the extracted components. Since the spatial information extracted by SDAPs results in a high number of features, the NWFE is applied a second time in order to extract a fixed number of features, which are finally classified. The experiments are carried out on two hyperspectral images, and the support vector machines and random forest are used as classifiers. The effectiveness of SDAPs is assessed by comparing its results against those obtained by an approach based on extended APs.
On Understanding Big Data Impacts in Remotely Sensed Image Classification Using Support Vector Machine Methods
G. Cavallaro, M. Riedel, M. Richerzhagen, J. A. Benediktsson and A. Plaza
IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing (J-STARS)
Owing to the recent development of sensor resolutions onboard different Earth observation platforms, remote sensing is an important source of information for mapping and monitoring natural and man-made land covers. Of particular importance is the increasing amounts of available hyperspectral data originating from airborne and satellite sensors such as AVIRIS, HyMap, and Hyperion with very high spectral resolution (i.e., high number of spectral channels) containing rich information for a wide range of applications. A relevant example is the separation of different types of land-cover classes using the data in order to understand, e.g., impacts of natural disasters or changing of city buildings over time. More recently, such increases in the data volume, velocity, and variety of data contributed to the term big data that stand for challenges shared with many other scientific disciplines. On one hand, the amount of available data is increasing in a way that raises the demand for automatic data analysis elements since many of the available data collections are massively underutilized lacking experts for manual investigation. On the other hand, proven statistical methods (e.g., dimensionality reduction) driven by manual approaches have a significant impact in reducing the amount of big data toward smaller smart data contributing to the more recently used terms data value and veracity (i.e., less noise, lower dimensions that capture the most important information). This paper aims to take stock of which proven statistical data mining methods in remote sensing are used to contribute to smart data analysis processes in the light of possible automation as well as scalable and parallel processing techniques. We focus on parallel support vector machines (SVMs) as one of the best out-of-the-box classification methods.
Automatic Framework for Spectral–Spatial Classification Based on Supervised Feature Extraction and Morphological Attribute Profiles
P. Ghamisi, J. A. Benediktsson, G. Cavallaro and A. Plaza
IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing (J-STARS)
Supervised classification plays a key role in terms of accurate analysis of hyperspectral images. Many applications can greatly benefit from the wealth of spectral and spatial information provided by these kind of data, including land-use and land-cover mapping. Conventional classifiers treat hyperspectral images as a list of spectral measurements and do not consider spatial dependencies of the adjacent pixels. To overcome these limitations, classifiers need to use both spectral and spatial information. In this paper, a framework for automatic spectral-spatial classification of hyperspectral images is proposed. In order to extract the spatial information, Extended Multi-Attribute Profiles (EMAPs) are taken into account. In addition, in order to reduce the redundancy of features and address the so-called curse of dimensionality, different supervised feature extraction (FE) techniques are considered. The final classification map is provided by using a random forest classifier. The proposed automatic framework is tested on two widely used hyperspectral data sets; Pavia University and Indian Pines. Experimental results confirm that the proposed framework automatically provides accurate classification maps in acceptable CPU processing times.
Conference Papers
Quantum Annealing for Semantic Segmentation in Remote Sensing: Potential and Limitations
A. Delilbasic, B. Le Saux, M. Riedel, K. Michielsen, G. Cavallaro
Proceedings of the IEEE Mediterranean and Middle-East Geoscience and Remote Sensing Symposium (M2GARSS)
Quantum Annealing (QA) is a powerful method for combinatorial optimization derived from adiabatic quantum computation. The development of computing devices implementing QA accelerated its adoption in practical use cases. In this paper, we summarize the main features and limitations of QA and its application to remote sensing, specifically to semantic segmentation. We provide indications for successfully applying it to real problems, and techniques for improving its performance. This overview can support practitioners in the adoption of this innovative computing technology.
A Hybrid Quantum-Classical CNN Architecture for Semantic Segmentation of Radar Sounder Data
R. Ghosh, A. Delilbasic, G. Cavallaro and F. Bovolo
Proceedings of the IEEE Mediterranean and Middle-East Geoscience and Remote Sensing Symposium (M2GARSS)
The article presents for the first time a hybrid quantum-classical architecture in the context of subsurface target detection in the radar sounder signal. We enhance the classical convolutional neural network (CNN) based architecture by integrating a quantum layer in the latent space. We investigate two quantum circuits with the classical neural networks by exploiting fundamental properties of quantum mechanics such as entanglement and superposition. The proposed hybrid architecture is used for the downstream task of patch-wise semantic segmentation of radar sounder subsurface images. Experimental results on the MCoRDS and MCoRDS3 datasets demonstrated the capability of the hybrid quantum-classical approach for radar sounder information extraction.
Reverse Quantum Annealing for Hybrid Quantum-Classical Satellite Mission Planning
A. Delilbasic, B. Le Saux, M. Riedel, K. Michielsen, G. Cavallaro
Proceedings of the IEEE International Geoscience and Remote Sensing Symposium (IGARSS)
The trend of building larger and more complex imaging satellite constellations leads to the challenge of managing multiple acquisition requests for the Earth's surface. Optimally planning these acquisitions is an intractable optimization problem, and heuristic algorithms are used today to find sub-optimal solutions. Recently, quantum algorithms have been considered for this purpose due to the potential breakthroughs they can bring in optimization, expecting either a speedup or an increase in solution quality. Hybrid quantum-classical methods have been considered a short-term solution for taking advantage of small quantum machines. In this paper, we propose reverse quantum annealing as a method for improving the acquisition plan obtained by a classical optimizer. We investigate the benefits of the method with different annealing schedules and different problem sizes. The obtained results provide guidelines for designing a larger hybrid quantum-classical framework based on reverse quantum annealing for this application.
Supporting Seismic Data Survey Design Through the Integration of Satellite-Based Land Cover Maps
L. Tian, N. Akram, R. Sedona, N. Savva, M. Riedel, G. Cavallaro, E.Verschuur
Proceedings of the IEEE International Geoscience and Remote Sensing Symposium (IGARSS)
Seismic imaging (SI) survey design for onshore applications faces challenges such as accessibility and poor data quality due to unexpected (near-)surface conditions. In this paper, we explore the correlation between the surface conditions provided by land-cover (LC) maps generated using remote sensing (RS) data and different settings of seismic processing (SP) parameters. The study involves a 2D seismic line related to geothermal exploration in the Netherlands.
A CNN Architecture Tailored for Quantum Feature Map-Based Radar Sounder Signal Segmentation
R. Ghosh, A. Delilbasic, G. Cavallaro, F. Bovolo
Proceedings of the IEEE International Geoscience and Remote Sensing Symposium (IGARSS)
This article presents a hybrid quantum-classical framework by incorporating quantum feature maps into a classical Convolutional Neural Network (CNN) architecture for detecting different subsurface targets in radar sounder signals. The quantum feature maps are generated by quantum circuits to utilize spatially-bound input information from the training samples. The associated spectral probabilistic amplitudes of the feature maps are further fed into the classical CNN-based network to classify the subsurface targets in the radargram. Experimental results on the MCoRDS and MCoRDS3 datasets demonstrated the capability of enhancing the classical architecture through quantum feature maps for characterizing radar sounder data.
Enhancing Land Cover Mapping: A Novel Automatic Approach to Improve Mixed Spectral Pixel Classification
S. Sharma, R. Sedona, M. Riedel, G. Cavallaro, C. Paris
Proceedings of the IEEE International Geoscience and Remote Sensing Symposium (IGARSS)
The increasing availability of high-resolution, open-access satellite data facilitates the production of global land cover (LC) maps, an essential source of information for managing and monitoring natural and human-induced processes. However, the accuracy of the obtained LC maps can be affected by the discrepancy between the spatial resolution of the satellite images and the extent of the LC present in the scene. Indeed, several pixels may be misclassified because of their mixed spectral signatures, i.e., more than two LC classes are present in the pixel. To solve this problem, this paper proposes an approach that explores the possibility of using simple but effective unmixing techniques to enhance the classification accuracy of the mixed spectral pixels. The results showed that several pixels, including buildings and grassland LC, are typically classified as cropland. By unmixing their spectral content, it is possible to extract the most prevalent class within the area of each pixel to update the classification map, thus sharply increasing the map accuracy. These promising preliminary results indicate the potential for broader applicability and efficiency in global LC mapping.
Enhancing Training Set Through Multi-temporal Attention Analysis in Transformers for Multi-Year Land Cover Mapping
R. Sedona, J. Ebert, C. Paris, M. Riedel, G. Cavallaro
Proceedings of the IEEE International Geoscience and Remote Sensing Symposium (IGARSS)
The continuous stream of high spatial resolution satellite data offers the opportunity to regularly produce land cover (LC) maps. To this end, Transformer deep learning (DL) models have recently proven their effectiveness in accurately classifying long time series (TS) of satellite images. The continual generation of regularly updated LC maps can be used to analyze dynamic phenomena and extract multi-temporal information. However, several challenges need to be addressed. Our paper aims to study how the performance of a Transformer model changes when classifying TS of satellite images acquired in years later than those in the training set. In particular, the behavior of the attention in the Transformer model is analyzed to determine when the information provided by the initial training set needs to be updated to keep generating accurate LC products. Preliminary results show that: (i) the selection of the positional encoding strategy used in the Transformer has a significant impact on the classification accuracy obtained with multi-year TS, and (ii) the most affected classes are the seasonal ones.
Challenges and Opportunities in the Adoption of High Performance Computing for Earth Observation Applications in the Exascale Era
G. Cavallaro, R. Sedona, M. Riedel, A. Lintermann, K. Michielsen
Conference on Big Data from Space (BiDS)
High-Performance Computing (HPC) enables precise analysis of large and complex Earth Observation (EO) datasets. However, the adoption of supercomputing in the EO community faces challenges from the increasing heterogeneity of HPC systems, limited expertise, and the need to leverage novel computing technologies. This paper explores the implications of exascale computing advancements and the inherent heterogeneity of HPC architectures. It highlights EU-supported projects optimizing software development and harnessing the capabilities of heterogeneous HPC configurations. Methodologies addressing challenges of modular supercomputing, large-scale Deep Learning (DL) models, and hybrid quantum-classical algorithms are presented, aiming to enhance the utilization of supercomputing in EO for improved research, industrial applications, and SME support.
End-to-End Process Orchestration of Earth Observation Data Workflows with Apache Airflow on High Performance Computing
L. Tian, R. Sedona, A. Mozaffari, E. Kreshpa, C. Paris, M. Riedel, M. G. Schultz, G. Cavallaro
Proceedings of the IEEE International Geoscience and Remote Sensing Symposium (IGARSS)
Earth Observation (EO) data processing faces challenges due to large volumes, multiple sources, and diverse formats. To address this issue, this paper presents a scalable and parallelizable workflow using Apache Airflow, capable of integrating Machine Learning (ML) and Deep Learning (DL) models with Modular Supercomputing Architecture (MSA) systems. To test the workflow, we considered the produc-tion of large-scale Land-Cover (LC) maps as a case study. The workflow manager, Airflow, offers scalability, extensi-bility, and programmable task definition in Python. It allows us to execute different steps of the workflow in different High-Performance Computing (HPC) systems. The workflow is demonstrated on the Dynamical Exascale Entry Plat-form (DEEP) and Jülich Research on Exascale Cluster Architectures (JURECA) hosted at the Jülich Supercomputing Centre (JSC), a platform that incorporates heterogeneous JSC systems.
Adiabatic Quantum Kitchen Sinks With Parallel Annealing For Remote Sensing Regression Problems
Edoardo Pasetto, Morris Riedel, Kristel Michielsen, Gabriele Cavallaro
Proceedings of the IEEE International Geoscience and Remote Sensing Symposium (IGARSS)
Kernel methods are class of Machine Learning (ML) models that have been widely employed in the literature for Earth Observation (EO) applications. The increasing development of quantum computing hardware motivates further research to improve the capabilities and the performances of data analysis algorithms. In this manuscript an implementation of Adiabatic Quantum Kitchen Sinks (AQKS) kernel estimation algorithm integrated with parallel quantum annealing is presented. Such combination with the concept of parallel quantum annealing allows for the solving of multiple problem instances in the same annealing cycle, thus reducing the number of required calls to the quantum annealing solver. The proposed workflow is then implemented using a D-Wave Advantage system and tested on a regression problem on a real Remote Sensing (RS) dataset. The obtained results are then analyzed and compared with those obtained by a classical kernel approximation algorithm based on Random Fourier Features.
Accuracy Assessment of Land-Use-Land-Cover Maps: the Semantic Gap between in Situ and Satellite Data
C. Paris, L. Martinez-Sanchez, M. van der Velde, S. Sharma, R. Sedona, G. Cavallaro
Proceedings Volume 12733, Image and Signal Processing for Remote Sensing XXIX, SPIE Remote Sensing
The availability of high-resolution, open, and free satellite data has facilitated the production of global Land-Use-Land-Cover (LULC) maps, which are extremely important to monitor the Earth’s surface constantly. However, generating these maps demands significant efforts in collecting a vast amount of data to train the classifier and to assess their accuracy. Although in-situ surveys are generally regarded as reliable sources of information, it is important to note that there may be inconsistencies between the in-situ data and the information derived from satellite data. This can be attributed to various factors (1) differences in viewpoint perspectives, i.e., aerial versus ground views, and (2) spatial resolution of the satellite images versus the extent of the Land-Cover (LC) present in the scene. The aim of this paper is to explore the feasibility of using geo-referenced street-level imagery to bridge the gap between information provided by field surveys and satellite data. Unlike conventional in-situ surveys that typically provide geo-tagged location-specific information on LULC, street-level images offer a richer semantic context for the sampling point under examination. This allows for (1) an improved interpretation of LC characteristics, and (2) a stronger correlation with satellite data. The experimental analysis was conducted considering the 2018 Land Use and Coverage Area Frame Survey (LUCAS) in-situ data, the LUCAS landscape (street-level) images and three high-resolution thematic products derived from satellite data, namely, Google’s Dynamic World, ESA’s World Cover, and Esri’s Land Cover maps.
Practice and Experience using High Performance Computing and Quantum Computing to Speed-up Data Science Methods in Scientific Applications
M. Riedel, M. Book, H. Neukirchen, G. Cavallaro, A. Lintermann
45th Jubilee International Convention on Information, Communication and Electronic Technology (MIPRO)
High-Performance Computing (HPC) can quickly process scientific data and perform complex calculations at extremely high speeds. A vast increase in HPC use across scientific communities is observed, especially in using parallel data science methods to speed-up scientific applications. HPC enables scaling up machine and deep learning algorithms that inherently solve optimization problems. More recently, the field of quantum machine learning evolved as another HPC related approach to speed-up data science methods. This paper will address primarily traditional HPC and partly the new quantum machine learning aspects, whereby the latter specifically focus on our experiences on using quantum annealing at the Juelich Supercomputing Centre (JSC). Quantum annealing is particularly effective for solving optimization problems like those that are inherent in machine learning methods. We contrast these new experiences with our lessons learned of using many parallel data science methods with a high number of Graphical Processing Units (GPUs). That includes modular supercomputers such as JUWELS, the fastest European supercomputer at the time of writing. Apart from practice and experience with HPC co-design applications, technical challenges and solutions are discussed, such as using interactive access via JupyterLab on typical batch-oriented HPC systems or enabling distributed training tools for deep learning on our HPC systems.
Accelerating Hyperparameter Tuning of a Deep Learning Model for Remote Sensing Image Classification
M. Aach, R. Sedona, A. Lintermann, G. Cavallaro, H. Neukirchen, M. Riedel
Proceedings of the IEEE International Geoscience and Remote Sensing Symposium (IGARSS)
Deep Learning models have proven necessary in dealing with the challenges posed by the continuous growth of data volume acquired from satellites and the increasing complexity of new Remote Sensing applications. To obtain the best performance from such models, it is necessary to fine-tune their hyperparameters. Since the models might have massive amounts of parameters that need to be tuned, this process requires many computational resources. In this work, a method to accelerate hyperparameter optimization on a High-Performance Computing system is proposed. The data batch size is increased during the training, leading to a more efficient execution on Graphics Processing Units (GPUs). The experimental results confirm that this method reduces the runtime of the hyper-parameter optimization step by a factor of 3 while achieving the same validation accuracy as a standard training procedure with a fixed batch size.
An Automatic Approach for the production of a Time Series of Consistent Land-cover Maps Based on Long-short Term Memory
R. Sedona, C. Paris, L. Tian, M. Riedel, G. Cavallaro
Proceedings of the IEEE International Geoscience and Remote Sensing Symposium (IGARSS)
This paper presents an approach that aims to produce a Time-Series (TS) of consistent Land-Cover (LC) maps, typically needed to perform environmental monitoring. First, it creates an annual training set for each TS to be classified, leveraging on publicly available thematic products. These annual training sets are then used to generate a set of preliminary LC maps that allow for the identification of the unchanged areas, i.e., the stable temporal component. Such areas can be used to define an informative and reliable multi-year training set, by selecting samples belonging to the different years for all the classes. The multi-year training set is finally employed to train a unique multi-year Long Short Term Memory (LSTM) model, which enhances the consistency of the annual LC maps. The preliminary results carried out on three TSs of Sentinel 2 images acquired in Italy in 2018, 2019 and 2020 demonstrates the capability of the method to improve the consistency of the annual LC maps. The agreement of the obtained maps is ≈ 78%, compared to the ≈ 74% achieved by the LSTM models trained separately.
Optimizing Distributed Deep Learning in Heterogeneous Computing Platforms for Remote Sensing Data Classification
S. M. Álvarez, M. E. Paoletti Ávila, J. A. Rico Gallego, G. Cavallaro, J. M. Haut
Proceedings of the IEEE International Geoscience and Remote Sensing Symposium (IGARSS)
Applications from Remote Sensing (RS) unveiled unique challenges to Deep Learning (DL) due to the high volume and complexity of their data. On the one hand, deep neural network architectures have the capability to automatically ex-tract informative features from RS data. On the other hand, these models have massive amounts of tunable parameters, requiring high computational capabilities. Distributed DL with data parallelism on High-Performance Computing (HPC) systems have proved necessary in dealing with the demands of DL models. Nevertheless, a single HPC system can be al-ready highly heterogeneous and include different computing resources with uneven processing power. In this context, a standard data parallelism strategy does not partition the data efficiently according to the available computing resources. This paper proposes an alternative approach to compute the gradient, which guarantees that the contribution to the gradient calculation is proportional to the processing speed of each DL model’s replica. The experimental results are obtained in a heterogeneous HPC system with RS data and demonstrate that the proposed approach provides a significant training speed up and gain in the global accuracy compared to one of the state-of-the-art distributed DL framework.
Quantum Support Vector Regression for Biophysical Variable Estimation in Remote Sensing
E. Pasetto, A. Delilbasic, G. Cavallaro, M. Willsch, F. Melgani, M. Riedel, K. Michielsen
Proceedings of the IEEE International Geoscience and Remote Sensing Symposium (IGARSS)
Regression analysis has a crucial role in many Earth Observation (EO) applications. The increasing availability and recent development of new computing technologies motivate further research to expand the capabilities and enhance the performance of data analysis algorithms. In this paper, the biophysical variable estimation problem is addressed. A novel approach is proposed, which consists in a reformulated Support Vector Regression (SVR) and leverages Quantum Annealing (QA). In particular, the SVR optimization problem is reframed to a Quadratic Unconstrained Binary Optimization (QUBO) problem. The algorithm is then tested on the D-Wave Advantage quantum annealer. The experiments presented in this paper show good results, despite current hardware limitations, suggesting that this approach is viable and has great potential.
Improving Generalization for Few-Shot Remote Sensing Classification with Meta-learning
S. Sharma, R. Roscher, M. Riedel, S. Memon, G. Cavallaro
Proceedings of the IEEE International Geoscience and Remote Sensing Symposium (IGARSS)
In Remote Sensing (RS) classification, generalization abil-ity is one of the measure that characterizes the success of Machine Learning (ML) models, but is often impeded by the scarse availability of annotated training data. Annotated RS samples are expensive to obtain and can present large dispar-ities when produced by different annotators. In this paper, we utilize Few-Shot Learning (FSL) with meta-learning to ad-dress the challenge of generalization using limited amount of training information. The data used in this paper is lever-aged from different datasets that have diverse distributions, that means distinct feature spaces. We tested our approach on publicly available RS benchmark datasets to perform few-shot RS image classification using meta-learning. The results of the experiments suggest that our approach is able to gen-eralize well on the unseen data even with limited number of training samples and reasonable training time.
Hybrid Quantum-Classical Workflows in Modular Supercomputing Architectures with the Jülich Unified Infrastructure for Quantum Computing
G. Cavallaro, M. Riedel, T. Lippert, K. Michielsen
Proceedings of the IEEE International Geoscience and Remote Sensing Symposium (IGARSS)
The implementation of scalable processing workflows is essential to improve the access to and analysis of the vast amount of high-resolution and multi-source Remote Sensing (RS) data and to provide decision-makers with timely and valuable information. The Modular Supercomputing Architecture (MSA) systems that are operated by the Jülich Supercomputing Centre (JSC) are a concrete solution for data-intensive RS applications that rely on big data storage and processing capabilities. To meet the requirements of applications with more complex computational tasks, JSC plans to connect the High Performance Computing (HPC) systems of its MSA environment to different quantum computers via the Jülich UNified Infrastructure for Quantum computing (JUNIQ). The paper describes this unique computing environment and highlights its potential to address real RS application scenarios through high-performance and hybrid quantum-classical processing workflows.
Quantum Support Vector Machine Algorithms for Remote Sensing Data Classification
A. Delilbasic, G. Cavallaro, M. Willsch, F. Melgani, M. Riedel and K. Michielsen
Proceedings of the IEEE International Geoscience and Remote Sensing Symposium (IGARSS)
Recent developments in Quantum Computing (QC) havepaved the way for an enhancement of computing capabilities.Quantum Machine Learning (QML) aims at developingMachine Learning (ML) models specifically designed forquantum computers. The availability of the first quantumprocessors enabled further research, in particular the explorationof possible practical applications of QML algorithms.In this work, quantum formulations of the Support Vector Machine(SVM) are presented. Then, their implementation usingexisting quantum technologies is discussed and Remote Sensing(RS) image classification is considered for evaluation.
Practice and Experience in using Parallel and Scalable Machine Learning in Remote Sensing from HPC over Clouds to Quantum Computing
M. Riedel, G. Cavallaro and J. A. Benediktsson
Proceedings of the IEEE International Geoscience and Remote Sensing Symposium (IGARSS)
Using computationally efficient techniques for transformingthe massive amount of Remote Sensing (RS) data into scientificunderstanding is critical for Earth science. The utilizationof efficient techniques through innovative computing systemsin RS applications has become more widespread in recentyears. The continuously increased use of Deep Learning(DL) as a specific type of Machine Learning (ML) for dataintensiveproblems (i.e., ’big data’) requires powerful computingresources with equally increasing performance. Thispaper reviews recent advances in High-Performance Computing(HPC), Cloud Computing (CC), and Quantum Computing(QC) applied to RS problems. It thus represents asnapshot of the state-of-the-art in ML in the context of themost recent developments in those computing areas, includingour lessons learned over the last years. Our paper alsoincludes some recent challenges and good experiences by usingEuropeans fastest supercomputer for hyper-spectral andmulti-spectral image analysis with state-of-the-art data analysistools. It offers a thoughtful perspective of the potentialand emerging challenges of applying innovative computingparadigms to RS problems.
Enhancing Large Batch Size Training of Deep Models for Remote Sensing Applications
R. Sedona, G. Cavallaro, M. Riedel and M. Book
Proceedings of the IEEE International Geoscience and Remote Sensing Symposium (IGARSS)
A wide variety of Remote Sensing (RS) missions are continuously acquiring a large volume of data every day. The availability of large datasets has propelled Deep Learning (DL) methods also in the RS domain. Convolutional Neural Networks (CNNs) have become the state of the art when tackling the classification of images, however the process of training is time consuming. In this work we exploit the Layer-wise Adaptive Moments optimizer for Batch training (LAMB) optimizer to use large batch size training on High-Performance Computing (HPC) systems. With the use of LAMB combined with learning rate scheduling and warm-up strategies, the experimental results on RS data classification demonstrate that a ResNet50 can be trained faster with batch sizes up to 32K.
Practice and Experience in using Parallel and Scalable Machine Learning with Heterogenous Modular Supercomputing Architectures
M. Riedel, R. Sedona, C. Barakat, P. Einarsson, R. Hassanian, G. Cavallaro, M. Book, H. Neukirchen and A. Lintermann
Proceedings of the IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)
We observe a continuously increased use of Deep Learning (DL) as a specific type of Machine Learning (ML) for data-intensive problems (i.e., ’big data’) that requires powerful computing resources with equally increasing performance. Consequently, innovative heterogeneous High-Performance Computing (HPC) systems based on multi-core CPUs and many-core GPUs require an architectural design that addresses end user communities’ requirements that take advantage of ML and DL. Still the workloads of end user communities of the simulation sciences (e.g., using numerical methods based on known physical laws) needs to be equally supported in those architectures. This paper offers insights into the Modular Supercomputer Architecture (MSA) developed in the Dynamic Exascale Entry Platform (DEEP) series of projects to address the requirements of both simulation sciences and data-intensive sciences such as High Performance Data Analytics (HPDA). It shares insights into implementing the MSA in the Jülich Supercomputing Centre (JSC) hosting Europe No. 1 Supercomputer Jülich Wizard for European Leadership Science (JUWELS). We augment the technical findings with experience and lessons learned from two application communities case studies (i.e., remote sensing and health sciences) using the MSA with JUWELS and the DEEP systems in practice. Thus, the paper provides details into specific MSA design elements that enable significant performance improvements of ML and DL algorithms. While this paper focuses on MSA-based HPC systems and application experience, we are not losing sight of advances in Cloud Computing (CC) and Quantum Computing (QC) relevant for ML and DL.
Design and Evaluation of an HPC-based Expert System to speed-up Retail Data Analysis using Residual Networks Combined with Parallel Association Rule Mining and Scalable Recommenders
C. Barakat, M. Riedel, S. Brynjólfsson, G. Cavallaro, J. Busch, R. Sedona
44th International Convention on Information, Communication and Electronic Technology (MIPRO)
Given the Covid-19 pandemic, the retail industry shifts many business models to enable more online purchases that produce large transaction data quantities (i.e., big data). Data science methods infer seasonal trends about products from this data and spikes in purchases, the effectiveness of advertising campaigns, or brand loyalty but require extensive processing power leveraging High-Performance Computing to deal with large transaction datasets. This paper proposes an High-Performance Computing-based expert system architectural design tailored for ‘big data analysis’ in the retail industry, providing data science methods and tools to speed up the data analysis with conceptual interoperability to commercial cloud-based services. Our expert system leverages an innovative Modular Supercomputer Architecture to enable the fast analysis by using parallel and distributed algorithms such as association rule mining (i.e., FP-Growth) and recommender methods (i.e., collaborative filtering). It enables the seamless use of accelerators of supercomputers or cloud-based systems to perform automated product tagging (i.e., residual deep learning networks for product image analysis) to obtain colour, shapes automatically, and other product features. We validate our expert system and its enhanced knowledge representation with commercial datasets obtained from our ON4OFF research project in a retail case study in the beauty sector.
JUWELS Booster–A Supercomputer for Large-Scale AI Research
J. Jitsev, M. Cherti, M. Langguth, B. Gong, S. Stadtler, A. Mozaffari, G. Cavallaro, R. Sedona, A. Schug, A. Strube, R. Kamath, M. G. Schultz, M. Riedel, T. Lippert
High Performance Computing: ISC High Performance Digital 2021 International Workshops
In this article, we present JUWELS Booster, a recently commissioned high-performance computing system at the Jülich Supercomputing Center. With its system architecture, most importantly its large number of powerful Graphics Processing Units (GPUs) and its fast interconnect via InfiniBand, it is an ideal machine for large-scale Artificial Intelligence (AI) research and applications. We detail its system architecture, parallel, distributed model training, and benchmarks indicating its outstanding performance. We exemplify its potential for research application by presenting large-scale AI research highlights from various scientific fields that require such a facility.
Approaching Remote Sensing Image Classification with Ensembles of Support Vector Machines on the D-WAVE Quantum Annealer
G. Cavallaro, D. Willsch, M. Willsch, K. Michielsen and M. Riedel
Proceedings of the IEEE International Geoscience and Remote Sensing Symposium (IGARSS)
Support Vector Machine (SVM) is a popular supervised Machine Learning (ML) method that is widely used for classification and regression problems. Recently, a method to train SVMs on a D-Wave 2000Q Quantum Annealer (QA) was proposed for binary classification of some biological data. First, ensembles of weak quantum SVMs are generated by training each classifier on a disjoint training subset that can be fit into the QA. Then, the computed weak solutions are fused for making predictions on unseen data. In this work, the classification of Remote Sensing (RS) multispectral images with SVMs trained on a QA is discussed. Furthermore, an open code repository is released to facilitate an early entry into the practical application of QA, a new disruptive compute technology.
Super-Resolution of Large Volumes of Sentinel-2 Images with High Performance Distributed Deep Learning
R. Zhang, G. Cavallaro and J. Jitsev
Proceedings of the IEEE International Geoscience and Remote Sensing Symposium (IGARSS)
This work proposes a novel distributed deep learning model for RS images super-resolution. HPC systems with GPUs are used to accelerate the learning of the unknown low to high resolution mapping from large volumes of Sentinel-2 data. The proposed deep learning model is based on self-attention mechanism and residual learning. The results demonstrate that state-of-the-art performance can be achieved by keeping the size of the model relatively small. Synchronous data parallelism is applied to scale up the training process without severe performance loss. Distributed training is thus shown to speed up learning substantially while keeping performance intact.
Scaling up a Multispectral Resnet-50 to 128 GPUs
R. Sedona, G. Cavallaro, J. Jitsev, A. Strube, M. Riedel and M. Book
Proceedings of the IEEE International Geoscience and Remote Sensing Symposium (IGARSS)
Similarly to other scientific domains, Deep Learning (DL) holds great promises to fulfil the challenging needs of Remote Sensing (RS) applications. However, the increase in volume, variety and complexity of acquisitions that are carried out on a daily basis by Earth Observation (EO) missions generates new processing and storage challenges within operational processing pipelines. The aim of this work is to show that High-Performance Computing (HPC) systems can speed up the training time of Convolutional Neural Networks (CNNs). Particular attention is put on the monitoring of the classification accuracy that usually degrades when using large batch sizes. The experimental results of this work show that the training of the model scales up to a batch size of 8,000, obtaining classification performances in terms of accuracy in line with those using smaller batch sizes.
Multi-Scale Convolutional SVM Networks for Multi-Class Classification Problems of Remote Sensing Images
G. Cavallaro, Y. Bazi, F. Melgani and M. Riedel
Proceedings of the IEEE International Geoscience and Remote Sensing Symposium (IGARSS)
The classification of land-cover classes in remote sensing images can suit a variety of interdisciplinary applications such as the interpretation of natural and man-made processes on the Earth surface. The Convolutional Support Vector Machine (CSVM) network was recently proposed as binary classifier for the detection of objects in Unmanned Aerial Vehicle (UAV) images. The training phase of the CSVM is based on convolutional layers that learn the kernel weights via a set of linear Support Vector Machines (SVMs). This paper proposes the Multi-scale Convolutional Support Vector Machine (MCSVM) network, that is an ensemble of CSVM classifiers which process patches of different spatial sizes and can deal with multi-class classification problems. The experiments are carried out on the EuroSAT Sentinel-2 dataset and the results are compared to the one obtained with recent transfer learning approaches based on pre-trained Convolutional Neural Networks (CNNs).
Scalable workflows for Remote Sensing Data Processing with the DEEP-EST Modular Supercomputing Architecture
E. Erlingsson, G. Cavallaro, H. Neukirchen and M. Riedel
Proceedings of the IEEE International Geoscience and Remote Sensing Symposium (IGARSS)
The implementation of efficient remote sensing workflows is essential to improve the access to and analysis of the vast amount of sensed data and to provide decision-makers with clear, timely, and useful information. The Dynamical Exascale Entry Platform (DEEP) is an European pre-exascale platform that incorporates heterogeneous High-Performance Computing (HPC) systems, i.e., hardware modules which include specialised accelerators. This paper demonstrates the potential of such diverse modules for the deployment of remote sensing data workflows that include diverse processing tasks. Particular focus is put on pipelines which can use the Network Attached Memory (NAM), which is a novel supercomputer module that allows near processing and/or fast shared storage of big remote sensing datasets.
Remote Sensing Data Analytics with the Udocker Container Tool using Multi-GPU Deep Learning Systems
G. Cavallaro, V. Kozlov, M. Götz and M. Riedel
Proceedings of the Conference on Big Data from Space (BiDS)
Multi-GPU systems are in continuous development to deal with the challenges of intensive computational big data problems. On the one hand, parallel architectures provide a tremendous computation capacity and outstanding scalability. On the other hand, the production path in multi-user environment faces several roadblocks since they do not grant root privileges to the users. Containers provide flexible strategies for packing, deploying and running isolated application processes within multi-user systems and enable scientific reproducibility. This paper describes the usage and advantages that the uDocker container tool offers for the development of deep learning models in the described context. The experimental results show that uDocker is more transparent to deploy for less tech-savvy researchers and allows the application to achieve processing time with negligible overhead compared to an uncontainerized environment.
Automated Analysis of Remotely Sensed Images Using the UNICORE Workflow Management System
M. Shahbaz, G. Cavallaro, B. Hagemeier, M. Riedel and H. Neukirchen
Proceedings of the IEEE International Geoscience and Remote Sensing Symposium (IGARSS)
The progress of remote sensing technologies leads to increased supply of high-resolution image data. However, solutions for processing large volumes of data are lagging behind: desktop computers cannot cope anymore with the requirements of macro-scale remote sensing applications; therefore, parallel methods running in High-Performance Computing (HPC) environments are essential. Managing an HPC processing pipeline is non-trivial for a scientist, especially when the computing environment is heterogeneous and the set of tasks has complex dependencies. This paper proposes an end-to-end scientific workflow approach based on the UNICORE workflow management system for automating the full chain of Support Vector Machine (SVM)-based classification of remotely sensed images. The high-level nature of UNICORE workflows allows to deal with heterogeneity of HPC computing environments and offers powerful workflow operations such as needed for parameter sweeps. As a result, the remote sensing workflow of SVM-based classification becomes re-usable across different computing environments, thus increasing usability and reducing efforts for a scientist.
The Influence of Sampling Methods on Pixel-Wise Hyperspectral Image Classification with 3D Convolutional Neural Networks
J. Lange, G. Cavallaro, M. Götz, E. Erlingsson and M. Riedel
Proceedings of the IEEE International Geoscience and Remote Sensing Symposium (IGARSS)
Supervised image classification is one of the essential techniques for generating semantic maps from remotely sensed images. The lack of labeled ground truth datasets, due to the inherent time effort and cost involved in collecting training samples, has led to the practice of training and validating new classifiers within a single image. In line with that, the dominant approach for the division of the available ground truth into disjoint training and test sets is random sampling. This paper discusses the problems that arise when this strategy is adopted in conjunction with spectral-spatial and pixel-wise classifiers such as 3D Convolutional Neural Networks (3D CNN). It is shown that a random sampling scheme leads to a violation of the independence assumption and to the illusion that global knowledge is extracted from the training set. To tackle this issue, two improved sampling strategies based on the Density-Based Clustering Algorithm (DBSCAN) are proposed. They minimize the violation of the train and test samples independence assumption and thus ensure an honest estimation of the generalization capabilities of the classifier.
Scaling Support Vector Machines Towards Exascale Computing for Classification of Large-Scale High-Resolution Remote Sensing Images
E. Erlingsson, G. Cavallaro, M. Riedel and H. Neukirchen
Proceedings of the IEEE International Geoscience and Remote Sensing Symposium (IGARSS)
Progress in sensor technology leads to an ever-increasing amount of remote sensing data which needs to be classified in order to extract information. This big amount of data requires parallel processing by running parallel implementations of classification algorithms, such as Support Vector Machines (SVMs), on High-Performance Computing (HPC) clusters. Tomorrow's supercomputers will be able to provide exascale computing performance by using specialised hardware accelerators. However, existing software processing chains need to be adapted to make use of the best fitting accelerators. To address this problem, a mapping of an SVM remote sensing classification chain to the Dynamical Exascale Entry Platform (DEEP), a European pre-exascale platform, is presented. It will allow to scale SVM-based classifications on tomorrow's hardware towards exascale performance.
Facilitating Efficient Data Analysis of Remotely Sensed Images Using Standards-Based Parameter Sweep Models
S. Memon, G. Cavallaro, M. Riedel and H. Neukirchen
Proceedings of the IEEE International Geoscience and Remote Sensing Symposium (IGARSS)
Classification of remote sensing images often use Support Vector Machines (SVMs) that require an n-fold cross-validation phase in order to do model selection. This phase is characterized by sweeping through a wide set of parameter combinations of SVM kernel and cost parameters. As a consequence this process is computationally expensive but represents a principled way of tuning a model for better accuracy and to prevent overfitting together with regularization that is in SVMs inherently solved in the optimization. Since the cross-validation technique is done in a principled way also known as `gridsearch', we aim at supporting remote sensing scientists in two ways. Firstly by reducing the time-to-solution of the cross-validation by applying state-of-the-art parallel processing methods because the sweep of parameters and cross-validation runs itself can be nicely parallelized. Secondly by reducing manual labour by automating the parallel submission processes since manually performing cross-validation is very time consuming, unintuitive, and error-prone especially in large-scale cluster or supercomputing environments (e.g., batch job scripts, node/core/task parameters, etc.).
Tree-Based Supervised Feature Extraction Method Based on Self-Dual Attribute Profiles
G. Cavallaro, M. Dalla Mura, M. Riedel, and J. A. Benediktsson
Proceedings of the IEEE International Geoscience and Remote Sensing Symposium (IGARSS)
Self-Dual Attribute Profiles (SDAPs) have proven to be an effective method for extracting spatial features able to improve scene classification of remote sensing images with very high spatial resolution. An SDAP is a multilevel decomposition of an image obtained with a sequence of transformations performed by attribute filters over the Tree of Shapes (ToS). One of the main issues with this technique is the identification of the filter thresholds generating a SDAP composed of features that should be relevant for the classification problem. This paper proposes a tree-based supervised feature extraction strategy, which is based on Fisher's linear discriminant analysis relying on the available class information. The exploitation of the ToS structure in the threshold selection procedure allows one to avoid any prior full image filtering, as in other related techniques. Furthermore, the ToS automates and optimizes the whole process by decreasing the computational time and overcoming the conventional selection procedure based on trial and error attempts. The proposed automatic spatial feature extraction technique has been tested in the classification of a very high resolution image proving its effectiveness with respect to a conventional selection strategy.
Unsupervised Change Detection Analysis to Multi-Channel Scenario based on Morphological Contextual Analysis
N. Falco, G. Cavallaro, P. R. Marpu and J. A. Benediktsson
Proceedings of the IEEE International Geoscience and Remote Sensing Symposium (IGARSS)
A novel unsupervised change detection approach for multi-spectral remote sensing data based on morphological transformation is presented. Profiles obtained by attribute filters can provide a rich multi-level analysis of the contextual information. The proposed method is based on the assumption that pixels belonging to changed areas exhibit profiles with significant differences due to a variation in their geometry, whereas pixels within unchanged areas result in similar profiles due to their similar spatial characteristics. The extension to the multi-spectral scenario is performed by applying the morphological analysis on the available bands that compose a given data set. In such scenario radiometric normalization results mandatory in order to minimize the effect due to different acquisition's conditions. To this purpose, IR-MAD is performed as pre-processing. In the paper, preliminary results obtained considering a multi-temporal Landsat ETM+ data set acquired over an agriculture area are shown.
Region-Based Classification of Remote Sensing Images with the Morphological Tree of Shapes
G. Cavallaro, M. D. Mura, E. Carlinet, T. Geraud, N. Falco and J. A. Benediktsson
Proceedings of the IEEE International Geoscience and Remote Sensing Symposium (IGARSS)
Satellite image classification is a key task used in remote sensing for the automatic interpretation of a large amount of information. Today there exist many types of classification algorithms using advanced image processing methods enhancing the classification accuracy rate. One of the best state-of-the-art methods which improves significantly the classification of complex scenes relies on Self-Dual Attribute Profiles (SDAPs). In this approach, the underlying representation of an image is the Tree of Shapes, which encodes the inclusion of connected components of the image. The SDAP computes for each pixel a vector of attributes providing a local multiscale representation of the information and hence leading to a fine description of the local structures of the image. Instead of performing a pixel-wise classification on features extracted from the Tree of Shapes, it is proposed to directly classify its nodes. Extending a specific interactive segmentation algorithm enables it to deal with the multi-class classification problem. The method does not involve any statistical learning and it is based entirely on morphological information related to the tree. Consequently, a very simple and effective region-based classifier relying on basic attributes is presented.
On Scalable Data Mining Techniques for Earth Science
M. Goetz, M. Richerzhagen, C. Bodenstein, G. Cavallaro, P. Glock, M. Riedel and J. A. Benediktsson
Proceedings of the International Conference On Computational Science (ICCS)
One of the observations made in earth data science is the massive increase of data volume (e.g, higher resolution measurements) and dimensionality (e.g. hyper-spectral bands). Traditional data mining tools (Matlab, R, etc.) are becoming redundant in the analysis of these datasets, as they are unable to process or even load the data. Parallel and scalable techniques, though, bear the potential to overcome these limitations. In this contribution we therefore evaluate said techniques in a High Performance Computing (HPC) environment on the basis of two earth science case studies: (a) Density-based Spatial Clustering of Applications with Noise (DBSCAN) for automated outlier detection and noise reduction in a 3D point cloud and (b) land cover type classification using multi-class Support Vector Machines (SVMs) in multi- spectral satellite images. The paper compares implementations of the algorithms in traditional data mining tools with HPC realizations and ’big data’ technology stacks. Our analysis reveals that a wide variety of them are not yet suited to deal with the coming challenges of data mining tasks in earth sciences.
An Advanced Classifier for the Joint Use of LiDAR and Hyperspectral Data: Case Study in Queensland, Australia
P. Ghamisi, D. Wu, G. Cavallaro, J. A. Benediktsson, S. Phinn and N. Falco
Proceedings of the IEEE International Geoscience and Remote Sensing Symposium (IGARSS)
With respect to the exponential increase in the number of available remote sensors in recent years, the possibility of having different types of data captured over the same scene, has resulted in many research works related to the joint use of passive and active sensors for the accurate classification of different materials. However, until now, there is a small number of research works related to the integration of highly valuable information obtained from the joint use of LiDAR and hyperspectral data. This paper proposes an efficient classification approach in terms of accuracies and demanded CPU processing time for integrating big data sets (e.g., LiDAR and hyperspectral) to provide land cover mapping capabilities at a range of spatial scales. In addition, the proposed approach is fully automatic and is able to efficiently handle big data containing a huge number of features with very limited number of training samples in few seconds.
Processing High Resolution Images of Urban Areas with Self-Dual Attribute Filters (Invited Paper)
G. Cavallaro, M. Dalla Mura and J. A. Benediktsson
Proceedings of the Joint Urban Remote Sensing Event (JURSE)
The application of remote sensing to the study of human settlements relies on the availability of different types of image sources which provide complementary measurements for the characterization of urban areas. By analyzing images of very high spatial resolution (metric and submetric pixel size) it is possible to retrieve information on buildings (e.g., characterizing their size and shape) and districts (e.g., assessing settlement density and urban sprawl). In this context, mathematical morphology provides a set of tools that are useful for the characterization of geometrical features in urban images. Among those tools, attribute filters (AF) have proven to effectively extract these spatial characteristics. In this paper, we propose AF based on the inclusion tree structure as an efficient technique for generating features suitable for structure extraction in an urban environment. We address the issue by combining the area and moment of inertia attributes and proving the potential of this filter in the analysis of the data acquired by different types of sensors (i.e., Optical, LiDAR and SAR images).
Automatic Threshold Selection for Profiles of Attribute Filters Based on Granulometric Characteristic Functions
G. Cavallaro, N. Falco, M. Dalla Mura, L. Bruzzone and J. A. Benediktsson
Proceedings of the 12th International Symposium on Mathematical Morphology (ISMM)
Morphological attribute filters have been widely exploited for characterizing the spatial structures in remote sensing images. They have proven their effectiveness especially when computed in multi-scale architectures, such as for Attribute Profiles. However, the question how to choose a proper set of filter thresholds in order to build a representative profile remains one of the main issues. In this paper, a novel methodology for the selection of the filters’ parameters is presented. A set of thresholds is selected by analysing granulometric characteristic functions, which provide information on the image decomposition according to a given measure. The method exploits a tree (i.e., min-, max- or inclusion-tree) representation of an image, which allows us to avoid the filtering steps usually required prior the threshold selection, making the process computationally effective. The experimental analysis performed on two real remote sensing images shows the effectiveness of the proposed approach in providing representative and non-redundant multi-level image decompositions.
Automatic Morphological Attribute Profiles
G. Cavallaro, M. Dalla Mura, N. Falco and J. A. Benediktsson
Proceedings of the IEEE International Geoscience and Remote Sensing Symposium (IGARSS)
Attribute profiles (APs) have increasingly been receiving more attention over the last years, as they are able to extract and model spatial information that is useful for the analysis of remote sensing images of very high spatial resolution (VHR). However, one of the major issues in employing APs is the choice of a proper range of thresholds, able to provide a representative and non-redundant multi-level image decomposition. This paper presents a novel method for the automatic selection of adequate thresholds to compute the AP. A new concept of cumulative function, which can be seen as an extension of the basic notion of granulometry, is introduced. In particular, different information on the spatial context is achieved according to the measure used for computing the cumulative function, which is computed on the AP composed by considering all possible values of the attribute. The proposed approach aims at selecting the set of thresholds that provides the best approximation of the resulting cumulative function based on the chosen measure. Experimental analysis carried out on a very high resolution image shows the effectiveness of the presented strategy in providing a set of thresholds able to retain the salient spatial structures in the scene.
Scalable Developments for Big Data Analytics in Remote Sensing
G. Cavallaro, M. Riedel, M. Goetz, C. Bodenstein, M. Richerzhagen, P. Glock and J. A. Benediktsson
Proceedings of the IEEE International Geoscience and Remote Sensing Symposium (IGARSS)
Big Data Analytics methods take advantage of techniques from the fields of data mining, machine learning, or statistics with a focus on analysing large quantities of data (aka `big datasets') with modern technologies. Big data sets appear in remote sensing in the sense of large volumes, but also in the sense of an ever increasing amount of spectral bands (i.e., high-dimensional data). The remote sensing has traditionally used the above described techniques for a wide variety of application such as classification (e.g., land cover analysis using different spectral bands from satellite data), but more recently scalability challenges occur when using traditional (often serial) methods. This paper addresses observed scalability limits when using support vector machines (SVMs) for classification and discusses scalable and parallel developments used in concrete application areas of remote sensing. Different approaches that are based on massively parallel methods are discussed as well as recent developments in parallel methods.
A Comparison of Self-Dual Attribute Profiles Based on Different Filter Rules for Classification
G. Cavallaro, M. Dalla Mura, J. A. Benediktsson and L. Bruzzone
Proceedings of the IEEE International Geoscience and Remote Sensing Symposium (IGARSS)
In this paper we compare features obtained by different filtering strategies for morphological attribute filters by considering non-increasing attributes. The Attribute profiles (APs) and Self Dual Attribute Profiles (SDAPs) are obtained by sequentially applying attribute filters on tree-based image representations, such as Min- or Max-trees and Inclusion tree, respectively. This work aims to study the effects of using the filtering rules max, min, direct and subtractive, when considering the non-increasing attributes moment of inertia and standard deviation. A very high spatial resolution data set is used in the experiments, and the extracted information obtained by the profiles is analyzed. This is done by studying the effects on the classification accuracy by using the profiles as additional input features to a Random Forest classifier.
Smart Data Analytics Methods for Remote Sensing Applications
G. Cavallaro, M. Riedel, J. A. Benediktsson, M. Goetz, T. Runarsson, K. Jonasson and T. Lippert
Proceedings of the IEEE International Geoscience and Remote Sensing Symposium (IGARSS)
The big data analytics approach emerged that can be interpreted as extracting information from large quantities of scientific data in a systematic way. In order to have a more concrete understanding of this term we refer to its refinement as smart data analytics in order to examine large quantities of scientific data to uncover hidden patterns, unknown correlations, or to extract information in cases where there is no exact formula (e.g. known physical laws). Our concrete big data problem is the classification of classes of land cover types in image-based datasets that have been created using remote sensing technologies, because the resolution can be high (i.e. large volumes) and there are various types such as panchromatic or different used bands like red, green, blue, and nearly infrared (i.e. large variety). We investigate various smart data analytics methods that take advantage of machine learning algorithms (i.e. support vector machines) and state-of-the-art parallelization approaches in order to overcome limitations of big data processing using non-scalable serial approaches.
Detection of Hedges Based on Attribute Filters
G. Cavallaro, B. Arbelot, M. Fauvel, M. D. Mura, J. A. Benediktsson, L. Bruzzone, J. Chanussot and D. Sheeren
Proceedings of the SPIE 8537, Image and Signal Processing for Remote Sensing XVIII
The detection of hedges is a very important task for the monitoring of a rural environment and aiding the management of their related natural resources. Hedges are narrow vegetated areas composed of shrubs and/or trees that are usually present at the boundaries of adjacent agricultural fields. In this paper, a technique for detecting hedges is presented. It exploits the spectral and spatial characteristics of hedges. In detail, spatial features are extracted with attribute filters, which are connected operators defined in the mathematical morphology framework. Attribute filters are flexible operators that can perform a simplification of a grayscale image driven by an arbitrary measure. Such a measure can be related to characteristics of regions in the scene such as the scale, shape, contrast etc. Attribute filters can be computed on tree representations of an image (such as the component tree) which either represent bright or dark regions (with respect to their surroundings graylevels). In this work, it is proposed to compute attribute filters on the inclusion tree which is an hierarchical dual representation of an image, in which nodes of the tree corresponds to both bright and dark regions. Specifically, attribute filters are employed to aid the detection of woody elements in the image, which is a step in the process aimed at detecting hedges. In order to perform a characterization of the spatial information of the hedges in the image, different attributes have been considered in the analysis. The final decision is obtained by fusing the results of different detectors applied to the filtered image.
Magazine articles
High Performance and Disruptive Computing in Remote Sensing - The third edition of the school organized by the HDCRS Working Group of the GRSS Earth Science Informatics Technical Committee
G. Cavallaro, D. B. Heras, M. Maskey
IEEE Geoscience and Remote Sensing Magazine
The University of Iceland in Reykjavik hosted the third edition of the “High Performance and Disruptive Computing in Remote Sensing” school from 29 May to 1 June 2023. This event was organized by the High-Performance and Disruptive Computing in Remote Sensing (HDCRS) Working Group of the IEEE Geoscience and Remote Sensing Society (GRSS) Earth Science Informatics Technical Committee (ESI TC). Its goal was to acquaint participants with advancements in parallel and scalable methods using state-of-the-art computing technologies as they apply to remote sensing (RS). In addition to fostering a deeper understanding of these topics, the school provided an opportunity for students and young professionals to network with established researchers in the field, thereby promoting collaboration in HDCRS interdisciplinary research.
A Summer School Session on Mastering Geospatial Artificial Intelligence: From Data Production to Artificial Intelligence Foundation Model Development and Downstream Applications [Technical Committees]
Manil Maskey, Gabriele Cavallaro, Dora Blanco Heras, Paolo Fraccaro, Blair Edwards, Iksha Gurung, Brian Freitag, Muthukumaran Ramasubramanian, Johannes Jakubik, Linsong Chu, Raghu Ganti, Rahul Ramachandran, Kommy Weldemariam, Sujit Roy, Carlos Costa, Alex Corvin, Anish Asthana
IEEE Geoscience and Remote Sensing Magazine
In collaboration with IBM Research, the NASA Interagency Implementation and Advanced Concepts Team (IMPACT) organized a specialized one-day summer school session focused on exploring the topic of data science at scale. This session was a part of the “High Performance and Disruptive Computing in Remote Sensing” summer school hosted by the University of Iceland from 29 May to 1 June 2023 in Reykjavik, Iceland. This marked the third edition of the school organised by the High Performance and Disruptive Computing in Remote Sensing (HDCRS) Working Group of the IEEE Geoscience and Remote Sensing Society’s (GRSS’s) Earth Science Informatics (ESI) Technical Committee (TC).
High-Performance and Disruptive Computing in Remote Sensing: HDCRS-A New Working Group of the GRSS Earth Science Informatics Technical Committee
G. Cavallaro, D. B. Heras, Z. Wu, M. Maskey, S. Lopez, P. Gawron, M. Coca, M. Datcu
EEE Geoscience and Remote Sensing Magazine
The High-Performance and Disruptive Computing in Remote Sensing (HDCRS) Working Group (WG) was recently established under the IEEE Geoscience and Remote Sensing Society (GRSS) Earth Science Informatics (ESI) Technical Committee to connect a community of interdisciplinary researchers in remote sensing (RS) who specialize in advanced computing technologies, parallel programming models, and scalable algorithms. HDCRS focuses on three major research topics in the context of RS: 1) supercomputing and distributed computing, 2) specialized hardware computing, and 3) quantum computing (QC). This article presents these computing technologies as they play a major role for the development of RS applications. The HDCRS disseminates information and knowledge through educational events and publication activities which will also be introduced in this article.
Book chapters
Proven Approaches of Using Innovative High-Performance Computing Architectures in Remote Sensing
R. Sedona, G. Cavallaro, M. Riedel, J. A. Benediktsson
Signal and Image Processing for Remote Sensing
This chapter underscores the essential role of high-performance computing (HPC) in the realm of remote sensing (RS), effectively addressing the growing demand for processing extensive and complex datasets. HPC, empowered by parallel programming paradigms, significantly speeds up a range of tasks, including image processing, data mining, and modeling, vital in the context of Earth observation (EO) applications. More notably, HPC can build even better models by employing systematic hyperparameter optimization methods that are computationally demanding, given a large search space. Furthermore, with deep learning (DL) progressively gravitating toward foundation models, extensively trained on substantial datasets, endowing them with the remarkable capability to transfer knowledge across diverse tasks, there is an increased demand for computational resources in the fast-paced landscape of artificial intelligence (AI) and consequently a heightened interest in HPC. Solutions to provide optimized resources on HPC resources, however, have increased their complexity and heterogeneity. This chapter highlights the advantages of embracing HPC while acknowledging current challenges, solutions, and future trends.
Remote Sensing Data Fusion: Markov Models and Mathematical Morphology for Multisensor, Multiresolution, and Multiscale Image Classification
J. A. Benediktsson, G. Cavallaro, N. Falco, I. Hedhli, V. A. Krylov, G. Moser, S. B. Serpico and J. Zerubia
Mathematical Models for Remote Sensing Image Processing
Current and forthcoming sensor technologies and space missions are providing remote sensing scientists and practitioners with an increasing wealth and variety of data modalities. They encompass multisensor, multiresolution, multiscale, multitemporal, multipolarization, and multifrequency imagery. While they represent remarkable opportunities for the applications, they pose important challenges to the development of mathematical methods aimed at fusing the information conveyed by the input multisource data. In this framework, the present chapter continues the discussion of remote sensing data fusion, which was opened in the previous chapter. Here, the focus is on data fusion for image classification purposes. Both methodological issues of feature extraction and supervised classification are addressed. On both respects, the focus is on hierarchical image models rooted in graph theory. First, multilevel feature extraction is addressed through the latest advances in Mathematical Morphology and attribute profile theory with respect to component trees and trees of shapes. Then, joint supervised classification of multisensor, multiscale, multiresolution, and multitemporal imagery is formulated through hierarchical Markov random fields on quad-trees. Examples of experimental results with data from current VHR optical and SAR missions are shown and analysed.
Analyzing Remote Sensing Images with Hierarchical Morphological Representations
G. Cavallaro, M. Dalla Mura and J. A. Benediktsson
Handbook of Pattern Recognition and Computer Vision, 5th edition
Very high resolution (VHR) images provide a detailed representation of the surveyed scene with a geometrical resolution that at present can be up to 30 cm (WorldView-3). One of the most promising strategy for the analysis and the interpretation of a scene relies on hierarchical representations of the spatial content of an image. The hierarchical structure of a tree is useful for gathering the heterogeneous characteristics of the objects among different spatial scales (i.e., from the pixel level up to the entire image). In the remote sensing literature, there are several contributions addressing the use of hierarchical representations in many tasks such as filtering, segmentation, classification, object detection, change detection, and considering different types of images (e.g., panchromatic, multispectral, hyperspectral). The structure of each representation can vary significantly and can be efficiently adopted for a specific remote sensing application. After providing an overview of the different hierarchical representations, this chapter focuses on the tree structures on which image filtering (here we consider morphological attribute filters) can be efficiently implanted. Particular attention is paid to the tree of shapes, which is an important morphological structure that represents images in a self-dual way. Moreover, the identification of structures with heterogeneous characteristics (i.e., scale and shape) can be effectively done when computing attribute filters in multi-scale architectures, for instance Self-Dual Attribute Profiles (SDAPs). In this chapter we focus on the use of multilevel filtering based on hierarchical representations of the image for land cover classification. In particular, the experimental results reported in the literature for the classification of SDAPs computed on remotely sensed images are presented. Since the effectiveness of SDAPs is strictly correlated to the set of the filter parameters selected for the filtering we report on a technique for the automatic selection on the filter's parameters in order to obtain a profile that is representative and a non-redundant.