![]() |
X. Dong, N. Yokoya, L. Wang, and T. Uezato, ”Learning mutual modulation for self-supervised cross-modal super-resolution,” Proc. ECCV, 2022. PDF Quick Abstract Abstract: Self-supervised cross-modal super-resolution (SR) can overcome the difficulty of acquiring paired training data, but is challenging because only low-resolution (LR) source and high-resolution (HR) guide images from different modalities are available. Existing methods utilize pseudo or weak supervision in LR space and thus deliver results that are blurry or not faithful to the source modality. To address this issue, we present a mutual modulation SR (MMSR) model, which tackles the task by a mutual modulation strategy, including a source-to-guide modulation and a guide-to-source modulation. In these modulations, we develop cross-domain adaptive filters to fully exploit cross-modal spatial dependency and help induce the source to emulate the resolution of the guide and induce the guide to mimic the modality characteristics of the source. Moreover, we adopt a cycle consistency constraint to train MMSR in a fully self-supervised manner. Experiments on various tasks demonstrate the state-of-the-art performance of our MMSR. |
![]() | W. He, N. Yokoya, X. Yuan, ”Fast hyperspectral image recovery via non-iterative fusion of dual-camera compressive hyperspectral imaging,” IEEE Transactions on Image Processing, vol. 30, pp. 7170-7183, 2021. PDF Quick Abstract Abstract: Coded aperture snapshot spectral imaging (CASSI) is a promising technique to capture the three-dimensional hyperspectral image (HSI) using a single coded two-dimensional (2D) measurement, in which algorithms are used to perform the inverse problem. Due to the ill-posed nature, various regularizers have been exploited to reconstruct the 3D data from the 2D measurement. Unfortunately, the accuracy and computational complexity are unsatisfied. One feasible solution is to utilize additional information such as the RGB measurement in CASSI. Considering the combined CASSI and RGB measurement, in this paper, we propose a new fusion model for the HSI reconstruction. We investigate the spectral low-rank property of HSI composed of a spectral basis and spatial coefficients. Specifically, the RGB measurement is utilized to estimate the coefficients, meanwhile the CASSI measurement is adopted to provide the orthogonal spectral basis. We further propose a patch processing strategy to enhance the spectral low-rank property of HSI. The proposed model neither requires non-local processing or iteration, nor the spectral sensing matrix of the RGB detector. Extensive experiments on both simulated and real HSI dataset demonstrate that our proposed method outperforms previous state-of-the-art not only in quality but also speeds up the reconstruction more than 5000 times. |
![]() |
T. Uezato, D. Hong, N. Yokoya, and W. He, “Guided deep decoder: Unsupervised image pair fusion,” Proc. ECCV (spotlight), 2020. PDF Supmat Code Quick Abstract Abstract: The fusion of input and guidance images that have a tradeoff in their information (e.g., hyperspectral and RGB image fusion or pansharpening) can be interpreted as one general problem. However, previous studies applied a task-specific handcrafted prior and did not address the problems with a unified approach. To address this limitation, in this study, we propose a guided deep decoder network as a general prior. The proposed network is composed of an encoder-decoder network that exploits multi-scale features of a guidance image and a deep decoder network that generates an output image. The two networks are connected by feature refinement units to embed the multi-scale features of the guidance image into the deep decoder network. The proposed network allows the network parameters to be optimized in an unsupervised way without training data. Our results show that the proposed network can achieve state-of-the-art performance in various image fusion problems. |
![]() |
J. Xia, N. Yokoya, B. Adriano, and C. Broni-Bedaiko, ”OpenEarthMap: A benchmark dataset for global high-resolution land cover mapping,” Proc. WACV, 2023. Quick Abstract Abstract: We introduce OpenEarthMap, a benchmark dataset, for global high-resolution land cover mapping. OpenEarthMap consists of 2.2 million segments of 5000 aerial and satellite images covering 97 regions from 44 countries across 6 continents, with manually annotated 8-class land cover labels at a 0.25-0.5m ground sampling distance. Semantic segmentation models trained on the OpenEarthMap generalize worldwide and can be used as off-the-shelf models in a variety of applications. We evaluate the performance of state-of-the-art methods for unsupervised domain adaptation and present challenging problem settings suitable for further technical development. We also investigate lightweight models using automated neural architecture search for limited computational resources and fast mapping. The dataset will be made publicly available. |
![]() |
C. Robinson, K. Malkin, N. Jojic, H. Chen, R. Qin, C. Xiao, M. Schmitt, P. Ghamisi, R. Haensch, and N. Yokoya, ”Global land cover mapping with weak supervision: Outcome of the 2020 IEEE GRSS Data Fusion Contest,” IEEE J. Sel. Topics Appl. Earth Observ. Remote Sens., vol. 14, pp. 3185-3199, 2021. PDF Quick Abstract Abstract: This paper presents the scientific outcomes of the 2020 Data Fusion Contest organized by the Image Analysis and Data Fusion Technical Committee of the IEEE Geoscience and Remote Sensing Society. The 2020 Contest addressed the problem of automatic global land-cover mapping with weak supervision, i.e. estimating high-resolution semantic maps while only low-resolution reference data is available during training. Two separate competitions were organized to assess two different scenarios: 1) high-resolution labels are not available at all and 2) a small amount of high-resolution labels are available additionally to low-resolution reference data. In this paper we describe the DFC2020 dataset that remains available for further evaluation of corresponding approaches and report the results of the best-performing methods during the contest. |
![]() |
S. Kunwar, H. Chen, M. Lin, H. Zhang, P. D'Angelo, D. Cerra, S. M. Azimi,
M. Brown, G. Hager, N. Yokoya, R. Haensch, and B. Le Saux, ”Large-Scale Semantic 3D Reconstruction: Outcome of the 2019 IEEE GRSS Data Fusion Contest - Part A,” IEEE J. Sel. Topics Appl. Earth Observ. Remote Sens., vol. 14, pp. 922-935, 2020. PDF Quick Abstract Abstract: In this paper, we present the scientific outcomes of the 2019 Data Fusion Contest organized by the Image Analysis and Data Fusion Technical Committee of the IEEE Geoscience and Remote Sensing Society. The 2019 Contest addressed the problem of 3D reconstruction and 3D semantic understanding on a large scale. Several competitions were organized to assess specific issues, such as elevation estimation and semantic mapping from a single view, two views, or multiple views. In this Part A, we report the results of the best-performing approaches for semantic 3D reconstruction according to these various set-ups, while 3D point cloud semantic mapping is discussed in Part B. |
![]() |
B. Adriano, N. Yokoya, J. Xia, H. Miura, W. Liu, M. Matsuoka, and S. Koshimura, ”Learning from multimodal and multitemporal earth observation data for building damage mapping,” ISPRS Journal of Photogrammetry and Remote Sensing, vol. 175, pp. 132-143, 2021. PDF Quick Abstract Abstract: Earth observation technologies, such as optical imaging and synthetic aperture radar (SAR), provide excellent means to monitor ever-growing urban environments continuously. Notably, in the case of large-scale disasters (e.g., tsunamis and earthquakes), in which a response is highly time-critical, images from both data modalities can complement each other to accurately convey the full damage condition in the disaster's aftermath. However, due to several factors, such as weather and satellite coverage, it is often uncertain which data modality will be the first available for rapid disaster response efforts. Hence, novel methodologies that can utilize all accessible EO datasets are essential for disaster management. In this study, we have developed a global multisensor and multitemporal dataset for building damage mapping. We included building damage characteristics from three disaster types, namely, earthquakes, tsunamis, and typhoons, and considered three building damage categories. The global dataset contains high-resolution optical imagery and high-to-moderate-resolution multiband SAR data acquired before and after each disaster. Using this comprehensive dataset, we analyzed five data modality scenarios for damage mapping: single-mode (optical and SAR datasets), cross-modal (pre-disaster optical and post-disaster SAR datasets), and mode fusion scenarios. We defined a damage mapping framework for the semantic segmentation of damaged buildings based on a deep convolutional neural network algorithm. We compare our approach to another state-of-the-art baseline model for damage mapping. The results indicated that our dataset, together with a deep learning network, enabled acceptable predictions for all the data modality scenarios. |
![]() |
N. Yokoya, K. Yamanoi, W. He, G. Baier, B. Adriano, H. Miura, and S. Oishi, ”Breaking limits of remote sensing by deep learning from simulated data for flood and debris flow mapping,” IEEE Transactions on Geoscience and Remote Sensing, (early access), 2020. PDF Code Quick Abstract Abstract: We propose a framework that estimates inundation depth (maximum water level) and debris-flow-induced topographic deformation from remote sensing imagery by integrating deep learning and numerical simulation. A water and debris flow simulator generates training data for various artificial disaster scenarios. We show that regression models based on Attention U-Net and LinkNet architectures trained on such synthetic data can predict the maximum water level and topographic deformation from a remote sensing-derived change detection map and a digital elevation model. The proposed framework has an inpainting capability, thus mitigating the false negatives that are inevitable in remote sensing image analysis. Our framework breaks limits of remote sensing and enables rapid estimation of inundation depth and topographic deformation, essential information for emergency response, including rescue and relief activities. We conduct experiments with both synthetic and real data for two disaster events that caused simultaneous flooding and debris flows and demonstrate the effectiveness of our approach quantitatively and qualitatively. Our code and datasets are available at https://github.com/nyokoya/dlsim. |
![]() |
T. D. Pham, N. Yokoya, T. T. T. Nguyen, N. N. Le, N. T. Ha, J. Xia, W. Takeuchi, T. D. Pham, ”Improvement of mangrove soil carbon stocks estimation in North Vietnam using Sentinel-2 data and machine learning approach,” GIScience & Remote Sensing, vol. 58, no. 1, pp. 68-87, 2021. PDF Quick Abstract Abstract: Quantifying total carbon (TC) stocks in soil across various mangrove ecosystems is key to understanding the global carbon cycle to reduce greenhouse gas emissions. Estimating mangrove TC at a large scale remains challenging due to the difficulty and high cost of soil carbon measurements when the number of samples is high. In the present study, we investigated the capability of Sentinel-2 multispectral data together with a state-of-the-art machine learning (ML) technique, which is a combination of CatBoost regression (CBR) and a genetic algorithm (GA) for feature selection and optimization (the CBR-GA model) to estimate the mangrove soil C stocks across the mangrove ecosystems in North Vietnam. We used the field survey data collected from 177 soil cores. We compared the performance of the proposed model with those of the four ML algorithms, i.e., the extreme gradient boosting regression (XGBR), the light gradient boosting machine regression (LGBMR), the support vector regression (SVR), and the random forest regression (RFR) models. Our proposed model estimated the TC level in the soil as 35.06–166.83 Mg ha−1 (average = 92.27 Mg ha−1) with satisfactory accuracy (R 2 = 0.665, RMSE = 18.41 Mg ha−1) and yielded the best prediction performance among all the ML techniques. We conclude that the Sentinel-2 data combined with the CBR-GA model can improve estimates of the mangrove TC at 10 m spatial resolution in tropical areas. The effectiveness of the proposed approach should be further evaluated for different mangrove soils of the other mangrove ecosystems in tropical and semi-tropical regions. |