Naoto Yokoya

Yokoya Naoto



University of Tokyo




remote sensing


pattern recognition


data fusion


Current position

Naoto Yokoya is an associate professor with the Department of Complexity Science and Engineering, the Department of Computer Science, and the Department of Information Science at the University of Tokyo, co-running Machine Learning and Statistical Data Analysis Laboratory. He is the team leader of the Geoinformatics Team at the RIKEN Center for Advanced Intelligence Project (AIP).

His research interests include image processing, data fusion, and machine learning for understanding remote sensing images, with applications to disaster management and environmental monitoring.

He is an associate editor of IEEE Transactions on Geoscience and Remote Sensing (TGRS) and ISPRS Journal of Photogrammetry and Remote Sensing (P&RS).

Featured publications

Z. Liu, Y. Li, F. Tu, R. Zhang, Z. Cheng, and N. Yokoya, ”DeepTreeSketch: Neural graph prediction for faithful 3D tree modeling from sketches,” Proc. CHI, 2024.
Project Page    Video    Quick Abstract

Abstract: We present DeepTreeSketch, a novel AI-assisted sketching system that enables users to create realistic 3D tree models from 2D free- hand sketches. Our system leverages a tree graph prediction net- work, TGP-Net, to learn the underlying structural patterns of trees from a large collection of 3D tree models. The TGP-Net simulates the iterative growth of botanical trees and progressively constructs the 3D tree structures in a bottom-up manner. Furthermore, our system supports a flexible sketching mode for both precise and coarse control of the tree shapes by drawing branch strokes and foliage strokes, respectively. Combined with a procedural genera- tion strategy, users can freely control the foliage propagation with diverse and fine details. We demonstrate the expressiveness, effi- ciency, and usability of our system through various experiments and user studies. Our system offers a practical tool for 3D tree cre- ation, especially for natural scenes in games, movies, and landscape applications.

W. Gan, H. Xu, Y. Huang, S. Chen, and N. Yokoya, ”V4D: Voxel for 4D novel view synthesis,” IEEE Transactions on Visualization and Computer Graphics, 2023.
PDF    Quick Abstract

Abstract: Neural radiance fields have made a remarkable breakthrough in the novel view synthesis task at the 3D static scene. However, for the 4D circumstance (e.g., dynamic scene), the performance of the existing method is still limited by the capacity of the neural network, typically in a multilayer perceptron network (MLP). In this paper, we utilize 3D Voxel to model the 4D neural radiance field, short as V4D, where the 3D voxel has two formats. The first one is to regularly model the 3D space and then use the sampled local 3D feature with the time index to model the density field and the texture field by a tiny MLP. The second one is in look-up tables (LUTs) format that is for the pixel-level refinement, where the pseudo-surface produced by the volume rendering is utilized as the guidance information to learn a 2D pixel-level refinement mapping. The proposed LUTs-based refinement module achieves the performance gain with little computational cost and could serve as the plug-and-play module in the novel view synthesis task. Moreover, we propose a more effective conditional positional encoding toward the 4D data that achieves performance gain with negligible computational burdens. Extensive experiments demonstrate that the proposed method achieves state-of-the-art performance at a low computational cost. The relevant code is available in https://github.com/GANWANSHUI/V4D.

J. Xia, N. Yokoya, B. Adriano, and C. Broni-Bediako, ”OpenEarthMap: A benchmark dataset for global high-resolution land cover mapping,” Proc. WACV, 2023.
Project Page    PDF    Quick Abstract

Abstract: We introduce OpenEarthMap, a benchmark dataset, for global high-resolution land cover mapping. OpenEarthMap consists of 2.2 million segments of 5000 aerial and satellite images covering 97 regions from 44 countries across 6 continents, with manually annotated 8-class land cover labels at a 0.25-0.5m ground sampling distance. Semantic segmentation models trained on the OpenEarthMap generalize worldwide and can be used as off-the-shelf models in a variety of applications. We evaluate the performance of state-of-the-art methods for unsupervised domain adaptation and present challenging problem settings suitable for further technical development. We also investigate lightweight models using automated neural architecture search for limited computational resources and fast mapping. The dataset will be made publicly available.

X. Dong, N. Yokoya, L. Wang, and T. Uezato, ”Learning mutual modulation for self-supervised cross-modal super-resolution,” Proc. ECCV, 2022.
PDF    Code    Quick Abstract

Abstract: Self-supervised cross-modal super-resolution (SR) can overcome the difficulty of acquiring paired training data, but is challenging because only low-resolution (LR) source and high-resolution (HR) guide images from different modalities are available. Existing methods utilize pseudo or weak supervision in LR space and thus deliver results that are blurry or not faithful to the source modality. To address this issue, we present a mutual modulation SR (MMSR) model, which tackles the task by a mutual modulation strategy, including a source-to-guide modulation and a guide-to-source modulation. In these modulations, we develop cross-domain adaptive filters to fully exploit cross-modal spatial dependency and help induce the source to emulate the resolution of the guide and induce the guide to mimic the modality characteristics of the source. Moreover, we adopt a cycle consistency constraint to train MMSR in a fully self-supervised manner. Experiments on various tasks demonstrate the state-of-the-art performance of our MMSR.

H. Chen, N. Yokoya, C. Wu and B. Du, ”Unsupervised multimodal change detection based on structural relationship graph representation learning,” IEEE Transactions on Geoscience and Remote Sensing, 2022.
PDF    Quick Abstract

Abstract: Unsupervised multimodal change detection is a practical and challenging topic that can play an important role in time-sensitive emergency applications. To address the challenge that multimodal remote sensing images cannot be directly compared due to their modal heterogeneity, we take advantage of two types of modality-independent structural relationships in multimodal images. In particular, we present a structural relationship graph representation learning framework for measuring the similarity of the two structural relationships. Firstly, structural graphs are generated from preprocessed multimodal image pairs by means of an object-based image analysis approach. Then, a structural relationship graph convolutional autoencoder (SR-GCAE) is proposed to learn robust and representative features from graphs. Two loss functions aiming at reconstructing vertex information and edge information are presented to make the learned representations applicable for structural relationship similarity measurement. Subsequently, the similarity levels of two structural relationships are calculated from learned graph representations and two difference images are generated based on the similarity levels. After obtaining the difference images, an adaptive fusion strategy is presented to fuse the two difference images. Finally, a morphological filtering-based postprocessing approach is employed to refine the detection results. Experimental results on six datasets with different modal combinations demonstrate the effectiveness of the proposed method.


2013 Mar. D.Eng. | Department of Aeronautics and Astronautics | The University of Tokyo | Japan
2010 Sep. M.Eng. | Department of Aeronautics and Astronautics | The University of Tokyo | Japan
2008 Mar. B.Eng. | Department of Aeronautics and Astronautics | The University of Tokyo | Japan


2023 Apr.-present Team Leader | RIKEN | Japan
2022 Dec-present Associate Professor | The University of Tokyo | Japan
2020 May-2022 Nov Lecturer | The University of Tokyo | Japan
2018 Jan.-2023 Mar. Unit Leader | RIKEN | Japan
2019 Apr.-2020 Mar. Visiting Associate Professor | Tokyo University of Agriculture and Technology | Japan
2015 Dec.-2017 Nov. Alexander von Hunboldt Research Fellow | DLR and TUM | Germany
2013 Jul.-2017 Dec. Assistant Professor | The University of Tokyo | Japan
2013 Aug.-2014 Jul. Visiting Scholar | National Food Research Institute (NFRI) | Japan
2012 Apr.-2013 Jun. JSPS Research Fellow | The University of Tokyo | Japan


Clarivate Highly Cited Researcher in Geosciences (2022, 2023).
1st place in the 2017 IEEE GRSS Data Fusion Contest.
Alexander von Humboldt research fellowship for postdoctoral researchers (2015).
Best presentation award of the Remote Sensing Society of Japan (2011, 2012, 2019).

Research grants

2022 Apr.-2026 Mar. PI, Grant-in-Aid for Scientific Research B, Japan Society for the Promotion of Science (JSPS)
2021 Apr.-2028 Mar. PI, FOREST (Fusion Oriented REsearch for disruptive Science and Technology),
 Japan Science and Technology Agency (JST)
2021 Apr.-2024 Mar. CI, Grant-in-Aid for Scientific Research B, Japan Society for the Promotion of Science (JSPS)
2019 Apr.-2022 Mar. CI, Grant-in-Aid for Scientific Research B, Japan Society for the Promotion of Science (JSPS)
2018 Apr.-2021 Mar. PI, Grant-in-Aid for Young Scientists, Japan Society for the Promotion of Science (JSPS)
2015 Apr.-2018 Mar. PI, Grant-in-Aid for Young Scientists (B), Japan Society for the Promotion of Science (JSPS)
2015 Jan.-2016 Dec. PI, Research Grant Program, Kayamori Foundation of Informational Science Advancement
2013 Aug.-2014 Mar. PI, Adaptable and Seamless Technology Transfer Program through
Target-driven R&D (A-STEP), Japan Science and Technology Agency (JST)
2012 Apr.-2013 Jun. PI, Grant-in-Aid for JSPS Fellows, Japan Society for the Promotion of Science (JSPS)


Mathematics for Information Science (in Japanese)The University of Tokyosince 2020
Computer Vision (in Japanese)The University of Tokyosince 2021
Remote Sensing Image Analysis (in English)The University of Tokyosince 2021

International mobility

2015 Sep.-2017 Nov. Visiting scholar at DLR and TUM, München, Germany.
2011 Oct.-2012 Mar. Visiting student at the Grenoble Institute of Technology, Grenoble, France.


Organizer IJCAI CDCEO Workshop 2022
Organizer CVPR EarthVision Workshop 2019, 2020, 2021, 2022
Chair & Co-Chair IEEE GRSS Image Analysis and Data Fusion Technical Committee (2017-2021)
Secretary IEEE GRSS All Japan Joint Chapter (2018-2021)
Organizer IEEE GRSS Data Fusion Contest 2018, 2019, 2020, 2021
Student Activity & TIE Event Chair IEEE IGARSS 2019
Program Chair IEEE WHISPERS 2015

Editorial activity

Associate Editor ISPRS Journal of Photogrammetry and Remote Sensing, since 2024.
Associate Editor IEEE Transactions on Geoscience and Remote Sensing, since 2021.
Associate Editor IEEE Journal on Selected Topics in Applied Earth Observations and Remote Sensing, 2018-2021.
Editorial Board Member Remote Sensing, since 2018.
Guest Editor  Remote Sensing
 List of Special Issues

 "Remote Sensing on Land Surface Albedo"
 "Advanced Machine Learning Techniques for High-Resolution Remote Sensing Data Analysis"
 "Data Fusion for Urban Applications"
 "Spectral Data Meets Machine Learning: From Datasets to Algorithms and Applications"
 "Deep Learning and Feature Mining Using Hyperspectral Imagery"
 "Point Cloud Processing in Remote Sensing"
 "Multisensor Data Fusion in Remote Sensing"
 "Spatial Enhancement of Hyperspectral Data and Applications"

Guest Editor  IEEE Journal on Selected Topics in Applied Earth Observations and Remote Sensing
 List of Special Issues

 "Benchmarking in Remote Sensing Data Science"
 "Semantic Extraction and Fusion of Multimodal Remote Sensing Data: Algorithms and Applications"
 "2020 Gaofen Challenge on Automated High-Resolution Earth Observation Image Interpretation"
 "Integrating Physics and Artificial Intelligence for Remote Sensing Applications"
 "Computer Vision-based Approaches for Earth Observation"
 "Hyperspectral Remote Sensing and Imaging Spectroscopy"

Guest Editor IEEE Geoscience and Remote Sensing Letters
 Special Issue on "Advanced Processing for Multimodal Optical Remote Sensing Imagery"


Reviewers for various journals and conferences:
- IEEE Transactions on Geoscience and Remote Sensing
- IEEE Transactions on Image Processing
- IEEE Transactions on Signal Processing
- IEEE Transactions on Computational Imaging
- IEEE Transactions on Pattern Analysis and Machine Intelligence
- IEEE Journal of Selected Topics on Applied Remote Sensing
- IEEE Journal of Selected Topics in Signal Processing
- IEEE Geoscience and Remote Sensing Letters
- IEEE Geoscience and Remote Sensing Magazine
- Proceedings of the IEEE
- Remote Sensing
- Remote Sensing of Environment
- International Journal of Remote Sensing
- Pattern Recognition
- Pattern Recognition Letters
- Neurocomputing