SIR-Net: Self-supervised transfer for inverse rendering via deep feature fusion and transformation from a single image

Tianteng Bi; Junjie Ma; Yue Liu; Dongdong Weng; Yongtian Wang

doi:10.1109/ACCESS.2020.3035213

SIR-Net: Self-supervised transfer for inverse rendering via deep feature fusion and transformation from a single image

Tianteng Bi, Junjie Ma, Yue Liu^*, Dongdong Weng, Yongtian Wang

^*Corresponding author for this work

School of Optics and Photonics

Research output: Contribution to journal › Article › peer-review

3 Citations (Scopus)

Abstract

Measuring the material, geometry, and ambient lighting of surfaces is a key technology in the object's appearance reconstruction. In this article, we propose a novel deep learning-based method to extract such information to reconstruct the object's appearance from an RGB image. Firstly, we design new deep convolutional neural network architectures to improve the performance by fusing complementary features from hierarchical layers and different tasks. Then we generate a synthetic dataset to train the proposed model to tackle the problem of the absence of the ground-truth. To transfer the domain from the synthetic data to the specific real image, we introduce a self-supervised test-time training strategy to finetune the trained model. The proposed architecture only requires one image as input when inferring the material, geometry, and ambient lighting. The experiments are conducted to evaluate the proposed method on both the synthetic data and real data. The results show that our trained model outperforms the existing baselines in each task and presents obvious improvement in final appearance reconstruction, which verifies the effectiveness of the proposed methods.

Original language	English
Pages (from-to)	201861-201873
Number of pages	13
Journal	IEEE Access
Volume	8
DOIs	https://doi.org/10.1109/ACCESS.2020.3035213
Publication status	Published - 2020

Keywords

Attention
Deep learning
Feature fusion
Image-based rendering
Inverse rendering
Lighting recovery
Material estimation

Access to Document

10.1109/ACCESS.2020.3035213

Cite this

@article{bfa732be0fa64ec88be0e1a44c81e4eb,

title = "SIR-Net: Self-supervised transfer for inverse rendering via deep feature fusion and transformation from a single image",

abstract = "Measuring the material, geometry, and ambient lighting of surfaces is a key technology in the object's appearance reconstruction. In this article, we propose a novel deep learning-based method to extract such information to reconstruct the object's appearance from an RGB image. Firstly, we design new deep convolutional neural network architectures to improve the performance by fusing complementary features from hierarchical layers and different tasks. Then we generate a synthetic dataset to train the proposed model to tackle the problem of the absence of the ground-truth. To transfer the domain from the synthetic data to the specific real image, we introduce a self-supervised test-time training strategy to finetune the trained model. The proposed architecture only requires one image as input when inferring the material, geometry, and ambient lighting. The experiments are conducted to evaluate the proposed method on both the synthetic data and real data. The results show that our trained model outperforms the existing baselines in each task and presents obvious improvement in final appearance reconstruction, which verifies the effectiveness of the proposed methods.",

keywords = "Attention, Deep learning, Feature fusion, Image-based rendering, Inverse rendering, Lighting recovery, Material estimation",

author = "Tianteng Bi and Junjie Ma and Yue Liu and Dongdong Weng and Yongtian Wang",

year = "2020",

doi = "10.1109/ACCESS.2020.3035213",

language = "English",

volume = "8",

pages = "201861--201873",

journal = "IEEE Access",

issn = "2169-3536",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

}

TY - JOUR

T1 - SIR-Net

T2 - Self-supervised transfer for inverse rendering via deep feature fusion and transformation from a single image

AU - Bi, Tianteng

AU - Ma, Junjie

AU - Liu, Yue

AU - Weng, Dongdong

AU - Wang, Yongtian

PY - 2020

Y1 - 2020

N2 - Measuring the material, geometry, and ambient lighting of surfaces is a key technology in the object's appearance reconstruction. In this article, we propose a novel deep learning-based method to extract such information to reconstruct the object's appearance from an RGB image. Firstly, we design new deep convolutional neural network architectures to improve the performance by fusing complementary features from hierarchical layers and different tasks. Then we generate a synthetic dataset to train the proposed model to tackle the problem of the absence of the ground-truth. To transfer the domain from the synthetic data to the specific real image, we introduce a self-supervised test-time training strategy to finetune the trained model. The proposed architecture only requires one image as input when inferring the material, geometry, and ambient lighting. The experiments are conducted to evaluate the proposed method on both the synthetic data and real data. The results show that our trained model outperforms the existing baselines in each task and presents obvious improvement in final appearance reconstruction, which verifies the effectiveness of the proposed methods.

AB - Measuring the material, geometry, and ambient lighting of surfaces is a key technology in the object's appearance reconstruction. In this article, we propose a novel deep learning-based method to extract such information to reconstruct the object's appearance from an RGB image. Firstly, we design new deep convolutional neural network architectures to improve the performance by fusing complementary features from hierarchical layers and different tasks. Then we generate a synthetic dataset to train the proposed model to tackle the problem of the absence of the ground-truth. To transfer the domain from the synthetic data to the specific real image, we introduce a self-supervised test-time training strategy to finetune the trained model. The proposed architecture only requires one image as input when inferring the material, geometry, and ambient lighting. The experiments are conducted to evaluate the proposed method on both the synthetic data and real data. The results show that our trained model outperforms the existing baselines in each task and presents obvious improvement in final appearance reconstruction, which verifies the effectiveness of the proposed methods.

KW - Attention

KW - Deep learning

KW - Feature fusion

KW - Image-based rendering

KW - Inverse rendering

KW - Lighting recovery

KW - Material estimation

UR - http://www.scopus.com/inward/record.url?scp=85102804514&partnerID=8YFLogxK

U2 - 10.1109/ACCESS.2020.3035213

DO - 10.1109/ACCESS.2020.3035213

M3 - Article

AN - SCOPUS:85102804514

SN - 2169-3536

VL - 8

SP - 201861

EP - 201873

JO - IEEE Access

JF - IEEE Access

ER -

SIR-Net: Self-supervised transfer for inverse rendering via deep feature fusion and transformation from a single image

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this