This is the Python reference implementation of AdaptionSeg proposed in "Curriculum Domain Adaptation for Semantic Segmentation of Urban Scenes".
Curriculum Domain Adaptation for Semantic Segmentation of Urban Scenes Yang Zhang; Philip David; Boqing Gong; International Conference on Computer Vision, 2017
[ICCV paper] [ArXiv Extended paper] [Poster]
We introduced a set of constraints to domain-adapt an arbitrary segmentation convolutional neural network (CNN) trained on source domain (synthetic images) to target domain (real images) without accessing target domain annotations.
- Linux
- A CUDA-enabled NVIDIA GPU; Recommend video memory >= 11GB
The code requires following dependencies:
- Python 2/3
- Theano (installation)
- Keras>=2.0.5 (Lower version might encounter
Conv2DTranspose
problem with Theano backend) (installation; You might want to install thoughpip
sinceconda
only offers Keras<=2.0.2) - Pillow (installation)
Make sure your Keras's image_data_format
is channels_first
.
It is recommended to use Theano
as the backend
. However Tensorflow
should also be okay. Note that using Tensorflow
will result in lower initial/baseline model performance because the baseline model was trained using Theano.
1, Download leftImg8bit_trainvaltest.zip
and leftImg8bit_trainextra.zip
in CityScape dataset here. (Require registration)
2, Download SYNTHIA-RAND-CITYSCAPES
in SYNTHIA dataset here.
3, Download our auxiliary pre-inferred target domain properties (Including both superpixel landmark and label distribution described in the paper) & parsed annotation here.
4, Download the submodule cityscapesScripts for evaluation purpose.
5, Unzip and organize them in this way:
./
├── train_val_DA.py
├── ...
├── cityscapesScripts/
│ ├── ...
│ └── cityscapesscripts/
│ ├── ...
│ └── evaluation/...
└── data/
├── Image/
│ ├── CityScape/ # Unzip from two CityScape zips
│ │ ├── test/
│ │ ├── train/
│ │ ├── train_extra/
│ │ └── val/
│ └── SYNTHIA/ # Unzip from the SYNTHIA dataset
│ └── train/
├── label_distribution/ # Unzip from our auxiliary dataset
│ └── ...
├── segmentation_annotation/ # Unzip from our auxiliary dataset
│ └── ...
├── SP_labels/ # Unzip from our auxiliary dataset
│ └── ...
└── SP_landmark/ # Unzip from our auxiliary dataset
└── ...
(Hint: If you have already downloaded the datasets but do not want to move them around, you may want to create some symbolic links of exsiting local datasets)
Run train_val_FCN_DA.py
either in your favorite Python IDE or the terminal by typing:
python train_val_FCN_DA.py
This would train the model for six epochs and save the best model during the training. You can stop it and continue to the evaluation during training if you feel it takes too long, however, performance would not be guaranteed then.
After running train_val_FCN_DA.py
for at least 500 steps, run test_FCN_DA.py
either in your favorite Python IDE or the terminal by typing:
python test_FCN_DA.py
This would evaluate both pre-trained SYNTHIA-FCN and adapted FCN over CityScape dataset and print both mean IoU.
The original framework was implemented in Keras 1 with a custom transposed convolution ops. The performance might be slightly different from the ones reported in the paper. Also, some new commits in TF/Theano optimizer implementation after the code release has broken the losses' numerical stability. I have changed code's optimizer to SGD despite the original paper used Adadelta. You are welcome to try Adadelta/Adam however it seems that they will result in a NaN loss right after training starts. If the NaN problem persists, try to remove the label distribution loss from the training.
Please cite our paper if this code benefits your reseaarch:
@InProceedings{Zhang_2017_ICCV,
author = {Zhang, Yang and David, Philip and Gong, Boqing},
title = {Curriculum Domain Adaptation for Semantic Segmentation of Urban Scenes},
booktitle={The IEEE International Conference on Computer Vision (ICCV)},
volume={2},
number={5},
pages={6},
month = {Oct},
year = {2017}
}