International Journal of Fuzzy Logic and Intelligent Systems 2020; 20(1): 59-68
Published online March 25, 2020
https://doi.org/10.5391/IJFIS.2020.20.1.59
© The Korean Institute of Intelligent Systems
Seoung-Ho Choi1 and Sung Hoon Jung2
1Department of Electronics and Information Engineering, Hansung University, Seoul, Korea
2Division of Mechanical and Electronics Engineering, Hansung University, Seoul, Korea
Correspondence to :
Sung Hoon Jung (shjung@hansung.ac.kr)
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0/) which permits unrestricted noncommercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
Acquisition of fine-grained segments in semantic segmentation is important in most sementic segmentation applications, especially for clothing images composed of fine-grained textures. However, most existing semantic segmentation methods based on fully convolutional network (FCN) were not enough to acquire fine-grained segments because they are based on a single resolution and can not well distinguish between objects in the images. To stabilize the acquisition of fine-grained segments, we propose a method that is composed of two additional components in the U-Net structure for processing multi-scale fine-grained segments. The first component is to use normalization at all layers. We found from experiments that normalization is a key process in stabilizing the acquisition of fine-grained segments, especially in the U-Net based methods because they operate on a multi-scale fine-grained segment. An additional component is to use model prediction correction using focal loss with L1 regularization. Focal loss can be used to control the model prediction term as regularization in the training process. From experiments, we found that our method was better than the existing methods.
Keywords: Multi-scale segments, Batch normalization, Model prediction correction
International Journal of Fuzzy Logic and Intelligent Systems 2020; 20(1): 59-68
Published online March 25, 2020 https://doi.org/10.5391/IJFIS.2020.20.1.59
Copyright © The Korean Institute of Intelligent Systems.
Seoung-Ho Choi1 and Sung Hoon Jung2
1Department of Electronics and Information Engineering, Hansung University, Seoul, Korea
2Division of Mechanical and Electronics Engineering, Hansung University, Seoul, Korea
Correspondence to:Sung Hoon Jung (shjung@hansung.ac.kr)
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0/) which permits unrestricted noncommercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
Acquisition of fine-grained segments in semantic segmentation is important in most sementic segmentation applications, especially for clothing images composed of fine-grained textures. However, most existing semantic segmentation methods based on fully convolutional network (FCN) were not enough to acquire fine-grained segments because they are based on a single resolution and can not well distinguish between objects in the images. To stabilize the acquisition of fine-grained segments, we propose a method that is composed of two additional components in the U-Net structure for processing multi-scale fine-grained segments. The first component is to use normalization at all layers. We found from experiments that normalization is a key process in stabilizing the acquisition of fine-grained segments, especially in the U-Net based methods because they operate on a multi-scale fine-grained segment. An additional component is to use model prediction correction using focal loss with L1 regularization. Focal loss can be used to control the model prediction term as regularization in the training process. From experiments, we found that our method was better than the existing methods.
Keywords: Multi-scale segments, Batch normalization, Model prediction correction
Proposed U-Net structure.
Result of focal loss regularization model: (a) FCN, (b) attention U-Net, and (c) U-Net BN. (I) Cross-entropy with L1 0.0 and L2 0.0 regularization coefficient, (II) cross-entropy with L1 0.0 and L2 0.5 regularization coefficient, (III) cross-entropy with L1 0.5 and L2 0.0 regularization coefficient, (IV) cross-entropy with L1 0.5 and L2 0.5 regularization coefficient, V) Focal loss with L1 0.0 and L2 0.0 regularization coefficient, (VI) focal loss with L1 0.0 and L2 0.5 regularization coefficient, (VII) focal loss with L1 0.5 and L2 0.0 regularization coefficient, and (VIII) focal loss with L1 0.5 and L2 0.5 regularization coefficient.
Comparison with or without BN about two loss function types and various regularization coefficients in loss function within training time. (I) Cross-entropy with L1 0.0 and L2 0.0 regularization coefficient, (II) cross-entropy with L1 0.0 and L2 0.5 regularization coefficient, (III) cross-entropy with L1 0.5 and L2 0.0 regularization coefficient, (IV) cross-entropy with L1 0.5 and L2 0.5 regularization coefficient, (V) focal loss with L1 0.0 and L2 0.0 regularization coefficient, (VI) focal loss with L1 0.0 and L2 0.5 regularization coefficient, (VII) focal loss with L1 0.5 and L2 0.0 regularization coefficient, and (VIII) Focal loss with L1 0.5 and L2 0.5 regularization coefficient.
Comparison of F1 score according to addition of attention gate on segmentation model within training time: (a) in the attention gate included and (b) in the attention gate non-included.
Four methods on four clothing images: (a) case 1, (b) case 2, (c) case 3, and (d) case 4. The proposed method is focal loss L1 0.5 regularization coefficient. (I) FCN, (II) U-Net, (III) attention U-Net, and (IV) U-Net BN.
Comparison of regularization effect using non-regularization and regularization with L1 and L2 regularization: (a) cross-entropy loss and (b) focal loss. (i) L1 0.0 and L2 0.0 regularization coefficient, (ii) L1 0.5 and L2 0.5 regularization coefficient.
Table 1 . Comparison of both focal loss about U-Net models.
IOU | Precision | Recall | ||
---|---|---|---|---|
U-Net | (i) | 0.496 ± 0.000 | 0.001 ± 0.000 | 0.000 ± 0.000 |
(ii) | 0.496 ± 0.000 | 0.001 ± 0.001 | 0.000 ± 0.000 | |
(iii) | 0.632 ± 0.011 | 0.344 ± 0.029 | 0.333 ± 0.004 | |
(iv) | 0.496 ± 0.000 | 0.003 ± 0.002 | 0.000 ± 0.000 | |
Attention U-Net | (i) | 0.715 ± 0.008 | 0.536 ± 0.002 | 0.420 ± 0.002 |
(ii) | 0.724 ± 0.005 | 0.555 ± 0.005 | 0.444 ± 0.013 | |
(iii) | 0.723 ± 0.002 | 0.554 ± 0.006 | 0.441 ± 0.004 | |
(iv) | 0.723 ± 0.005 | 0.548 ± 0.012 | 0.437 ± 0.013 | |
U-Net BN | (i) | 0.731 ± 0.002 | 0.565 ± 0.007 | 0.464 ± 0.005 |
(ii) | 0.728 ± 0.004 | 0.564 ± 0.004 | 0.458 ± 0.004 | |
(iii) | 0.729 ± 0.003 | 0.567 ± 0.003 | 0.459 ± 0.007 | |
(iv) | 0.730 ± 0.001 | 0.564 ± 0.006 | 0.461 ± 0.001 |
(I) L1 0.0 and L2 0.0 regularization coefficient, (ii) L1 0.0 and L2 0.5 regularization coefficient, (iii) L1 0.5 and L2 0.0 regularization coefficient, and (iv) L1 0.5 and L2 0.5 regularization coefficient..
Wang-Su Jeon, and Sang-Yong Rhee
Int. J. Fuzzy Log. Intell. Syst. 2017; 17(3): 170-176 https://doi.org/10.5391/IJFIS.2017.17.3.170Proposed U-Net structure.
|@|~(^,^)~|@|Result of focal loss regularization model: (a) FCN, (b) attention U-Net, and (c) U-Net BN. (I) Cross-entropy with L1 0.0 and L2 0.0 regularization coefficient, (II) cross-entropy with L1 0.0 and L2 0.5 regularization coefficient, (III) cross-entropy with L1 0.5 and L2 0.0 regularization coefficient, (IV) cross-entropy with L1 0.5 and L2 0.5 regularization coefficient, V) Focal loss with L1 0.0 and L2 0.0 regularization coefficient, (VI) focal loss with L1 0.0 and L2 0.5 regularization coefficient, (VII) focal loss with L1 0.5 and L2 0.0 regularization coefficient, and (VIII) focal loss with L1 0.5 and L2 0.5 regularization coefficient.
|@|~(^,^)~|@|Comparison with or without BN about two loss function types and various regularization coefficients in loss function within training time. (I) Cross-entropy with L1 0.0 and L2 0.0 regularization coefficient, (II) cross-entropy with L1 0.0 and L2 0.5 regularization coefficient, (III) cross-entropy with L1 0.5 and L2 0.0 regularization coefficient, (IV) cross-entropy with L1 0.5 and L2 0.5 regularization coefficient, (V) focal loss with L1 0.0 and L2 0.0 regularization coefficient, (VI) focal loss with L1 0.0 and L2 0.5 regularization coefficient, (VII) focal loss with L1 0.5 and L2 0.0 regularization coefficient, and (VIII) Focal loss with L1 0.5 and L2 0.5 regularization coefficient.
|@|~(^,^)~|@|Comparison of F1 score according to addition of attention gate on segmentation model within training time: (a) in the attention gate included and (b) in the attention gate non-included.
|@|~(^,^)~|@|Four methods on four clothing images: (a) case 1, (b) case 2, (c) case 3, and (d) case 4. The proposed method is focal loss L1 0.5 regularization coefficient. (I) FCN, (II) U-Net, (III) attention U-Net, and (IV) U-Net BN.
|@|~(^,^)~|@|Comparison of regularization effect using non-regularization and regularization with L1 and L2 regularization: (a) cross-entropy loss and (b) focal loss. (i) L1 0.0 and L2 0.0 regularization coefficient, (ii) L1 0.5 and L2 0.5 regularization coefficient.