Article Search
닫기

Original Article

Split Viewer

International Journal of Fuzzy Logic and Intelligent Systems 2022; 22(4): 339-349

Published online December 25, 2022

https://doi.org/10.5391/IJFIS.2022.22.4.339

© The Korean Institute of Intelligent Systems

Extended Siamese Convolutional Neural Networks for Discriminative Feature Learning

Sangyun Lee and Sungjun Hong

School of Information Technology, Sungkonghoe University, Seoul, Korea

Correspondence to :
Sungjun Hong (sjhong@skhu.ac.kr)

Received: November 9, 2021; Revised: June 30, 2022; Accepted: October 17, 2022

This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0/) which permits unrestricted noncommercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

Siamese convolutional neural networks (SCNNs) has been considered as among the best deep learning architectures for visual object verification. However, these models involve the drawback that each branch extracts features independently without considering the other branch, which sometimes lead to unsatisfactory performance. In this study, we propose a new architecture called an extended SCNN (ESCNN) that addresses this limitation by learning both independent and relative features for a pair of images. ESCNNs also have a feature augmentation architecture that exploits the multi-level features of the underlying SCNN. The results of feature visualization showed that the proposed ESCNN can encode relative and discriminative information for the two input images at multi-level scales. Finally, we applied an ESCNN model to a person verification problem, and the experimental results indicate that the ESCNN achived an accuracy of 97.7%, which outperformed an SCNN model with 91.4% accuracy. The results of ablation studies also showed that a small version of the ESCNN performed 5.6% better than an SCNN model.

Keywords: Discriminative feature, Feature augmentation, Object verification, Siamese convolutional neural network

This research was supported by Basic Science Research Program through the National Research Foundation of Korea(NRF) funded by the Ministry of Education(grant number: NRF-2019R1I1A1A01059759).

No potential conflict of interest relevant to this article was reported.

Sangyun Lee received the B.S. and Ph.D. degrees in Electrical and Electronic Engineering from Yonsei University, Seoul, Korea, in 2011 and 2018, respectively. From 2018 to 2021, he was a Senior Researcher in Samsung Electronics Co., Ltd. Since 2021, he has been with the faculty of the School of Information Technology, Sungkonghoe University, Seoul, Korea, where he is currently an Assistant Professor. His current research interests include artificial intelligence, computer vision, and their various applications.

Sungjun Hong received the B.S. degree in Electrical and Electronic Engineering and Computer Science and the Ph.D. degree in Electrical and Electronic Engineering from Yonsei University, Seoul, Korea, in 2005 and 2012, respectively. Upon his graduation, he worked with LG Electronics, a connected car industry, as a senior researcher, from 2012 to 2013. He worked as a Lead Software Engineer with The Pinkfong Company, from 2013 to 2016. He was a Postdoctoral Researcher and a Research Professor with the School of Electrical and Electronic Engineering, Yonsei University, from 2016 to 2020, prior to his current appointment. He is currently an Assistant Professor with the School of Information Technology, Sungkonghoe university, Seoul, Korea. His research interests include machine learning, deep learning, computer vision, and their various applications. He received the IET Computer Vision Premium Award from the Institution of Engineering and Technology (IET), U.K., in 2015.

Article

Original Article

International Journal of Fuzzy Logic and Intelligent Systems 2022; 22(4): 339-349

Published online December 25, 2022 https://doi.org/10.5391/IJFIS.2022.22.4.339

Copyright © The Korean Institute of Intelligent Systems.

Extended Siamese Convolutional Neural Networks for Discriminative Feature Learning

Sangyun Lee and Sungjun Hong

School of Information Technology, Sungkonghoe University, Seoul, Korea

Correspondence to:Sungjun Hong (sjhong@skhu.ac.kr)

Received: November 9, 2021; Revised: June 30, 2022; Accepted: October 17, 2022

This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0/) which permits unrestricted noncommercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

Abstract

Siamese convolutional neural networks (SCNNs) has been considered as among the best deep learning architectures for visual object verification. However, these models involve the drawback that each branch extracts features independently without considering the other branch, which sometimes lead to unsatisfactory performance. In this study, we propose a new architecture called an extended SCNN (ESCNN) that addresses this limitation by learning both independent and relative features for a pair of images. ESCNNs also have a feature augmentation architecture that exploits the multi-level features of the underlying SCNN. The results of feature visualization showed that the proposed ESCNN can encode relative and discriminative information for the two input images at multi-level scales. Finally, we applied an ESCNN model to a person verification problem, and the experimental results indicate that the ESCNN achived an accuracy of 97.7%, which outperformed an SCNN model with 91.4% accuracy. The results of ablation studies also showed that a small version of the ESCNN performed 5.6% better than an SCNN model.

Keywords: Discriminative feature, Feature augmentation, Object verification, Siamese convolutional neural network

Fig 1.

Figure 1.

Architecture of a conventional SCNN. The network is trained by contrastive loss in the training stage, whereas a distance function is used to compute the similarity metric in the testing stage.

The International Journal of Fuzzy Logic and Intelligent Systems 2022; 22: 339-349https://doi.org/10.5391/IJFIS.2022.22.4.339

Fig 2.

Figure 2.

The proposed ESCNN architecture, which consists of three parts: (a) Siamese, (b) extension, and (c) decision parts. The feature dimensions are denoted as h × w × d, and the bracketed numbers correspond to ESCNN-tiny, introduced in Section 4.3

The International Journal of Fuzzy Logic and Intelligent Systems 2022; 22: 339-349https://doi.org/10.5391/IJFIS.2022.22.4.339

Fig 3.

Figure 3.

Visualization of the features learned by the ESCNN: (a) positive and (b) negative samples.

The International Journal of Fuzzy Logic and Intelligent Systems 2022; 22: 339-349https://doi.org/10.5391/IJFIS.2022.22.4.339

Fig 4.

Figure 4.

Training strategy of the proposed network. The network is optimized by a combination of two loss functions: 1) contrastive loss for the Siamese part and 2) cross-entropy loss for all parts, including the extension and decision parts.

The International Journal of Fuzzy Logic and Intelligent Systems 2022; 22: 339-349https://doi.org/10.5391/IJFIS.2022.22.4.339

Fig 5.

Figure 5.

Examples from the iLIDS–VID dataset.

The International Journal of Fuzzy Logic and Intelligent Systems 2022; 22: 339-349https://doi.org/10.5391/IJFIS.2022.22.4.339

Fig 6.

Figure 6.

Some example results: (a) positive and (b) negative samples.

The International Journal of Fuzzy Logic and Intelligent Systems 2022; 22: 339-349https://doi.org/10.5391/IJFIS.2022.22.4.339

Fig 7.

Figure 7.

ROC curves for the methods under consideration.

The International Journal of Fuzzy Logic and Intelligent Systems 2022; 22: 339-349https://doi.org/10.5391/IJFIS.2022.22.4.339

Table 1 . Quantitative results.

TPR (recall)TNR (specificity)PPV (precision)F1 scoreAccuracy
SIFT [4]0.8860.8900.8890.8880.888
SURF [5]0.7480.7510.7500.7500.750
Standard Siamese model [12]0.9790.8490.8660.9190.914
ESCNN-abs0.9940.9050.9130.9520.949
ESCNN0.9830.9710.9710.9770.977

Table 2 . Comparison of the number of trainable weights.

Weight typeStandard Siamese model [12]ESCNN & ESCNN-absESCNN-tiny
Convolution144,540612,54062,820
Convolution (bias)3001,100350
Batch normalization8002,400900
Fully connected layer100,200350,200105,200
Fully connected layer (bias)102102102
Total number of weights245,942966,342169,372

Table 3 . Quantitative results for the ESCNN-tiny.

TPR (recall)TNR (specificity)PPV (precision)F1 scoreAccuracy
Standard Siamese model [12]0.9790.8490.8660.9190.914
ESCNN-tiny0.9850.9550.9570.9710.970