This page has only limited features, please log in for full access.
Deep learning techniques have shown their superior performance in dermatologist clinical inspection. Nevertheless, melanoma diagnosis is still a challenging task due to the difficulty of incorporating the useful dermatologist clinical knowledge into the learning process. In this paper, we propose a novel knowledge-aware deep framework that incorporates some clinical knowledge into collaborative learning of two important melanoma diagnosis tasks, i.e., skin lesion segmentation and melanoma recognition. Specifically, to exploit the knowledge of morphological expressions of the lesion region and also the periphery region for melanoma identification, a lesion-based pooling and shape extraction (LPSE) scheme is designed, which transfers the structure information obtained from skin lesion segmentation into melanoma recognition. Meanwhile, to pass the skin lesion diagnosis knowledge from melanoma recognition to skin lesion segmentation, an effective diagnosis guided feature fusion (DGFF) strategy is designed. Moreover, we propose a recursive mutual learning mechanism that further promotes the inter-task cooperation, and thus iteratively improves the joint learning capability of the model for both skin lesion segmentation and melanoma recognition. Experimental results on two publicly available skin lesion datasets show the effectiveness of the proposed method for melanoma analysis.
Xiaohong Wang; Xudong Jiang; Henghui Ding; Yuqian Zhao; Jun Liu. Knowledge-aware deep framework for collaborative skin lesion segmentation and melanoma recognition. Pattern Recognition 2021, 120, 108075 .
AMA StyleXiaohong Wang, Xudong Jiang, Henghui Ding, Yuqian Zhao, Jun Liu. Knowledge-aware deep framework for collaborative skin lesion segmentation and melanoma recognition. Pattern Recognition. 2021; 120 ():108075.
Chicago/Turabian StyleXiaohong Wang; Xudong Jiang; Henghui Ding; Yuqian Zhao; Jun Liu. 2021. "Knowledge-aware deep framework for collaborative skin lesion segmentation and melanoma recognition." Pattern Recognition 120, no. : 108075.
Conventional deep neural networks use simple classifiers to obtain highly accurate results. However, they have limitations in practical applications. This study demonstrates a robust deep metric neural network model for rare bioparticle detection.
Shaobo Luo; Yuzhi Shi; Lip Ket Chin; Yi Zhang; Bihan Wen; Ying Sun; Binh T. T. Nguyen; Giovanni Chierchia; Hugues Talbot; Tarik Bourouina; Xudong Jiang; Ai-Qun Liu. Rare bioparticle detection via deep metric learning. RSC Advances 2021, 11, 17603 -17610.
AMA StyleShaobo Luo, Yuzhi Shi, Lip Ket Chin, Yi Zhang, Bihan Wen, Ying Sun, Binh T. T. Nguyen, Giovanni Chierchia, Hugues Talbot, Tarik Bourouina, Xudong Jiang, Ai-Qun Liu. Rare bioparticle detection via deep metric learning. RSC Advances. 2021; 11 (29):17603-17610.
Chicago/Turabian StyleShaobo Luo; Yuzhi Shi; Lip Ket Chin; Yi Zhang; Bihan Wen; Ying Sun; Binh T. T. Nguyen; Giovanni Chierchia; Hugues Talbot; Tarik Bourouina; Xudong Jiang; Ai-Qun Liu. 2021. "Rare bioparticle detection via deep metric learning." RSC Advances 11, no. 29: 17603-17610.
Motor imagery brain-computer interface (MI-BCI) has many promising applications but there are problems such as poor classification accuracy and robustness which need to be addressed. We propose a novel approach called time-frequency common spatial patterns (TFCSP) to enhance the robustness and accuracy of the electroencephalogram (EEG) signal classification. The proposed approach decomposes the EEG signal into time stages and frequency components to find the most robust and discriminative features. Common spatial patterns (CSP) are extracted from every decomposed time-frequency cell and unreliable features are removed while remaining features are weighted and regularized for the classification. Comparison on three publicly available datasets from BCI competition III and IV shows that the proposed TFCSP outperforms state-of-the-art methods. This demonstrates that adopting subject reaction time paradigm is useful to enhance the classification performance. It also shows that the complex CSP in the frequency domain significantly effective than the commonly used bandpass-filters in time domain. Finally, this work proves that weighting and regularizing CSP features are better techniques than selecting the leading CSP features because the former alleviates information loss.
Vasilisa Mishuhina; Xudong Jiang. Complex common spatial patterns on time-frequency decomposed EEG for brain-computer interface. Pattern Recognition 2021, 115, 107918 .
AMA StyleVasilisa Mishuhina, Xudong Jiang. Complex common spatial patterns on time-frequency decomposed EEG for brain-computer interface. Pattern Recognition. 2021; 115 ():107918.
Chicago/Turabian StyleVasilisa Mishuhina; Xudong Jiang. 2021. "Complex common spatial patterns on time-frequency decomposed EEG for brain-computer interface." Pattern Recognition 115, no. : 107918.
Imaging flow cytometry has become a popular technology for bioparticle image analysis because of its capability of capturing thousands of images per second. Nevertheless, the vast number of images generated by imaging flow cytometry imposes great challenges for data analysis especially when the species have similar morphologies. In this work, we report a deep learning‐enabled high‐throughput system for predicting Cryptosporidium and Giardia in drinking water. This system combines imaging flow cytometry and an efficient artificial neural network called MCellNet, which achieves a classification accuracy >99.6%. The system can detect Cryptosporidium and Giardia with a sensitivity of 97.37% and a specificity of 99.95%. The high‐speed analysis reaches 346 frames per second, outperforming the state‐of‐the‐art deep learning algorithm MobileNetV2 in speed (251 frames per second) with a comparable classification accuracy. The reported system empowers rapid, accurate, and high throughput bioparticle detection in clinical diagnostics, environmental monitoring and other potential biosensing applications.
Shaobo Luo; Kim Truc Nguyen; Binh T. T. Nguyen; Shilun Feng; Yuzhi Shi; Ahmed Elsayed; Yi Zhang; Xiaohong Zhou; Bihan Wen; Giovanni Chierchia; Hugues Talbot; Tarik Bourouina; Xudong Jiang; Ai Qun Liu. Deep learning‐enabled imaging flow cytometry for high‐speed Cryptosporidium and Giardia detection. Cytometry Part A 2021, 1 .
AMA StyleShaobo Luo, Kim Truc Nguyen, Binh T. T. Nguyen, Shilun Feng, Yuzhi Shi, Ahmed Elsayed, Yi Zhang, Xiaohong Zhou, Bihan Wen, Giovanni Chierchia, Hugues Talbot, Tarik Bourouina, Xudong Jiang, Ai Qun Liu. Deep learning‐enabled imaging flow cytometry for high‐speed Cryptosporidium and Giardia detection. Cytometry Part A. 2021; ():1.
Chicago/Turabian StyleShaobo Luo; Kim Truc Nguyen; Binh T. T. Nguyen; Shilun Feng; Yuzhi Shi; Ahmed Elsayed; Yi Zhang; Xiaohong Zhou; Bihan Wen; Giovanni Chierchia; Hugues Talbot; Tarik Bourouina; Xudong Jiang; Ai Qun Liu. 2021. "Deep learning‐enabled imaging flow cytometry for high‐speed Cryptosporidium and Giardia detection." Cytometry Part A , no. : 1.
Complex-valued neural networks have many advantages over their real-valued counterparts. Conventional digital electronic computing platforms are incapable of executing truly complex-valued representations and operations. In contrast, optical computing platforms that encode information in both phase and magnitude can execute complex arithmetic by optical interference, offering significantly enhanced computational speed and energy efficiency. However, to date, most demonstrations of optical neural networks still only utilize conventional real-valued frameworks that are designed for digital computers, forfeiting many of the advantages of optical computing such as efficient complex-valued operations. In this article, we highlight an optical neural chip (ONC) that implements truly complex-valued neural networks. We benchmark the performance of our complex-valued ONC in four settings: simple Boolean tasks, species classification of an Iris dataset, classifying nonlinear datasets (Circle and Spiral), and handwriting recognition. Strong learning capabilities (i.e., high accuracy, fast convergence and the capability to construct nonlinear decision boundaries) are achieved by our complex-valued ONC compared to its real-valued counterpart.
H. Zhang; M. Gu; X. D. Jiang; J. Thompson; H. Cai; S. Paesani; R. Santagati; A. Laing; Y. Zhang; M. H. Yung; Y. Z. Shi; F. K. Muhammad; G. Q. Lo; X. S. Luo; B. Dong; D. L. Kwong; L. C. Kwek; A. Q. Liu. An optical neural chip for implementing complex-valued neural network. Nature Communications 2021, 12, 1 -11.
AMA StyleH. Zhang, M. Gu, X. D. Jiang, J. Thompson, H. Cai, S. Paesani, R. Santagati, A. Laing, Y. Zhang, M. H. Yung, Y. Z. Shi, F. K. Muhammad, G. Q. Lo, X. S. Luo, B. Dong, D. L. Kwong, L. C. Kwek, A. Q. Liu. An optical neural chip for implementing complex-valued neural network. Nature Communications. 2021; 12 (1):1-11.
Chicago/Turabian StyleH. Zhang; M. Gu; X. D. Jiang; J. Thompson; H. Cai; S. Paesani; R. Santagati; A. Laing; Y. Zhang; M. H. Yung; Y. Z. Shi; F. K. Muhammad; G. Q. Lo; X. S. Luo; B. Dong; D. L. Kwong; L. C. Kwek; A. Q. Liu. 2021. "An optical neural chip for implementing complex-valued neural network." Nature Communications 12, no. 1: 1-11.
Scalable algorithms of variational posterior approximation allow Bayesian nonparametrics such as Dirichlet process mixture to scale up to larger dataset at fractional cost. Recent algorithms, notably the stochastic variational inference performs local learning from minibatch. The main problem with stochastic variational inference is that it relies on closed form solution. Stochastic gradient ascent is a modern approach to machine learning and is widely deployed in the training of deep neural networks. In this work, we explore using stochastic gradient ascent as a fast algorithm for the posterior approximation of Dirichlet process mixture. However, stochastic gradient ascent alone is not optimal for learning. In order to achieve both speed and performance, we turn our focus to stepsize optimization in stochastic gradient ascent. As as intermediate approach, we first optimize stepsize using the momentum method. Finally, we introduce Fisher information to allow adaptive stepsize in our posterior approximation. In the experiments, we justify that our approach using stochastic gradient ascent do not sacrifice performance for speed when compared to closed form coordinate ascent learning on these datasets. Lastly, our approach is also compatible with deep ConvNet features as well as scalable to large class datasets such as Caltech256 and SUN397.
Kart-Leong Lim; Xudong Jiang. Variational posterior approximation using stochastic gradient ascent with adaptive stepsize. Pattern Recognition 2020, 112, 107783 .
AMA StyleKart-Leong Lim, Xudong Jiang. Variational posterior approximation using stochastic gradient ascent with adaptive stepsize. Pattern Recognition. 2020; 112 ():107783.
Chicago/Turabian StyleKart-Leong Lim; Xudong Jiang. 2020. "Variational posterior approximation using stochastic gradient ascent with adaptive stepsize." Pattern Recognition 112, no. : 107783.
High accuracy measurement of size is essential in physical and biomedical sciences. Various sizing techniques have been widely used in sorting colloidal materials, analyzing bioparticles and monitoring the qualities of food and atmosphere. Most imaging-free methods such as light scattering measure the averaged size of particles and have difficulties in determining non-spherical particles. Imaging acquisition using camera is capable of observing individual nanoparticles in real time, but the accuracy is compromised by the image defocusing and instrumental calibration. In this work, a machine learning-based pipeline is developed to facilitate a high accuracy imaging-based particle sizing. The pipeline consists of an image segmentation module for cell identification and a machine learning model for accurate pixel-to-size conversion. The results manifest a significantly improved accuracy, showing great potential for a wide range of applications in environmental sensing, biomedical diagnostical, and material characterization.
Shaobo Luo; Yi Zhang; Kim Truc Nguyen; Shilun Feng; Yuzhi Shi; Yang Liu; Paul Hutchinson; Giovanni Chierchia; Hugues Talbot; Tarik Bourouina; Xudong Jiang; Ai Qun Liu. Machine Learning-Based Pipeline for High Accuracy Bioparticle Sizing. Micromachines 2020, 11, 1084 .
AMA StyleShaobo Luo, Yi Zhang, Kim Truc Nguyen, Shilun Feng, Yuzhi Shi, Yang Liu, Paul Hutchinson, Giovanni Chierchia, Hugues Talbot, Tarik Bourouina, Xudong Jiang, Ai Qun Liu. Machine Learning-Based Pipeline for High Accuracy Bioparticle Sizing. Micromachines. 2020; 11 (12):1084.
Chicago/Turabian StyleShaobo Luo; Yi Zhang; Kim Truc Nguyen; Shilun Feng; Yuzhi Shi; Yang Liu; Paul Hutchinson; Giovanni Chierchia; Hugues Talbot; Tarik Bourouina; Xudong Jiang; Ai Qun Liu. 2020. "Machine Learning-Based Pipeline for High Accuracy Bioparticle Sizing." Micromachines 11, no. 12: 1084.
Existing interactive object segmentation methods mainly take spatial interactions such as bounding boxes or clicks as input. However, these interactions do not contain information about explicit attributes of the target-of-interest and thus cannot quickly specify what the selected object exactly is, especially when there are diverse scales of candidate objects or the target-of-interest contains multiple objects. Therefore, excessive user interactions are often required to reach desirable results. On the other hand, in existing approaches attribute information of objects is often not well utilized in interactive segmentation. We propose to employ phrase expressions as another interaction input to infer the attributes of target object. In this way, we can 1) leverage spatial clicks to locate the target object and 2) utilize semantic phrases to qualify the attributes of the target object. Specifically, the phrase expressions focus on “what” the target object is and the spatial clicks are in charge of “where” the target object is, which together help to accurately segment the target-of-interest with smaller number of interactions. Moreover, the proposed approach is flexible in terms of interaction modes and can efficiently handle complex scenarios by leveraging the strengths of each type of input. Our multi-modal phrase+click approach achieves new state-of-the-art performance on interactive segmentation. To the best of our knowledge, this is the first work to leverage both clicks and phrases for interactive segmentation.
Henghui Ding; Scott Cohen; Brian Price; Xudong Jiang. PhraseClick: Toward Achieving Flexible Interactive Segmentation by Phrase and Click. Transactions on Petri Nets and Other Models of Concurrency XV 2020, 417 -435.
AMA StyleHenghui Ding, Scott Cohen, Brian Price, Xudong Jiang. PhraseClick: Toward Achieving Flexible Interactive Segmentation by Phrase and Click. Transactions on Petri Nets and Other Models of Concurrency XV. 2020; ():417-435.
Chicago/Turabian StyleHenghui Ding; Scott Cohen; Brian Price; Xudong Jiang. 2020. "PhraseClick: Toward Achieving Flexible Interactive Segmentation by Phrase and Click." Transactions on Petri Nets and Other Models of Concurrency XV , no. : 417-435.
Motivated by the previous success of Two-Dimensional Convolutional Neural Network (2D CNN) on image recognition, researchers endeavor to leverage it to characterize videos. However, one limitation of applying 2D CNN to analyze videos is that different frames of a video share the same 2D CNN kernels, which may result in repeated and redundant information utilization, especially in the spatial semantics extraction process, hence neglecting the critical variations among frames. In this paper, we attempt to tackle this issue through two ways. 1) Design a sequential channel filtering mechanism, i.e., Progressive Enhancement Module (PEM), to excite the discriminative channels of features from different frames step by step, and thus avoid repeated information extraction. 2) Create a Temporal Diversity Loss (TD Loss) to force the kernels to concentrate on and capture the variations among frames rather than the image regions with similar appearance. Our method is evaluated on benchmark temporal reasoning datasets Something-Something V1 and V2, and it achieves visible improvements over the best competitor by \(2.4\%\) and \(1.3\%\), respectively. Besides, performance improvements over the 2D-CNN-based state-of-the-arts on the large-scale dataset Kinetics are also witnessed.
Junwu Weng; Donghao Luo; Yabiao Wang; Ying Tai; Chengjie Wang; Jilin Li; Feiyue Huang; Xudong Jiang; Junsong Yuan. Temporal Distinct Representation Learning for Action Recognition. Transactions on Petri Nets and Other Models of Concurrency XV 2020, 363 -378.
AMA StyleJunwu Weng, Donghao Luo, Yabiao Wang, Ying Tai, Chengjie Wang, Jilin Li, Feiyue Huang, Xudong Jiang, Junsong Yuan. Temporal Distinct Representation Learning for Action Recognition. Transactions on Petri Nets and Other Models of Concurrency XV. 2020; ():363-378.
Chicago/Turabian StyleJunwu Weng; Donghao Luo; Yabiao Wang; Ying Tai; Chengjie Wang; Jilin Li; Feiyue Huang; Xudong Jiang; Junsong Yuan. 2020. "Temporal Distinct Representation Learning for Action Recognition." Transactions on Petri Nets and Other Models of Concurrency XV , no. : 363-378.
Paulo Lobato Correia; Xudong Jiang. Editorial: Keeping identity in times of change. IET Biometrics 2020, 9, 223 -223.
AMA StylePaulo Lobato Correia, Xudong Jiang. Editorial: Keeping identity in times of change. IET Biometrics. 2020; 9 (6):223-223.
Chicago/Turabian StylePaulo Lobato Correia; Xudong Jiang. 2020. "Editorial: Keeping identity in times of change." IET Biometrics 9, no. 6: 223-223.
Unmanned aerial vehicles (UAVs) have been used in a wide range of applications and become an increasingly important radar target. To better model radar data and to tackle the curse of dimensionality, a three-step classification framework is proposed for UAV detection. First we propose to utilize the greedy subspace clustering to handle potential outliers and the complex sample distribution of radar data. Parameters of the resulting multi-Gaussian model, especially the covariance matrices, could not be reliably estimated due to insufficient training samples and the high dimensionality. Thus, in the second step, a multi-Gaussian subspace reliability analysis is proposed to handle the unreliable feature dimensions of these covariance matrices. To address the challenges of classifying samples using the complex multi-Gaussian model and to fuse the distances of a sample to different clusters at different dimensionalities, a subspace-fusion scheme is proposed in the third step. The proposed approach is validated on a large benchmark dataset, which significantly outperforms the state-of-the-art approaches.
Jianfeng Ren; Xudong Jiang. A Three-Step Classification Framework to Handle Complex Data Distribution for Radar UAV Detection. Pattern Recognition 2020, 111, 107709 .
AMA StyleJianfeng Ren, Xudong Jiang. A Three-Step Classification Framework to Handle Complex Data Distribution for Radar UAV Detection. Pattern Recognition. 2020; 111 ():107709.
Chicago/Turabian StyleJianfeng Ren; Xudong Jiang. 2020. "A Three-Step Classification Framework to Handle Complex Data Distribution for Radar UAV Detection." Pattern Recognition 111, no. : 107709.
In this paper, we address the challenging task of estimating 6D object poses from a single RGB image. Motivated by the deep learning-based object detection methods, we propose a concise and efficient network that integrates 6D object pose parameter estimation into the object detection framework. Furthermore, for more robust estimation to occlusion, a nonlocal self-attention module is introduced. The experimental results show that the proposed method reaches the state-ofthe-art performance on the YCB-video and the Linemod datasets.
Jianhan Mei; Henghui Ding; Xudong Jiang. Object 6D pose estimation with non-local attention. Twelfth International Conference on Digital Image Processing (ICDIP 2020) 2020, 11519, 115191H .
AMA StyleJianhan Mei, Henghui Ding, Xudong Jiang. Object 6D pose estimation with non-local attention. Twelfth International Conference on Digital Image Processing (ICDIP 2020). 2020; 11519 ():115191H.
Chicago/Turabian StyleJianhan Mei; Henghui Ding; Xudong Jiang. 2020. "Object 6D pose estimation with non-local attention." Twelfth International Conference on Digital Image Processing (ICDIP 2020) 11519, no. : 115191H.
This paper aims to calibrate the orientation of glass and the field of view of the camera from a single reflection-contaminated image. We show how a reflective amplitude coefficient map can be used as a calibration cue. Different from existing methods, the proposed solution is free from image contents. To reduce the impact of a noisy calibration cue estimated from a reflection-contaminated image, we propose two strategies: an optimization-based method that imposes part of though reliable entries on the map and a learning-based method that fully exploits all entries. We collect a dataset containing 320 samples as well as their camera parameters for evaluation. We demonstrate that our method not only facilitates a general single image camera calibration method that leverages image contents but also contributes to improving the performance of single image reflection removal. Furthermore, we show our byproduct output helps alleviate the ill-posed problem of estimating the panorama from a single image.
Qian Zheng; Jinnan Chen; Zhan Lu; Boxin Shi; Xudong Jiang; Kim-Hui Yap; Ling-Yu Duan; Alex C. Kot. What Does Plate Glass Reveal About Camera Calibration? 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2020, 3019 -3029.
AMA StyleQian Zheng, Jinnan Chen, Zhan Lu, Boxin Shi, Xudong Jiang, Kim-Hui Yap, Ling-Yu Duan, Alex C. Kot. What Does Plate Glass Reveal About Camera Calibration? 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 2020; ():3019-3029.
Chicago/Turabian StyleQian Zheng; Jinnan Chen; Zhan Lu; Boxin Shi; Xudong Jiang; Kim-Hui Yap; Ling-Yu Duan; Alex C. Kot. 2020. "What Does Plate Glass Reveal About Camera Calibration?" 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) , no. : 3019-3029.
Skin lesion segmentation from dermoscopy images is a fundamental yet challenging task in the computer-aided skin diagnosis system due to the large variations in terms of their views and scales of lesion areas. We propose a novel and effective generative adversarial network (GAN) to meet these challenges. Specifically, this network architecture integrates two modules: a skip connection and dense convolution U-Net (UNet-SCDC) based segmentation module and a dual discrimination (DD) module. While the UNet-SCDC module uses dense dilated convolution blocks to generate a deep representation that preserves fine-grained information, the DD module makes use of two discriminators to jointly decide whether the input of the discriminators is real or fake. While one discriminator, with a traditional adversarial loss, focuses on the differences at the boundaries of the generated segmentation masks and the ground truths, the other examines the contextual environment of target object in the original image using a conditional discriminative loss. We integrate these two modules and train the proposed GAN in an end-to-end manner. The proposed GAN is evaluated on the public International Skin Imaging Collaboration (ISIC) Skin Lesion Challenge Datasets of 2017 and 2018. Extensive experimental results demonstrate that the proposed network achieves superior segmentation performance to state-of-the-art methods.
Baiying Lei; Zaimin Xia; Feng Jiang; Xudong Jiang; Zongyuan Ge; Yanwu Xu; Jing Qin; Siping Chen; Tianfu Wang; Shuqiang Wang. Skin lesion segmentation via generative adversarial networks with dual discriminators. Medical Image Analysis 2020, 64, 101716 .
AMA StyleBaiying Lei, Zaimin Xia, Feng Jiang, Xudong Jiang, Zongyuan Ge, Yanwu Xu, Jing Qin, Siping Chen, Tianfu Wang, Shuqiang Wang. Skin lesion segmentation via generative adversarial networks with dual discriminators. Medical Image Analysis. 2020; 64 ():101716.
Chicago/Turabian StyleBaiying Lei; Zaimin Xia; Feng Jiang; Xudong Jiang; Zongyuan Ge; Yanwu Xu; Jing Qin; Siping Chen; Tianfu Wang; Shuqiang Wang. 2020. "Skin lesion segmentation via generative adversarial networks with dual discriminators." Medical Image Analysis 64, no. : 101716.
The goal of early action recognition is to predict action label when the sequence is partially observed. The existing methods treat the early action recognition task as sequential classification problems on different observation ratios of an action sequence. Since these models are trained by differentiating positive category from all negative classes, the diverse information of different negative categories is ignored, which we believe can be collected to help improve the recognition performance. In this paper, we step towards to a new direction by introducing category exclusion to early action recognition. We model the exclusion as a mask operation on the classification probability output of a pre-trained early action recognition classifier. Specifically, we use policy-based reinforcement learning to train an agent. The agent generates a series of binary masks to exclude interfering negative categories during action execution and hence help improve the recognition accuracy. The proposed method is evaluated on three benchmark recognition datasets, NTU-RGBD, First-Person Hand Action, as well as UCF-101. The proposed method enhances the recognition accuracy consistently over all different observation ratios on the three datasets, where the accuracy improvements on the early stages are especially significant.
Junwu Weng; Xudong Jiang; Wei-Long Zheng; Junsong Yuan. Early Action Recognition With Category Exclusion Using Policy-Based Reinforcement Learning. IEEE Transactions on Circuits and Systems for Video Technology 2020, 30, 4626 -4638.
AMA StyleJunwu Weng, Xudong Jiang, Wei-Long Zheng, Junsong Yuan. Early Action Recognition With Category Exclusion Using Policy-Based Reinforcement Learning. IEEE Transactions on Circuits and Systems for Video Technology. 2020; 30 (12):4626-4638.
Chicago/Turabian StyleJunwu Weng; Xudong Jiang; Wei-Long Zheng; Junsong Yuan. 2020. "Early Action Recognition With Category Exclusion Using Policy-Based Reinforcement Learning." IEEE Transactions on Circuits and Systems for Video Technology 30, no. 12: 4626-4638.
An autoencoder that learns a latent space in an unsupervised manner has many applications in signal processing. However, the latent space of an autoencoder does not pursue the same clustering goal as Kmeans or GMM. A recent work of Song et al proposes to artificially re-align each point in the latent space of an autoencoder to its nearest class neighbors during training. The resulting new latent space is found to be much more suitable for clustering, since clustering information is used. Inspired by Song et al, in this paper we propose several extensions to this technique. First, we propose a probabilistic approach to generalize Song's approach, such that Euclidean distance in the latent space is now represented by KL divergence. Second, as a consequence of this generalization we can now use probability distributions as inputs rather than points in the latent space. Third, we propose using Bayesian Gaussian mixture model for clustering in the latent space. We demonstrated our proposed method on digit recognition datasets, MNIST, USPS and SHVN as well as scene datasets, Scene15 and MIT67 with interesting findings.
Kart-Leong Lim; Xudong Jiang; Chenyu Yi. Deep Clustering With Variational Autoencoder. IEEE Signal Processing Letters 2020, 27, 231 -235.
AMA StyleKart-Leong Lim, Xudong Jiang, Chenyu Yi. Deep Clustering With Variational Autoencoder. IEEE Signal Processing Letters. 2020; 27 (99):231-235.
Chicago/Turabian StyleKart-Leong Lim; Xudong Jiang; Chenyu Yi. 2020. "Deep Clustering With Variational Autoencoder." IEEE Signal Processing Letters 27, no. 99: 231-235.
Semantic image segmentation aims to classify every pixel of a scene image to one of many classes. It implicitly involves object recognition, localization, and boundary delineation. In this paper, we propose a segmentation network called CGBNet to enhance the paring results by context encoding and multi-path decoding. We first propose a context encoding module that generates context contrasted local feature to make use of the informative context and the discriminative local information. This context encoding module greatly improves the segmentation performance, especially for inconspicuous objects. Furthermore, we propose a scale-selection scheme to selectively fuse the parsing results from different-scales of features at every spatial position. It adaptively selects appropriate score maps from rich scales of features. To improve the parsing results of boundary, we further propose a boundary delineation module that encourages the location-specific very-low-level feature near the boundaries to take part in the final prediction and suppresses them far from the boundaries. Without bells and whistles, the proposed segmentation network achieves very competitive performance in terms of all three different evaluation metrics consistently on the four popular scene segmentation datasets, Pascal Context, SUN-RGBD, Sift Flow, and COCO Stuff.
Henghui Ding; Xudong Jiang; Bing Shuai; Ai Qun Liu; Gang Wang. Semantic Segmentation With Context Encoding and Multi-Path Decoding. IEEE Transactions on Image Processing 2020, 29, 3520 -3533.
AMA StyleHenghui Ding, Xudong Jiang, Bing Shuai, Ai Qun Liu, Gang Wang. Semantic Segmentation With Context Encoding and Multi-Path Decoding. IEEE Transactions on Image Processing. 2020; 29 ():3520-3533.
Chicago/Turabian StyleHenghui Ding; Xudong Jiang; Bing Shuai; Ai Qun Liu; Gang Wang. 2020. "Semantic Segmentation With Context Encoding and Multi-Path Decoding." IEEE Transactions on Image Processing 29, no. : 3520-3533.
Qian Zheng; Yiming Jia; Boxin Shi; Xudong Jiang; Lingyu Duan; Alex Kot. SPLINE-Net: Sparse Photometric Stereo Through Lighting Interpolation and Normal Estimation Networks. 2019 IEEE/CVF International Conference on Computer Vision (ICCV) 2019, 1 .
AMA StyleQian Zheng, Yiming Jia, Boxin Shi, Xudong Jiang, Lingyu Duan, Alex Kot. SPLINE-Net: Sparse Photometric Stereo Through Lighting Interpolation and Normal Estimation Networks. 2019 IEEE/CVF International Conference on Computer Vision (ICCV). 2019; ():1.
Chicago/Turabian StyleQian Zheng; Yiming Jia; Boxin Shi; Xudong Jiang; Lingyu Duan; Alex Kot. 2019. "SPLINE-Net: Sparse Photometric Stereo Through Lighting Interpolation and Normal Estimation Networks." 2019 IEEE/CVF International Conference on Computer Vision (ICCV) , no. : 1.
Henghui Ding; Xudong Jiang; Ai Qun Liu; Nadia Magnenat Thalmann; Gang Wang. Boundary-Aware Feature Propagation for Scene Segmentation. 2019 IEEE/CVF International Conference on Computer Vision (ICCV) 2019, 1 .
AMA StyleHenghui Ding, Xudong Jiang, Ai Qun Liu, Nadia Magnenat Thalmann, Gang Wang. Boundary-Aware Feature Propagation for Scene Segmentation. 2019 IEEE/CVF International Conference on Computer Vision (ICCV). 2019; ():1.
Chicago/Turabian StyleHenghui Ding; Xudong Jiang; Ai Qun Liu; Nadia Magnenat Thalmann; Gang Wang. 2019. "Boundary-Aware Feature Propagation for Scene Segmentation." 2019 IEEE/CVF International Conference on Computer Vision (ICCV) , no. : 1.
This paper presents a novel adversarial scheme to perform image denoising for the tasks of rain streak removal and reflection removal. Similar to several previous works, the proposed method first estimates a prior image and then uses it to guide the inference of noise-free image. The novelty of our approach is to jointly learn the gradient and noise-free image based on an adversarial scheme. More specifically, we use the gradient map as the prior image. The inferred noise-free image guided by an estimated gradient is regarded as a negative sample, while the noise-free image guided by the ground truth of a gradient is taken as a positive sample. With the anchor defined by the ground truth of noise-free image, we play a min-max game to jointly train two optimizers for the estimation of the gradient and the inference of noise-free images. We show that both prior image and noise-free image can be accurately obtained under this adversarial scheme. Our state-of-the-art performance achieved on two public benchmark datasets validate the effectiveness of our approach.
Qian Zheng; Boxin Shi; Xudong Jiang; Ling-Yu Duan; Alex C. Kot. Denoising Adversarial Networks for Rain Removal and Reflection Removal. 2019 IEEE International Conference on Image Processing (ICIP) 2019, 2766 -2770.
AMA StyleQian Zheng, Boxin Shi, Xudong Jiang, Ling-Yu Duan, Alex C. Kot. Denoising Adversarial Networks for Rain Removal and Reflection Removal. 2019 IEEE International Conference on Image Processing (ICIP). 2019; ():2766-2770.
Chicago/Turabian StyleQian Zheng; Boxin Shi; Xudong Jiang; Ling-Yu Duan; Alex C. Kot. 2019. "Denoising Adversarial Networks for Rain Removal and Reflection Removal." 2019 IEEE International Conference on Image Processing (ICIP) , no. : 2766-2770.