This page has only limited features, please log in for full access.

Unclaimed
Jian Yang
School of Computer Science and Technology, Nanjing University of Science and Technology, Suzhou 210094, China

Basic Info

Basic Info is private.

Honors and Awards

The user has no records in this section


Career Timeline

The user has no records in this section.


Short Biography

The user biography is not available.
Following
Followers
Co Authors
The list of users this user is following is empty.
Following: 0 users

Feed

Journal article
Published: 02 August 2021 in IEEE Transactions on Intelligent Transportation Systems
Reads 0
Downloads 0

In recent years, how to strike a good trade-off between accuracy, inference speed, and model size has become the core issue for real-time semantic segmentation applications, which plays a vital role in real-world scenarios such as autonomous driving systems and drones. In this study, we devise a novel lightweight network using a multi-scale context fusion (MSCFNet) scheme, which explores an asymmetric encoder-decoder architecture to alleviate these problems. More specifically, the encoder adopts some developed efficient asymmetric residual (EAR) modules, which are composed of factorization depth-wise convolution and dilation convolution. Meanwhile, instead of complicated computation, simple deconvolution is applied in the decoder to further reduce the amount of parameters while still maintaining the high segmentation accuracy. Also, MSCFNet has branches with efficient attention modules from different stages of the network to well capture multi-scale contextual information. Then we combine them before the final classification to enhance the expression of the features and improve the segmentation efficiency. Comprehensive experiments on challenging datasets have demonstrated that the proposed MSCFNet, which contains only 1.15M parameters, achieves 71.9% Mean IoU on the Cityscapes testing dataset and can run at over 50 FPS on a single Titan XP GPU configuration.

ACS Style

Guangwei Gao; Guoan Xu; Yi Yu; Jin Xie; Jian Yang; Dong Yue. MSCFNet: A Lightweight Network With Multi-Scale Context Fusion for Real-Time Semantic Segmentation. IEEE Transactions on Intelligent Transportation Systems 2021, PP, 1 -11.

AMA Style

Guangwei Gao, Guoan Xu, Yi Yu, Jin Xie, Jian Yang, Dong Yue. MSCFNet: A Lightweight Network With Multi-Scale Context Fusion for Real-Time Semantic Segmentation. IEEE Transactions on Intelligent Transportation Systems. 2021; PP (99):1-11.

Chicago/Turabian Style

Guangwei Gao; Guoan Xu; Yi Yu; Jin Xie; Jian Yang; Dong Yue. 2021. "MSCFNet: A Lightweight Network With Multi-Scale Context Fusion for Real-Time Semantic Segmentation." IEEE Transactions on Intelligent Transportation Systems PP, no. 99: 1-11.

Journal article
Published: 01 July 2021 in IEEE Transactions on Neural Networks and Learning Systems
Reads 0
Downloads 0

Magnetic resonance (MR) image acquisition is an inherently prolonged process, whose acceleration has long been the subject of research. This is commonly achieved by obtaining multiple undersampled images, simultaneously, through parallel imaging. In this article, we propose the dual-octave network (DONet), which is capable of learning multiscale spatial-frequency features from both the real and imaginary components of MR data, for parallel fast MR image reconstruction. More specifically, our DONet consists of a series of dual-octave convolutions (Dual-OctConvs), which are connected in a dense manner for better reuse of features. In each Dual-OctConv, the input feature maps and convolutional kernels are first split into two components (i.e., real and imaginary) and then divided into four groups according to their spatial frequencies. Then, our Dual-OctConv conducts intragroup information updating and intergroup information exchange to aggregate the contextual information across different groups. Our framework provides three appealing benefits: 1) it encourages information interaction and fusion between the real and imaginary components at various spatial frequencies to achieve richer representational capacity; 2) the dense connections between the real and imaginary groups in each Dual-OctConv make the propagation of features more efficient by feature reuse; and 3) DONet enlarges the receptive field by learning multiple spatial-frequency features of both the real and imaginary components. Extensive experiments on two popular datasets (i.e., clinical knee and fastMRI), under different undersampling patterns and acceleration factors, demonstrate the superiority of our model in accelerated parallel MR image reconstruction.

ACS Style

Chun-Mei Feng; Zhanyuan Yang; Huazhu Fu; Yong Xu; Jian Yang; Ling Shao. DONet: Dual-Octave Network for Fast MR Image Reconstruction. IEEE Transactions on Neural Networks and Learning Systems 2021, PP, 1 -11.

AMA Style

Chun-Mei Feng, Zhanyuan Yang, Huazhu Fu, Yong Xu, Jian Yang, Ling Shao. DONet: Dual-Octave Network for Fast MR Image Reconstruction. IEEE Transactions on Neural Networks and Learning Systems. 2021; PP (99):1-11.

Chicago/Turabian Style

Chun-Mei Feng; Zhanyuan Yang; Huazhu Fu; Yong Xu; Jian Yang; Ling Shao. 2021. "DONet: Dual-Octave Network for Fast MR Image Reconstruction." IEEE Transactions on Neural Networks and Learning Systems PP, no. 99: 1-11.

Journal article
Published: 29 June 2021 in Pattern Recognition
Reads 0
Downloads 0

Recently, Self-Expressive-based Subspace Clustering (SESC) has been widely applied in pattern clustering and machine learning as it aims to learn a representation that can faithfully reflect the correlation between data points. However, most existing SESC methods directly use the original data as the dictionary, which miss the intrinsic structure (e.g., low-rank and nonlinear) of the real-word data. To address this problem, we propose a novel Projection Low-Rank Subspace Clustering (PLRSC) method by integrating feature extraction and subspace clustering into a unified framework. In particular, PLRSC learns a projection transformation to extract the low-dimensional features and utilizes a low-rank regularizer to ensure the informative and important structures of the extracted features. The extracted low-rank features effectively enhance the self-expressive property of the dictionary. Furthermore, we extend PLRSC to a nonlinear version (i.e., NPLRSC) by integrating a nonlinear activator into the projection transformation. NPLRSC cannot only effectively extract features but also guarantee the data structure of the extracted features. The corresponding optimization problem is solved by the Alternating Direction Method (ADM), and we also prove that the algorithm converges to a stationary point. Experimental results on the real-world datasets validate the superior of our model over the existing subspace clustering methods.

ACS Style

Yesong Xu; Shuo Chen; Jun Li; Lei Luo; Jian Yang. Learnable low-rank latent dictionary for subspace clustering. Pattern Recognition 2021, 120, 108142 .

AMA Style

Yesong Xu, Shuo Chen, Jun Li, Lei Luo, Jian Yang. Learnable low-rank latent dictionary for subspace clustering. Pattern Recognition. 2021; 120 ():108142.

Chicago/Turabian Style

Yesong Xu; Shuo Chen; Jun Li; Lei Luo; Jian Yang. 2021. "Learnable low-rank latent dictionary for subspace clustering." Pattern Recognition 120, no. : 108142.

Journal article
Published: 03 June 2021 in IEEE Signal Processing Letters
Reads 0
Downloads 0

Video inpainting aims to fill missing regions with plausible content in a video sequence. Deep learning-based video inpainting methods have made promising progress over the past few years. However, these methods tend to generate degraded completion content, such as missing textural details. To address this issue, we propose a novel Deformable Alignment and Pyramid-context Completion Network for video inpainting (DAPC-Net), which takes advantage of temporal redundancy information among video sequence. Specifically, we construct a deformable convolution alignment network (DANet) for aligning reference frame at the feature level. After alignment, we further devise a pyramid-context completion network (PCNet) to complete missing regions of the target frame. Particularly, the pyramid completion mechanism and cross-scale transference strategy are used to ensure the visual and semantic coherence of the completed target frame. Experimental results show that the proposed method not only achieves better quantitative and qualitative performance but also improves the inference speed by 35.4%.

ACS Style

Zhiliang Wu; Kang Zhang; Hanyu Xuan; Jian Yang; Yan Yan. DAPC-Net: Deformable Alignment and Pyramid Context Completion Networks for Video Inpainting. IEEE Signal Processing Letters 2021, 28, 1145 -1149.

AMA Style

Zhiliang Wu, Kang Zhang, Hanyu Xuan, Jian Yang, Yan Yan. DAPC-Net: Deformable Alignment and Pyramid Context Completion Networks for Video Inpainting. IEEE Signal Processing Letters. 2021; 28 (99):1145-1149.

Chicago/Turabian Style

Zhiliang Wu; Kang Zhang; Hanyu Xuan; Jian Yang; Yan Yan. 2021. "DAPC-Net: Deformable Alignment and Pyramid Context Completion Networks for Video Inpainting." IEEE Signal Processing Letters 28, no. 99: 1145-1149.

Editorial
Published: 18 May 2021 in Computer Vision and Image Understanding
Reads 0
Downloads 0
ACS Style

Jinshan Pan; Deqing Sun; Jian Yang; Wangmeng Zuo; Paolo Favaro; Yasuyuki Matsushita; Ming-Hsuan Yang. Editorial for CVIU_DL for image restoration. Computer Vision and Image Understanding 2021, 208-209, 103222 .

AMA Style

Jinshan Pan, Deqing Sun, Jian Yang, Wangmeng Zuo, Paolo Favaro, Yasuyuki Matsushita, Ming-Hsuan Yang. Editorial for CVIU_DL for image restoration. Computer Vision and Image Understanding. 2021; 208-209 ():103222.

Chicago/Turabian Style

Jinshan Pan; Deqing Sun; Jian Yang; Wangmeng Zuo; Paolo Favaro; Yasuyuki Matsushita; Ming-Hsuan Yang. 2021. "Editorial for CVIU_DL for image restoration." Computer Vision and Image Understanding 208-209, no. : 103222.

Journal article
Published: 14 May 2021 in IEEE Transactions on Knowledge and Data Engineering
Reads 0
Downloads 0

Corporate relative valuation (CRV) refers to the process of comparing a company's value from company products, core staff and other related information, so that we can assess the company's market value, which is critical for venture capital firms. Traditionally, relative valuation methods heavily rely on tedious and expensive human efforts, especially for non-publicly listed companies. However, the availability of information about company's invisible assets, such as patents, talent, and investors, enables a new paradigm for learning and evaluating corporate relative values automatically. Indeed, in this paper, we reveal that, if the companies and their core members are formed as a heterogeneous graph and the attributes of different nodes include semantically-rich multi-modal data, it is able to extract a latent embedding for each company. Along this line, we develop an end-to-end heterogeneous multi-modal graph neural network method, named HM $^2$ . Specifically, HM $^2$ firstly perform the representation learning for heterogeneous neighbors of input company by taking relationships among nodes into consideration, which aggregates node attributes via linkage-aware multi-head attention mechanism, rather than multi-instance based methods. Then, HM $^2$ adopts the self-attention network to aggregate different modal embeddings for final prediction, and employs dynamic triplet loss with embeddings of competitors as the constraint.

ACS Style

Yang Yang; Jia-Qi Yang; Ran Bao; De-Chuan Zhan; Hengshu Zhu; Xiao-Ru Gao; Hui Xiong; Jian Yang. Corporate Relative Valuation using Heterogeneous Multi-Modal Graph Neural Network. IEEE Transactions on Knowledge and Data Engineering 2021, PP, 1 -1.

AMA Style

Yang Yang, Jia-Qi Yang, Ran Bao, De-Chuan Zhan, Hengshu Zhu, Xiao-Ru Gao, Hui Xiong, Jian Yang. Corporate Relative Valuation using Heterogeneous Multi-Modal Graph Neural Network. IEEE Transactions on Knowledge and Data Engineering. 2021; PP (99):1-1.

Chicago/Turabian Style

Yang Yang; Jia-Qi Yang; Ran Bao; De-Chuan Zhan; Hengshu Zhu; Xiao-Ru Gao; Hui Xiong; Jian Yang. 2021. "Corporate Relative Valuation using Heterogeneous Multi-Modal Graph Neural Network." IEEE Transactions on Knowledge and Data Engineering PP, no. 99: 1-1.

Journal article
Published: 10 May 2021 in IEEE Transactions on Geoscience and Remote Sensing
Reads 0
Downloads 0

Recently, graph convolutional network (GCN) has progressed significantly and gained increasing attention in hyperspectral image (HSI) classification due to its impressive representation power. However, existing GCN-based methods do not give full consideration to the multiscale spatial information, since the convolution operations are governed by fixed neighborhood. As a result, their performances can be limited, particularly in the regions with diverse land cover appearances. In this article, we develop a new dual interactive GCN (DIGCN) which introduces the dual GCN branches to capture spatial information at different scales. More significantly, the dual interactive module is embedded across the GCN branches, so that the correlation of multiscale spatial information can be leveraged to refine the graph information. To be concrete, the edge information contained in one GCN branch can be refined by incorporating the feature representations from the other branch. Analogously, improved feature representations can be generated in one GCN branch by fusing the edge information from the other branch. As such, the refined graph information can help enhance the representation power of the model. Furthermore, to avoid the negative effects of the manually constructed graph, our proposed model adaptively learns a discriminative region-induced graph, which also accelerates the convolution operation. We comprehensively evaluate the proposed method on four commonly used HSI benchmark data sets, and the state-of-the-art results can be achieved when compared with several typical HSI classification methods.

ACS Style

Sheng Wan; Shirui Pan; Ping Zhong; Xiaojun Chang; Jian Yang; Chen Gong. Dual Interactive Graph Convolutional Networks for Hyperspectral Image Classification. IEEE Transactions on Geoscience and Remote Sensing 2021, PP, 1 -14.

AMA Style

Sheng Wan, Shirui Pan, Ping Zhong, Xiaojun Chang, Jian Yang, Chen Gong. Dual Interactive Graph Convolutional Networks for Hyperspectral Image Classification. IEEE Transactions on Geoscience and Remote Sensing. 2021; PP (99):1-14.

Chicago/Turabian Style

Sheng Wan; Shirui Pan; Ping Zhong; Xiaojun Chang; Jian Yang; Chen Gong. 2021. "Dual Interactive Graph Convolutional Networks for Hyperspectral Image Classification." IEEE Transactions on Geoscience and Remote Sensing PP, no. 99: 1-14.

Article
Published: 08 April 2021 in International Journal of Computer Vision
Reads 0
Downloads 0

Pedestrian detection and re-identification have progressed significantly in the last few years. However, occluded people are notoriously hard to detect and recognize, as their appearance varies substantially depending on a wide range of occlusion patterns. In this paper, we aim to propose a simple and compact method based on CNNs for occlusion handling. We start with interpreting CNN channel features of a pedestrian detector, and we find that different channels activate responses for different body parts respectively. These findings motivate us to employ an attention mechanism across channels to represent various occlusion patterns in one single model, as each occlusion pattern can be formulated as some specific combination of body parts. Therefore, an attention network with self or external guidances is proposed as an add-on to the baseline CNN method. Also, we propose an attention guided self-paced learning method to balance the optimization across different occlusion levels. Our proposed method shows significant improvements over the baseline methods for both pedestrian detection and re-identification tasks. For pedestrian detection, we achieve a considerable improvement of 8pp to the baseline FasterRCNN detector on the heavy occlusion subset of CityPersons and on Caltech we outperform the state-of-the-art method by 5pp. For pedestrian re-identification, our method surpasses the baseline and achieves state-of-the-art performance on multiple re-identification benchmarks.

ACS Style

Shanshan Zhang; Di Chen; Jian Yang; Bernt Schiele. Guided Attention in CNNs for Occluded Pedestrian Detection and Re-identification. International Journal of Computer Vision 2021, 129, 1875 -1892.

AMA Style

Shanshan Zhang, Di Chen, Jian Yang, Bernt Schiele. Guided Attention in CNNs for Occluded Pedestrian Detection and Re-identification. International Journal of Computer Vision. 2021; 129 (6):1875-1892.

Chicago/Turabian Style

Shanshan Zhang; Di Chen; Jian Yang; Bernt Schiele. 2021. "Guided Attention in CNNs for Occluded Pedestrian Detection and Re-identification." International Journal of Computer Vision 129, no. 6: 1875-1892.

Journal article
Published: 01 April 2021 in Pattern Recognition
Reads 0
Downloads 0

The popularity of smartphones with digital cameras makes photographing using smartphones an important daily activity. Moiré patterns can easily appear when shooting objects with rich textures, such as computer screens, and will severely degrade the image quality. Image demoiréing is an important image restoration task that aims to remove moiré patterns and reveal the underlying clean image. Two key properties of moiré patterns—the widely distributed frequency spectrum and the dynamic nature of moiré textures—challenge the image demoiréing task. In this paper, we propose an improved Multi-scale convolutional network with Dynamic feature encoding for image DeMoiréing (MDDM+). We design two schemes in our network to respectively attack the broad frequency spectrum and the dynamic texture of moiré: a multi-scale structure to process images at different spatial resolutions and a dynamic feature encoding module to encode the texture dynamically. To capture more moiré and texture information from different frequencies, we further propose a novel L1 wavelet loss used to train our model. Extensive experiments on two benchmarks show that our proposed image demoiréing network can outperform the state of the arts in terms of fidelity as well as perception.

ACS Style

Xi Cheng; Zhenyong Fu; Jian Yang. Improved multi-scale dynamic feature encoding network for image demoiréing. Pattern Recognition 2021, 116, 107970 .

AMA Style

Xi Cheng, Zhenyong Fu, Jian Yang. Improved multi-scale dynamic feature encoding network for image demoiréing. Pattern Recognition. 2021; 116 ():107970.

Chicago/Turabian Style

Xi Cheng; Zhenyong Fu; Jian Yang. 2021. "Improved multi-scale dynamic feature encoding network for image demoiréing." Pattern Recognition 116, no. : 107970.

Journal article
Published: 10 March 2021 in IEEE Transactions on Neural Networks and Learning Systems
Reads 0
Downloads 0

Video object segmentation (VOS) is one of the most fundamental tasks for numerous sequent video applications. The crucial issue of online VOS is the drifting of segmenter when incrementally updated on continuous video frames under unconfident supervision constraints. In this work, we propose a self-teaching VOS (ST-VOS) method to make segmenter to learn online adaptation confidently as much as possible. In the segmenter learning at each time slice, the segment hypothesis and segmenter update are enclosed into a self-looping optimization circle such that they can be mutually improved for each other. To reduce error accumulation of the self-looping process, we specifically introduce a metalearning strategy to learn how to do this optimization within only a few iteration steps. To this end, the learning rates of segmenter are adaptively derived through metaoptimization in the channel space of convolutional kernels. Furthermore, to better launch the self-looping process, we calculate an initial mask map through part detectors and motion flow to well-establish a foundation for subsequent refinement, which could result in the robustness of the segmenter update. Extensive experiments demonstrate that this ST idea can boost the performance of baselines, and in the meantime, our ST-VOS achieves encouraging performance on the DAVIS16, Youtube-objects, DAVIS17, and SegTrackV2 data sets, where, in particular, the accuracy of 75.7% in J-mean metric is obtained on the multi-instance DAVIS17 data set.

ACS Style

Chuanwei Zhou; Chunyan Xu; Zhen Cui; Tong Zhang; Jian Yang. Self-Teaching Video Object Segmentation. IEEE Transactions on Neural Networks and Learning Systems 2021, PP, 1 -15.

AMA Style

Chuanwei Zhou, Chunyan Xu, Zhen Cui, Tong Zhang, Jian Yang. Self-Teaching Video Object Segmentation. IEEE Transactions on Neural Networks and Learning Systems. 2021; PP (99):1-15.

Chicago/Turabian Style

Chuanwei Zhou; Chunyan Xu; Zhen Cui; Tong Zhang; Jian Yang. 2021. "Self-Teaching Video Object Segmentation." IEEE Transactions on Neural Networks and Learning Systems PP, no. 99: 1-15.

Journal article
Published: 01 March 2021 in IEEE Transactions on Neural Networks and Learning Systems
Reads 0
Downloads 0

Feature selection aims to select strongly relevant features and discard the rest. Recently, embedded feature selection methods, which incorporate feature weights learning into the training process of a classifier, have attracted much attention. However, traditional embedded methods merely focus on the combinatorial optimality of all selected features. They sometimes select the weakly relevant features with satisfactory combination abilities and leave out some strongly relevant features, thereby degrading the generalization performance. To address this issue, we propose a novel embedded framework for feature selection, termed feature selection boosted by unselected features (FSBUF). Specifically, we introduce an extra classifier for unselected features into the traditional embedded model and jointly learn the feature weights to maximize the classification loss of unselected features. As a result, the extra classifier recycles the unselected strongly relevant features to replace the weakly relevant features in the selected feature subset. Our final objective can be formulated as a minimax optimization problem, and we design an effective gradient-based algorithm to solve it. Furthermore, we theoretically prove that the proposed FSBUF is able to improve the generalization ability of traditional embedded feature selection methods. Extensive experiments on synthetic and real-world data sets exhibit the comprehensibility and superior performance of FSBUF.

ACS Style

Wei Zheng; Shuo Chen; Zhenyong Fu; Fa Zhu; Hui Yan; Jian Yang. Feature Selection Boosted by Unselected Features. IEEE Transactions on Neural Networks and Learning Systems 2021, PP, 1 -13.

AMA Style

Wei Zheng, Shuo Chen, Zhenyong Fu, Fa Zhu, Hui Yan, Jian Yang. Feature Selection Boosted by Unselected Features. IEEE Transactions on Neural Networks and Learning Systems. 2021; PP (99):1-13.

Chicago/Turabian Style

Wei Zheng; Shuo Chen; Zhenyong Fu; Fa Zhu; Hui Yan; Jian Yang. 2021. "Feature Selection Boosted by Unselected Features." IEEE Transactions on Neural Networks and Learning Systems PP, no. 99: 1-13.

Conference paper
Published: 25 February 2021 in Transactions on Petri Nets and Other Models of Concurrency XV
Reads 0
Downloads 0

Partial Label Learning (PLL) aims to train a classifier when each training instance is associated with a set of candidate labels, among which only one is correct but is not accessible during the training phase. The common strategy dealing with such ambiguous labeling information is to disambiguate the candidate label sets. Nonetheless, existing methods ignore the disambiguation difficulty of instances and adopt the single-trend training mechanism. The former would lead to the vulnerability of models to the false positive labels and the latter may arouse error accumulation problem. To remedy these two drawbacks, this paper proposes a novel approach termed “Network Cooperation with Progressive Disambiguation” (NCPD) for PLL. Specifically, we devise a progressive disambiguation strategy of which the disambiguation operations are performed on simple instances firstly and then gradually on more complicated ones. Therefore, the negative impacts brought by the false positive labels of complicated instances can be effectively mitigated as the disambiguation ability of the model has been strengthened via learning from the simple instances. Moreover, by employing artificial neural networks as the backbone, we utilize a network cooperation mechanism which trains two networks collaboratively by letting them interact with each other. As two networks have different disambiguation ability, such interaction is beneficial for both networks to reduce their respective disambiguation errors, and thus is much better than the existing algorithms with single-trend training process. Extensive experimental results on various benchmark and practical datasets demonstrate the superiority of our NCPD approach to other state-of-the-art PLL methods.

ACS Style

Yao Yao; Chen Gong; Jiehui Deng; Jian Yang. Network Cooperation with Progressive Disambiguation for Partial Label Learning. Transactions on Petri Nets and Other Models of Concurrency XV 2021, 471 -488.

AMA Style

Yao Yao, Chen Gong, Jiehui Deng, Jian Yang. Network Cooperation with Progressive Disambiguation for Partial Label Learning. Transactions on Petri Nets and Other Models of Concurrency XV. 2021; ():471-488.

Chicago/Turabian Style

Yao Yao; Chen Gong; Jiehui Deng; Jian Yang. 2021. "Network Cooperation with Progressive Disambiguation for Partial Label Learning." Transactions on Petri Nets and Other Models of Concurrency XV , no. : 471-488.

Journal article
Published: 23 February 2021 in IEEE Transactions on Pattern Analysis and Machine Intelligence
Reads 0
Downloads 0

This paper studies instance-dependent Positive and Unlabeled (PU) classification, where whether a positive example will be labeled (indicated by s) is not only related to the class label y, but also depends on the observation x. Therefore, the labeling probability on positive examples is not uniform as previous works assumed, but is biased to some simple or critical data points. To depict the above dependency relationship, a graphical model is built in this paper which further leads to a maximization problem on the induced likelihood function regarding P(s,y|x). By utilizing the well-known EM and Adam optimization techniques, the labeling probability of any positive example P(s=1|y=1,x) as well as the classifier induced by P(y|x) can be acquired. Theoretically, we prove that the critical solution always exists, and is locally unique for linear model if some sufficient conditions are met. Moreover, we upper bound the generalization error for both linear logistic and non-linear network instantiations of our algorithm. Empirically, we compare our method with state-of-the-art instance-independent and instance-dependent PU algorithms on a wide range of synthetic, benchmark and real-world datasets, and the experimental results firmly demonstrate the advantage of the proposed method over the existing PU approaches.

ACS Style

Chen Gong; Qizhou Wang; Tongliang Liu; Bo Han; Jane J. You; Jian Yang; Dacheng Tao. Instance-Dependent Positive and Unlabeled Learning with Labeling Bias Estimation. IEEE Transactions on Pattern Analysis and Machine Intelligence 2021, PP, 1 -1.

AMA Style

Chen Gong, Qizhou Wang, Tongliang Liu, Bo Han, Jane J. You, Jian Yang, Dacheng Tao. Instance-Dependent Positive and Unlabeled Learning with Labeling Bias Estimation. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2021; PP (99):1-1.

Chicago/Turabian Style

Chen Gong; Qizhou Wang; Tongliang Liu; Bo Han; Jane J. You; Jian Yang; Dacheng Tao. 2021. "Instance-Dependent Positive and Unlabeled Learning with Labeling Bias Estimation." IEEE Transactions on Pattern Analysis and Machine Intelligence PP, no. 99: 1-1.

Journal article
Published: 26 January 2021 in Information Sciences
Reads 0
Downloads 0

Positive and Unlabeled learning (PU learning) aims to train a binary classifier solely based on positively labeled and unlabeled data when negatively labeled data are absent or distributed too diversely. However, none of the existing PU learning methods takes the class imbalance problem into account, which significantly neglects the minority class and is likely to generate a biased classifier. Therefore, this paper proposes a novel algorithm termed “Cost-Sensitive Positive and Unlabeled learning” (CSPU) which imposes different misclassification costs on different classes when conducting PU classification. Specifically, we assign distinct weights to the losses caused by false negative and false positive examples, and employ double hinge loss to build our CSPU algorithm under the framework of empirical risk minimization. Theoretically, we analyze the computational complexity, and also derive a generalization error bound of CSPU which guarantees the good performance of our algorithm on test data. Empirically, we compare CSPU with the state-of-the-art PU learning methods on synthetic dataset, OpenML benchmark datasets, and real-world datasets. The results clearly demonstrate the superiority of the proposed CSPU to other comparators in dealing with class imbalanced tasks.

ACS Style

Xiuhua Chen; Chen Gong; Jian Yang. Cost-sensitive positive and unlabeled learning. Information Sciences 2021, 558, 229 -245.

AMA Style

Xiuhua Chen, Chen Gong, Jian Yang. Cost-sensitive positive and unlabeled learning. Information Sciences. 2021; 558 ():229-245.

Chicago/Turabian Style

Xiuhua Chen; Chen Gong; Jian Yang. 2021. "Cost-sensitive positive and unlabeled learning." Information Sciences 558, no. : 229-245.

Journal article
Published: 15 December 2020 in IEEE Transactions on Pattern Analysis and Machine Intelligence
Reads 0
Downloads 0

In this paper, we propose a general framework termed "Centroid Estimation with Guaranteed Efficiency" (CEGE) for Weakly Supervised Learning (WSL) with incomplete, inexact, and inaccurate supervision. The core of our framework is to devise an unbiased and statistically efficient risk estimator that is applicable to various weak supervision. Specifically, by decomposing the loss function (e.g., the squared loss and hinge loss) into a label-independent term and a label-dependent term, we discover that only the latter is influenced by the weak supervision and is related to the centroid of the entire dataset. Therefore, by constructing two auxiliary pseudo-labeled datasets with synthesized labels, we derive unbiased estimates of centroid based on the two auxiliary datasets, respectively. These two estimates are further linearly combined with a properly decided coefficient which makes the final combined estimate not only unbiased but also statistically efficient. This is better than some existing methods that only care about the unbiasedness of estimation but ignore the statistical efficiency. The good statistical efficiency of the derived estimator is guaranteed as we theoretically prove that it acquires the minimum variance when estimating the centroid. As a result, intensive experimental results on a large number of benchmark datasets demonstrate that our CEGE generally obtains better performance than the existing approaches related to typical WSL problems including semi-supervised learning, positive-unlabeled learning, multiple instance learning, and label noise learning.

ACS Style

Chen Gong; Jian Yang; Jane J. You; Masashi Sugiyama. Centroid Estimation with Guaranteed Efficiency: A General Framework for Weakly Supervised Learning. IEEE Transactions on Pattern Analysis and Machine Intelligence 2020, PP, 1 -1.

AMA Style

Chen Gong, Jian Yang, Jane J. You, Masashi Sugiyama. Centroid Estimation with Guaranteed Efficiency: A General Framework for Weakly Supervised Learning. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2020; PP (99):1-1.

Chicago/Turabian Style

Chen Gong; Jian Yang; Jane J. You; Masashi Sugiyama. 2020. "Centroid Estimation with Guaranteed Efficiency: A General Framework for Weakly Supervised Learning." IEEE Transactions on Pattern Analysis and Machine Intelligence PP, no. 99: 1-1.

Conference paper
Published: 19 November 2020 in Transactions on Petri Nets and Other Models of Concurrency XV
Reads 0
Downloads 0

In the past few years, we have witnessed the great progress of image super-resolution (SR) thanks to the power of deep learning. However, a major limitation of the current image SR approaches is that they assume a pre-determined degradation model or kernel, e.g. bicubic, controls the image degradation process. This makes them easily fail to generalize in a real-world or non-ideal environment since the degradation model of an unseen image may not obey the pre-determined kernel used when training the SR model. In this work, we introduce a simple yet effective zero-shot image super-resolution model. Our zero-shot SR model learns an image-specific super-resolution network (SRN) from a low-resolution input image alone, without relying on external training sets. To circumvent the difficulty caused by the unknown internal degradation model of an image, we propose to learn an image-specific degradation simulation network (DSN) together with our image-specific SRN. Specifically, we exploit the depth information, naturally indicating the scales of local image patches, of an image to extract the unpaired high/low-resolution patch collection to train our networks. According to the benchmark test on four datasets with depth labels or estimated depth maps, our proposed depth guided degradation model learning-based image super-resolution (DGDML-SR) achieves visually pleasing results and can outperform the state-of-the-arts in perceptual metrics.

ACS Style

Xi Cheng; Zhenyong Fu; Jian Yang. Zero-Shot Image Super-Resolution with Depth Guided Internal Degradation Learning. Transactions on Petri Nets and Other Models of Concurrency XV 2020, 265 -280.

AMA Style

Xi Cheng, Zhenyong Fu, Jian Yang. Zero-Shot Image Super-Resolution with Depth Guided Internal Degradation Learning. Transactions on Petri Nets and Other Models of Concurrency XV. 2020; ():265-280.

Chicago/Turabian Style

Xi Cheng; Zhenyong Fu; Jian Yang. 2020. "Zero-Shot Image Super-Resolution with Depth Guided Internal Degradation Learning." Transactions on Petri Nets and Other Models of Concurrency XV , no. : 265-280.

Journal article
Published: 18 November 2020 in IEEE Transactions on Cybernetics
Reads 0
Downloads 0

Block-diagonal representation (BDR) is an effective subspace clustering method. The existing BDR methods usually obtain a self-expression coefficient matrix from the original features by a shallow linear model. However, the underlying structure of real-world data is often nonlinear, thus those methods cannot faithfully reflect the intrinsic relationship among samples. To address this problem, we propose a novel latent BDR (LBDR) model to perform the subspace clustering on a nonlinear structure, which jointly learns an autoencoder and a BDR matrix. The autoencoder, which consists of a nonlinear encoder and a linear decoder, plays an important role to learn features from the nonlinear samples. Meanwhile, the learned features are used as a new dictionary for a linear model with block-diagonal regularization, which can ensure good performances for spectral clustering. Moreover, we theoretically prove that the learned features are located in the linear space, thus ensuring the effectiveness of the linear model using self-expression. Extensive experiments on various real-world datasets verify the superiority of our LBDR over the state-of-the-art subspace clustering approaches.

ACS Style

Yesong Xu; Shuo Chen; Jun Li; Zongyan Han; Jian Yang. Autoencoder-Based Latent Block-Diagonal Representation for Subspace Clustering. IEEE Transactions on Cybernetics 2020, PP, 1 -11.

AMA Style

Yesong Xu, Shuo Chen, Jun Li, Zongyan Han, Jian Yang. Autoencoder-Based Latent Block-Diagonal Representation for Subspace Clustering. IEEE Transactions on Cybernetics. 2020; PP (99):1-11.

Chicago/Turabian Style

Yesong Xu; Shuo Chen; Jun Li; Zongyan Han; Jian Yang. 2020. "Autoencoder-Based Latent Block-Diagonal Representation for Subspace Clustering." IEEE Transactions on Cybernetics PP, no. 99: 1-11.

Conference paper
Published: 16 November 2020 in Transactions on Petri Nets and Other Models of Concurrency XV
Reads 0
Downloads 0

In this paper, we propose an effective point cloud generation method, which can generate multi-resolution point clouds of the same shape from a latent vector. Specifically, we develop a novel progressive deconvolution network with the learning-based bilateral interpolation. The learning-based bilateral interpolation is performed in the spatial and feature spaces of point clouds so that local geometric structure information of point clouds can be exploited. Starting from the low-resolution point clouds, with the bilateral interpolation and max-pooling operations, the deconvolution network can progressively output high-resolution local and global feature maps. By concatenating different resolutions of local and global feature maps, we employ the multi-layer perceptron as the generation network to generate multi-resolution point clouds. In order to keep the shapes of different resolutions of point clouds consistent, we propose a shape-preserving adversarial loss to train the point cloud deconvolution generation network. Experimental results on ShpaeNet and ModelNet datasets demonstrate that our proposed method can yield good performance. Our code is available at https://github.com/fpthink/PDGN.

ACS Style

Le Hui; Rui Xu; Jin Xie; Jianjun Qian; Jian Yang. Progressive Point Cloud Deconvolution Generation Network. Transactions on Petri Nets and Other Models of Concurrency XV 2020, 397 -413.

AMA Style

Le Hui, Rui Xu, Jin Xie, Jianjun Qian, Jian Yang. Progressive Point Cloud Deconvolution Generation Network. Transactions on Petri Nets and Other Models of Concurrency XV. 2020; ():397-413.

Chicago/Turabian Style

Le Hui; Rui Xu; Jin Xie; Jianjun Qian; Jian Yang. 2020. "Progressive Point Cloud Deconvolution Generation Network." Transactions on Petri Nets and Other Models of Concurrency XV , no. : 397-413.

Journal article
Published: 10 November 2020 in Neurocomputing
Reads 0
Downloads 0

Generalized zero-shot learning suffers from an extreme data imbalance problem, that is, the training data only come from seen classes while no unseen class data are available. Recently, a number of feature generation methods based on generative adversarial networks (GAN) have been proposed to address this problem. Existing feature generation methods, however, have never considered the under-constrained problem, and thus could generate an unrestricted visual feature corresponding to no meaningful object class. In this paper, we propose to equip the feature generation framework with a parallel inference network that projects visual feature to the semantic descriptor space, constraining to avoid the generation of unrestricted visual features. The two-parallel-stream framework (1) enables our method, termed inference guided feature generation (Inf-FG), to mitigate the under-constrained problem and (2) makes our Inf-FG applicable to transductive ZSL. Our Inf-FG learns the feature generator and the inference network simultaneously by aligning the joint distribution of visual features and semantic descriptors from the feature generator and the joint distribution from the inference network. We evaluate our approach on four benchmark ZSL datasets, including AWA, CUB, SUN, and FLO, on which our method improves our baselines on generalized zero-shot learning.

ACS Style

Zongyan Han; Zhenyong Fu; Guangyu Li; Jian Yang. Inference guided feature generation for generalized zero-shot learning. Neurocomputing 2020, 430, 150 -158.

AMA Style

Zongyan Han, Zhenyong Fu, Guangyu Li, Jian Yang. Inference guided feature generation for generalized zero-shot learning. Neurocomputing. 2020; 430 ():150-158.

Chicago/Turabian Style

Zongyan Han; Zhenyong Fu; Guangyu Li; Jian Yang. 2020. "Inference guided feature generation for generalized zero-shot learning." Neurocomputing 430, no. : 150-158.

Journal article
Published: 12 August 2020 in IEEE Transactions on Cybernetics
Reads 0
Downloads 0

In recent years, most of the studies have shown that the generalized iterated shrinkage thresholdings (GISTs) have become the commonly used first-order optimization algorithms in sparse learning problems. The nonconvex relaxations of the ℓ₀-norm usually achieve better performance than the convex case (e.g., ℓ₁-norm) since the former can achieve a nearly unbiased solver. To increase the calculation efficiency, this work further provides an accelerated GIST version, that is, AGIST, through the extrapolation-based acceleration technique, which can contribute to reduce the number of iterations when solving a family of nonconvex sparse learning problems. Besides, we present the algorithmic analysis, including both local and global convergence guarantees, as well as other intermediate results for the GIST and AGIST, denoted as (A)GIST, by virtue of the Kurdyka-Łojasiewica (KŁ) property and some milder assumptions. Numerical experiments on both synthetic data and real-world databases can demonstrate that the convergence results of objective function accord to the theoretical properties and nonconvex sparse learning methods can achieve superior performance over some convex ones.

ACS Style

Hengmin Zhang; Feng Qian; Fanhua Shang; Wenli Du; Jianjun Qian; Jian Yang. Global Convergence Guarantees of (A)GIST for a Family of Nonconvex Sparse Learning Problems. IEEE Transactions on Cybernetics 2020, PP, 1 -13.

AMA Style

Hengmin Zhang, Feng Qian, Fanhua Shang, Wenli Du, Jianjun Qian, Jian Yang. Global Convergence Guarantees of (A)GIST for a Family of Nonconvex Sparse Learning Problems. IEEE Transactions on Cybernetics. 2020; PP (99):1-13.

Chicago/Turabian Style

Hengmin Zhang; Feng Qian; Fanhua Shang; Wenli Du; Jianjun Qian; Jian Yang. 2020. "Global Convergence Guarantees of (A)GIST for a Family of Nonconvex Sparse Learning Problems." IEEE Transactions on Cybernetics PP, no. 99: 1-13.