This page has only limited features, please log in for full access.

Unclaimed
Yuhai Zhao
Northeastern university in China, ShenYang, LiaoNing, China

Basic Info

Basic Info is private.

Honors and Awards

The user has no records in this section


Career Timeline

The user has no records in this section.


Short Biography

The user biography is not available.
Following
Followers
Co Authors
The list of users this user is following is empty.
Following: 0 users

Feed

Journal article
Published: 25 February 2021 in IEEE/ACM Transactions on Computational Biology and Bioinformatics
Reads 0
Downloads 0

recently, the compacted de Bruijn graph (cDBG) of complete genome sequences was successfully used in read mapping due to its ability to deal with the repetitions in genomes. However, current approaches are not flexible enough to fit frequently building the graphs with different k-mer lengths. Instead of building the graph directly, how can we build the compacted de Bruijin graph of longer k-mer based on the one of short k-mer In this article, we present StLiter, a novel algorithm to build the compacted de Bruijn graph either directly from genome sequences or indirectly based on the graph of a short k-mer. For 100 simulated human genomes, StLiter can construct the graph of k-mer length 15-18 in 2.5-3.2 hours with maximal ~70GB memory in the case of without considering the reverese complements of the reference genomes. And it costs 4.5-5.9 hours when considering the reverse complements. In experiments, we compared StLiter with TwoPaCo, the state-of-art method for building the graph, on 4 datasets. For k-mer length 15-18, StLiter can build the graph 5-9 times faster than TwoPaCo. For k-mer length larger than 18, given the graph of a short (k-x)-mer, such as x=1-2, StLiter can also build the graph more efficiently.

ACS Style

Changong Yu; Keming Mao; Yuhai Zhao; Cheng Chang; Guoren Wang. StLiter: A Novel Algorithm to Iteratively Build the Compacted de Bruijn Graph from Many Complete Genomes. IEEE/ACM Transactions on Computational Biology and Bioinformatics 2021, PP, 1 -1.

AMA Style

Changong Yu, Keming Mao, Yuhai Zhao, Cheng Chang, Guoren Wang. StLiter: A Novel Algorithm to Iteratively Build the Compacted de Bruijn Graph from Many Complete Genomes. IEEE/ACM Transactions on Computational Biology and Bioinformatics. 2021; PP (99):1-1.

Chicago/Turabian Style

Changong Yu; Keming Mao; Yuhai Zhao; Cheng Chang; Guoren Wang. 2021. "StLiter: A Novel Algorithm to Iteratively Build the Compacted de Bruijn Graph from Many Complete Genomes." IEEE/ACM Transactions on Computational Biology and Bioinformatics PP, no. 99: 1-1.

Journal article
Published: 29 October 2020 in IEEE Transactions on Knowledge and Data Engineering
Reads 0
Downloads 0

Density Peaks (DP) Clustering organizes data into clusters by finding peaks in dense regions. This involves computing density (ρ) and distance (σ) of every point. As such, though DP-based schemes have been very effective in producing high quality clusters, their complexity is ${O(N^{2})}$ where N is the number of data points. In this paper, we propose a fast distributed density peaks clustering algorithm, FDDP, based on the z-value index. In FDDP, we first employ the z-value index to map multi-dimensional data points into one-dimensional space and then range-partition the data according to the z-value to balance the load across the processing nodes. We ensure minimal overlapping range to handle computations at the boundary points. We also propose a σ calculation algorithm, FC which facilitates a forward computing strategy to calculate ρ linearly. Additionally, we propose another FC computation algorithm, CB which using a caching and efficient searching strategy to calculate ρ. Moreover, FDDP is able to reduce the time complexity from ${O(N^{2})}$ to ${O(N\cdot log(N))}$ . We provide a theoretical analysis of FDDP and evaluated FDDP empirically. Our experimental results show that FDDP outperforms the state-of-art algorithms significantly.

ACS Style

Jing Lu; Yuhai Zhao; Kian-Lee Tan; Zhengkui Wang. Distributed Density Peaks Clustering Revisited. IEEE Transactions on Knowledge and Data Engineering 2020, PP, 1 -1.

AMA Style

Jing Lu, Yuhai Zhao, Kian-Lee Tan, Zhengkui Wang. Distributed Density Peaks Clustering Revisited. IEEE Transactions on Knowledge and Data Engineering. 2020; PP (99):1-1.

Chicago/Turabian Style

Jing Lu; Yuhai Zhao; Kian-Lee Tan; Zhengkui Wang. 2020. "Distributed Density Peaks Clustering Revisited." IEEE Transactions on Knowledge and Data Engineering PP, no. 99: 1-1.

Journal article
Published: 02 April 2018 in Entropy
Reads 0
Downloads 0

Recently, Multi-Graph Learning was proposed as the extension of Multi-Instance Learning and has achieved some successes. However, to the best of our knowledge, currently, there is no study working on Multi-Graph Multi-Label Learning, where each object is represented as a bag containing a number of graphs and each bag is marked with multiple class labels. It is an interesting problem existing in many applications, such as image classification, medicinal analysis and so on. In this paper, we propose an innovate algorithm to address the problem. Firstly, it uses more precise structures, multiple Graphs, instead of Instances to represent an image so that the classification accuracy could be improved. Then, it uses multiple labels as the output to eliminate the semantic ambiguity of the image. Furthermore, it calculates the entropy to mine the informative subgraphs instead of just mining the frequent subgraphs, which enables selecting the more accurate features for the classification. Lastly, since the current algorithms cannot directly deal with graph-structures, we degenerate the Multi-Graph Multi-Label Learning into the Multi-Instance Multi-Label Learning in order to solve it by MIML-ELM (Improving Multi-Instance Multi-Label Learning by Extreme Learning Machine). The performance study shows that our algorithm outperforms the competitors in terms of both effectiveness and efficiency.

ACS Style

Zixuan Zhu; Yuhai Zhao. Multi-Graph Multi-Label Learning Based on Entropy. Entropy 2018, 20, 245 .

AMA Style

Zixuan Zhu, Yuhai Zhao. Multi-Graph Multi-Label Learning Based on Entropy. Entropy. 2018; 20 (4):245.

Chicago/Turabian Style

Zixuan Zhu; Yuhai Zhao. 2018. "Multi-Graph Multi-Label Learning Based on Entropy." Entropy 20, no. 4: 245.

Journal article
Published: 24 May 2016 in Applied Sciences
Reads 0
Downloads 0

Multi-instance multi-label learning is a learning framework, where every object is represented by a bag of instances and associated with multiple labels simultaneously. The existing degeneration strategy-based methods often suffer from some common drawbacks: (1) the user-specific parameter for the number of clusters may incur the effective problem; (2) SVM may bring a high computational cost when utilized as the classifier builder. In this paper, we propose an algorithm, namely multi-instance multi-label (MIML)-extreme learning machine (ELM), to address the problems. To our best knowledge, we are the first to utilize ELM in the MIML problem and to conduct the comparison of ELM and SVM on MIML. Extensive experiments have been conducted on real datasets and synthetic datasets. The results show that MIMLELM tends to achieve better generalization performance at a higher learning speed.

ACS Style

Ying Yin; Yuhai Zhao; Chengguang Li; Bin Zhang. Improving Multi-Instance Multi-Label Learning by Extreme Learning Machine. Applied Sciences 2016, 6, 160 .

AMA Style

Ying Yin, Yuhai Zhao, Chengguang Li, Bin Zhang. Improving Multi-Instance Multi-Label Learning by Extreme Learning Machine. Applied Sciences. 2016; 6 (6):160.

Chicago/Turabian Style

Ying Yin; Yuhai Zhao; Chengguang Li; Bin Zhang. 2016. "Improving Multi-Instance Multi-Label Learning by Extreme Learning Machine." Applied Sciences 6, no. 6: 160.