This page has only limited features, please log in for full access.

Unclaimed
Guo-Wei Wei
Department of Mathematics, Michigan State University, MI 48824, USA

Honors and Awards

The user has no records in this section


Career Timeline

The user has no records in this section.


Short Biography

The user biography is not available.
Following
Followers
Co Authors
The list of users this user is following is empty.
Following: 0 users

Feed

Journal article
Published: 14 July 2021 in Journal of Molecular Biology
Reads 0
Downloads 0

The ongoing massive vaccination and the development of effective intervention offer the long-awaited hope to end the global rage of the COVID-19 pandemic. However, the rapidly growing SARS-CoV-2 variants might compromise existing vaccines and monoclonal antibody (mAb) therapies. Although there are valuable experimental studies about the potential threats from emerging variants, the results are limited to a handful of mutations and Eli Lilly and Regeneron mAbs. The potential threats from frequently occurring mutations on the SARS-CoV-2 spike (S) protein receptor-binding domain (RBD) to many mAbs in clinical trials are largely unknown. We fill the gap by developing a topology-based deep learning strategy that is validated with tens of thousands of experimental data points. We analyze 796,759 genome isolates from patients to identify 606 non-degenerate RBD mutations and investigate their impacts on 16 mAbs in clinical trials. Our findings, which are highly consistent with existing experimental results about Alpha, Beta, Gamma, Delta, Epsilon, and Kappa variants shed light on potential threats of 100 most observed mutations to mAbs not only from Eli Lilly and Regeneron but also from Celltrion and Rockefeller University that are in clinical trials. We unveil, for the first time, that high-frequency mutations R346K/S, N439K, G446V, L455F, V483F/A, F486L, F490L/S, Q493L, and S494P might compromise some of mAbs in clinical trials. Our study gives rise to a general perspective about how mutations will affect current vaccines.

ACS Style

Jiahui Chen; Kaifu Gao; Rui Wang; Guo-Wei Wei. Revealing the Threat of Emerging SARS-CoV-2 Mutations to Antibody Therapies. Journal of Molecular Biology 2021, 433, 167155 -167155.

AMA Style

Jiahui Chen, Kaifu Gao, Rui Wang, Guo-Wei Wei. Revealing the Threat of Emerging SARS-CoV-2 Mutations to Antibody Therapies. Journal of Molecular Biology. 2021; 433 (18):167155-167155.

Chicago/Turabian Style

Jiahui Chen; Kaifu Gao; Rui Wang; Guo-Wei Wei. 2021. "Revealing the Threat of Emerging SARS-CoV-2 Mutations to Antibody Therapies." Journal of Molecular Biology 433, no. 18: 167155-167155.

Journal article
Published: 10 June 2021 in Nature Communications
Reads 0
Downloads 0

The ability of molecular property prediction is of great significance to drug discovery, human health, and environmental protection. Despite considerable efforts, quantitative prediction of various molecular properties remains a challenge. Although some machine learning models, such as bidirectional encoder from transformer, can incorporate massive unlabeled molecular data into molecular representations via a self-supervised learning strategy, it neglects three-dimensional (3D) stereochemical information. Algebraic graph, specifically, element-specific multiscale weighted colored algebraic graph, embeds complementary 3D molecular information into graph invariants. We propose an algebraic graph-assisted bidirectional transformer (AGBT) framework by fusing representations generated by algebraic graph and bidirectional transformer, as well as a variety of machine learning algorithms, including decision trees, multitask learning, and deep neural networks. We validate the proposed AGBT framework on eight molecular datasets, involving quantitative toxicity, physical chemistry, and physiology datasets. Extensive numerical experiments have shown that AGBT is a state-of-the-art framework for molecular property prediction.

ACS Style

Dong Chen; Kaifu Gao; Duc Duy Nguyen; Xin Chen; Yi Jiang; Guo-Wei Wei; Feng Pan. Algebraic graph-assisted bidirectional transformers for molecular property prediction. Nature Communications 2021, 12, 1 -9.

AMA Style

Dong Chen, Kaifu Gao, Duc Duy Nguyen, Xin Chen, Yi Jiang, Guo-Wei Wei, Feng Pan. Algebraic graph-assisted bidirectional transformers for molecular property prediction. Nature Communications. 2021; 12 (1):1-9.

Chicago/Turabian Style

Dong Chen; Kaifu Gao; Duc Duy Nguyen; Xin Chen; Yi Jiang; Guo-Wei Wei; Feng Pan. 2021. "Algebraic graph-assisted bidirectional transformers for molecular property prediction." Nature Communications 12, no. 1: 1-9.

Preprint content
Published: 07 June 2021
Reads 0
Downloads 0

Directed evolution (DE), a strategy for protein engineering, optimizes protein properties (i.e. fitness) by expensive and time-consuming screen or selection of a large combinatorial sequence space. Machine learning-assisted directed evolution (MLDE) that screens variant properties in silico can reduce the experimental burden. However, the MLDE utilizing small experimentally labeled training data from random sampling renders low global maximal fitness hitting rates. This work introduces a cluster learning-assisted directed evolution (CLADE) framework, particularly designed for systems without high-throughput screening assays, that combines sampling through hierarchical unsupervised clustering and supervised learning to guide protein engineering. Based on general biological information, CLADE splits the genetic combinatorial space into various subspaces with heterogeneous evolutionary traits, which guides the selection of experimental sampling sets and the subsequent building up of supervised learning training sets. By virtually screening two four-site combinatorial fitness landscapes from protein G domain B1 (GB1) and PhoQ, our CLADE consistently showed near 3-fold improvement on global maximal fitness hitting rate than using randomly sampled training data. Our CLADE can be easily applied to various biological systems and customized for systems with different throughput levels to maximize its accuracy and efficiency. It promises a significant impact to protein engineering.

ACS Style

YuChi Qiu; Jian Hu; Guo-Wei Wei. CLADE: Cluster learning-assisted directed evolution. 2021, 1 .

AMA Style

YuChi Qiu, Jian Hu, Guo-Wei Wei. CLADE: Cluster learning-assisted directed evolution. . 2021; ():1.

Chicago/Turabian Style

YuChi Qiu; Jian Hu; Guo-Wei Wei. 2021. "CLADE: Cluster learning-assisted directed evolution." , no. : 1.

Journal article
Published: 15 May 2021 in Genomics
Reads 0
Downloads 0

Recently, the SARS-CoV-2 variants from the United Kingdom (UK), South Africa, and Brazil have received much attention for their increased infectivity, potentially high virulence, and possible threats to existing vaccines and antibody therapies. The question remains if there are other more infectious variants transmitted around the world. We carry out a large-scale study of 506,768 SARS-CoV-2 genome isolates from patients to identify many other rapidly growing mutations on the spike (S) protein receptor-binding domain (RBD). We reveal that essentially all 100 most observed mutations strengthen the binding between the RBD and the host angiotensin-converting enzyme 2 (ACE2), indicating the virus evolves toward more infectious variants. In particular, we discover new fast-growing RBD mutations N439K, S477N, S477R, and N501T that also enhance the RBD and ACE2 binding. We further unveil that mutation N501Y involved in United Kingdom (UK), South Africa, and Brazil variants may moderately weaken the binding between the RBD and many known antibodies, while mutations E484K and K417N found in South Africa and Brazilian variants, L452R and E484Q found in India variants, can potentially disrupt the binding between the RBD and many known antibodies. Among these RBD mutations, L452R is also now known as part of the California variant B.1.427. Finally, we hypothesize that RBD mutations that can simultaneously make SARS-CoV-2 more infectious and disrupt the existing antibodies, called vaccine escape mutations, will pose an imminent threat to the current crop of vaccines. A list of most likely vaccine escape mutations is given, including S494P, Q493L, K417N, F490S, F486L, R403K, E484K, L452R, K417T, F490L, E484Q, and A475S. Mutation T478K appears to make the Mexico variant B.1.1.222 the most infectious one. Our comprehensive genetic analysis and protein-protein binding study show that the genetic evolution of SARS-CoV-2 on the RBD, which may be regulated by host gene editing, viral proofreading, random genetic drift, and natural selection, gives rise to more infectious variants that will potentially compromise existing vaccines and antibody therapies.

ACS Style

Rui Wang; Jiahui Chen; Kaifu Gao; Guo-Wei Wei. Vaccine-escape and fast-growing mutations in the United Kingdom, the United States, Singapore, Spain, India, and other COVID-19-devastated countries. Genomics 2021, 113, 2158 -2170.

AMA Style

Rui Wang, Jiahui Chen, Kaifu Gao, Guo-Wei Wei. Vaccine-escape and fast-growing mutations in the United Kingdom, the United States, Singapore, Spain, India, and other COVID-19-devastated countries. Genomics. 2021; 113 (4):2158-2170.

Chicago/Turabian Style

Rui Wang; Jiahui Chen; Kaifu Gao; Guo-Wei Wei. 2021. "Vaccine-escape and fast-growing mutations in the United Kingdom, the United States, Singapore, Spain, India, and other COVID-19-devastated countries." Genomics 113, no. 4: 2158-2170.

Journal article
Published: 12 May 2021 in Computers in Biology and Medicine
Reads 0
Downloads 0

While automated feature extraction has had tremendous success in many deep learning algorithms for image analysis and natural language processing, it does not work well for data involving complex internal structures, such as molecules. Data representations via advanced mathematics, including algebraic topology, differential geometry, and graph theory, have demonstrated superiority in a variety of biomolecular applications, however, their performance is often dependent on manual parametrization. This work introduces the auto-parametrized weighted element-specific graph neural network, dubbed AweGNN, to overcome the obstacle of this tedious parametrization process while also being a suitable technique for automated feature extraction on these internally complex biomolecular data sets. The AweGNN is a neural network model based on geometric-graph features of element-pair interactions, with its graph parameters being updated throughout the training, which results in what we call a network-enabled automatic representation (NEAR). To enhance the predictions with small data sets, we construct multi-task (MT) AweGNN models in addition to single-task (ST) AweGNN models. The proposed methods are applied to various benchmark data sets, including four data sets for quantitative toxicity analysis and another data set for solvation prediction. Extensive numerical tests show that AweGNN models can achieve state-of-the-art performance in molecular property predictions.

ACS Style

Timothy Szocinski; Duc Duy Nguyen; Guo-Wei Wei. AweGNN: Auto-parametrized weighted element-specific graph neural networks for molecules. Computers in Biology and Medicine 2021, 134, 104460 .

AMA Style

Timothy Szocinski, Duc Duy Nguyen, Guo-Wei Wei. AweGNN: Auto-parametrized weighted element-specific graph neural networks for molecules. Computers in Biology and Medicine. 2021; 134 ():104460.

Chicago/Turabian Style

Timothy Szocinski; Duc Duy Nguyen; Guo-Wei Wei. 2021. "AweGNN: Auto-parametrized weighted element-specific graph neural networks for molecules." Computers in Biology and Medicine 134, no. : 104460.

Review
Published: 06 May 2021 in Annual Review of Biophysics
Reads 0
Downloads 0

In the global health emergency caused by coronavirus disease 2019 (COVID-19), efficient and specific therapies are urgently needed. Compared with traditional small-molecular drugs, antibody therapies are relatively easy to develop; they are as specific as vaccines in targeting severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2); and they have thus attracted much attention in the past few months. This article reviews seven existing antibodies for neutralizing SARS-CoV-2 with 3D structures deposited in the Protein Data Bank (PDB). Five 3D antibody structures associated with the SARS-CoV spike (S) protein are also evaluated for their potential in neutralizing SARS-CoV-2. The interactions of these antibodies with the S protein receptor-binding domain (RBD) are compared with those between angiotensin-converting enzyme 2 and RBD complexes. Due to the orders of magnitude in the discrepancies of experimental binding affinities, we introduce topological data analysis, a variety of network models, and deep learning to analyze the binding strength and therapeutic potential of the 14 antibody–antigen complexes. The current COVID-19 antibody clinical trials, which are not limited to the S protein target, are also reviewed.

ACS Style

Jiahui Chen; Kaifu Gao; Rui Wang; Duc Duy Nguyen; Guo-Wei Wei. Review of COVID-19 Antibody Therapies. Annual Review of Biophysics 2021, 50, 1 -30.

AMA Style

Jiahui Chen, Kaifu Gao, Rui Wang, Duc Duy Nguyen, Guo-Wei Wei. Review of COVID-19 Antibody Therapies. Annual Review of Biophysics. 2021; 50 (1):1-30.

Chicago/Turabian Style

Jiahui Chen; Kaifu Gao; Rui Wang; Duc Duy Nguyen; Guo-Wei Wei. 2021. "Review of COVID-19 Antibody Therapies." Annual Review of Biophysics 50, no. 1: 1-30.

Edge article
Published: 13 April 2021 in Chemical Science
Reads 0
Downloads 0

Antibody therapeutics and vaccines are among our last resort to end the raging COVID-19 pandemic.

ACS Style

Jiahui Chen; Kaifu Gao; Rui Wang; Guo-Wei Wei. Prediction and mitigation of mutation threats to COVID-19 vaccines and antibody therapies. Chemical Science 2021, 12, 6929 -6948.

AMA Style

Jiahui Chen, Kaifu Gao, Rui Wang, Guo-Wei Wei. Prediction and mitigation of mutation threats to COVID-19 vaccines and antibody therapies. Chemical Science. 2021; 12 (20):6929-6948.

Chicago/Turabian Style

Jiahui Chen; Kaifu Gao; Rui Wang; Guo-Wei Wei. 2021. "Prediction and mitigation of mutation threats to COVID-19 vaccines and antibody therapies." Chemical Science 12, no. 20: 6929-6948.

Preprint content
Published: 12 April 2021
Reads 0
Downloads 0

The ongoing massive vaccination and the development of effective intervention offer the long-awaited hope to end the global rage of the COVID-19 pandemic. However, the rapidly growing SARS-CoV-2 variants might compromise existing vaccines and monoclonal antibody (mAb) therapies. Although there are valuable experimental studies about the potential threats from emerging variants, the results are limited to a handful of mutations and Eli Lilly and Regeneron mAbs. The potential threats from frequently occurring mutations on the SARS-CoV-2 spike (S) protein receptor-binding domain (RBD) to many mAbs in clinical trials are largely unknown. We fill the gap by developing a topology-based deep learning strategy that is validated with tens of thousands of experimental data points. We analyze 261,348 genome isolates from patients to identify 514 non-degenerate RBD mutations and investigate their impacts on 16 mAbs in clinical trials. Our findings, which are highly consistent with existing experimental results about variants from the UK, South Africa, Brazil, US-California, and Mexico shed light on potential threats of 95 high-frequency mutations to mAbs not only from Eli Lilly and Regeneron but also from Celltrion and Rockefeller University that are in clinical trials. We unveil, for the first time, that high-frequency mutations R346K/S, N439K, G446V, L455F, V483F/A, E484Q/V/A/G/D, F486L, F490L/V/S, Q493L, and S494P/L might compromise some of mAbs in clinical trials. Our study gives rise to a general perspective about how mutations will affect current vaccines.

ACS Style

Jiahui Chen; Kaifu Gao; Rui Wang; Guo-Wei Wei. Revealing the threat of emerging SARS-CoV-2 mutations to antibody therapies. 2021, 1 .

AMA Style

Jiahui Chen, Kaifu Gao, Rui Wang, Guo-Wei Wei. Revealing the threat of emerging SARS-CoV-2 mutations to antibody therapies. . 2021; ():1.

Chicago/Turabian Style

Jiahui Chen; Kaifu Gao; Rui Wang; Guo-Wei Wei. 2021. "Revealing the threat of emerging SARS-CoV-2 mutations to antibody therapies." , no. : 1.

Research article
Published: 15 March 2021 in Journal of Chemical Information and Modeling
Reads 0
Downloads 0

Toxicity analysis is a major challenge in drug design and discovery. Recently significant progress has been made through machine learning due to its accuracy, efficiency, and lower cost. US Toxicology in the 21st Century (Tox21) screened a large library of compounds, including approximately 12 000 environmental chemicals and drugs, for different mechanisms responsible for eliciting toxic effects. The Tox21 Data Challenge offered a platform to evaluate different computational methods for toxicity predictions. Inspired by the success of multiscale weighted colored graph (MWCG) theory in protein–ligand binding affinity predictions, we consider MWCG theory for toxicity analysis. In the present work, we develop a geometric graph learning toxicity (GGL-Tox) model by integrating MWCG features and the gradient boosting decision tree (GBDT) algorithm. The benchmark tests of the Tox21 Data Challenge are employed to demonstrate the utility and usefulness of the proposed GGL-Tox model. An extensive comparison with other state-of-the-art models indicates that GGL-Tox is an accurate and efficient model for toxicity analysis and prediction.

ACS Style

Jian Jiang; Rui Wang; Guo-Wei Wei. GGL-Tox: Geometric Graph Learning for Toxicity Prediction. Journal of Chemical Information and Modeling 2021, 61, 1691 -1700.

AMA Style

Jian Jiang, Rui Wang, Guo-Wei Wei. GGL-Tox: Geometric Graph Learning for Toxicity Prediction. Journal of Chemical Information and Modeling. 2021; 61 (4):1691-1700.

Chicago/Turabian Style

Jian Jiang; Rui Wang; Guo-Wei Wei. 2021. "GGL-Tox: Geometric Graph Learning for Toxicity Prediction." Journal of Chemical Information and Modeling 61, no. 4: 1691-1700.

Author correction
Published: 03 March 2021 in Communications Biology
Reads 0
Downloads 0

A Correction to this paper has been published: https://doi.org/10.1038/s42003-021-01867-y

ACS Style

Rui Wang; Jiahui Chen; Kaifu Gao; Yuta Hozumi; Changchuan Yin; Guo-Wei Wei. Author Correction: Analysis of SARS-CoV-2 mutations in the United States suggests presence of four substrains and novel variants. Communications Biology 2021, 4, 1 -1.

AMA Style

Rui Wang, Jiahui Chen, Kaifu Gao, Yuta Hozumi, Changchuan Yin, Guo-Wei Wei. Author Correction: Analysis of SARS-CoV-2 mutations in the United States suggests presence of four substrains and novel variants. Communications Biology. 2021; 4 (1):1-1.

Chicago/Turabian Style

Rui Wang; Jiahui Chen; Kaifu Gao; Yuta Hozumi; Changchuan Yin; Guo-Wei Wei. 2021. "Author Correction: Analysis of SARS-CoV-2 mutations in the United States suggests presence of four substrains and novel variants." Communications Biology 4, no. 1: 1-1.

Journal article
Published: 26 February 2021 in Applied Sciences
Reads 0
Downloads 0

Mitochondrial cristae are dynamic invaginations of the inner membrane and play a key role in its metabolic capacity to produce ATP. Structural alterations caused by either genetic abnormalities or detrimental environmental factors impede mitochondrial metabolic fluxes and lead to a decrease in their ability to meet metabolic energy requirements. While some of the key proteins associated with mitochondrial cristae are known, very little is known about how the inner membrane dynamics are involved in energy metabolism. In this study, we present a computational strategy to understand how cristae are formed using a phase-based separation approach of both the inner membrane space and matrix space, which are explicitly modeled using the Cahn–Hilliard equation. We show that cristae are formed as a consequence of minimizing an energy function associated with phase interactions which are subject to geometric boundary constraints. We then extended the model to explore how the presence of calcium phosphate granules, entities that form in calcium overload conditions, exert a devastating inner membrane remodeling response that reduces the capacity for mitochondria to produce ATP. This modeling approach can be extended to include arbitrary geometrical constraints, the spatial heterogeneity of enzymes, and electrostatic effects to mechanize the impact of ultrastructural changes on energy metabolism.

ACS Style

Jasiel Strubbe-Rivera; Jiahui Chen; Benjamin West; Kristin Parent; Guo-Wei Wei; Jason Bazil. Modeling the Effects of Calcium Overload on Mitochondrial Ultrastructural Remodeling. Applied Sciences 2021, 11, 2071 .

AMA Style

Jasiel Strubbe-Rivera, Jiahui Chen, Benjamin West, Kristin Parent, Guo-Wei Wei, Jason Bazil. Modeling the Effects of Calcium Overload on Mitochondrial Ultrastructural Remodeling. Applied Sciences. 2021; 11 (5):2071.

Chicago/Turabian Style

Jasiel Strubbe-Rivera; Jiahui Chen; Benjamin West; Kristin Parent; Guo-Wei Wei; Jason Bazil. 2021. "Modeling the Effects of Calcium Overload on Mitochondrial Ultrastructural Remodeling." Applied Sciences 11, no. 5: 2071.

Journal article
Published: 22 February 2021 in Computers in Biology and Medicine
Reads 0
Downloads 0

Coronavirus disease 2019 (COVID-19) caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has a worldwide devastating effect. Understanding the evolution and transmission of SARS-CoV-2 is of paramount importance for controlling, combating and preventing COVID-19. Due to the rapid growth in both the number of SARS-CoV-2 genome sequences and the number of unique mutations, the phylogenetic analysis of SARS-CoV-2 genome isolates faces an emergent large-data challenge. We introduce a dimension-reduced K-means clustering strategy to tackle this challenge. We examine the performance and effectiveness of three dimension-reduction algorithms: principal component analysis (PCA), t-distributed stochastic neighbor embedding (t-SNE), and uniform manifold approximation and projection (UMAP). By using four benchmark datasets, we found that UMAP is the best-suited technique due to its stable, reliable, and efficient performance, its ability to improve clustering accuracy, especially for large Jaccard distanced-based datasets, and its superior clustering visualization. The UMAP-assisted K-means clustering enables us to shed light on increasingly large datasets from SARS-CoV-2 genome isolates.

ACS Style

Yuta Hozumi; Rui Wang; Changchuan Yin; Guo-Wei Wei. UMAP-assisted K-means clustering of large-scale SARS-CoV-2 mutation datasets. Computers in Biology and Medicine 2021, 131, 104264 -104264.

AMA Style

Yuta Hozumi, Rui Wang, Changchuan Yin, Guo-Wei Wei. UMAP-assisted K-means clustering of large-scale SARS-CoV-2 mutation datasets. Computers in Biology and Medicine. 2021; 131 ():104264-104264.

Chicago/Turabian Style

Yuta Hozumi; Rui Wang; Changchuan Yin; Guo-Wei Wei. 2021. "UMAP-assisted K-means clustering of large-scale SARS-CoV-2 mutation datasets." Computers in Biology and Medicine 131, no. : 104264-104264.

Journal article
Published: 15 February 2021 in Communications Biology
Reads 0
Downloads 0

SARS-CoV-2 has been mutating since it was first sequenced in early January 2020. Here, we analyze 45,494 complete SARS-CoV-2 geneome sequences in the world to understand their mutations. Among them, 12,754 sequences are from the United States. Our analysis suggests the presence of four substrains and eleven top mutations in the United States. These eleven top mutations belong to 3 disconnected groups. The first and second groups consisting of 5 and 8 concurrent mutations are prevailing, while the other group with three concurrent mutations gradually fades out. Moreover, we reveal that female immune systems are more active than those of males in responding to SARS-CoV-2 infections. One of the top mutations, 27964C > T-(S24L) on ORF8, has an unusually strong gender dependence. Based on the analysis of all mutations on the spike protein, we uncover that two of four SARS-CoV-2 substrains in the United States become potentially more infectious.

ACS Style

Rui Wang; Jiahui Chen; Kaifu Gao; Yuta Hozumi; Changchuan Yin; Guo-Wei Wei. Analysis of SARS-CoV-2 mutations in the United States suggests presence of four substrains and novel variants. Communications Biology 2021, 4, 1 -14.

AMA Style

Rui Wang, Jiahui Chen, Kaifu Gao, Yuta Hozumi, Changchuan Yin, Guo-Wei Wei. Analysis of SARS-CoV-2 mutations in the United States suggests presence of four substrains and novel variants. Communications Biology. 2021; 4 (1):1-14.

Chicago/Turabian Style

Rui Wang; Jiahui Chen; Kaifu Gao; Yuta Hozumi; Changchuan Yin; Guo-Wei Wei. 2021. "Analysis of SARS-CoV-2 mutations in the United States suggests presence of four substrains and novel variants." Communications Biology 4, no. 1: 1-14.

Journal article
Published: 05 February 2021 in npj Computational Materials
Reads 0
Downloads 0

Accurate theoretical predictions of desired properties of materials play an important role in materials research and development. Machine learning (ML) can accelerate the materials design by building a model from input data. For complex datasets, such as those of crystalline compounds, a vital issue is how to construct low-dimensional representations for input crystal structures with chemical insights. In this work, we introduce an algebraic topology-based method, called atom-specific persistent homology (ASPH), as a unique representation of crystal structures. The ASPH can capture both pairwise and many-body interactions and reveal the topology-property relationship of a group of atoms at various scales. Combined with composition-based attributes, ASPH-based ML model provides a highly accurate prediction of the formation energy calculated by density functional theory (DFT). After training with more than 30,000 different structure types and compositions, our model achieves a mean absolute error of 61 meV/atom in cross-validation, which outperforms previous work such as Voronoi tessellations and Coulomb matrix method using the same ML algorithm and datasets. Our results indicate that the proposed topology-based method provides a powerful computational tool for predicting materials properties compared to previous works.

ACS Style

Yi Jiang; Dong Chen; Xin Chen; Tangyi Li; Guo-Wei Wei; Feng Pan. Topological representations of crystalline compounds for the machine-learning prediction of materials properties. npj Computational Materials 2021, 7, 1 -8.

AMA Style

Yi Jiang, Dong Chen, Xin Chen, Tangyi Li, Guo-Wei Wei, Feng Pan. Topological representations of crystalline compounds for the machine-learning prediction of materials properties. npj Computational Materials. 2021; 7 (1):1-8.

Chicago/Turabian Style

Yi Jiang; Dong Chen; Xin Chen; Tangyi Li; Guo-Wei Wei; Feng Pan. 2021. "Topological representations of crystalline compounds for the machine-learning prediction of materials properties." npj Computational Materials 7, no. 1: 1-8.

Editorial
Published: 02 February 2021 in Journal of Chemical Information and Modeling
Reads 0
Downloads 0
ACS Style

Guo-Wei Wei; Thereza A. Soares; Habibah Wahab; Renxiao Wang. Computational Chemistry in Asia. Journal of Chemical Information and Modeling 2021, 61, 547 -547.

AMA Style

Guo-Wei Wei, Thereza A. Soares, Habibah Wahab, Renxiao Wang. Computational Chemistry in Asia. Journal of Chemical Information and Modeling. 2021; 61 (2):547-547.

Chicago/Turabian Style

Guo-Wei Wei; Thereza A. Soares; Habibah Wahab; Renxiao Wang. 2021. "Computational Chemistry in Asia." Journal of Chemical Information and Modeling 61, no. 2: 547-547.

Preprint content
Published: 29 January 2021
Reads 0
Downloads 0

The ability of quantitative molecular prediction is of great significance to drug discovery, human health, and environmental protection. Despite considerable efforts, quantitative prediction of various molecular properties remains a challenge. Although some machine learning models, such as bidirectional encoder from transformer, can incorporate massive unlabeled molecular data into molecular representations via a self-supervised learning strategy, it neglects three-dimensional (3D) stereochemical information. Algebraic graph, specifically, element-specific multiscale weighted colored algebraic graph, embeds complementary 3D molecular information into graph invariants. We propose an algebraic graph-assisted bidirectional transformer (AGBT) model by fusing representations generated by algebraic graph and bidirectional transformer, as well as a variety of machine learning algorithms, including decision trees, multitask learning, and deep neural networks. We validate the proposed AGBT model on five benchmark molecular datasets, involving quantitative toxicity and partition coefficient. Extensive numerical experiments suggest that AGBT outperforms all other existing methods for all these molecular predictions.

ACS Style

Dong Chen; Kaifu Gao; Duc Nguyen; Xin Chen; Yi Jiang; Guowei Wei; Feng Pan. Algebraic Graph-assisted Bidirectional Transformers for Molecular Prediction. 2021, 1 .

AMA Style

Dong Chen, Kaifu Gao, Duc Nguyen, Xin Chen, Yi Jiang, Guowei Wei, Feng Pan. Algebraic Graph-assisted Bidirectional Transformers for Molecular Prediction. . 2021; ():1.

Chicago/Turabian Style

Dong Chen; Kaifu Gao; Duc Nguyen; Xin Chen; Yi Jiang; Guowei Wei; Feng Pan. 2021. "Algebraic Graph-assisted Bidirectional Transformers for Molecular Prediction." , no. : 1.

Journal article
Published: 01 January 2021 in Foundations of Data Science
Reads 0
Downloads 0
ACS Style

Rui Wang; Rundong Zhao; Emily Ribando-Gros; Jiahui Chen; Yiying Tong; Guo-Wei Wei. HERMES: Persistent spectral graph software. Foundations of Data Science 2021, 3, 67 .

AMA Style

Rui Wang, Rundong Zhao, Emily Ribando-Gros, Jiahui Chen, Yiying Tong, Guo-Wei Wei. HERMES: Persistent spectral graph software. Foundations of Data Science. 2021; 3 (1):67.

Chicago/Turabian Style

Rui Wang; Rundong Zhao; Emily Ribando-Gros; Jiahui Chen; Yiying Tong; Guo-Wei Wei. 2021. "HERMES: Persistent spectral graph software." Foundations of Data Science 3, no. 1: 67.

Journal article
Published: 01 January 2021 in Discrete & Continuous Dynamical Systems - B
Reads 0
Downloads 0
ACS Style

Jiahui Chen; Rundong Zhao; Yiying Tong; Guo-Wei Wei. Evolutionary de Rham-Hodge method. Discrete & Continuous Dynamical Systems - B 2021, 26, 3785 .

AMA Style

Jiahui Chen, Rundong Zhao, Yiying Tong, Guo-Wei Wei. Evolutionary de Rham-Hodge method. Discrete & Continuous Dynamical Systems - B. 2021; 26 (7):3785.

Chicago/Turabian Style

Jiahui Chen; Rundong Zhao; Yiying Tong; Guo-Wei Wei. 2021. "Evolutionary de Rham-Hodge method." Discrete & Continuous Dynamical Systems - B 26, no. 7: 3785.

Journal article
Published: 01 January 2021 in Communications in Information and Systems
Reads 0
Downloads 0
ACS Style

Jiahui Chen; Rui Wang; Guo-Wei Wei. SARS-CoV-2 becoming more infectious as revealed by algebraic topology and deep learning. Communications in Information and Systems 2021, 21, 31 -36.

AMA Style

Jiahui Chen, Rui Wang, Guo-Wei Wei. SARS-CoV-2 becoming more infectious as revealed by algebraic topology and deep learning. Communications in Information and Systems. 2021; 21 (1):31-36.

Chicago/Turabian Style

Jiahui Chen; Rui Wang; Guo-Wei Wei. 2021. "SARS-CoV-2 becoming more infectious as revealed by algebraic topology and deep learning." Communications in Information and Systems 21, no. 1: 31-36.

Journal article
Published: 01 January 2021 in Foundations of Data Science
Reads 0
Downloads 0

The \begin{document}$ p $\end{document}-persistent \begin{document}$ q $\end{document}-combinatorial Laplacian defined for a pair of simplicial complexes is a generalization of the \begin{document}$ q $\end{document}-combinatorial Laplacian. Given a filtration, the spectra of persistent combinatorial Laplacians not only recover the persistent Betti numbers of persistent homology but also provide extra multiscale geometrical information of the data. Paired with machine learning algorithms, the persistent Laplacian has many potential applications in data science. Seeking different ways to find the spectrum of an operator is an active research topic, becoming interesting when ideas are originated from multiple fields. In this work, we explore an alternative approach for the spectrum of persistent Laplacians. As the eigenvalues of a persistent Laplacian matrix are the roots of its characteristic polynomial, one may attempt to find the roots of the characteristic polynomial by homotopy continuation, and thus resolving the spectrum of the corresponding persistent Laplacian. We consider a set of simple polytopes and small molecules to prove the principle that algebraic topology, combinatorial graph, and algebraic geometry can be integrated to understand the shape of data.

ACS Style

Xiaoqi Wei; Guo-Wei Wei. Homotopy continuation for the spectra of persistent Laplacians. Foundations of Data Science 2021, 1 .

AMA Style

Xiaoqi Wei, Guo-Wei Wei. Homotopy continuation for the spectra of persistent Laplacians. Foundations of Data Science. 2021; ():1.

Chicago/Turabian Style

Xiaoqi Wei; Guo-Wei Wei. 2021. "Homotopy continuation for the spectra of persistent Laplacians." Foundations of Data Science , no. : 1.