This page has only limited features, please log in for full access.

Unclaimed
Asif Khan
School of Computer Science and Technology, Beijing Institute of Technology, Beijing 100811, China

Honors and Awards

The user has no records in this section


Career Timeline

The user has no records in this section.


Short Biography

The user biography is not available.
Following
Followers
Co Authors
The list of users this user is following is empty.
Following: 0 users

Feed

Journal article
Published: 24 May 2021 in Sustainability
Reads 0
Downloads 0

Neural relation extraction (NRE) models are the backbone of various machine learning tasks, including knowledge base enrichment, information extraction, and document summarization. Despite the vast popularity of these models, their vulnerabilities remain unknown; this is of high concern given their growing use in security-sensitive applications such as question answering and machine translation in the aspects of sustainability. In this study, we demonstrate that NRE models are inherently vulnerable to adversarially crafted text that contains imperceptible modifications of the original but can mislead the target NRE model. Specifically, we propose a novel sustainable term frequency-inverse document frequency (TFIDF) based black-box adversarial attack to evaluate the robustness of state-of-the-art CNN, CGN, LSTM, and BERT-based models on two benchmark RE datasets. Compared with white-box adversarial attacks, black-box attacks impose further constraints on the query budget; thus, efficient black-box attacks remain an open problem. By applying TFIDF to the correctly classified sentences of each class label in the test set, the proposed query-efficient method achieves a reduction of up to 70% in the number of queries to the target model for identifying important text items. Based on these items, we design both character- and word-level perturbations to generate adversarial examples. The proposed attack successfully reduces the accuracy of six representative models from an average F1 score of 80% to below 20%. The generated adversarial examples were evaluated by humans and are considered semantically similar. Moreover, we discuss defense strategies that mitigate such attacks, and the potential countermeasures that could be deployed in order to improve sustainability of the proposed scheme.

ACS Style

Ijaz Haq; Zahid Khan; Arshad Ahmad; Bashir Hayat; Asif Khan; Ye-Eun Lee; Ki-Il Kim. Evaluating and Enhancing the Robustness of Sustainable Neural Relationship Classifiers Using Query-Efficient Black-Box Adversarial Attacks. Sustainability 2021, 13, 5892 .

AMA Style

Ijaz Haq, Zahid Khan, Arshad Ahmad, Bashir Hayat, Asif Khan, Ye-Eun Lee, Ki-Il Kim. Evaluating and Enhancing the Robustness of Sustainable Neural Relationship Classifiers Using Query-Efficient Black-Box Adversarial Attacks. Sustainability. 2021; 13 (11):5892.

Chicago/Turabian Style

Ijaz Haq; Zahid Khan; Arshad Ahmad; Bashir Hayat; Asif Khan; Ye-Eun Lee; Ki-Il Kim. 2021. "Evaluating and Enhancing the Robustness of Sustainable Neural Relationship Classifiers Using Query-Efficient Black-Box Adversarial Attacks." Sustainability 13, no. 11: 5892.

Review article
Published: 07 April 2021 in Complexity
Reads 0
Downloads 0

Context. Social media platforms such as Facebook and Twitter carry a big load of people’s opinions about politics and leaders, which makes them a good source of information for researchers to exploit different tasks that include election predictions. Objective. Identify, categorize, and present a comprehensive overview of the approaches, techniques, and tools used in election predictions on Twitter. Method. Conducted a systematic mapping study (SMS) on election predictions on Twitter and provided empirical evidence for the work published between January 2010 and January 2021. Results. This research identified 787 studies related to election predictions on Twitter. 98 primary studies were selected after defining and implementing several inclusion/exclusion criteria. The results show that most of the studies implemented sentiment analysis (SA) followed by volume-based and social network analysis (SNA) approaches. The majority of the studies employed supervised learning techniques, subsequently, lexicon-based approach SA, volume-based, and unsupervised learning. Besides this, 18 types of dictionaries were identified. Elections of 28 countries were analyzed, mainly USA (28%) and Indian (25%) elections. Furthermore, the results revealed that 50% of the primary studies used English tweets. The demographic data showed that academic organizations and conference venues are the most active. Conclusion. The evolution of the work published in the past 11 years shows that most of the studies employed SA. The implementation of SNA techniques is lower as compared to SA. Appropriate political labelled datasets are not available, especially in languages other than English. Deep learning needs to be employed in this domain to get better predictions.

ACS Style

Asif Khan; Huaping Zhang; Nada Boudjellal; Arshad Ahmad; Jianyun Shang; Lin Dai; Bashir Hayat. Election Prediction on Twitter: A Systematic Mapping Study. Complexity 2021, 2021, 1 -27.

AMA Style

Asif Khan, Huaping Zhang, Nada Boudjellal, Arshad Ahmad, Jianyun Shang, Lin Dai, Bashir Hayat. Election Prediction on Twitter: A Systematic Mapping Study. Complexity. 2021; 2021 ():1-27.

Chicago/Turabian Style

Asif Khan; Huaping Zhang; Nada Boudjellal; Arshad Ahmad; Jianyun Shang; Lin Dai; Bashir Hayat. 2021. "Election Prediction on Twitter: A Systematic Mapping Study." Complexity 2021, no. : 1-27.

Research article
Published: 13 March 2021 in Complexity
Reads 0
Downloads 0

The web is being loaded daily with a huge volume of data, mainly unstructured textual data, which increases the need for information extraction and NLP systems significantly. Named-entity recognition task is a key step towards efficiently understanding text data and saving time and effort. Being a widely used language globally, English is taking over most of the research conducted in this field, especially in the biomedical domain. Unlike other languages, Arabic suffers from lack of resources. This work presents a BERT-based model to identify biomedical named entities in the Arabic text data (specifically disease and treatment named entities) that investigates the effectiveness of pretraining a monolingual BERT model with a small-scale biomedical dataset on enhancing the model understanding of Arabic biomedical text. The model performance was compared with two state-of-the-art models (namely, AraBERT and multilingual BERT cased), and it outperformed both models with 85% F1-score.

ACS Style

Nada Boudjellal; Huaping Zhang; Asif Khan; Arshad Ahmad; Rashid Naseem; Jianyun Shang; Lin Dai. ABioNER: A BERT-Based Model for Arabic Biomedical Named-Entity Recognition. Complexity 2021, 2021, 1 -6.

AMA Style

Nada Boudjellal, Huaping Zhang, Asif Khan, Arshad Ahmad, Rashid Naseem, Jianyun Shang, Lin Dai. ABioNER: A BERT-Based Model for Arabic Biomedical Named-Entity Recognition. Complexity. 2021; 2021 ():1-6.

Chicago/Turabian Style

Nada Boudjellal; Huaping Zhang; Asif Khan; Arshad Ahmad; Rashid Naseem; Jianyun Shang; Lin Dai. 2021. "ABioNER: A BERT-Based Model for Arabic Biomedical Named-Entity Recognition." Complexity 2021, no. : 1-6.

Research article
Published: 09 October 2020 in Complexity
Reads 0
Downloads 0

The rapidly growing data in many areas, as well as in the biomedical domain, require the assistance of information extraction systems to acquire the much needed knowledge about specific entities such as proteins, drugs, or diseases practically within a short time. Annotated corpora serve the purpose of facilitating the process of building NLP systems. While colossal work has been done in this area for English language, other languages like Arabic seem to lack these resources, especially in the healthcare area. Therefore, in this work, we present a method to develop a silver standard medical corpus for the Arabic language with a dictionary as a minimal supervision tool. The corpus contains 49,856 sentences tagged with 13 entity types corresponding to a subset of UMLS (Unified Medical Language System) concept types. The evaluation of a subset of corpus showed the efficiency of the method used to annotate it with 90% accuracy.

ACS Style

Nada Boudjellal; Huaping Zhang; Asif Khan; Arshad Ahmad; Rashid Naseem; Lin Dai. A Silver Standard Biomedical Corpus for Arabic Language. Complexity 2020, 2020, 1 -7.

AMA Style

Nada Boudjellal, Huaping Zhang, Asif Khan, Arshad Ahmad, Rashid Naseem, Lin Dai. A Silver Standard Biomedical Corpus for Arabic Language. Complexity. 2020; 2020 ():1-7.

Chicago/Turabian Style

Nada Boudjellal; Huaping Zhang; Asif Khan; Arshad Ahmad; Rashid Naseem; Lin Dai. 2020. "A Silver Standard Biomedical Corpus for Arabic Language." Complexity 2020, no. : 1-7.

Research article
Published: 01 September 2020 in Scientific Programming
Reads 0
Downloads 0

Politics is one of the hottest and most commonly mentioned and viewed topics on social media networks nowadays. Microblogging platforms like Twitter and Weibo are widely used by many politicians who have a huge number of followers and supporters on those platforms. It is essential to study the supporters’ network of political leaders because it can help in decision making when predicting their political futures. This study focuses on the supporters’ network of three famous political leaders of Pakistan, namely, Imran Khan (IK), Maryam Nawaz Sharif (MNS), and Bilawal Bhutto Zardari (BBZ). This is done using social network analysis and semantic analysis. The proposed method (1) detects and removes fake supporter(s), (2) mines communities in the politicians’ social network(s), (3) investigates the supporters’ reply network for conversations between supporters about each leader, and, finally, (4) analyses the retweet network for information diffusion of each political leader. Furthermore, sentiment analysis of the supporters of politicians is done using machine learning techniques, which ultimately predicted and revealed the strongest supporter network(s) among the three political leaders. Analysis of this data reveals that as of October 2017 (1) IK was the most renowned of the three politicians and had the strongest supporter’s community while using Twitter in a very controlled manner, (2) BBZ had the weakest supporters’ network on Twitter, and (3) the supporters of the political leaders in Pakistan are flexible on Twitter, communicating with each other, and that any group of supporters has a low level of isolation.

ACS Style

Asif Khan; Huaping Zhang; Jianyun Shang; Nada Boudjellal; Arshad Ahmad; Asmat Ali; Lin Dai. Predicting Politician’s Supporters’ Network on Twitter Using Social Network Analysis and Semantic Analysis. Scientific Programming 2020, 2020, 1 -17.

AMA Style

Asif Khan, Huaping Zhang, Jianyun Shang, Nada Boudjellal, Arshad Ahmad, Asmat Ali, Lin Dai. Predicting Politician’s Supporters’ Network on Twitter Using Social Network Analysis and Semantic Analysis. Scientific Programming. 2020; 2020 ():1-17.

Chicago/Turabian Style

Asif Khan; Huaping Zhang; Jianyun Shang; Nada Boudjellal; Arshad Ahmad; Asmat Ali; Lin Dai. 2020. "Predicting Politician’s Supporters’ Network on Twitter Using Social Network Analysis and Semantic Analysis." Scientific Programming 2020, no. : 1-17.

Review
Published: 15 July 2020 in Security and Communication Networks
Reads 0
Downloads 0

Context. The improvements made in the last couple of decades in the requirements engineering (RE) processes and methods have witnessed a rapid rise in effectively using diverse machine learning (ML) techniques to resolve several multifaceted RE issues. One such challenging issue is the effective identification and classification of the software requirements on Stack Overflow (SO) for building quality systems. The appropriateness of ML-based techniques to tackle this issue has revealed quite substantial results, much effective than those produced by the usual available natural language processing (NLP) techniques. Nonetheless, a complete, systematic, and detailed comprehension of these ML based techniques is considerably scarce. Objective. To identify or recognize and classify the kinds of ML algorithms used for software requirements identification primarily on SO. Method. This paper reports a systematic literature review (SLR) collecting empirical evidence published up to May 2020. Results. This SLR study found 2,484 published papers related to RE and SO. The data extraction process of the SLR showed that (1) Latent Dirichlet Allocation (LDA) topic modeling is among the widely used ML algorithm in the selected studies and (2) precision and recall are amongst the most commonly utilized evaluation methods for measuring the performance of these ML algorithms. Conclusion. Our SLR study revealed that while ML algorithms have phenomenal capabilities of identifying the software requirements on SO, they still are confronted with various open problems/issues that will eventually limit their practical applications and performances. Our SLR study calls for the need of close collaboration venture between the RE and ML communities/researchers to handle the open issues confronted in the development of some real world machine learning-based quality systems.

ACS Style

Arshad Ahmad; Chong Feng; Muzammil Khan; Asif Khan; Ayaz Ullah; Shah Nazir; Adnan Tahir. A Systematic Literature Review on Using Machine Learning Algorithms for Software Requirements Identification on Stack Overflow. Security and Communication Networks 2020, 2020, 1 -19.

AMA Style

Arshad Ahmad, Chong Feng, Muzammil Khan, Asif Khan, Ayaz Ullah, Shah Nazir, Adnan Tahir. A Systematic Literature Review on Using Machine Learning Algorithms for Software Requirements Identification on Stack Overflow. Security and Communication Networks. 2020; 2020 ():1-19.

Chicago/Turabian Style

Arshad Ahmad; Chong Feng; Muzammil Khan; Asif Khan; Ayaz Ullah; Shah Nazir; Adnan Tahir. 2020. "A Systematic Literature Review on Using Machine Learning Algorithms for Software Requirements Identification on Stack Overflow." Security and Communication Networks 2020, no. : 1-19.

Review article
Published: 16 June 2020 in Scientific Programming
Reads 0
Downloads 0

With the accelerating growth of big data, especially in the healthcare area, information extraction is more needed currently than ever, for it can convey unstructured information into an easily interpretable structured data. Relation extraction is the second of the two important tasks of relation extraction. This study presents an overview of relation extraction using distant supervision, providing a generalized architecture of this task based on the state-of-the-art work that proposed this method. Besides, it surveys the methods used in the literature targeting this topic with a description of different knowledge bases used in the process along with the corpora, which can be helpful for beginner practitioners seeking knowledge on this subject. Moreover, the limitations of the proposed approaches and future challenges were highlighted, and possible solutions were proposed.

ACS Style

Nada Boudjellal; Huaping Zhang; Asif Khan; Arshad Ahmad. Biomedical Relation Extraction Using Distant Supervision. Scientific Programming 2020, 2020, 1 -9.

AMA Style

Nada Boudjellal, Huaping Zhang, Asif Khan, Arshad Ahmad. Biomedical Relation Extraction Using Distant Supervision. Scientific Programming. 2020; 2020 ():1-9.

Chicago/Turabian Style

Nada Boudjellal; Huaping Zhang; Asif Khan; Arshad Ahmad. 2020. "Biomedical Relation Extraction Using Distant Supervision." Scientific Programming 2020, no. : 1-9.

Conference paper
Published: 01 October 2019 in 2019 IEEE 10th International Conference on Software Engineering and Service Science (ICSESS)
Reads 0
Downloads 0

Context: The recent developments made during the last decade or two in requirements engineering (RE) methods have seen a rise in using different machine-learning (ML) algorithms to solve some complex RE problems. One such problem is identifying and classifying software requirements on Stack Overflow (SO). The suitability of ML-based techniques to this tackle problem has shown convincing results, much better than those generated by some traditional natural language processing (NLP) techniques. Nevertheless, a comprehensive and systematic comprehension of these ML based techniques is still deficient. Objective: To identify and classify the type of ML algorithms used for identifying software requirements on SO. Method: This article reports systematic literature review (SLR) gathering evidence published up to August, 2019. Results: This study identified 1073 published papers related to RE and SO. Only 12 primary papers were selected. The data extraction process revealed that; 1) Latent Dirichlet Allocation (LDA) topic modeling is the most widely used ML algorithm in the selected studies, and 2) Precision and recall are the most commonly used evaluation method to measure the performance of these ML algorithms. Conclusion: The SLR finds that while ML algorithms have great potential in the identification of RE on SO, they face some open issues that will ultimately affect their performance and practical application. The SLR calls for the collaboration between RE and ML researchers, to tackle the open issues facing the development of real-world ML systems.

ACS Style

Arshad Ahmad; Chong Feng; Adnan Tahir; Asif Khan; Muhammad Waqas; Sadique Ahmad; Ayaz Ullah. An Empirical Evaluation of Machine Learning Algorithms for Identifying Software Requirements on Stack Overflow: Initial Results. 2019 IEEE 10th International Conference on Software Engineering and Service Science (ICSESS) 2019, 689 -693.

AMA Style

Arshad Ahmad, Chong Feng, Adnan Tahir, Asif Khan, Muhammad Waqas, Sadique Ahmad, Ayaz Ullah. An Empirical Evaluation of Machine Learning Algorithms for Identifying Software Requirements on Stack Overflow: Initial Results. 2019 IEEE 10th International Conference on Software Engineering and Service Science (ICSESS). 2019; ():689-693.

Chicago/Turabian Style

Arshad Ahmad; Chong Feng; Adnan Tahir; Asif Khan; Muhammad Waqas; Sadique Ahmad; Ayaz Ullah. 2019. "An Empirical Evaluation of Machine Learning Algorithms for Identifying Software Requirements on Stack Overflow: Initial Results." 2019 IEEE 10th International Conference on Software Engineering and Service Science (ICSESS) , no. : 689-693.