This page has only limited features, please log in for full access.

Prof. Jorge Bernardino
Instituto Politécnico de Coimbra, Instituto Superior de Engenharia de Coimbra, Coimbra, Portugal

Basic Info


Research Keywords & Expertise

0 Data Warehousing
0 Software
0 Big data, business intelligence and data science.
0 data analysis, Business analytics, information systems, decision analysis, business informatics
0 NoSQL Database

Fingerprints

Software
Data Warehousing
NoSQL Database
Big data, business intelligence and data science.

Honors and Awards

The user has no records in this section


Career Timeline

The user has no records in this section.


Short Biography

The user biography is not available.
Following
Followers
Co Authors
The list of users this user is following is empty.
Following: 0 users

Feed

Journal article
Published: 09 August 2021 in Journal of Clinical Medicine
Reads 0
Downloads 0

We carried out a retrospective analysis of infertile couple data using several methodologies and data analysis techniques, including the application of a novel data mining approach for analyzing varicocele treatment outcomes. The aim of this work was to characterize embolized varicocele patients by ascertaining the improvement of some of their clinical features, predicting the success of treatment via pregnancy outcomes, and identifying data patterns that can contribute to both ongoing varicocele research and the more effective management of patients treated for varicocele. We retrospectively surveyed the data of 293 consenting couples undergoing infertility treatment with male varicocele embolization over a 10-year period, and sperm samples were collected before and at 3, 6, and 12 months after varicocele embolization treatment and analyzed with World Health Organization parameters—varicocele severity grades were assessed with medical assessment and scrotal ultrasound, patient personal information (e.g., age, lifestyle, and embolization complications) was collected with clinical inquiries, and varicocele embolization success was measured through pregnancy outcomes. Varicocele embolization significantly improved sperm concentration, motility, and morphology mean values, as well as sperm chromatin integrity. Following this study, we can predict that a male patient without a high varicocele severity grade (with grade I or II) has a 70.83% chance of conceiving after embolization treatment if his partners’ age is between 24 and 33 with an accuracy of 70.59%. Furthermore, male patients successful in achieving pregnancy following embolization are mostly characterized by having a normal sperm progressive motility before treatment, a normal sperm concentration after treatment, a moderate to low varicocele severity grade, and not working in a putatively hazardous environment.

ACS Style

Ana Sousa; Judith Santos-Pereira; Maria Freire; Belmiro Parada; Teresa Almeida-Santos; Jorge Bernardino; João Ramalho-Santos. Using Data Mining to Assist in Predicting Reproductive Outcomes Following Varicocele Embolization. Journal of Clinical Medicine 2021, 10, 3503 .

AMA Style

Ana Sousa, Judith Santos-Pereira, Maria Freire, Belmiro Parada, Teresa Almeida-Santos, Jorge Bernardino, João Ramalho-Santos. Using Data Mining to Assist in Predicting Reproductive Outcomes Following Varicocele Embolization. Journal of Clinical Medicine. 2021; 10 (16):3503.

Chicago/Turabian Style

Ana Sousa; Judith Santos-Pereira; Maria Freire; Belmiro Parada; Teresa Almeida-Santos; Jorge Bernardino; João Ramalho-Santos. 2021. "Using Data Mining to Assist in Predicting Reproductive Outcomes Following Varicocele Embolization." Journal of Clinical Medicine 10, no. 16: 3503.

Review article
Published: 08 June 2021 in Journal of King Saud University - Computer and Information Sciences
Reads 0
Downloads 0

The healthcare industry has become increasingly challenging, requiring retrieval of knowledge from large amounts of complex data to find the best treatments. Several works have suggested the use of Data Mining tools to overcome the challenges; however, none of them has suggested the best tool to do so. To fill this gap, this paper presents a survey of popular open-source data mining tools in which data mining tool selection criteria based on healthcare application requirements is proposed and the best ones using the proposed selection criteria are identified. The following popular open-source data mining tools are assessed: KNIME, R, RapidMiner, Scikit-learn, and Spark. The study shows that KNIME and RapidMiner provide the largest coverage of healthcare data mining requirements.

ACS Style

Judith Santos-Pereira; Le Gruenwald; Jorge Bernardino. Top Data Mining Tools for the Healthcare Industry. Journal of King Saud University - Computer and Information Sciences 2021, 1 .

AMA Style

Judith Santos-Pereira, Le Gruenwald, Jorge Bernardino. Top Data Mining Tools for the Healthcare Industry. Journal of King Saud University - Computer and Information Sciences. 2021; ():1.

Chicago/Turabian Style

Judith Santos-Pereira; Le Gruenwald; Jorge Bernardino. 2021. "Top Data Mining Tools for the Healthcare Industry." Journal of King Saud University - Computer and Information Sciences , no. : 1.

Journal article
Published: 04 May 2021 in IEEE Access
Reads 0
Downloads 0

The increase of automated systems in space missions raises concerns about safety and reliability in operations carried out by satellites due to performance degradation. There have been several studies about the automatic planning process, but many approaches are generated with invalid states. The invalid state can be understood as a prohibited, degraded or risky scenario for the domain. This paper proposes an automated planning process with restrictions that enables automatic planners to not generate plans with invalid states. We implement a validator method for the planner software which proves that plan generation matches the restrictions imposed on the domain. In the experiments, we test an automatic planning process that is specific to the aerospace area, where a knowledge base with invalid states is available in the context of the operation of a satellite. Our proposal to carry out the verification of invalid states in automatic planning, can contribute to plans being generated with higher quality, ensuring that the goal of a plan is only achieved through valid intermediate states. It is also expected that the generated plans will be executed with better performance and will require less computational resources, since the search space is reduced.

ACS Style

Caio Gustavo Rodrigues da Cruz; Rodrigo Rocha Silva; Mauricio Goncalves Vieira Ferreira; Jorge Bernardino. Automated Planning With Invalid States Prediction. IEEE Access 2021, 9, 68289 -68301.

AMA Style

Caio Gustavo Rodrigues da Cruz, Rodrigo Rocha Silva, Mauricio Goncalves Vieira Ferreira, Jorge Bernardino. Automated Planning With Invalid States Prediction. IEEE Access. 2021; 9 ():68289-68301.

Chicago/Turabian Style

Caio Gustavo Rodrigues da Cruz; Rodrigo Rocha Silva; Mauricio Goncalves Vieira Ferreira; Jorge Bernardino. 2021. "Automated Planning With Invalid States Prediction." IEEE Access 9, no. : 68289-68301.

Journal article
Published: 15 April 2021 in Big Data and Cognitive Computing
Reads 0
Downloads 0

Wine is the second most popular alcoholic drink in the world behind beer. With the rise of e-commerce, recommendation systems have become a very important factor in the success of business. Recommendation systems analyze metadata to predict if, for example, a user will recommend a product. The metadata consist mostly of former reviews or web traffic from the same user. For this reason, we investigate what would happen if the information analyzed by a recommendation system was insufficient. In this paper, we explore the effects of a new wine ontology in a recommendation system. We created our own wine ontology and then made two sets of tests for each dataset. In both sets of tests, we applied four machine learning clustering algorithms that had the objective of predicting if a user recommends a wine product. The only difference between each set of tests is the attributes contained in the dataset. In the first set of tests, the datasets were influenced by the ontology, and in the second set, the only information about a wine product is its name. We compared the two test sets’ results and observed that there was a significant increase in classification accuracy when using a dataset with the proposed ontology. We demonstrate the general applicability of the methodology to other cases, applying our proposal to an Amazon product review dataset.

ACS Style

Luís Oliveira; Rodrigo Rocha Silva; Jorge Bernardino. Wine Ontology Influence in a Recommendation System. Big Data and Cognitive Computing 2021, 5, 16 .

AMA Style

Luís Oliveira, Rodrigo Rocha Silva, Jorge Bernardino. Wine Ontology Influence in a Recommendation System. Big Data and Cognitive Computing. 2021; 5 (2):16.

Chicago/Turabian Style

Luís Oliveira; Rodrigo Rocha Silva; Jorge Bernardino. 2021. "Wine Ontology Influence in a Recommendation System." Big Data and Cognitive Computing 5, no. 2: 16.

Journal article
Published: 02 February 2021 in IEEE Access
Reads 0
Downloads 0

REST services are nowadays being used to support many businesses, with most major companies exposing their services via REST interfaces (e.g., Google, Amazon, Instagram, and Slack). In this type of scenarios, heterogeneity is prevalent and software is sometimes exposed to unexpected conditions that may activate residual bugs, leading service operations to fail. Such failures may lead to financial or reputation losses (e.g., information disclosure). Although techniques and tools for assessing robustness have been thoroughly studied and applied to a large diversity of domains, REST services still lack practical approaches that specialize in robustness evaluation. In this paper, we present a tool (named bBOXRT) for performing robustness tests over REST services, solely based on minimal information expressed in their interface descriptions. We used bBOXRT to evaluate an heterogeneous set of 52 REST services that comprise 1,351 operations and fit in distinct categories (e.g., public, private, in-house). We were able to disclose several different types of robustness problems, including issues in services with strong reliability requirements and also a few security vulnerabilities. The results show that REST services are being deployed preserving software defects that harm service integration, and also carrying security vulnerabilities that can be exploited by malicious users.

ACS Style

Nuno Laranjeiro; Joao Agnelo; Jorge Bernardino. A Black Box Tool for Robustness Testing of REST Services. IEEE Access 2021, 9, 24738 -24754.

AMA Style

Nuno Laranjeiro, Joao Agnelo, Jorge Bernardino. A Black Box Tool for Robustness Testing of REST Services. IEEE Access. 2021; 9 ():24738-24754.

Chicago/Turabian Style

Nuno Laranjeiro; Joao Agnelo; Jorge Bernardino. 2021. "A Black Box Tool for Robustness Testing of REST Services." IEEE Access 9, no. : 24738-24754.

Survey article
Published: 18 January 2021 in Computing
Reads 0
Downloads 0

The edge computing (EC) paradigm brings computation and storage to the edge of the network where data is both consumed and produced. This variation is necessary to cope with the increasing amount of network-connected devices and data transmitted, that the launch of the new 5G networks will expand. The aim is to avoid the high latency and traffic bottlenecks associated with the use of Cloud Computing in networks where several devices both access and generate high volumes of data. EC also improves network support for mobility, security, and privacy. This paper provides a discussion around EC and summarized the definition and fundamental properties of the EC architectures proposed in the literature (Multi-access Edge Computing, Fog Computing, Cloudlet Computing, and Mobile Cloud Computing). Subsequently, this paper examines significant use cases for each EC architecture and debates some promising future research directions.

ACS Style

Gonçalo Carvalho; Bruno Cabral; Vasco Pereira; Jorge Bernardino. Edge computing: current trends, research challenges and future directions. Computing 2021, 103, 993 -1023.

AMA Style

Gonçalo Carvalho, Bruno Cabral, Vasco Pereira, Jorge Bernardino. Edge computing: current trends, research challenges and future directions. Computing. 2021; 103 (5):993-1023.

Chicago/Turabian Style

Gonçalo Carvalho; Bruno Cabral; Vasco Pereira; Jorge Bernardino. 2021. "Edge computing: current trends, research challenges and future directions." Computing 103, no. 5: 993-1023.

Chapter
Published: 01 January 2021 in Advances in Human and Social Aspects of Technology
Reads 0
Downloads 0

The amount of data in our world has been exploding, and big data represents a fundamental shift in business decision-making. Analyzing such so-called big data is today a keystone of competition and the success of organizations depends on fast and well-founded decisions taken by relevant people in their specific area of responsibility. Business analytics (BA) represents a merger between data strategy and a collection of decision support technologies and mechanisms for enterprises aimed at enabling knowledge workers such as executives, managers, and analysts to make better and faster decisions. The authors review the concept of BA as an open innovation strategy and address the importance of BA in revolutionizing knowledge towards economics and business sustainability. Using big data with open source business analytics systems generates the greatest opportunities to increase competitiveness and differentiation in organizations. In this chapter, the authors describe and analyze business intelligence and analytics (BI&A) and four popular open source systems – BIRT, Jaspersoft, Pentaho, and SpagoBI.

ACS Style

Pedro Caldeira Neves; Jorge Rodrigues Bernardino. The Role of Big Data and Business Analytics in Decision Making. Advances in Human and Social Aspects of Technology 2021, 226 -257.

AMA Style

Pedro Caldeira Neves, Jorge Rodrigues Bernardino. The Role of Big Data and Business Analytics in Decision Making. Advances in Human and Social Aspects of Technology. 2021; ():226-257.

Chicago/Turabian Style

Pedro Caldeira Neves; Jorge Rodrigues Bernardino. 2021. "The Role of Big Data and Business Analytics in Decision Making." Advances in Human and Social Aspects of Technology , no. : 226-257.

Journal article
Published: 27 November 2020 in Information
Reads 0
Downloads 0

The ability of keeping a record of geospatial information, knowing how it changed over time, is crucial for landscape analysis and territorial government. Land management is still a problem. Many governmental databases are incomplete, and there is a lack of reliable information. Good land management implies having a tool that can keep track of all the information available about a certain property and its changes over time. In this paper, we propose a land management tool where managers access all the information on a certain parcel of land—its boundaries, the land registration, a map which verifies the landcover, and the historic of updates of territorial limits. With the proposed tool, it is possible to edit the information of any property, whether it is active or not—that is, to also edit properties that no longer exist today, but that the user wants to add information to, for legal or other reasons. Keeping track of data properties’ revision history is groundbreaking due to the fact it is not well developed in existing tools. We will look at Brazil as a use case, where land management is a critical problem.

ACS Style

Bernardo Carvalhinho; Rodrigo Silva; Jorge Bernardino. A Tool for Better Land Management. Information 2020, 11, 554 .

AMA Style

Bernardo Carvalhinho, Rodrigo Silva, Jorge Bernardino. A Tool for Better Land Management. Information. 2020; 11 (12):554.

Chicago/Turabian Style

Bernardo Carvalhinho; Rodrigo Silva; Jorge Bernardino. 2020. "A Tool for Better Land Management." Information 11, no. 12: 554.

Journal article
Published: 20 November 2020 in Future Internet
Reads 0
Downloads 0

Cyber-Physical Systems (CPS) are a prominent component of the modern digital transformation, which combines the dynamics of the physical processes with those of software and networks. Critical infrastructures have built-in CPS, and assessing its risk is crucial to avoid significant losses, both economic and social. As CPS are increasingly attached to the world’s main industries, these systems’ criticality depends not only on software efficiency and availability but also on cyber-security awareness. Given this, and because Failure Mode and Effect Analysis (FMEA) is one of the most effective methods to assess critical infrastructures’ risk, in this paper, we show how this method performs in the analysis of CPS threats, also exposing the main drawbacks concerning CPS risk assessment. We first propose a risk prevention analysis to the Communications-Based Train Control (CBTC) system, which involves exploiting cyber vulnerabilities, and we introduce a novel approach to the failure modes’ Risk Priority Number (RPN) estimation. We also propose how to adapt the FMEA method to the requirement of CPS risk evaluation. We applied the proposed procedure to the CBTC system use case since it is a CPS with a substantial cyber component and network data transfer.

ACS Style

João Oliveira; Gonçalo Carvalho; Bruno Cabral; Jorge Bernardino. Failure Mode and Effect Analysis for Cyber-Physical Systems. Future Internet 2020, 12, 205 .

AMA Style

João Oliveira, Gonçalo Carvalho, Bruno Cabral, Jorge Bernardino. Failure Mode and Effect Analysis for Cyber-Physical Systems. Future Internet. 2020; 12 (11):205.

Chicago/Turabian Style

João Oliveira; Gonçalo Carvalho; Bruno Cabral; Jorge Bernardino. 2020. "Failure Mode and Effect Analysis for Cyber-Physical Systems." Future Internet 12, no. 11: 205.

Journal article
Published: 16 November 2020 in Entropy
Reads 0
Downloads 0

The dependability of systems and networks has been the target of research for many years now. In the 1970s, what is now known as the top conference on dependability—The IEEE/IFIP International Conference on Dependable Systems and Networks (DSN)—emerged gathering international researchers and sparking the interest of the scientific community. Although it started in niche systems, nowadays dependability is viewed as highly important in most computer systems. The goal of this work is to analyze the research published in the proceedings of well-established dependability conferences (i.e., DSN, International Symposium on Software Reliability Engineering (ISSRE), International Symposium on Reliable Distributed Systems (SRDS), European Dependable Computing Conference (EDCC), Latin-American Symposium on Dependable Computing (LADC), Pacific Rim International Symposium on Dependable Computing (PRDC)), while using Natural Language Processing (NLP) and namely the Latent Dirichlet Allocation (LDA) algorithm to identify active, collapsing, ephemeral, and new lines of research in the dependability field. Results show a strong emphasis on terms, like ‘security’, despite the general focus of the conferences in dependability and new trends that are related with ’machine learning’ and ‘blockchain’. We used the PRDC conference as a use case, which showed similarity with the overall set of conferences, although we also found specific terms, like ‘cyber-physical’, being popular at PRDC and not in the overall dataset.

ACS Style

Miriam Carnot; Jorge Bernardino; Nuno Laranjeiro; Hugo Gonçalo Oliveira. Applying Text Analytics for Studying Research Trends in Dependability. Entropy 2020, 22, 1303 .

AMA Style

Miriam Carnot, Jorge Bernardino, Nuno Laranjeiro, Hugo Gonçalo Oliveira. Applying Text Analytics for Studying Research Trends in Dependability. Entropy. 2020; 22 (11):1303.

Chicago/Turabian Style

Miriam Carnot; Jorge Bernardino; Nuno Laranjeiro; Hugo Gonçalo Oliveira. 2020. "Applying Text Analytics for Studying Research Trends in Dependability." Entropy 22, no. 11: 1303.

Conference paper
Published: 30 July 2020 in Transactions on Petri Nets and Other Models of Concurrency XV
Reads 0
Downloads 0

Nowadays software architects face new challenges because Internet has grown to a point where popular websites are accessed by hundreds of millions of people on a daily basis. One powerful machine is no longer economically viable and resilient in order to handle such outstanding traffic and architectures have since been migrated to horizontal scaling. However, traditional databases, usually associated with a relational design, were not ready for horizontal scaling. Therefore, NoSQL databases have proposed to fill the gap left by their predecessors. This new paradigm is proposed to better serve currently massive scaled-up Internet usage when consistency is no longer a top priority and a high available service is preferable. Cassandra is a NoSQL database based on the Amazon Dynamo design. Dynamo-based databases are designed to run in a cluster while offering high availability and eventual consistency to clients when subject to network partition events. Therefore, the main goal of this work is to propose CBench-Dynamo, the first consistency benchmark for NoSQL databases. Our proposed benchmark correlates properties, such as performance, consistency, and availability, in different consistency configurations while subjecting the System Under Test to network partition events.

ACS Style

Miguel Diogo; Bruno Cabral; Jorge Bernardino. CBench-Dynamo: A Consistency Benchmark for NoSQL Database Systems. Transactions on Petri Nets and Other Models of Concurrency XV 2020, 84 -98.

AMA Style

Miguel Diogo, Bruno Cabral, Jorge Bernardino. CBench-Dynamo: A Consistency Benchmark for NoSQL Database Systems. Transactions on Petri Nets and Other Models of Concurrency XV. 2020; ():84-98.

Chicago/Turabian Style

Miguel Diogo; Bruno Cabral; Jorge Bernardino. 2020. "CBench-Dynamo: A Consistency Benchmark for NoSQL Database Systems." Transactions on Petri Nets and Other Models of Concurrency XV , no. : 84-98.

Journal article
Published: 29 July 2020 in Engineering Applications of Artificial Intelligence
Reads 0
Downloads 0

Edge Computing (EC) is a recent architectural paradigm that brings computation close to end-users with the aim of reducing latency and bandwidth bottlenecks, which 5G technologies are committed to further reduce, while also achieving higher reliability. EC enables computation offloading from end devices to edge nodes. Deciding whether a task should be offloaded, or not, is not trivial. Moreover, deciding when and where to offload a task makes things even harder and making inadequate or off-time decisions can undermine the EC approach. Recently, Artificial Intelligence (AI) techniques, such as Machine Learning (ML), have been used to help EC systems cope with this problem. AI promises accurate decisions, higher adaptability and portability, thus diminishing the cost of decision-making and the probability of error. In this work, we perform a literature review on computation offloading in EC systems with and without AI techniques. We analyze several AI techniques, especially ML-based, that display promising results, overcoming the shortcomings of current approaches for computing offloading coordination We sorted the ML algorithms into classes for better analysis and provide an in-depth analysis on the use of AI for offloading, in particular, in the use case of offloading in Vehicular Edge Computing Networks, actually one technology that gained more relevance in the last years, enabling a vast amount of solutions for computation and data offloading. We also discuss the main advantages and limitations of offloading, with and without the use of AI techniques.

ACS Style

Gonçalo Carvalho; Bruno Cabral; Vasco Pereira; Jorge Bernardino. Computation offloading in Edge Computing environments using Artificial Intelligence techniques. Engineering Applications of Artificial Intelligence 2020, 95, 103840 .

AMA Style

Gonçalo Carvalho, Bruno Cabral, Vasco Pereira, Jorge Bernardino. Computation offloading in Edge Computing environments using Artificial Intelligence techniques. Engineering Applications of Artificial Intelligence. 2020; 95 ():103840.

Chicago/Turabian Style

Gonçalo Carvalho; Bruno Cabral; Vasco Pereira; Jorge Bernardino. 2020. "Computation offloading in Edge Computing environments using Artificial Intelligence techniques." Engineering Applications of Artificial Intelligence 95, no. : 103840.

Journal article
Published: 01 April 2020 in International Journal of Information Security and Privacy
Reads 0
Downloads 0

Databases are widely used by organizations to store business-critical information, which makes them one of the most attractive targets for security attacks. SQL Injection is the most common attack to webpages with dynamic content. To mitigate it, organizations use Intrusion Detection Systems (IDS) as part of the security infrastructure, to detect this type of attack. However, the authors observe a gap between the comprehensive state-of-the-art in detecting SQL Injection attacks and the state-of-practice regarding existing tools capable of detecting such attacks. The majority of IDS implementations provide little or no protection against SQL Injection attacks, with exceptions like the tools Bro and ModSecurity. In this article, the authors compare these tools using the CSIC dataset in order to examine the state-of-practice in database protection from SQL Injection attacks, identifying the main characteristics and implementation details needed for IDSs to successfully detect such attacks. The experiments indicate that signature-based IDS provide the greatest coverage against SQL Injection.

ACS Style

Rui Filipe Silva; Raul Barbosa; Jorge Bernardino. Intrusion Detection Systems for Mitigating SQL Injection Attacks. International Journal of Information Security and Privacy 2020, 14, 20 -40.

AMA Style

Rui Filipe Silva, Raul Barbosa, Jorge Bernardino. Intrusion Detection Systems for Mitigating SQL Injection Attacks. International Journal of Information Security and Privacy. 2020; 14 (2):20-40.

Chicago/Turabian Style

Rui Filipe Silva; Raul Barbosa; Jorge Bernardino. 2020. "Intrusion Detection Systems for Mitigating SQL Injection Attacks." International Journal of Information Security and Privacy 14, no. 2: 20-40.

Journal article
Published: 08 November 2019 in Algorithms
Reads 0
Downloads 0

The growth of the Internet has increased the amount of data and information available to any person at any time. Recommendation Systems help users find the items that meet their preferences, among the large number of items available. Techniques such as collaborative filtering and content-based recommenders have played an important role in the implementation of recommendation systems. In the last few years, other techniques, such as, ontology-based recommenders, have gained significance when reffering better active user recommendations; however, building an ontology-based recommender is an expensive process, which requires considerable skills in Knowledge Engineering. This paper presents a new hybrid approach that combines the simplicity of collaborative filtering with the efficiency of the ontology-based recommenders. The experimental evaluation demonstrates that the proposed approach presents higher quality recommendations when compared to collaborative filtering. The main improvement is verified on the results regarding the products, which, in spite of belonging to unknown categories to the users, still match their preferences and become recommended.

ACS Style

Márcio Guia; Rodrigo Rocha Silva; Jorge Bernardino. A Hybrid Ontology-Based Recommendation System in e-Commerce. Algorithms 2019, 12, 239 .

AMA Style

Márcio Guia, Rodrigo Rocha Silva, Jorge Bernardino. A Hybrid Ontology-Based Recommendation System in e-Commerce. Algorithms. 2019; 12 (11):239.

Chicago/Turabian Style

Márcio Guia; Rodrigo Rocha Silva; Jorge Bernardino. 2019. "A Hybrid Ontology-Based Recommendation System in e-Commerce." Algorithms 12, no. 11: 239.

Journal article
Published: 23 October 2019 in Journal of Systems and Software
Reads 0
Downloads 0

NoSQL databases are increasingly used for storing and managing data in business-critical Big Data systems. The presence of software defects (i.e., bugs) in these databases can bring in severe consequences to the NoSQL services being offered, such as data loss or service unavailability. Thus, it is essential to understand the types of defects that frequently affect these databases, allowing developers take action in an informed manner (e.g., redirect testing efforts). In this paper, we use Orthogonal Defect Classification (ODC) to classify a total of 4096 software defects from three of the most popular NoSQL databases: MongoDB, Cassandra, and HBase. The results show great similarity for the defects across the three different NoSQL systems and, at the same time, show the differences and heterogeneity regarding research carried out in other domains and types of applications, emphasizing the need for possessing such information. Our results expose the defect distributions in NoSQL databases, provide a foundation for selecting representative defects for NoSQL systems, and, overall, can be useful for developers for verifying and building more reliable NoSQL database systems.

ACS Style

João Agnelo; Nuno Laranjeiro; Jorge Bernardino. Using Orthogonal Defect Classification to characterize NoSQL database defects. Journal of Systems and Software 2019, 159, 110451 .

AMA Style

João Agnelo, Nuno Laranjeiro, Jorge Bernardino. Using Orthogonal Defect Classification to characterize NoSQL database defects. Journal of Systems and Software. 2019; 159 ():110451.

Chicago/Turabian Style

João Agnelo; Nuno Laranjeiro; Jorge Bernardino. 2019. "Using Orthogonal Defect Classification to characterize NoSQL database defects." Journal of Systems and Software 159, no. : 110451.

Journal article
Published: 12 September 2019 in Future Generation Computer Systems
Reads 0
Downloads 0

Software systems are increasingly being used in business or mission critical scenarios, where the presence of certain types of software defects, i.e., bugs, may result in catastrophic consequences (e.g., financial losses or even the loss of human lives). To deploy systems in which we can rely on, it is vital to understand the types of defects that tend to affect such systems. This allows developers to take proper action, such as adapting the development process or redirecting testing efforts (e.g., using a certain set of testing techniques, or focusing on certain parts of the system). Orthogonal Defect Classification (ODC) has emerged as a popular method for classifying software defects, but it requires one or more experts to categorize each defect in a quite complex and time-consuming process. In this paper, we evaluate the use of machine learning algorithms (k-Nearest Neighbors, Support Vector Machines, Naïve Bayes, Nearest Centroid, Random Forest and Recurrent Neural Networks) for automatic classification of software defects using ODC, based on unstructured textual bug reports. Experimental results reveal the difficulties in automatically classifying certain ODC attributes solely using reports, but also suggest that the overall classification accuracy may be improved in most of the cases, if larger datasets are used.

ACS Style

Fábio Lopes; João Agnelo; César A. Teixeira; Nuno Laranjeiro; Jorge Bernardino. Automating orthogonal defect classification using machine learning algorithms. Future Generation Computer Systems 2019, 102, 932 -947.

AMA Style

Fábio Lopes, João Agnelo, César A. Teixeira, Nuno Laranjeiro, Jorge Bernardino. Automating orthogonal defect classification using machine learning algorithms. Future Generation Computer Systems. 2019; 102 ():932-947.

Chicago/Turabian Style

Fábio Lopes; João Agnelo; César A. Teixeira; Nuno Laranjeiro; Jorge Bernardino. 2019. "Automating orthogonal defect classification using machine learning algorithms." Future Generation Computer Systems 102, no. : 932-947.

Conference paper
Published: 01 June 2019 in 2019 14th Iberian Conference on Information Systems and Technologies (CISTI)
Reads 0
Downloads 0
ACS Style

Sofia Alves; João Costa; Jorge Bernardino. Information Extraction Applications for Clinical Trials: A Survey. 2019 14th Iberian Conference on Information Systems and Technologies (CISTI) 2019, 1 .

AMA Style

Sofia Alves, João Costa, Jorge Bernardino. Information Extraction Applications for Clinical Trials: A Survey. 2019 14th Iberian Conference on Information Systems and Technologies (CISTI). 2019; ():1.

Chicago/Turabian Style

Sofia Alves; João Costa; Jorge Bernardino. 2019. "Information Extraction Applications for Clinical Trials: A Survey." 2019 14th Iberian Conference on Information Systems and Technologies (CISTI) , no. : 1.

Conference paper
Published: 01 June 2019 in 2019 14th Iberian Conference on Information Systems and Technologies (CISTI)
Reads 0
Downloads 0
ACS Style

Nuno Leite; Isabel Pedrosa; Jorge Bernardino. Open Source Business Intelligence on a SME: A Case Study using Pentaho. 2019 14th Iberian Conference on Information Systems and Technologies (CISTI) 2019, 1 .

AMA Style

Nuno Leite, Isabel Pedrosa, Jorge Bernardino. Open Source Business Intelligence on a SME: A Case Study using Pentaho. 2019 14th Iberian Conference on Information Systems and Technologies (CISTI). 2019; ():1.

Chicago/Turabian Style

Nuno Leite; Isabel Pedrosa; Jorge Bernardino. 2019. "Open Source Business Intelligence on a SME: A Case Study using Pentaho." 2019 14th Iberian Conference on Information Systems and Technologies (CISTI) , no. : 1.

Conference paper
Published: 01 June 2019 in 2019 14th Iberian Conference on Information Systems and Technologies (CISTI)
Reads 0
Downloads 0
ACS Style

Tania Ferreira; Isabel Pedrosa; Jorge Bernardino. Integration of Business Intelligence with e-commerce. 2019 14th Iberian Conference on Information Systems and Technologies (CISTI) 2019, 1 .

AMA Style

Tania Ferreira, Isabel Pedrosa, Jorge Bernardino. Integration of Business Intelligence with e-commerce. 2019 14th Iberian Conference on Information Systems and Technologies (CISTI). 2019; ():1.

Chicago/Turabian Style

Tania Ferreira; Isabel Pedrosa; Jorge Bernardino. 2019. "Integration of Business Intelligence with e-commerce." 2019 14th Iberian Conference on Information Systems and Technologies (CISTI) , no. : 1.

Conference paper
Published: 01 June 2019 in 2019 14th Iberian Conference on Information Systems and Technologies (CISTI)
Reads 0
Downloads 0
ACS Style

Hugo Brito; Alvaro Santos; Jorge Bernardino; Anabela Gomes. Learning analysis of mobile JavaScript frameworks. 2019 14th Iberian Conference on Information Systems and Technologies (CISTI) 2019, 1 .

AMA Style

Hugo Brito, Alvaro Santos, Jorge Bernardino, Anabela Gomes. Learning analysis of mobile JavaScript frameworks. 2019 14th Iberian Conference on Information Systems and Technologies (CISTI). 2019; ():1.

Chicago/Turabian Style

Hugo Brito; Alvaro Santos; Jorge Bernardino; Anabela Gomes. 2019. "Learning analysis of mobile JavaScript frameworks." 2019 14th Iberian Conference on Information Systems and Technologies (CISTI) , no. : 1.