Adaptive neighborhood rough set model for hybrid data processing: a case study on Parkinson’s disease behavioral analysis

Article Subjects > Engineering Europe University of Atlantic > Research > Scientific Production
Ibero-american International University > Research > Scientific Production
Universidad Internacional do Cuanza > Research > Articles and books
Abierto Inglés Extracting knowledge from hybrid data, comprising both categorical and numerical data, poses significant challenges due to the inherent difficulty in preserving information and practical meanings during the conversion process. To address this challenge, hybrid data processing methods, combining complementary rough sets, have emerged as a promising approach for handling uncertainty. However, selecting an appropriate model and effectively utilizing it in data mining requires a thorough qualitative and quantitative comparison of existing hybrid data processing models. This research aims to contribute to the analysis of hybrid data processing models based on neighborhood rough sets by investigating the inherent relationships among these models. We propose a generic neighborhood rough set-based hybrid model specifically designed for processing hybrid data, thereby enhancing the efficacy of the data mining process without resorting to discretization and avoiding information loss or practical meaning degradation in datasets. The proposed scheme dynamically adapts the threshold value for the neighborhood approximation space according to the characteristics of the given datasets, ensuring optimal performance without sacrificing accuracy. To evaluate the effectiveness of the proposed scheme, we develop a testbed tailored for Parkinson’s patients, a domain where hybrid data processing is particularly relevant. The experimental results demonstrate that the proposed scheme consistently outperforms existing schemes in adaptively handling both numerical and categorical data, achieving an impressive accuracy of 95% on the Parkinson’s dataset. Overall, this research contributes to advancing hybrid data processing techniques by providing a robust and adaptive solution that addresses the challenges associated with handling hybrid data, particularly in the context of Parkinson’s disease analysis. metadata Raza, Imran and Jamal, Muhammad Hasan and Qureshi, Rizwan and Shahid, Abdul Karim and Rojas Vistorte, Angel Olider and Samad, Md Abdus and Ashraf, Imran mail UNSPECIFIED, UNSPECIFIED, UNSPECIFIED, UNSPECIFIED, angel.rojas@uneatlantico.es, UNSPECIFIED, UNSPECIFIED (2024) Adaptive neighborhood rough set model for hybrid data processing: a case study on Parkinson’s disease behavioral analysis. Scientific Reports, 14 (1). ISSN 2045-2322

[img] Text
s41598-024-57547-4.pdf
Available under License Creative Commons Attribution.

Download (1MB)

Abstract

Extracting knowledge from hybrid data, comprising both categorical and numerical data, poses significant challenges due to the inherent difficulty in preserving information and practical meanings during the conversion process. To address this challenge, hybrid data processing methods, combining complementary rough sets, have emerged as a promising approach for handling uncertainty. However, selecting an appropriate model and effectively utilizing it in data mining requires a thorough qualitative and quantitative comparison of existing hybrid data processing models. This research aims to contribute to the analysis of hybrid data processing models based on neighborhood rough sets by investigating the inherent relationships among these models. We propose a generic neighborhood rough set-based hybrid model specifically designed for processing hybrid data, thereby enhancing the efficacy of the data mining process without resorting to discretization and avoiding information loss or practical meaning degradation in datasets. The proposed scheme dynamically adapts the threshold value for the neighborhood approximation space according to the characteristics of the given datasets, ensuring optimal performance without sacrificing accuracy. To evaluate the effectiveness of the proposed scheme, we develop a testbed tailored for Parkinson’s patients, a domain where hybrid data processing is particularly relevant. The experimental results demonstrate that the proposed scheme consistently outperforms existing schemes in adaptively handling both numerical and categorical data, achieving an impressive accuracy of 95% on the Parkinson’s dataset. Overall, this research contributes to advancing hybrid data processing techniques by providing a robust and adaptive solution that addresses the challenges associated with handling hybrid data, particularly in the context of Parkinson’s disease analysis.

Item Type: Article
Uncontrolled Keywords: Computational biology and bioinformatics; Machine learning
Subjects: Subjects > Engineering
Divisions: Europe University of Atlantic > Research > Scientific Production
Ibero-american International University > Research > Scientific Production
Universidad Internacional do Cuanza > Research > Articles and books
Date Deposited: 11 Apr 2024 23:30
Last Modified: 11 Apr 2024 23:30
URI: https://repositorio.unic.co.ao/id/eprint/11642

Actions (login required)

View Item View Item

<a class="ep_document_link" href="/10290/1/Influence%20of%20E-learning%20training%20on%20the%20acquisition%20of%20competences%20in%20basketball%20coaches%20in%20Cantabria.pdf"><img class="ep_doc_icon" alt="[img]" src="/style/images/fileicons/text.png" border="0"/></a>

en

open

Influence of E-learning training on the acquisition of competences in basketball coaches in Cantabria

The main aim of this study was to analyse the influence of e-learning training on the acquisition of competences in basketball coaches in Cantabria. The current landscape of basketball coach training shows an increasing demand for innovative training models and emerging pedagogies, including e-learning-based methodologies. The study sample consisted of fifty students from these courses, all above 16 years of age (36 males, 14 females). Among them, 16% resided outside the autonomous community of Cantabria, 10% resided more than 50 km from the city of Santander, 36% between 10 and 50 km, 14% less than 10 km, and 24% resided within Santander city. Data were collected through a Google Forms survey distributed by the Cantabrian Basketball Federation to training course students. Participation was voluntary and anonymous. The survey, consisting of 56 questions, was validated by two sports and health doctors and two senior basketball coaches. The collected data were processed and analysed using Microsoft® Excel version 16.74, and the results were expressed in percentages. The analysis revealed that 24.60% of the students trained through the e-learning methodology considered themselves fully qualified as basketball coaches, contrasting with 10.98% of those trained via traditional face-to-face methodology. The results of the study provide insights into important characteristics that can be adjusted and improved within the investigated educational process. Moreover, the study concludes that e-learning training effectively qualifies basketball coaches in Cantabria.

Producción Científica

Josep Alemany Iturriaga mail josep.alemany@uneatlantico.es, Álvaro Velarde-Sotres mail alvaro.velarde@uneatlantico.es, Javier Jorge mail , Kamil Giglio mail ,

Alemany Iturriaga

<a class="ep_document_link" href="/12747/1/sensors-24-03754%20%281%29.pdf"><img class="ep_doc_icon" alt="[img]" src="/style/images/fileicons/text.png" border="0"/></a>

en

open

Ultra-Wide Band Radar Empowered Driver Drowsiness Detection with Convolutional Spatial Feature Engineering and Artificial Intelligence

Driving while drowsy poses significant risks, including reduced cognitive function and the potential for accidents, which can lead to severe consequences such as trauma, economic losses, injuries, or death. The use of artificial intelligence can enable effective detection of driver drowsiness, helping to prevent accidents and enhance driver performance. This research aims to address the crucial need for real-time and accurate drowsiness detection to mitigate the impact of fatigue-related accidents. Leveraging ultra-wideband radar data collected over five minutes, the dataset was segmented into one-minute chunks and transformed into grayscale images. Spatial features are retrieved from the images using a two-dimensional Convolutional Neural Network. Following that, these features were used to train and test multiple machine learning classifiers. The ensemble classifier RF-XGB-SVM, which combines Random Forest, XGBoost, and Support Vector Machine using a hard voting criterion, performed admirably with an accuracy of 96.6%. Additionally, the proposed approach was validated with a robust k-fold score of 97% and a standard deviation of 0.018, demonstrating significant results. The dataset is augmented using Generative Adversarial Networks, resulting in improved accuracies for all models. Among them, the RF-XGB-SVM model outperformed the rest with an accuracy score of 99.58%.

Producción Científica

Hafeez Ur Rehman Siddiqui mail , Ambreen Akmal mail , Muhammad Iqbal mail , Adil Ali Saleem mail , Muhammad Amjad Raza mail , Kainat Zafar mail , Aqsa Zaib mail , Sandra Dudley mail , Jon Arambarri mail jon.arambarri@uneatlantico.es, Ángel Gabriel Kuc Castilla mail , Furqan Rustam mail ,

Siddiqui

<a href="/12749/1/fnut-11-1083759.pdf" class="ep_document_link"><img class="ep_doc_icon" alt="[img]" src="/style/images/fileicons/text.png" border="0"/></a>

en

open

From by-products to new application opportunities: the enhancement of the leaves deriving from the fruit plants for new potential healthy products

In the last decades, the world population and demand for any kind of product have grown exponentially. The rhythm of production to satisfy the request of the population has become unsustainable and the concept of the linear economy, introduced after the Industrial Revolution, has been replaced by a new economic approach, the circular economy. In this new economic model, the concept of “the end of life” is substituted by the concept of restoration, providing a new life to many industrial wastes. Leaves are a by-product of several agricultural cultivations. In recent years, the scientific interest regarding leaf biochemical composition grew, recording that plant leaves may be considered an alternative source of bioactive substances. Plant leaves’ main bioactive compounds are similar to those in fruits, i.e., phenolic acids and esters, flavonols, anthocyanins, and procyanidins. Bioactive compounds can positively influence human health; in fact, it is no coincidence that the leaves were used by our ancestors as a natural remedy for various pathological conditions. Therefore, leaves can be exploited to manufacture many products in food (e.g., being incorporated in food formulations as natural antioxidants, or used to create edible coatings or films for food packaging), cosmetic and pharmaceutical industries (e.g., promising ingredients in anti-aging cosmetics such as oils, serums, dermatological creams, bath gels, and other products). This review focuses on the leaves’ main bioactive compounds and their beneficial health effects, indicating their applications until today to enhance them as a harvesting by-product and highlight their possible reuse for new potential healthy products.

Producción Científica

Lucia Regolo mail , Francesca Giampieri mail francesca.giampieri@uneatlantico.es, Maurizio Battino mail maurizio.battino@uneatlantico.es, Yasmany Armas Diaz mail , Bruno Mezzetti mail , Maria Elexpuru Zabaleta mail maria.elexpuru@uneatlantico.es, Cristina Mazas Pérez-Oleaga mail cristina.mazas@uneatlantico.es, Kilian Tutusaus mail kilian.tutusaus@uneatlantico.es, Luca Mazzoni mail ,

Regolo

<a class="ep_document_link" href="/12750/1/s41598-024-63831-0.pdf"><img class="ep_doc_icon" alt="[img]" src="/style/images/fileicons/text.png" border="0"/></a>

en

open

Efficient deep learning-based approach for malaria detection using red blood cell smears

Malaria is an extremely malignant disease and is caused by the bites of infected female mosquitoes. This disease is not only infectious among humans, but among animals as well. Malaria causes mild symptoms like fever, headache, sweating and vomiting, and muscle discomfort; severe symptoms include coma, seizures, and kidney failure. The timely identification of malaria parasites is a challenging and chaotic endeavor for health staff. An expert technician examines the schematic blood smears of infected red blood cells through a microscope. The conventional methods for identifying malaria are not efficient. Machine learning approaches are effective for simple classification challenges but not for complex tasks. Furthermore, machine learning involves rigorous feature engineering to train the model and detect patterns in the features. On the other hand, deep learning works well with complex tasks and automatically extracts low and high-level features from the images to detect disease. In this paper, EfficientNet, a deep learning-based approach for detecting Malaria, is proposed that uses red blood cell images. Experiments are carried out and performance comparison is made with pre-trained deep learning models. In addition, k-fold cross-validation is also used to substantiate the results of the proposed approach. Experiments show that the proposed approach is 97.57% accurate in detecting Malaria from red blood cell images and can be beneficial practically for medical healthcare staff.

Producción Científica

Muhammad Mujahid mail , Furqan Rustam mail , Rahman Shafique mail , Elizabeth Caro Montero mail elizabeth.caro@uneatlantico.es, Eduardo René Silva Alvarado mail eduardo.silva@funiber.org, Isabel de la Torre Diez mail , Imran Ashraf mail ,

Mujahid

<a class="ep_document_link" href="/12751/1/s12874-024-02249-8.pdf"><img class="ep_doc_icon" alt="[img]" src="/style/images/fileicons/text.png" border="0"/></a>

en

open

Feature group partitioning: an approach for depression severity prediction with class balancing using machine learning algorithms

In contemporary society, depression has emerged as a prominent mental disorder that exhibits exponential growth and exerts a substantial influence on premature mortality. Although numerous research applied machine learning methods to forecast signs of depression. Nevertheless, only a limited number of research have taken into account the severity level as a multiclass variable. Besides, maintaining the equality of data distribution among all the classes rarely happens in practical communities. So, the inevitable class imbalance for multiple variables is considered a substantial challenge in this domain. Furthermore, this research emphasizes the significance of addressing class imbalance issues in the context of multiple classes. We introduced a new approach Feature group partitioning (FGP) in the data preprocessing phase which effectively reduces the dimensionality of features to a minimum. This study utilized synthetic oversampling techniques, specifically Synthetic Minority Over-sampling Technique (SMOTE) and Adaptive Synthetic (ADASYN), for class balancing. The dataset used in this research was collected from university students by administering the Burn Depression Checklist (BDC). For methodological modifications, we implemented heterogeneous ensemble learning stacking, homogeneous ensemble bagging, and five distinct supervised machine learning algorithms. The issue of overfitting was mitigated by evaluating the accuracy of the training, validation, and testing datasets. To justify the effectiveness of the prediction models, balanced accuracy, sensitivity, specificity, precision, and f1-score indices are used. Overall, comprehensive analysis demonstrates the discrimination between the Conventional Depression Screening (CDS) and FGP approach. In summary, the results show that the stacking classifier for FGP with SMOTE approach yields the highest balanced accuracy, with a rate of 92.81%. The empirical evidence has demonstrated that the FGP approach, when combined with the SMOTE, able to produce better performance in predicting the severity of depression. Most importantly the optimization of the training time of the FGP approach for all of the classifiers is a significant achievement of this research.

Producción Científica

Tumpa Rani Shaha mail , Momotaz Begum mail , Jia Uddin mail , Vanessa Yélamos Torres mail vanessa.yelamos@funiber.org, Josep Alemany Iturriaga mail josep.alemany@uneatlantico.es, Imran Ashraf mail , Md. Abdus Samad mail ,

Shaha