Feature group partitioning: an approach for depression severity prediction with class balancing using machine learning algorithms

Artículo Materias > Ingeniería Universidad Europea del Atlántico > Investigación > Producción Científica
Universidad Internacional Iberoamericana México > Investigación > Producción Científica
Universidad Internacional Iberoamericana Puerto Rico > Investigación > Producción Científica
Universidad Internacional do Cuanza > Investigación > Artículos y libros
Universidad de La Romana > Investigación > Producción Científica
Abierto Inglés In contemporary society, depression has emerged as a prominent mental disorder that exhibits exponential growth and exerts a substantial influence on premature mortality. Although numerous research applied machine learning methods to forecast signs of depression. Nevertheless, only a limited number of research have taken into account the severity level as a multiclass variable. Besides, maintaining the equality of data distribution among all the classes rarely happens in practical communities. So, the inevitable class imbalance for multiple variables is considered a substantial challenge in this domain. Furthermore, this research emphasizes the significance of addressing class imbalance issues in the context of multiple classes. We introduced a new approach Feature group partitioning (FGP) in the data preprocessing phase which effectively reduces the dimensionality of features to a minimum. This study utilized synthetic oversampling techniques, specifically Synthetic Minority Over-sampling Technique (SMOTE) and Adaptive Synthetic (ADASYN), for class balancing. The dataset used in this research was collected from university students by administering the Burn Depression Checklist (BDC). For methodological modifications, we implemented heterogeneous ensemble learning stacking, homogeneous ensemble bagging, and five distinct supervised machine learning algorithms. The issue of overfitting was mitigated by evaluating the accuracy of the training, validation, and testing datasets. To justify the effectiveness of the prediction models, balanced accuracy, sensitivity, specificity, precision, and f1-score indices are used. Overall, comprehensive analysis demonstrates the discrimination between the Conventional Depression Screening (CDS) and FGP approach. In summary, the results show that the stacking classifier for FGP with SMOTE approach yields the highest balanced accuracy, with a rate of 92.81%. The empirical evidence has demonstrated that the FGP approach, when combined with the SMOTE, able to produce better performance in predicting the severity of depression. Most importantly the optimization of the training time of the FGP approach for all of the classifiers is a significant achievement of this research. metadata Shaha, Tumpa Rani; Begum, Momotaz; Uddin, Jia; Yélamos Torres, Vanessa; Alemany Iturriaga, Josep; Ashraf, Imran y Samad, Md. Abdus mail SIN ESPECIFICAR, SIN ESPECIFICAR, SIN ESPECIFICAR, vanessa.yelamos@funiber.org, josep.alemany@uneatlantico.es, SIN ESPECIFICAR, SIN ESPECIFICAR (2024) Feature group partitioning: an approach for depression severity prediction with class balancing using machine learning algorithms. BMC Medical Research Methodology, 24 (1). ISSN 1471-2288

[img] Texto
s12874-024-02249-8.pdf
Available under License Creative Commons Attribution.

Descargar (2MB)

Resumen

In contemporary society, depression has emerged as a prominent mental disorder that exhibits exponential growth and exerts a substantial influence on premature mortality. Although numerous research applied machine learning methods to forecast signs of depression. Nevertheless, only a limited number of research have taken into account the severity level as a multiclass variable. Besides, maintaining the equality of data distribution among all the classes rarely happens in practical communities. So, the inevitable class imbalance for multiple variables is considered a substantial challenge in this domain. Furthermore, this research emphasizes the significance of addressing class imbalance issues in the context of multiple classes. We introduced a new approach Feature group partitioning (FGP) in the data preprocessing phase which effectively reduces the dimensionality of features to a minimum. This study utilized synthetic oversampling techniques, specifically Synthetic Minority Over-sampling Technique (SMOTE) and Adaptive Synthetic (ADASYN), for class balancing. The dataset used in this research was collected from university students by administering the Burn Depression Checklist (BDC). For methodological modifications, we implemented heterogeneous ensemble learning stacking, homogeneous ensemble bagging, and five distinct supervised machine learning algorithms. The issue of overfitting was mitigated by evaluating the accuracy of the training, validation, and testing datasets. To justify the effectiveness of the prediction models, balanced accuracy, sensitivity, specificity, precision, and f1-score indices are used. Overall, comprehensive analysis demonstrates the discrimination between the Conventional Depression Screening (CDS) and FGP approach. In summary, the results show that the stacking classifier for FGP with SMOTE approach yields the highest balanced accuracy, with a rate of 92.81%. The empirical evidence has demonstrated that the FGP approach, when combined with the SMOTE, able to produce better performance in predicting the severity of depression. Most importantly the optimization of the training time of the FGP approach for all of the classifiers is a significant achievement of this research.

Tipo de Documento: Artículo
Palabras Clave: Machine learning; Depression prediction; Class balancing; Oversampling; SMOTE; ADASYN; Stratified cross validation; Burn depression checklist; Feature group partitioning
Clasificación temática: Materias > Ingeniería
Divisiones: Universidad Europea del Atlántico > Investigación > Producción Científica
Universidad Internacional Iberoamericana México > Investigación > Producción Científica
Universidad Internacional Iberoamericana Puerto Rico > Investigación > Producción Científica
Universidad Internacional do Cuanza > Investigación > Artículos y libros
Universidad de La Romana > Investigación > Producción Científica
Depositado: 17 Jun 2024 23:30
Ultima Modificación: 17 Jun 2024 23:30
URI: https://repositorio.unic.co.ao/id/eprint/12751

Acciones (logins necesarios)

Ver Objeto Ver Objeto

en

close

Single-cell omics for nutrition research: an emerging opportunity for human-centric investigations

Understanding how dietary compounds affect human health is challenged by their molecular complexity and cell-type–specific effects. Conventional multi-cell type (bulk) analyses obscure cellular heterogeneity, while animal and standard in vitro models often fail to replicate human physiology. Single-cell omics technologies—such as single-cell RNA sequencing, as well as single-cell–resolved proteomic and metabolomic approaches—enable high-resolution investigation of nutrient–cell interactions and reveal mechanisms at a single-cell resolution. When combined with advanced human-derived in vitro systems like organoids and organ-on-chip platforms, they support mechanistic studies in physiologically relevant contexts. This review outlines emerging applications of single-cell omics in nutrition research, emphasizing their potential to uncover cell-specific dietary responses, identify nutrient-sensitive pathways, and capture interindividual variability. It also discusses key challenges—including technical limitations, model selection, and institutional biases—and identifies strategic directions to facilitate broader adoption in the field. Collectively, single-cell omics offer a transformative framework to advance human-centric nutrition research.

Producción Científica

Manuela Cassotta mail manucassotta@gmail.com, Yasmany Armas Diaz mail , Danila Cianciosi mail , Bei Yang mail , Zexiu Qi mail , Ge Chen mail , Santos Gracia Villar mail santos.gracia@uneatlantico.es, Luis Alonso Dzul López mail luis.dzul@uneatlantico.es, Giuseppe Grosso mail , José L. Quiles mail , Jianbo Xiao mail , Maurizio Battino mail maurizio.battino@uneatlantico.es, Francesca Giampieri mail francesca.giampieri@uneatlantico.es,

Cassotta

<a class="ep_document_link" href="/17862/1/sensors-25-06419.pdf"><img class="ep_doc_icon" alt="[img]" src="/style/images/fileicons/text.png" border="0"/></a>

en

open

Edge-Based Autonomous Fire and Smoke Detection Using MobileNetV2

Forest fires pose significant threats to ecosystems, human life, and the global climate, necessitating rapid and reliable detection systems. Traditional fire detection approaches, including sensor networks, satellite monitoring, and centralized image analysis, often suffer from delayed response, high false positives, and limited deployment in remote areas. Recent deep learning-based methods offer high classification accuracy but are typically computationally intensive and unsuitable for low-power, real-time edge devices. This study presents an autonomous, edge-based forest fire and smoke detection system using a lightweight MobileNetV2 convolutional neural network. The model is trained on a balanced dataset of fire, smoke, and non-fire images and optimized for deployment on resource-constrained edge devices. The system performs near real-time inference, achieving a test accuracy of 97.98% with an average end-to-end prediction latency of 0.77 s per frame (approximately 1.3 FPS) on the Raspberry Pi 5 edge device. Predictions include the class label, confidence score, and timestamp, all generated locally without reliance on cloud connectivity, thereby enhancing security and robustness against potential cyber threats. Experimental results demonstrate that the proposed solution maintains high predictive performance comparable to state-of-the-art methods while providing efficient, offline operation suitable for real-world environmental monitoring and early wildfire mitigation. This approach enables cost-effective, scalable deployment in remote forest regions, combining accuracy, speed, and autonomous edge processing for timely fire and smoke detection.

Producción Científica

Dilshod Sharobiddinov mail , Hafeez Ur Rehman Siddiqui mail , Adil Ali Saleem mail , Gerardo Méndez Mezquita mail , Debora L. Ramírez-Vargas mail debora.ramirez@unini.edu.mx, Isabel de la Torre Díez mail ,

Sharobiddinov

<a href="/17863/1/v16p4316.pdf" class="ep_document_link"><img class="ep_doc_icon" alt="[img]" src="/style/images/fileicons/text.png" border="0"/></a>

en

open

Divulging Patterns: An Analytical Review for Machine Learning Methodologies for Breast Cancer Detection

Breast cancer is a lethal carcinoma impacting a considerable number of women across the globe. While preventive measures are limited, early detection remains the most effective strategy. Accurate classification of breast tumors into benign and malignant categories is important which may help physicians in diagnosing the disease faster. This survey investigates the emerging inclination and approaches in the area of machine learning (ML) for the diagnosis of breast cancer, pointing out the classification techniques based on both segmentation and feature selection. Certain datasets such as the Wisconsin Diagnostic Breast Cancer Dataset (WDBC), Wisconsin Breast Cancer Dataset Original (WBCD), Wisconsin Prognostic Breast Cancer Dataset (WPBC), BreakHis, and others are being evaluated in this study for the demonstration of their influence on the performance of the diagnostic tools and the accuracy of the models such as Support vector machine, Convolutional Neural Networks (CNNs) and ensemble approaches. The main shortcomings or research gaps such as prejudice of datasets, scarcity of generalizability, and interpretation challenges are highlighted. This research emphasizes the importance of the hybrid methodologies, cross-dataset validation, and the engineering of explainable AI to narrow these gaps and enhance the overall clinical acceptance of ML-based detection tools.

Producción Científica

Alveena Saleem mail , Muhammad Umair mail , Muhammad Tahir Naseem mail , Muhammad Zubair mail , Silvia Aparicio Obregón mail silvia.aparicio@uneatlantico.es, Rubén Calderón Iglesias mail ruben.calderon@uneatlantico.es, Shoaib Hassan mail , Imran Ashraf mail ,

Saleem

<a class="ep_document_link" href="/17849/1/1-s2.0-S2590005625001043-main.pdf"><img class="ep_doc_icon" alt="[img]" src="/style/images/fileicons/text.png" border="0"/></a>

en

open

Ultra Wideband radar-based gait analysis for gender classification using artificial intelligence

Gender classification plays a vital role in various applications, particularly in security and healthcare. While several biometric methods such as facial recognition, voice analysis, activity monitoring, and gait recognition are commonly used, their accuracy and reliability often suffer due to challenges like body part occlusion, high computational costs, and recognition errors. This study investigates gender classification using gait data captured by Ultra-Wideband radar, offering a non-intrusive and occlusion-resilient alternative to traditional biometric methods. A dataset comprising 163 participants was collected, and the radar signals underwent preprocessing, including clutter suppression and peak detection, to isolate meaningful gait cycles. Spectral features extracted from these cycles were transformed using a novel integration of Feedforward Artificial Neural Networks and Random Forests , enhancing discriminative power. Among the models evaluated, the Random Forest classifier demonstrated superior performance, achieving 94.68% accuracy and a cross-validation score of 0.93. The study highlights the effectiveness of Ultra-wideband radar and the proposed transformation framework in advancing robust gender classification.

Producción Científica

Adil Ali Saleem mail , Hafeez Ur Rehman Siddiqui mail , Muhammad Amjad Raza mail , Sandra Dudley mail , Julio César Martínez Espinosa mail ulio.martinez@unini.edu.mx, Luis Alonso Dzul López mail luis.dzul@uneatlantico.es, Isabel de la Torre Díez mail ,

Saleem

<a class="ep_document_link" href="/17856/1/fpubh-13-1654645.pdf"><img class="ep_doc_icon" alt="[img]" src="/style/images/fileicons/text.png" border="0"/></a>

en

open

Children's and adolescents' lifestyle factors associated with physical activity in five Mediterranean countries: the DELICIOUS project

Background: Physical activity in children and adolescents represents one of the most important lifestyle factors to determine current and future health. Aim: The aim of the study is to assess the lifestyle and dietary factors linked to physical activity in younger populations across five countries in the Mediterranean region. Design: A total of 2,011 parents of children and adolescents (age range 6–17 years) participating to a preliminary survey of the DELICIOUS project were investigated to determine children's adequate physical activity level (identified using the short form of the international physical activity questionnaire) as well as diet quality parameters [measured as Youth-Healthy Eating Index (Y-HEI)] and eating and lifestyle factors (i.e., meal habits, sleep duration, screen time, etc.). Logistic regression analyses were performed to assess the odds ratios (ORs) and 95% confidence intervals (CIs) for the associations between variables of interest. Results: Younger children of younger parents currently working had higher rates and probability to have adequate physical activity. Multivariate analysis showed that children and adolescents who had breakfast (OR = 1.88, 95% CI: 1.38, 2.56) and often ate with their family (OR = 1.80, 95% CI: 0.90, 3.61) were more likely to have an adequate level of physical activity. Children and adolescents who reported a sleep duration (8–10 h) closest to the recommended one were significantly more likely to achieve adequate levels of physical activity (OR = 1.88, 95% CI: 1.38, 2.56). Conversely, those with more than 4 h of daily screen time were less likely to engage in adequate physical activity (OR = 0.77, 95% CI: 0.54, 1.10). Furthermore, children and adolescents in the highest tertile of YEHI scores showed a 60% greater likelihood of engaging in adequate physical activity (OR = 1.60, 95% CI: 1.27, 2.01). Conclusion: These results emphasize the importance of promoting healthy diet and lifestyle habits, including structured and high quality shared meals, sufficient sleep, and screen time moderation, as key strategies to support active behaviors in younger populations. Future interventions should focus on reinforcing these behaviors through parental guidance and community-based initiatives to foster lifelong healthy habits.

Producción Científica

Alice Rosi mail , Francesca Scazzina mail , Maria Antonieta Touriz Bonifaz mail , Francesca Giampieri mail francesca.giampieri@uneatlantico.es, Achraf Ammar mail , Khaled Trabelsi mail , Osama Abdelkarim mail , Mohamed Aly mail , Evelyn Frias-Toral mail , Juancho Pons mail , Laura Vázquez-Araújo mail , Josep Alemany Iturriaga mail josep.alemany@uneatlantico.es, Lorenzo Monasta mail , Nunzia Decembrino mail , Ana Mata mail , Adrián Chacón mail , Pablo Busó mail , Giuseppe Grosso mail ,

Rosi