Depression Intensity Classification from Tweets Using FastText Based Weighted Soft Voting Ensemble

Artículo Materias > Ingeniería
Materias > Psicología
Universidad Europea del Atlántico > Investigación > Producción Científica
Fundación Universitaria Internacional de Colombia > Investigación > Producción Científica
Universidad Internacional Iberoamericana México > Investigación > Producción Científica
Universidad Internacional Iberoamericana Puerto Rico > Investigación > Producción Científica
Universidad Internacional do Cuanza > Investigación > Artículos y libros
Abierto Inglés Predicting depression intensity from microblogs and social media posts has numerous benefits and applications, including predicting early psychological disorders and stress in individuals or the general public. A major challenge in predicting depression using social media posts is that the existing studies do not focus on predicting the intensity of depression in social media texts but rather only perform the binary classification of depression and moreover noisy data makes it difficult to predict the true depression in the social media text. This study intends to begin by collecting relevant Tweets and generating a corpus of 210000 public tweets using Twitter public application programming interfaces (APIs). A strategy is devised to filter out only depression-related tweets by creating a list of relevant hashtags to reduce noise in the corpus. Furthermore, an algorithm is developed to annotate the data into three depression classes: ‘Mild,’ ‘Moderate,’ and ‘Severe,’ based on International Classification of Diseases-10 (ICD-10) depression diagnostic criteria. Different baseline classifiers are applied to the annotated dataset to get a preliminary idea of classification performance on the corpus. Further FastText-based model is applied and fine-tuned with different preprocessing techniques and hyperparameter tuning to produce the tuned model, which significantly increases the depression classification performance to an 84% F1 score and 90% accuracy compared to baselines. Finally, a FastText-based weighted soft voting ensemble (WSVE) is proposed to boost the model’s performance by combining several other classifiers and assigning weights to individual models according to their individual performances. The proposed WSVE outperformed all baselines as well as FastText alone, with an F1 of 89%, 5% higher than FastText alone, and an accuracy of 93%, 3% higher than FastText alone. The proposed model better captures the contextual features of the relatively small sample class and aids in the detection of early depression intensity prediction from tweets with impactful performances. metadata Rizwan, Muhammad; Mushtaq, Muhammad Faheem; Rafiq, Maryam; Mehmood, Arif; Diez, Isabel de la Torre; Gracia Villar, Mónica; Garay, Helena y Ashraf, Imran mail SIN ESPECIFICAR, SIN ESPECIFICAR, SIN ESPECIFICAR, SIN ESPECIFICAR, SIN ESPECIFICAR, monica.gracia@uneatlantico.es, helena.garay@uneatlantico.es, SIN ESPECIFICAR (2024) Depression Intensity Classification from Tweets Using FastText Based Weighted Soft Voting Ensemble. Computers, Materials & Continua, 78 (2). pp. 2047-2066. ISSN 1546-2226

[img] Texto
TSP_CMC_37347.pdf
Available under License Creative Commons Attribution.

Descargar (861kB)

Resumen

Predicting depression intensity from microblogs and social media posts has numerous benefits and applications, including predicting early psychological disorders and stress in individuals or the general public. A major challenge in predicting depression using social media posts is that the existing studies do not focus on predicting the intensity of depression in social media texts but rather only perform the binary classification of depression and moreover noisy data makes it difficult to predict the true depression in the social media text. This study intends to begin by collecting relevant Tweets and generating a corpus of 210000 public tweets using Twitter public application programming interfaces (APIs). A strategy is devised to filter out only depression-related tweets by creating a list of relevant hashtags to reduce noise in the corpus. Furthermore, an algorithm is developed to annotate the data into three depression classes: ‘Mild,’ ‘Moderate,’ and ‘Severe,’ based on International Classification of Diseases-10 (ICD-10) depression diagnostic criteria. Different baseline classifiers are applied to the annotated dataset to get a preliminary idea of classification performance on the corpus. Further FastText-based model is applied and fine-tuned with different preprocessing techniques and hyperparameter tuning to produce the tuned model, which significantly increases the depression classification performance to an 84% F1 score and 90% accuracy compared to baselines. Finally, a FastText-based weighted soft voting ensemble (WSVE) is proposed to boost the model’s performance by combining several other classifiers and assigning weights to individual models according to their individual performances. The proposed WSVE outperformed all baselines as well as FastText alone, with an F1 of 89%, 5% higher than FastText alone, and an accuracy of 93%, 3% higher than FastText alone. The proposed model better captures the contextual features of the relatively small sample class and aids in the detection of early depression intensity prediction from tweets with impactful performances.

Tipo de Documento: Artículo
Palabras Clave: Depression classification; deep learning; FastText; machine learning
Clasificación temática: Materias > Ingeniería
Materias > Psicología
Divisiones: Universidad Europea del Atlántico > Investigación > Producción Científica
Fundación Universitaria Internacional de Colombia > Investigación > Producción Científica
Universidad Internacional Iberoamericana México > Investigación > Producción Científica
Universidad Internacional Iberoamericana Puerto Rico > Investigación > Producción Científica
Universidad Internacional do Cuanza > Investigación > Artículos y libros
Depositado: 14 Mar 2024 23:30
Ultima Modificación: 14 Mar 2024 23:30
URI: https://repositorio.unic.co.ao/id/eprint/11264

Acciones (logins necesarios)

Ver Objeto Ver Objeto

<a href="/17788/1/s40537-025-01167-w.pdf" class="ep_document_link"><img class="ep_doc_icon" alt="[img]" src="/style/images/fileicons/text.png" border="0"/></a>

en

open

Detecting hate in diversity: a survey of multilingual code-mixed image and video analysis

The proliferation of damaging content on social media in today’s digital environment has increased the need for efficient hate speech identification systems. A thorough examination of hate speech detection methods in a variety of settings, such as code-mixed, multilingual, visual, audio, and textual scenarios, is presented in this paper. Unlike previous research focusing on single modalities, our study thoroughly examines hate speech identification across multiple forms. We classify the numerous types of hate speech, showing how it appears on different platforms and emphasizing the unique difficulties in multi-modal and multilingual settings. We fill research gaps by assessing a variety of methods, including deep learning, machine learning, and natural language processing, especially for complicated data like code-mixed and cross-lingual text. Additionally, we offer key technique comparisons, suggesting future research avenues that prioritize multi-modal analysis and ethical data handling, while acknowledging its benefits and drawbacks. This study attempts to promote scholarly research and real-world applications on social media platforms by acting as an essential resource for improving hate speech identification across various data sources.

Producción Científica

Hafiz Muhammad Raza Ur Rehman mail , Mahpara Saleem mail , Muhammad Zeeshan Jhandir mail , Eduardo René Silva Alvarado mail eduardo.silva@funiber.org, Helena Garay mail helena.garay@uneatlantico.es, Imran Ashraf mail ,

Raza Ur Rehman

<a href="/17794/1/s41598-025-95836-8.pdf" class="ep_document_link"><img class="ep_doc_icon" alt="[img]" src="/style/images/fileicons/text.png" border="0"/></a>

en

open

Evaluating the impact of deep learning approaches on solar and photovoltaic power forecasting: A systematic review

Accurate solar and photovoltaic (PV) power forecasting is essential for optimizing grid integration, managing energy storage, and maximizing the efficiency of solar power systems. Deep learning (DL) models have shown promise in this area due to their ability to learn complex, non-linear relationships within large datasets. This study presents a systematic literature review (SLR) of deep learning applications for solar PV forecasting, addressing a gap in the existing literature, which often focuses on traditional ML or broader renewable energy applications. This review specifically aims to identify the DL architectures employed, preprocessing and feature engineering techniques used, the input features leveraged, evaluation metrics applied, and the persistent challenges in this field. Through a rigorous analysis of 26 selected papers from an initial set of 155 articles retrieved from the Web of Science database, we found that Long Short-Term Memory (LSTM) networks were the most frequently used algorithm (appearing in 32.69% of the papers), closely followed by Convolutional Neural Networks (CNNs) at 28.85%. Furthermore, Wavelet Transform (WT) was found to be the most prominent data decomposition technique, while Pearson Correlation was the most used for feature selection. We also found that ambient temperature, pressure, and humidity are the most common input features. Our systematic evaluation provides critical insights into state-of-the-art DL-based solar forecasting and identifies key areas for upcoming research. Future research should prioritize the development of more robust and interpretable models, as well as explore the integration of multi-source data to further enhance forecasting accuracy. Such advancements are crucial for the effective integration of solar energy into future power grids.

Producción Científica

Oussama Khouili mail , Mohamed Hanine mail , Mohamed Louzazni mail , Miguel Ángel López Flores mail miguelangel.lopez@uneatlantico.es, Eduardo García Villena mail eduardo.garcia@uneatlantico.es, Imran Ashraf mail ,

Khouili

en

close

Measurement of chest muscle mass in COVID-19 patients on mechanical ventilation using tomography

Background: Sarcopenia, characterized by a reduction in skeletal muscle mass and function, is a prevalent complication in the Intensive Care Unit (ICU) and is related to increased mortality. This study aims to determine whether muscle and fat mass measurements at the T12 and L1 vertebrae using chest tomography can predict mortality among critically ill COVID-19 patients requiring invasive mechanical ventilation (MV). Methods: Fifty-one critically ill COVID-19 patients on MV underwent chest tomography within 72 h of ICU admission. Muscle mass was measured using the Core Slicer program. Results: After adjustment for potential confounding factors related to background and clinical parameters, a 1-unit increase in muscle mass, subcutaneous, and intra-abdominal fat mass at the L1 level was associated with approximately 1–2% lower odds of negative outcomes and in-hospital mortality. No significant association was found between muscle mass at the T12 level and patient outcomes. Furthermore, no significant results were observed when considering a 1-standard deviation increase as the exposure variable. Conclusion: Measuring muscle mass using chest tomography at the T12 level does not effectively predict outcomes for ICU patients. However, muscle and fat mass at the L1 level may be associated with a lower risk of negative outcomes. Additional studies should explore other potential markers or methods to improve prognostic accuracy in this critically ill population.

Producción Científica

Natalia Daniela Llobera mail , Evelyn Frias-Toral mail , Mariel Aquino mail , María Jimena Reberendo mail , Laura Cardona Díaz mail , Adriana García mail , Martha Montalván mail , Álvaro Velarde Sotres mail alvaro.velarde@uneatlantico.es, Sebastián Chapela mail ,

Llobera

<a href="/17593/1/s41598-025-95448-2.pdf" class="ep_document_link"><img class="ep_doc_icon" alt="[img]" src="/style/images/fileicons/text.png" border="0"/></a>

en

open

Client engagement solution for post implementation issues in software industry using blockchain

In the rapidly advanced and evolving information technology industry, adequate client engagement plays a critical role as it is very important to understand the client’s concerns, and requirements, have the records, authorizations, and go-ahead of previously agreed requirements, and provide the feasible solution accordingly. Previously multiple solutions have been proposed to enhance the efficiency of client engagement, but they lack traceability, trust, transparency, and conflict in agreements of previous contracts. Due to the lack of these shortcomings, the client requirement is getting delayed which is causing client escalations, integrity issues, project failure, and penalties. In this study, we proposed the UniferCollab framework to overcome the issues of collaboration between various teams, transparency, the record of client authorizations, and the go-ahead on previous developments by implementing blockchain technology. We store the data on the permissible network in the proposed approach. It allows us to compile all the requirements and information shared by clients on permissible blockchain to secure a large amount of data which enhances the traceability of all the requirements. All the authorizations from the client generate push notifications for any changes in their current system executed through smart contracts. It removes the ambiguity between various development teams if the client has only shared the requirement with one team. The data is stored in the decentralized network from where information is gathered which resolves the traceability, transparency, and trust issues. Lastly, evaluations involved a total of 800 hypertext transfer protocol (HTTP) requests tested using Postman with blockchain block sizes ranging from 0.568 KB to 550 KB and an average size increase of 280 KB was observed as new blocks were added. The longest chain in the network was observed during 800 repetitions of blockchain operations. Latency analysis revealed that delays in processing HTTP requests were influenced by decentralized node processing, local machine response times, and internet bandwidth through various experiments. Results show that the proposed framework resolves all client engagement issues in implementation between all stakeholders which enhances trust, and transparency improves client experience and helps us manage disputes effectively.

Producción Científica

Muhammad Shoaib Farooq mail , Khurram Irshad mail , Danish Riaz mail , Nagwan Abdel Samee mail , Ernesto Bautista Thompson mail ernesto.bautista@unini.edu.mx, Daniel Gavilanes Aray mail daniel.gavilanes@uneatlantico.es, Imran Ashraf mail ,

Farooq

<a class="ep_document_link" href="/17611/1/nutrients-17-01242.pdf"><img class="ep_doc_icon" alt="[img]" src="/style/images/fileicons/text.png" border="0"/></a>

en

open

Diet, Eating Habits, and Lifestyle Factors Associated with Adequate Sleep Duration in Children and Adolescents Living in 5 Mediterranean Countries: The DELICIOUS Project

Background/Objectives: Sleep is a fundamental physiological function that plays a crucial role in maintaining health and well-being. The aim of this study was to assess dietary and lifestyle factors associated with adequate sleep duration in children and adolescents living in five Mediterranean countries. Methods: Parents of children and adolescents taking part in an initial survey for the DELICIOUS project were examined to assess their children’s dietary and eating habits (i.e., meal routines), as well as other lifestyle behaviors (i.e., physical activity levels, screen time, etc.) potentially associated with adequate sleep duration (defined as 8–10 h according to the National Sleep Foundation). The youth healthy eating index (Y-HEI) was used to assess the diet quality of children and adolescents. Multivariate logistic regression analyses were performed to calculate the odds ratios (ORs) and 95% confidence intervals (CIs), indicating the level of association between variables. Results: A total of 2011 individuals participated in the survey. The adolescents and children of younger parents reported being more likely to have inadequate sleep duration. Among eating behaviors, having breakfast (OR = 2.23, 95% CI: 1.62, 3.08) and eating at school (OR = 1.33, 95% CI: 1.01, 1.74) were associated with adequate sleep duration. In contrast, children eating alone, screen time, and eating outside of the home were less likely to have adequate sleep duration, although these findings were only significant in the unadjusted model. After adjusting for covariates, a better diet quality (OR = 1.63, 95% CI: 1.24, 2.16), including higher intake of fruits, meat, fish, and whole grains, was associated with adequate sleep duration. Conclusions: Adequate sleep duration seems to be highly influenced by factors related to individual lifestyles, family and school eating behaviors, as well as diet quality.

Producción Científica

Justyna Godos mail , Alice Rosi mail , Francesca Scazzina mail , Maria Antonieta Touriz Bonifaz mail , Francesca Giampieri mail francesca.giampieri@uneatlantico.es, Osama Abdelkarim mail , Achraf Ammar mail , Mohamed Aly mail , Evelyn Frias-Toral mail , Juancho Pons mail , Laura Vázquez-Araújo mail , Josep Alemany Iturriaga mail josep.alemany@uneatlantico.es, Lorenzo Monasta mail , Ana Mata mail , Adrián Chacón mail , Pablo Busó mail , Giuseppe Grosso mail ,

Godos