Real Word Spelling Error Detection and Correction for Urdu Language

Artículo Materias > Ingeniería Universidad Europea del Atlántico > Investigación > Producción Científica
Fundación Universitaria Internacional de Colombia > Investigación > Producción Científica
Universidad Internacional Iberoamericana México > Investigación > Producción Científica
Universidad Internacional Iberoamericana Puerto Rico > Investigación > Producción Científica
Universidad Internacional do Cuanza > Investigación > Artículos y libros
Abierto Inglés Non-word and real-word errors are generally two types of spelling errors. Non-word errors are misspelled words that are nonexistent in the lexicon while real-word errors are misspelled words that exist in the lexicon but are used out of context in a sentence. Lexicon-based lookup approach is widely used for non-word errors but it is incapable of handling real-word errors as they require contextual information. Contrary to the English language, real-word error detection and correction for low-resourced languages like Urdu is an unexplored area. This paper presents a real-word spelling error detection and correction approach for the Urdu language. We develop an extensive lexicon of 593,738 words and use this lexicon to develop a dataset for real-word errors comprising 125562 sentences and 2,552,735 words. Based on the developed lexicon and dataset, we then develop a contextual spell checker that detects and corrects real-word errors. For the real-word error detection phase, word-gram features are used along with five machine learning classifiers, achieving a precision, recall, and F1-score of 0.84,0.79, and 0.81 respectively. We also test the proposed approach with a 40% error density. For real-word error correction, the Damerau-Levenshtein distance is used along with the n-gram model for further ranking of the suggested candidate words, achieving an accuracy of up to 83.67%. metadata Aziz, Romila; Anwar, Muhammad Waqas; Jamal, Muhammad Hasan; Bajwa, Usama Ijaz; Kuc Castilla, Ángel Gabriel; Uc-Rios, Carlos; Bautista Thompson, Ernesto y Ashraf, Imran mail SIN ESPECIFICAR, SIN ESPECIFICAR, SIN ESPECIFICAR, SIN ESPECIFICAR, SIN ESPECIFICAR, carlos.uc@unini.edu.mx, ernesto.bautista@unini.edu.mx, SIN ESPECIFICAR (2023) Real Word Spelling Error Detection and Correction for Urdu Language. IEEE Access. p. 1. ISSN 2169-3536

[img] Texto
Real_Word_Spelling_Error_Detection_and_Correction_for_Urdu_Language.pdf
Available under License Creative Commons Attribution Non-commercial No Derivatives.

Descargar (3MB)

Resumen

Non-word and real-word errors are generally two types of spelling errors. Non-word errors are misspelled words that are nonexistent in the lexicon while real-word errors are misspelled words that exist in the lexicon but are used out of context in a sentence. Lexicon-based lookup approach is widely used for non-word errors but it is incapable of handling real-word errors as they require contextual information. Contrary to the English language, real-word error detection and correction for low-resourced languages like Urdu is an unexplored area. This paper presents a real-word spelling error detection and correction approach for the Urdu language. We develop an extensive lexicon of 593,738 words and use this lexicon to develop a dataset for real-word errors comprising 125562 sentences and 2,552,735 words. Based on the developed lexicon and dataset, we then develop a contextual spell checker that detects and corrects real-word errors. For the real-word error detection phase, word-gram features are used along with five machine learning classifiers, achieving a precision, recall, and F1-score of 0.84,0.79, and 0.81 respectively. We also test the proposed approach with a 40% error density. For real-word error correction, the Damerau-Levenshtein distance is used along with the n-gram model for further ranking of the suggested candidate words, achieving an accuracy of up to 83.67%.

Tipo de Documento: Artículo
Palabras Clave: Real-word errors, spelling correction, spelling detection, spell checker
Clasificación temática: Materias > Ingeniería
Divisiones: Universidad Europea del Atlántico > Investigación > Producción Científica
Fundación Universitaria Internacional de Colombia > Investigación > Producción Científica
Universidad Internacional Iberoamericana México > Investigación > Producción Científica
Universidad Internacional Iberoamericana Puerto Rico > Investigación > Producción Científica
Universidad Internacional do Cuanza > Investigación > Artículos y libros
Depositado: 14 Sep 2023 23:30
Ultima Modificación: 14 Sep 2023 23:30
URI: https://repositorio.unic.co.ao/id/eprint/8800

Acciones (logins necesarios)

Ver Objeto Ver Objeto

en

close

Single-cell omics for nutrition research: an emerging opportunity for human-centric investigations

Understanding how dietary compounds affect human health is challenged by their molecular complexity and cell-type–specific effects. Conventional multi-cell type (bulk) analyses obscure cellular heterogeneity, while animal and standard in vitro models often fail to replicate human physiology. Single-cell omics technologies—such as single-cell RNA sequencing, as well as single-cell–resolved proteomic and metabolomic approaches—enable high-resolution investigation of nutrient–cell interactions and reveal mechanisms at a single-cell resolution. When combined with advanced human-derived in vitro systems like organoids and organ-on-chip platforms, they support mechanistic studies in physiologically relevant contexts. This review outlines emerging applications of single-cell omics in nutrition research, emphasizing their potential to uncover cell-specific dietary responses, identify nutrient-sensitive pathways, and capture interindividual variability. It also discusses key challenges—including technical limitations, model selection, and institutional biases—and identifies strategic directions to facilitate broader adoption in the field. Collectively, single-cell omics offer a transformative framework to advance human-centric nutrition research.

Producción Científica

Manuela Cassotta mail manucassotta@gmail.com, Yasmany Armas Diaz mail , Danila Cianciosi mail , Bei Yang mail , Zexiu Qi mail , Ge Chen mail , Santos Gracia Villar mail santos.gracia@uneatlantico.es, Luis Alonso Dzul López mail luis.dzul@uneatlantico.es, Giuseppe Grosso mail , José L. Quiles mail , Jianbo Xiao mail , Maurizio Battino mail maurizio.battino@uneatlantico.es, Francesca Giampieri mail francesca.giampieri@uneatlantico.es,

Cassotta

<a href="/17878/1/s13018-025-06422-7.pdf" class="ep_document_link"><img class="ep_doc_icon" alt="[img]" src="/style/images/fileicons/text.png" border="0"/></a>

en

open

Shoulder ligamentoplasty, arthroscopic Latarjet, dynamic anterior stabilization, and arthroscopic trillat for the treatment of shoulder instability: a systematic review of original studies on surgical techniques

Background Anterior shoulder instability is a common condition, especially among young and active individuals, often associated with both osseous and soft tissue injuries. Recent innovations have introduced various surgical options for managing critical and subcritical instability. Therefore, the primary objective of this systematic review was to collect, synthesize, and integrate international research published across multiple scientific databases on shoulder ligamentoplasty, arthroscopic Latarjet, dynamic anterior stabilization (DAS), and arthroscopic Trillat techniques used in the treatment of shoulder instability. Method A structured search was conducted following the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines and the PICOS model, up to January 30, 2025, in the MEDLINE/PubMed, Web of Science (WOS), ScienceDirect, Cochrane Library, SciELO, EMBASE, SPORTDiscus, and Scopus databases. The risk of bias was evaluated, and the PEDro scale was used to assess methodological quality. Results The initial search yielded a total of 964 articles. After applying the inclusion and exclusion criteria, the final sample consisted of 25 articles. These studies demonstrated a high standard of methodological quality. The review summarized the effects of ligamentoplasty, arthroscopic Latarjet, dynamic anterior stabilization, and arthroscopic Trillat techniques in treating shoulder instability, detailing the sample population, immobilization period, frequency of instability episodes—including recurrent dislocations and subluxations—surgical methods, study designs, assessed variables, main findings, and reported outcomes. Conclusions Arthroscopic ligamentoplasty is advantageous in preserving the patient’s native anatomy, maintaining joint integrity, and allowing for alternative interventions in case of failure. The arthroscopic Trillat technique offers a minimally invasive solution for anterior instability without significant bone loss. The DAS technique utilizes the biceps tendon to provide dynamic stabilization, aiming to generate a sling effect over the subscapularis muscle. The Latarjet procedure remains the gold standard for managing anterior glenoid bone loss greater than 20%. Each surgical option for anterior shoulder instability carries specific implications, and treatment decisions should be tailored based on bone loss severity, capsuloligamentous quality, and the patient’s functional needs.

Producción Científica

Carlos Galindo-Rubín mail , Yehinson Barajas Ramón mail , Fernando Maniega Legarda mail , Álvaro Velarde-Sotres mail alvaro.velarde@uneatlantico.es,

Galindo-Rubín

<a class="ep_document_link" href="/17862/1/sensors-25-06419.pdf"><img class="ep_doc_icon" alt="[img]" src="/style/images/fileicons/text.png" border="0"/></a>

en

open

Edge-Based Autonomous Fire and Smoke Detection Using MobileNetV2

Forest fires pose significant threats to ecosystems, human life, and the global climate, necessitating rapid and reliable detection systems. Traditional fire detection approaches, including sensor networks, satellite monitoring, and centralized image analysis, often suffer from delayed response, high false positives, and limited deployment in remote areas. Recent deep learning-based methods offer high classification accuracy but are typically computationally intensive and unsuitable for low-power, real-time edge devices. This study presents an autonomous, edge-based forest fire and smoke detection system using a lightweight MobileNetV2 convolutional neural network. The model is trained on a balanced dataset of fire, smoke, and non-fire images and optimized for deployment on resource-constrained edge devices. The system performs near real-time inference, achieving a test accuracy of 97.98% with an average end-to-end prediction latency of 0.77 s per frame (approximately 1.3 FPS) on the Raspberry Pi 5 edge device. Predictions include the class label, confidence score, and timestamp, all generated locally without reliance on cloud connectivity, thereby enhancing security and robustness against potential cyber threats. Experimental results demonstrate that the proposed solution maintains high predictive performance comparable to state-of-the-art methods while providing efficient, offline operation suitable for real-world environmental monitoring and early wildfire mitigation. This approach enables cost-effective, scalable deployment in remote forest regions, combining accuracy, speed, and autonomous edge processing for timely fire and smoke detection.

Producción Científica

Dilshod Sharobiddinov mail , Hafeez Ur Rehman Siddiqui mail , Adil Ali Saleem mail , Gerardo Méndez Mezquita mail , Debora L. Ramírez-Vargas mail debora.ramirez@unini.edu.mx, Isabel de la Torre Díez mail ,

Sharobiddinov

<a class="ep_document_link" href="/17863/1/v16p4316.pdf"><img class="ep_doc_icon" alt="[img]" src="/style/images/fileicons/text.png" border="0"/></a>

en

open

Divulging Patterns: An Analytical Review for Machine Learning Methodologies for Breast Cancer Detection

Breast cancer is a lethal carcinoma impacting a considerable number of women across the globe. While preventive measures are limited, early detection remains the most effective strategy. Accurate classification of breast tumors into benign and malignant categories is important which may help physicians in diagnosing the disease faster. This survey investigates the emerging inclination and approaches in the area of machine learning (ML) for the diagnosis of breast cancer, pointing out the classification techniques based on both segmentation and feature selection. Certain datasets such as the Wisconsin Diagnostic Breast Cancer Dataset (WDBC), Wisconsin Breast Cancer Dataset Original (WBCD), Wisconsin Prognostic Breast Cancer Dataset (WPBC), BreakHis, and others are being evaluated in this study for the demonstration of their influence on the performance of the diagnostic tools and the accuracy of the models such as Support vector machine, Convolutional Neural Networks (CNNs) and ensemble approaches. The main shortcomings or research gaps such as prejudice of datasets, scarcity of generalizability, and interpretation challenges are highlighted. This research emphasizes the importance of the hybrid methodologies, cross-dataset validation, and the engineering of explainable AI to narrow these gaps and enhance the overall clinical acceptance of ML-based detection tools.

Producción Científica

Alveena Saleem mail , Muhammad Umair mail , Muhammad Tahir Naseem mail , Muhammad Zubair mail , Silvia Aparicio Obregón mail silvia.aparicio@uneatlantico.es, Rubén Calderón Iglesias mail ruben.calderon@uneatlantico.es, Shoaib Hassan mail , Imran Ashraf mail ,

Saleem

<a href="/17871/1/ijph-70-1608318.pdf" class="ep_document_link"><img class="ep_doc_icon" alt="[img]" src="/style/images/fileicons/text.png" border="0"/></a>

en

open

Unhealthy Ultra-Processed Food Consumption in Children and Adolescents Living in the Mediterranean Area: The DELICIOUS Project

Objectives: This study addressed the consumption of ultra-processed foods (UPFs) formulated with excess of energy/fats/sugars (hence deemed as unhealthy) and factors associated with it in children and adolescents living in 5 Mediterranean countries participating to the DELICIOUS (UnDErstanding consumer food choices & promotion of healthy and sustainable Mediterranean diet and LIfestyle in Children and adolescents through behavIOUral change actionS) project.Methods: A total of 2011 parents of children and adolescents (6–17 years) participated in a survey exploring their children’s frequency consumption of unhealthy UPFs and demographic, eating, and lifestyle habits.Results: Most children consumed unhealthy UPFs daily: higher intake was associated with being older and with obesity, as well as higher parental education and younger age. Children eating more frequently out of home and with a higher number of meals were also more likely to consume unhealthier UPF. Moreover, more screen time and a lower healthy lifestyle score were associated with higher unhealthy UPF consumption.Conclusion: consumption of unhealthy UPFs seems to be preeminent in children and adolescents living in the Mediterranean area and associated with an overall unhealthy lifestyle.

Producción Científica

Alice Rosi mail , Francesca Giampieri mail francesca.giampieri@uneatlantico.es, Osama Abdelkarim mail , Mohamed Aly mail , Achraf Ammar mail , Evelyn Frias-Toral mail , Juancho Pons mail , Laura Vázquez-Araújo mail , Alessandro Scuderi mail , Nunzia Decembrino mail , Alice Leonardi mail , Fernando Maniega Legarda mail , Lorenzo Monasta mail , Ana Mata mail , Adrián Chacón mail , Pablo Busó mail , Giuseppe Grosso mail ,

Rosi