المنشورات العلمية

Migration of RDBs into ORDBs and XML Data

Journal Article

Abstract— XML and relation database are two of the most important mechanisms for storing and transferring data. A reliable and flexible way of moving data between them is very desirable goal. The way data is stored in each method is very different which makes the translation process difficult. To try and abstract some of the differences away, a low–level common data model can be used to successfully move data from one model to another. A way of describing the schema is needed. To the best of our knowledge, there is no widely accepted way of doing this for XML.

Recently, XML Schema has taken on this role. On one hand, this paper takes XML conforming to XML schema definitions and transforms into relational database via the low–level modeling language HDM. On the other hand, a relational database is transformed into an XML Schema document and an XML instance document containing the data from the database. The transformations are done within the Auto med framework providing a sound theoretical basis for the work. A visual tool that represents the XML Schema in tree structure and allows some manipulation of the schema is also described.

Ali Sayeh Ahmed Elbekai, Abduelbaset Mustafa Alia Goweder, (04-2018), Faculty of Science, University of Tripoli: THE LIBYAN JOURNAL OF SCIENCE (An International Journal), 4 (21), 57-63

Publication link

Using Access Control List against Denial of service attacks

Journal Article

Hadya Soliman Hadya Hawedi, (12-2017), Journal of Economics and Political Science: Faculty of Economics and Commerce / Al-Asmarya Islamic University, 1 (10), 261-274

Publication link

Irregular Arabic Plural without Stemming.

Conference paper

Abstract— With the growth of digital Arabic documents specially in information retrieval (IR) and natural language processing (NLP) applications, identification of irregular plurals which are commonly called broken plurals (BP) in modern standard Arabic becomes very urgent issue. Broken plurals are formed by imposing interdigitating patterns on stems, and singular words cannot be recovered by standard affix stripping stemming techniques. Identifying broken plurals is an important and difficult problem which needs to be addressed. In information retrieval, deriving singulars from plurals is referred to as a stemming. The process of stemming can be achieved by removing the attached affixes from a given word. To the best of our knowledge, all existing Arabic stemmers are unreliable and still under research. Consequently, this paper proposes an approach which identifies broken plurals without the need to perform the stemming process on any given word. The well known decision tree system (WEKA J48) is applied to build a classifier (model) on a very huge Arabic corpus as a training data which is pre-processed and prepared as a piece of this work. The built classifier is evaluated using unseen test set. The obtained results reveal that a very promising broken plural recognizer could be designed and implemented for NLP applications.

Abduelbaset Mustafa Alia Goweder, (11-2016), Hammamet, Tunisia.: Proceedings of CEIT 2016, 1-6

Publication link

Detection of the Offensive Language in Multilingual Communication

Technical Report

Detection of the Offensive Language in Multilingual Communication

Despite the high number of studies that deal with abuse on social media platforms, the vast majority of these studies focus on the English langu age. With the increasing number of Arabs actively using social media, there is a paucity of studies on addressing the problems specific for the Arabic language.

AZALDEN ABULQASEM MOHAMED ALAKROT, (03-2016), The University of Limerick , Ireland.: asd,

Publication link

INFORMATION TECHNOLOGY CAPABILITY AS PREDICTOR OF ORGANIZATIONAL INTELLIGENCE IN LIBYAN OIL AND GAS COMPANIES

Journal Article

Hadya Soliman Hadya Hawedi, (12-2015), ARPN Journal of Engineering and Applied Sciences: ARPN, 10 (23), 18220-18227

Publication link

A survey of text mining approaches to cyberbullying detection in online communication flows

Technical Report

he exponential growth of various social media platforms in recent years has created the opportunity for people to interact and communicate with each other to a degree unprecedented before the invention of the Web. This development is without doubt beneficial for society; however, it has also been associated with an escalation of cyberbullying activities with unacceptable consequences. The goal of this study is to develop data mining and information visualisation techniques which can form the base of a visual analytics solution for effective detection of cyberbullying. In this paper we summarise the main algorithms for text mining with a focus on cyberbullying detection.

AZALDEN ABULQASEM MOHAMED ALAKROT, (03-2015), The University of Galwa , Ireland.: asd,

Publication link

The Similarity Thesaurus for Expanding Arabic Queries

Conference paper

Abstract— Query expansion is the process of supplementing additional terms to the original query to improve the information retrieval (IR) performance. For heavily inflectional languages such as Arabic, query expansion is considered a difficult task. In this paper, the well-known approach: "The similarity thesaurus" is adopted to be applied on Arabic. Prior to applying this approach, first; datasets (three collections of Arabic documents) are pre-processed to create documents inverted index vocabularies, then, the normal indexing process is carried out. The thesaurus method is applied to create a modified (expanded) query of the original one and the target collection is indexed once more. To gauge the enhancement of retrieval process, the results of normal indexing and those of applying thesaurus approach are evaluated against each other using precision and recall measures. The results have shown that the thesaurus method has considerably enhanced the performance of the Arabic Information Retrieval (AIR) System. As the number of expansion terms increases up to a certain extent (35 terms), the performance has been improved. On the other hand, the performance will not be affected, or grow insignificantly as the number of expansion terms exceeds this limit.

Abduelbaset Mustafa Alia Goweder, (08-2014), University of Selcuk, Antalya, Turkey: Proceedings of ICAT 2014, 876-882

Publication link

XMLSchema-Driven Mapping of Architecture Components for Generating New Data.

Conference paper

Abstract— In this paper, the XMLSchema-driven mapping of architecture components for generating new data formats will be introduced and an investigation of how the XMLSchema can be stored in different ways will be carried out. In general, any application that has the capability to work with XML documents will need to display the structure of its related data in a different format specified for a particular occasion, due to its nature in working in heterogeneous environment. Accordingly, mapping document from one data structure to another is needed. Such a mapping process is essential, especially when dealing with XMLSchema. Actually, when the data are to be translated between XML and database there should be some means of mapping formulated for the data before they can be transferred either to the database or in the document. Most of the techniques use object relational mapping for transforming data between XML and the database. In this paper, we will present different types of mapping of XMLSchema such as tree-to-tree which means XMLSchema to another XMLSchema and XMLSchema to XHTML. Other mappings are XMLSchema to relation, XMLSchema to object relational, and XMLSchema to relational algebra. We also introduce general algorithms for many of the mapping types. The algorithms and the techniques show how XMLSchema drives the mapping of architecture components to generate a new data structure.

Ali Sayeh Ahmed Elbekai, Abduelbaset Mustafa Alia Goweder, (08-2014), University of Selcuk, Antalya, Turkey.: Proceedings of ICAT 2014, 889-895

Publication link

CENTROID-BASED ARABIC CLASSIFIER

Conference paper

Abstract: Nowadays, enormous amounts of accessible textual information available on the Internet are phenomenal. Automatic text classification is considered an important application in natural language processing. It is the process of assigning a document to predefined categories based on its content. In this paper, the well-known Centroid-based technique developed for text classification is considered to be applied on Arabic text. Arabic language is highly inflectional and derivational which makes text processing a complex and challenging task. In the proposed work, the Centroid-based Algorithm is adopted and adapted to be applied to classify Arabic documents. The implemented algorithm is evaluated using a corpus containing a set of Arabic documents. The experimental results against a dataset of 1400 Arabic text documents covering seven distinct categories reveal that the adapted Centroid-based algorithm is applicable to classify Arabic documents. The performance criteria of the implemented Arabic classifier achieved roughly figures of 90.7%, 87.1%, 88.9%, 94.8%, and 5.2% of Micro-averaging recall, precision, F measure, accuracy, and error rates respectively.

Abduelbaset Mustafa Alia Goweder, (12-2013), Sudan University of Science and Technology, Khartoum, Sudan: Proceedings of ACIT 2013, 13-21

Publication link

The Pseudo Relevance Feedback for Expanding Arabic Queries

Conference paper

Abstract With the explosive growth of the World Wide Web, Information Retrieval Systems (IRS) have recently become a focus of research. Query expansion is defined as the process of supplementing additional terms or phrases to the original query to improve the information retrieval performance. Arabic is highly inflectional and derivational language which makes the query expansion process a hard task. In this paper, the well known approach, Pseudo Relevance Feedback (PRF) is adopted to be applied on Arabic. Prior to applying PRF, first; datasets (three collections of Arabic documents) are pre-processed to create documents inverted index vocabularies, then, the normal indexing process is carried out. The PRF is applied to create a modified (expanded) query of the original one and the target collection is indexed once more. To judge the enhancement of retrieval process, the results of normal indexing and those of applying PRF are evaluated against each other using precision and recall measures. The results have shown that the PRF method has significantly enhanced the performance of the Arabic Information Retrieval (AIR) System. As the number of expansion terms increases up to a certain extent (35 terms), the performance has been improved. On the other hand, the performance will not be affected, or grow insignificantly as the number of expansion terms exceeds this limit.

Abduelbaset Mustafa Alia Goweder, (12-2013), Poznan, Poland.: Proceedings of 6th Language and Technology Conference, (LTC), 359-365

Publication link