Zeszyty czasopisma

Tom 22 (2022): Zeszyt 3 (September 2022)

Tom 22 (2022): Zeszyt 2 (June 2022)

Tom 22 (2022): Zeszyt 1 (March 2022)

Tom 21 (2021): Zeszyt 4 (December 2021)

Tom 21 (2021): Zeszyt 3 (September 2021)

Tom 21 (2021): Zeszyt 2 (June 2021)

Tom 21 (2021): Zeszyt 1 (March 2021)

Tom 20 (2020): Zeszyt 6 (December 2020)
Special Zeszyt on New Developments in Scalable Computing

Tom 20 (2020): Zeszyt 5 (December 2020)
Special issue on Innovations in Intelligent Systems and Applications

Tom 20 (2020): Zeszyt 4 (November 2020)

Tom 20 (2020): Zeszyt 3 (September 2020)

Tom 20 (2020): Zeszyt 2 (June 2020)

Tom 20 (2020): Zeszyt 1 (March 2020)

Tom 19 (2019): Zeszyt 4 (November 2019)

Tom 19 (2019): Zeszyt 3 (September 2019)

Tom 19 (2019): Zeszyt 2 (June 2019)

Tom 19 (2019): Zeszyt 1 (March 2019)

Tom 18 (2018): Zeszyt 5 (May 2018)
Special Thematic Zeszyt on Optimal Codes and Related Topics

Tom 18 (2018): Zeszyt 4 (November 2018)

Tom 18 (2018): Zeszyt 3 (September 2018)

Tom 18 (2018): Zeszyt 2 (June 2018)

Tom 18 (2018): Zeszyt 1 (March 2018)

Tom 17 (2017): Zeszyt 5 (December 2017)
Special Zeszyt With Selected Papers From The Workshop “Two Years Avitohol: Advanced High Performance Computing Applications 2017

Tom 17 (2017): Zeszyt 4 (November 2017)

Tom 17 (2017): Zeszyt 3 (September 2017)

Tom 17 (2017): Zeszyt 2 (June 2017)

Tom 17 (2017): Zeszyt 1 (March 2017)

Tom 16 (2016): Zeszyt 6 (December 2016)
Special issue with selection of extended papers from 6th International Conference on Logistic, Informatics and Service Science LISS’2016

Tom 16 (2016): Zeszyt 5 (October 2016)
Zeszyt Title: Special Zeszyt on Application of Advanced Computing and Simulation in Information Systems

Tom 16 (2016): Zeszyt 4 (December 2016)

Tom 16 (2016): Zeszyt 3 (September 2016)

Tom 16 (2016): Zeszyt 2 (June 2016)

Tom 16 (2016): Zeszyt 1 (March 2016)

Tom 15 (2015): Zeszyt 7 (December 2015)
Special Zeszyt on Information Fusion

Tom 15 (2015): Zeszyt 6 (December 2015)
Special Zeszyt on Logistics, Informatics and Service Science

Tom 15 (2015): Zeszyt 5 (April 2015)
Special Zeszyt on Control in Transportation Systems

Tom 15 (2015): Zeszyt 4 (November 2015)

Tom 15 (2015): Zeszyt 3 (September 2015)

Tom 15 (2015): Zeszyt 2 (June 2015)

Tom 15 (2015): Zeszyt 1 (March 2015)

Tom 14 (2014): Zeszyt 5 (December 2014)
Special Zeszyt

Tom 14 (2014): Zeszyt 4 (December 2014)

Tom 14 (2014): Zeszyt 3 (September 2014)

Tom 14 (2014): Zeszyt 2 (June 2014)

Tom 14 (2014): Zeszyt 1 (March 2014)

Tom 13 (2013): Zeszyt Special-Zeszyt (December 2013)

Tom 13 (2013): Zeszyt 4 (December 2013)
The publishing of the present issue (Tom 13, No 4, 2013) of the journal “Cybernetics and Information Technologies” is financially supported by FP7 project “Advanced Computing for Innovation” (ACOMIN), grant agreement 316087 of Call FP7 REGPOT-2012-2013-1.

Tom 13 (2013): Zeszyt 3 (September 2013)

Tom 13 (2013): Zeszyt 2 (June 2013)

Tom 13 (2013): Zeszyt 1 (March 2013)

Tom 12 (2012): Zeszyt 4 (December 2012)

Tom 12 (2012): Zeszyt 3 (September 2012)

Tom 12 (2012): Zeszyt 2 (June 2012)

Tom 12 (2012): Zeszyt 1 (March 2012)

Informacje o czasopiśmie
Format
Czasopismo
eISSN
1314-4081
Pierwsze wydanie
13 Mar 2012
Częstotliwość wydawania
4 razy w roku
Języki
Angielski

Wyszukiwanie

Tom 18 (2018): Zeszyt 1 (March 2018)

Informacje o czasopiśmie
Format
Czasopismo
eISSN
1314-4081
Pierwsze wydanie
13 Mar 2012
Częstotliwość wydawania
4 razy w roku
Języki
Angielski

Wyszukiwanie

14 Artykułów
Otwarty dostęp

A Unique Computational Method for Constructing Intervals in Fuzzy Time Series Forecasting

Data publikacji: 30 Mar 2018
Zakres stron: 3 - 10

Abstrakt

Abstract

This research article suggests a computational method for constructing fuzzy sets in absence of expert knowledge. This method uses concepts of central tendencies mean and variance. This study gives a solution to the critical issue in designing of fuzzy systems, number of fuzzy sets. Proposed computational method helps in finding intervals and thereby fuzzy sets for fuzzy time series forecasting. Proposed computational method is implemented on the authentic data for the enrolments of University of Alabama, which is considered as benchmark problem in the field of fuzzy time series. The forecasted values are compared with the results of other methods to state its supremacy. Projected computational method along with Gaussian membership function gave promising results over other methods for fuzzy time series for the above said benchmark data.

Słowa kluczowe

  • Fuzzy logic
  • central tendencies
  • membership function
  • prediction
  • fuzzy time series
Otwarty dostęp

Malicious URLs Detection Using Decision Tree Classifiers and Majority Voting Technique

Data publikacji: 30 Mar 2018
Zakres stron: 11 - 29

Abstrakt

Abstract

Researchers all over the world have provided significant and effective solutions to detect malicious URLs. Still due to the ever changing nature of cyberattacks, there are many open issues. In this paper, we have provided an effective hybrid methodology with new features to deal with this problem. To evaluate our approach, we have used state-of-the-arts supervised decision tree learning classifications models. We have performed our experiments on the balanced dataset. The experimental results show that, by inclusion of new features all the decision tree learning classifiers work well on our labeled dataset, achieving 98-99% detection accuracy with very low False Positive Rate (FPR) and False Negative Rate (FNR). Also we have achieved 99.29% detection accuracy with very low FPR and FNR using majority voting technique, which is better than the wellknown anti-virus and anti-malware solutions.

Słowa kluczowe

  • Static and dynamic analysis
  • feature extraction
  • decision tree learning
  • malicious URLs
  • Web security
Otwarty dostęp

Packet-Level Link Capacity Evaluation for IP Networks

Data publikacji: 30 Mar 2018
Zakres stron: 30 - 40

Abstrakt

Abstract

In recent times, with many applications, the IP networks have become the most powerful tool for sharing information. Best-effort IP interconnected networks deliver data according to the available resources, without any assurance of throughput, delay bounds, or reliability requirements. As a result, their performance is highly variable and cannot be guaranteed. In IP networks, ensuring proper link capacity at the packet level is a challenging problem. In this article, a method to evaluate the link capacity of IP networks at the packet level based on a single server delay system with state-dependent arrival and departure processes is suggested. The dependence of the traffic being carried on the queue length and on the defined waiting time is shown. Presented graphic dependencies allow for defined quality of service, namely the probability of packet loss and admissible delays, to determine the carried traffic of the links.

Słowa kluczowe

  • Link capacity
  • packet level
  • generalised single server delay queue
  • state-dependent arrival process
  • throughput
  • peaked flow
  • queue length
  • carried traffic
  • network congestion
  • best-effort IP network
  • overload regime
  • congestion
  • packet loss
  • packet delay
  • demand-capacity-performance relation
Otwarty dostęp

Comparison of Software Decision Support Systems for Solving a Multicriteria Optimization Problem

Data publikacji: 30 Mar 2018
Zakres stron: 41 - 50

Abstrakt

Abstract

This article describes in details how multicriteria optimization can be applied to solve a typical business problem for resources planning and manufacturing process optimization in a battery factory. We solve the problem by using an interactive software decision support system WebOptim developed at the Institute of information and communication technologies. The entire problem solving process is described step by step in order to point out the problem specific features as well as to demonstrate the capabilities of the WebOptim software system. For comparison, we have solved the same problem by means of another popular decision support system WWW NIMBUS and both solutions are analyzed and discussed.

Słowa kluczowe

  • Multicriteria optimization
  • decision support systems
  • resource planning
Otwarty dostęp

Adaptation of Symmetric Positive Semi-Definite Matrices for the Analysis of Textured Images

Data publikacji: 30 Mar 2018
Zakres stron: 51 - 68

Abstrakt

Abstract

This paper addresses the analysis of textured images using the symmetric positive semi-definite matrix. In particular, a field of symmetric positive semi-definite matrices is used to estimate the structural information represented by the local orientation and the degree of anisotropy in structured and sinusoid-like textured images. In order to ensure faithful local structure estimation, an adaptive algorithm for the regularization of the extent of gradient fields smoothing is proposed. Results obtained on different texture samples show the strength of the proposed method in accurately representing the local variation of orientations in the underlying textured images, which paves the way towards an accurate analysis of the texture structures.

Słowa kluczowe

  • Binary Image
  • Coherence Factor
  • Gaussian Kernel
  • Gradient Field
  • Image Structure
  • Orientation
  • Symmetric Positive Semi-Definite Matrix
  • Textured Images
Otwarty dostęp

Digital Image Steganography Using Bit Flipping

Data publikacji: 30 Mar 2018
Zakres stron: 69 - 80

Abstrakt

Abstract

This article proposes bit flipping method to conceal secret data in the original image. Here a block consists of 2 pixels and thereby flipping one or two LSBs of the pixels to hide secret information in it. It exists in two variants. Variant-1 and Variant-2 both use 7th and 8th bit of a pixel to conceal the secret data. Variant-1 hides 3 bits per a pair of pixels and the Variant-2 hides 4 bits per a pair of pixels. Our proposed method notably raises the capacity as well as bits per pixel that can be hidden in the image compared to existing bit flipping method. The image steganographic parameters such as, Peak Signal to Noise Ratio (PSNR), hiding capacity, and the Quality Index (Q.I) of the proposed techniques has been compared with the results of the existing bit flipping technique and some of the state of art article.

Słowa kluczowe

  • Steganography
  • Least Significant Bit (LSB) substitution
  • bit flipping
  • capacity
Otwarty dostęp

Classification of Mental Tasks from EEG Signals Using Spectral Analysis, PCA and SVM

Data publikacji: 30 Mar 2018
Zakres stron: 81 - 92

Abstrakt

Abstract

Signals provided by the ElectroEncephaloGraphy (EEG) are widely used in Brain-Computer Interface (BCI) applications. They can be further analyzed and used for thinking activity recognition. In this paper we proposed an algorithm that is able to recognize five mental tasks using 6 channel EEG data. The main idea is to separate the raw EEG signals into several frames and compute their spectrums. Next, a second-order derivative of Gaussian is applied to extract features and an optimum Gaussian kernel parameters grid search is performed with the help of cross-validation. The extracted features are further reduced by Principal Component Analysis. The processed data is utilized to train SVM classifier which is used for mental tasks recognition afterwards. The performance of the algorithm is estimated on publically available dataset. In terms of 5 folds cross-validation we obtained an average of 82.7% recognition rate (accuracy). Additional experiments were conducted using leave-one-out cross-validation where 67.2% correct classification was reported. Comparison to several state-of-the art methods reveals the advantages of the proposed algorithm.

Słowa kluczowe

  • ElectroEncephaloGraphy (EEG)
  • Brain Computer Interface (BCI)
  • Fast Fourier Transform (FFT)
  • Principal Component Analysis (PCA)
  • Support Vector Machine (SVM)
Otwarty dostęp

Special Thematic Section on Semantic Models for Natural Language Processing (Preface)

Data publikacji: 30 Mar 2018
Zakres stron: 93 - 94

Abstrakt

Abstract

With the availability of large language data online, cross-linked lexical resources (such as BabelNet, Predicate Matrix and UBY) and semantically annotated corpora (SemCor, OntoNotes, etc.), more and more applications in Natural Language Processing (NLP) have started to exploit various semantic models. The semantic models have been created on the base of LSA, clustering, word embeddings, deep learning, neural networks, etc., and abstract logical forms, such as Minimal Recursion Semantics (MRS) or Abstract Meaning Representation (AMR), etc.

Additionally, the Linguistic Linked Open Data Cloud has been initiated (LLOD Cloud) which interlinks linguistic data for improving the tasks of NLP. This cloud has been expanding enormously for the last four-five years. It includes corpora, lexicons, thesauri, knowledge bases of various kinds, organized around appropriate ontologies, such as LEMON. The semantic models behind the data organization as well as the representation of the semantic resources themselves are a challenge to the NLP community.

The NLP applications that extensively rely on the above discussed models include Machine Translation, Information Extraction, Question Answering, Text Simplification, etc.

Otwarty dostęp

An Automatically Generated Annotated Corpus for Albanian Named Entity Recognition

Data publikacji: 30 Mar 2018
Zakres stron: 95 - 108

Abstrakt

Abstract

Named Entity Recognition (NER) is an important task in many NLP pipelines. It has become especially important for knowledge bases that power many of the nowadays information retrieval systems. In order to cope with the high demand for annotated training corpora for supervised NER systems, automatic generation approaches have been proposed. In this paper we report on the first automatically generated NE annotated corpus for Albanian. News articles from Albanian news media were used as a document source. They were automatically tagged using a custom generated gazetteer from the Albanian Wikipedia. Our evaluation results show that this corpus can be used as a baseline corpus for human annotated ones or as a training corpus where no other is available.

Słowa kluczowe

  • Named entity recognition
  • natural language processing
  • language corpora
  • semi-automatic annotation
  • information extraction
Otwarty dostęp

Linking Datasets Using Semantic Textual Similarity

Data publikacji: 30 Mar 2018
Zakres stron: 109 - 123

Abstrakt

Abstract

Linked data has been widely recognized as an important paradigm for representing data and one of the most important aspects of supporting its use is discovery of links between datasets. For many datasets, there is a significant amount of textual information in the form of labels, descriptions and documentation about the elements of the dataset and the fundament of a precise linking is in the application of semantic textual similarity to link these datasets. However, most linking tools so far rely on only simple string similarity metrics such as Jaccard scores. We present an evaluation of some metrics that have performed well in recent semantic textual similarity evaluations and apply these to linking existing datasets.

Słowa kluczowe

  • Linked data
  • link discovery
  • ontology alignment
  • semantic textual similarity
  • structural similarity
  • NLP architectures
Otwarty dostęp

2L-APD: A Two-Level Plagiarism Detection System for Arabic Documents

Data publikacji: 30 Mar 2018
Zakres stron: 124 - 138

Abstrakt

Abstract

Measuring the amount of shared information between two documents is a key to address a number of Natural Language Processing (NLP) challenges such as Information Retrieval (IR), Semantic Textual Similarity (STS), Sentiment Analysis (SA) and Plagiarism Detection (PD). In this paper, we report a plagiarism detection system based on two layers of assessment: 1) Fingerprinting which simply compares the documents fingerprints to detect the verbatim reproduction; 2) Word embedding which uses the semantic and syntactic properties of words to detect much more complicated reproductions. Moreover, Word Alignment (WA), Inverse Document Frequency (IDF) and Part-of-Speech (POS) weighting are applied on the examined documents to support the identification of words that are most descriptive in each textual unit. In the present work, we focused on Arabic documents and we evaluated the performance of the system on a data-set of holding three types of plagiarism: 1) Simple reproduction (copy and paste); 2) Word and phrase shuffling; 3) Intelligent plagiarism including synonym substitution, diacritics insertion and paraphrasing. The results show a recall of 88% and a precision of 86%. Compared to the results obtained by the systems participating in the Arabic Plagiarism Detection Shared Task 2015, our system outperforms all of them with a plagiarism detection score (Plagdet) of 83%.

Słowa kluczowe

  • Plagiarism detection
  • intelligent plagiarism
  • fingerprinting
  • word embedding
  • Arabic language
Otwarty dostęp

Neural Network Models for Word Sense Disambiguation: An Overview

Data publikacji: 30 Mar 2018
Zakres stron: 139 - 151

Abstrakt

Abstract

The following article presents an overview of the use of artificial neural networks for the task of Word Sense Disambiguation (WSD). More specifically, it surveys the advances in neural language models in recent years that have resulted in methods for the effective distributed representation of linguistic units. Such representations – word embeddings, context embeddings, sense embeddings – can be effectively applied for WSD purposes, as they encode rich semantic information, especially in conjunction with recurrent neural networks, which are able to capture long-distance relations encoded in word order, syntax, information structuring.

Słowa kluczowe

  • Word sense disambiguation
  • neural networks
  • long short-term memory cells
  • word embeddings
  • sense embeddings
  • context representation
Otwarty dostęp

Graph-Based Complex Representation in Inter-Sentence Relation Recognition in Polish Texts

Data publikacji: 30 Mar 2018
Zakres stron: 152 - 170

Abstrakt

Abstract

This paper presents a supervised approach to the recognition of Cross-document Structure Theory (CST) relations in Polish texts. Its core is a graph-based representation constructed for sentences. Graphs are built on the basis of lexicalised syntactic-semantic relations extracted from text. Similarity between sentences is calculated as similarity between their graphs, and the values are used as features to train the classifiers. Several different configurations of graphs, as well as graph similarity methods were analysed for this task. The approach was evaluated on a large open corpus annotated manually with 17 types of selected CST relations. The configuration of experiments was similar to those known from SEMEVAL and we obtained very promising results.

Słowa kluczowe

  • Cross-document structure theory
  • CST
  • supervised learning
  • graph-based representation
  • logistic model tree
  • LMT
  • support vector machine
  • SVM
Otwarty dostęp

A Semantic Multi-Field Clinical Search for Patient Medical Records

Data publikacji: 30 Mar 2018
Zakres stron: 171 - 182

Abstrakt

Abstract

A semantic-based search engine for clinical data would be a substantial aid for hospitals to provide support for clinical practitioners. Since electronic medical records of patients contain a variety of information, there is a need to extract meaningful patterns from the Patient Medical Records (PMR). The proposed work matches patients to relevant clinical practice guidelines (CPGs) by matching their medical records with the CPGs. However in both PMR and CPG, the information pertaining to symptoms, diseases, diagnosis procedures and medicines is not structured and there is a need to pre-process and index the information in a meaningful way. In order to reduce manual effort to match to the clinical guidelines, this work automatically extracts the clinical guidelines from the PDF documents using a set of regular expression rules and indexes them with a multi-field index using Lucene. We have attempted a multi-field Lucene search and ontology-based advanced search, where the PMR is mapped to SNOMED core subset to find the important concepts. We found that the ontology-based search engine gave more meaningful results for specific queries when compared to term based search.

Słowa kluczowe

  • Semantic similarity
  • application to NLP
  • SNOMED ontology
  • information extraction and text simplification
14 Artykułów
Otwarty dostęp

A Unique Computational Method for Constructing Intervals in Fuzzy Time Series Forecasting

Data publikacji: 30 Mar 2018
Zakres stron: 3 - 10

Abstrakt

Abstract

This research article suggests a computational method for constructing fuzzy sets in absence of expert knowledge. This method uses concepts of central tendencies mean and variance. This study gives a solution to the critical issue in designing of fuzzy systems, number of fuzzy sets. Proposed computational method helps in finding intervals and thereby fuzzy sets for fuzzy time series forecasting. Proposed computational method is implemented on the authentic data for the enrolments of University of Alabama, which is considered as benchmark problem in the field of fuzzy time series. The forecasted values are compared with the results of other methods to state its supremacy. Projected computational method along with Gaussian membership function gave promising results over other methods for fuzzy time series for the above said benchmark data.

Słowa kluczowe

  • Fuzzy logic
  • central tendencies
  • membership function
  • prediction
  • fuzzy time series
Otwarty dostęp

Malicious URLs Detection Using Decision Tree Classifiers and Majority Voting Technique

Data publikacji: 30 Mar 2018
Zakres stron: 11 - 29

Abstrakt

Abstract

Researchers all over the world have provided significant and effective solutions to detect malicious URLs. Still due to the ever changing nature of cyberattacks, there are many open issues. In this paper, we have provided an effective hybrid methodology with new features to deal with this problem. To evaluate our approach, we have used state-of-the-arts supervised decision tree learning classifications models. We have performed our experiments on the balanced dataset. The experimental results show that, by inclusion of new features all the decision tree learning classifiers work well on our labeled dataset, achieving 98-99% detection accuracy with very low False Positive Rate (FPR) and False Negative Rate (FNR). Also we have achieved 99.29% detection accuracy with very low FPR and FNR using majority voting technique, which is better than the wellknown anti-virus and anti-malware solutions.

Słowa kluczowe

  • Static and dynamic analysis
  • feature extraction
  • decision tree learning
  • malicious URLs
  • Web security
Otwarty dostęp

Packet-Level Link Capacity Evaluation for IP Networks

Data publikacji: 30 Mar 2018
Zakres stron: 30 - 40

Abstrakt

Abstract

In recent times, with many applications, the IP networks have become the most powerful tool for sharing information. Best-effort IP interconnected networks deliver data according to the available resources, without any assurance of throughput, delay bounds, or reliability requirements. As a result, their performance is highly variable and cannot be guaranteed. In IP networks, ensuring proper link capacity at the packet level is a challenging problem. In this article, a method to evaluate the link capacity of IP networks at the packet level based on a single server delay system with state-dependent arrival and departure processes is suggested. The dependence of the traffic being carried on the queue length and on the defined waiting time is shown. Presented graphic dependencies allow for defined quality of service, namely the probability of packet loss and admissible delays, to determine the carried traffic of the links.

Słowa kluczowe

  • Link capacity
  • packet level
  • generalised single server delay queue
  • state-dependent arrival process
  • throughput
  • peaked flow
  • queue length
  • carried traffic
  • network congestion
  • best-effort IP network
  • overload regime
  • congestion
  • packet loss
  • packet delay
  • demand-capacity-performance relation
Otwarty dostęp

Comparison of Software Decision Support Systems for Solving a Multicriteria Optimization Problem

Data publikacji: 30 Mar 2018
Zakres stron: 41 - 50

Abstrakt

Abstract

This article describes in details how multicriteria optimization can be applied to solve a typical business problem for resources planning and manufacturing process optimization in a battery factory. We solve the problem by using an interactive software decision support system WebOptim developed at the Institute of information and communication technologies. The entire problem solving process is described step by step in order to point out the problem specific features as well as to demonstrate the capabilities of the WebOptim software system. For comparison, we have solved the same problem by means of another popular decision support system WWW NIMBUS and both solutions are analyzed and discussed.

Słowa kluczowe

  • Multicriteria optimization
  • decision support systems
  • resource planning
Otwarty dostęp

Adaptation of Symmetric Positive Semi-Definite Matrices for the Analysis of Textured Images

Data publikacji: 30 Mar 2018
Zakres stron: 51 - 68

Abstrakt

Abstract

This paper addresses the analysis of textured images using the symmetric positive semi-definite matrix. In particular, a field of symmetric positive semi-definite matrices is used to estimate the structural information represented by the local orientation and the degree of anisotropy in structured and sinusoid-like textured images. In order to ensure faithful local structure estimation, an adaptive algorithm for the regularization of the extent of gradient fields smoothing is proposed. Results obtained on different texture samples show the strength of the proposed method in accurately representing the local variation of orientations in the underlying textured images, which paves the way towards an accurate analysis of the texture structures.

Słowa kluczowe

  • Binary Image
  • Coherence Factor
  • Gaussian Kernel
  • Gradient Field
  • Image Structure
  • Orientation
  • Symmetric Positive Semi-Definite Matrix
  • Textured Images
Otwarty dostęp

Digital Image Steganography Using Bit Flipping

Data publikacji: 30 Mar 2018
Zakres stron: 69 - 80

Abstrakt

Abstract

This article proposes bit flipping method to conceal secret data in the original image. Here a block consists of 2 pixels and thereby flipping one or two LSBs of the pixels to hide secret information in it. It exists in two variants. Variant-1 and Variant-2 both use 7th and 8th bit of a pixel to conceal the secret data. Variant-1 hides 3 bits per a pair of pixels and the Variant-2 hides 4 bits per a pair of pixels. Our proposed method notably raises the capacity as well as bits per pixel that can be hidden in the image compared to existing bit flipping method. The image steganographic parameters such as, Peak Signal to Noise Ratio (PSNR), hiding capacity, and the Quality Index (Q.I) of the proposed techniques has been compared with the results of the existing bit flipping technique and some of the state of art article.

Słowa kluczowe

  • Steganography
  • Least Significant Bit (LSB) substitution
  • bit flipping
  • capacity
Otwarty dostęp

Classification of Mental Tasks from EEG Signals Using Spectral Analysis, PCA and SVM

Data publikacji: 30 Mar 2018
Zakres stron: 81 - 92

Abstrakt

Abstract

Signals provided by the ElectroEncephaloGraphy (EEG) are widely used in Brain-Computer Interface (BCI) applications. They can be further analyzed and used for thinking activity recognition. In this paper we proposed an algorithm that is able to recognize five mental tasks using 6 channel EEG data. The main idea is to separate the raw EEG signals into several frames and compute their spectrums. Next, a second-order derivative of Gaussian is applied to extract features and an optimum Gaussian kernel parameters grid search is performed with the help of cross-validation. The extracted features are further reduced by Principal Component Analysis. The processed data is utilized to train SVM classifier which is used for mental tasks recognition afterwards. The performance of the algorithm is estimated on publically available dataset. In terms of 5 folds cross-validation we obtained an average of 82.7% recognition rate (accuracy). Additional experiments were conducted using leave-one-out cross-validation where 67.2% correct classification was reported. Comparison to several state-of-the art methods reveals the advantages of the proposed algorithm.

Słowa kluczowe

  • ElectroEncephaloGraphy (EEG)
  • Brain Computer Interface (BCI)
  • Fast Fourier Transform (FFT)
  • Principal Component Analysis (PCA)
  • Support Vector Machine (SVM)
Otwarty dostęp

Special Thematic Section on Semantic Models for Natural Language Processing (Preface)

Data publikacji: 30 Mar 2018
Zakres stron: 93 - 94

Abstrakt

Abstract

With the availability of large language data online, cross-linked lexical resources (such as BabelNet, Predicate Matrix and UBY) and semantically annotated corpora (SemCor, OntoNotes, etc.), more and more applications in Natural Language Processing (NLP) have started to exploit various semantic models. The semantic models have been created on the base of LSA, clustering, word embeddings, deep learning, neural networks, etc., and abstract logical forms, such as Minimal Recursion Semantics (MRS) or Abstract Meaning Representation (AMR), etc.

Additionally, the Linguistic Linked Open Data Cloud has been initiated (LLOD Cloud) which interlinks linguistic data for improving the tasks of NLP. This cloud has been expanding enormously for the last four-five years. It includes corpora, lexicons, thesauri, knowledge bases of various kinds, organized around appropriate ontologies, such as LEMON. The semantic models behind the data organization as well as the representation of the semantic resources themselves are a challenge to the NLP community.

The NLP applications that extensively rely on the above discussed models include Machine Translation, Information Extraction, Question Answering, Text Simplification, etc.

Otwarty dostęp

An Automatically Generated Annotated Corpus for Albanian Named Entity Recognition

Data publikacji: 30 Mar 2018
Zakres stron: 95 - 108

Abstrakt

Abstract

Named Entity Recognition (NER) is an important task in many NLP pipelines. It has become especially important for knowledge bases that power many of the nowadays information retrieval systems. In order to cope with the high demand for annotated training corpora for supervised NER systems, automatic generation approaches have been proposed. In this paper we report on the first automatically generated NE annotated corpus for Albanian. News articles from Albanian news media were used as a document source. They were automatically tagged using a custom generated gazetteer from the Albanian Wikipedia. Our evaluation results show that this corpus can be used as a baseline corpus for human annotated ones or as a training corpus where no other is available.

Słowa kluczowe

  • Named entity recognition
  • natural language processing
  • language corpora
  • semi-automatic annotation
  • information extraction
Otwarty dostęp

Linking Datasets Using Semantic Textual Similarity

Data publikacji: 30 Mar 2018
Zakres stron: 109 - 123

Abstrakt

Abstract

Linked data has been widely recognized as an important paradigm for representing data and one of the most important aspects of supporting its use is discovery of links between datasets. For many datasets, there is a significant amount of textual information in the form of labels, descriptions and documentation about the elements of the dataset and the fundament of a precise linking is in the application of semantic textual similarity to link these datasets. However, most linking tools so far rely on only simple string similarity metrics such as Jaccard scores. We present an evaluation of some metrics that have performed well in recent semantic textual similarity evaluations and apply these to linking existing datasets.

Słowa kluczowe

  • Linked data
  • link discovery
  • ontology alignment
  • semantic textual similarity
  • structural similarity
  • NLP architectures
Otwarty dostęp

2L-APD: A Two-Level Plagiarism Detection System for Arabic Documents

Data publikacji: 30 Mar 2018
Zakres stron: 124 - 138

Abstrakt

Abstract

Measuring the amount of shared information between two documents is a key to address a number of Natural Language Processing (NLP) challenges such as Information Retrieval (IR), Semantic Textual Similarity (STS), Sentiment Analysis (SA) and Plagiarism Detection (PD). In this paper, we report a plagiarism detection system based on two layers of assessment: 1) Fingerprinting which simply compares the documents fingerprints to detect the verbatim reproduction; 2) Word embedding which uses the semantic and syntactic properties of words to detect much more complicated reproductions. Moreover, Word Alignment (WA), Inverse Document Frequency (IDF) and Part-of-Speech (POS) weighting are applied on the examined documents to support the identification of words that are most descriptive in each textual unit. In the present work, we focused on Arabic documents and we evaluated the performance of the system on a data-set of holding three types of plagiarism: 1) Simple reproduction (copy and paste); 2) Word and phrase shuffling; 3) Intelligent plagiarism including synonym substitution, diacritics insertion and paraphrasing. The results show a recall of 88% and a precision of 86%. Compared to the results obtained by the systems participating in the Arabic Plagiarism Detection Shared Task 2015, our system outperforms all of them with a plagiarism detection score (Plagdet) of 83%.

Słowa kluczowe

  • Plagiarism detection
  • intelligent plagiarism
  • fingerprinting
  • word embedding
  • Arabic language
Otwarty dostęp

Neural Network Models for Word Sense Disambiguation: An Overview

Data publikacji: 30 Mar 2018
Zakres stron: 139 - 151

Abstrakt

Abstract

The following article presents an overview of the use of artificial neural networks for the task of Word Sense Disambiguation (WSD). More specifically, it surveys the advances in neural language models in recent years that have resulted in methods for the effective distributed representation of linguistic units. Such representations – word embeddings, context embeddings, sense embeddings – can be effectively applied for WSD purposes, as they encode rich semantic information, especially in conjunction with recurrent neural networks, which are able to capture long-distance relations encoded in word order, syntax, information structuring.

Słowa kluczowe

  • Word sense disambiguation
  • neural networks
  • long short-term memory cells
  • word embeddings
  • sense embeddings
  • context representation
Otwarty dostęp

Graph-Based Complex Representation in Inter-Sentence Relation Recognition in Polish Texts

Data publikacji: 30 Mar 2018
Zakres stron: 152 - 170

Abstrakt

Abstract

This paper presents a supervised approach to the recognition of Cross-document Structure Theory (CST) relations in Polish texts. Its core is a graph-based representation constructed for sentences. Graphs are built on the basis of lexicalised syntactic-semantic relations extracted from text. Similarity between sentences is calculated as similarity between their graphs, and the values are used as features to train the classifiers. Several different configurations of graphs, as well as graph similarity methods were analysed for this task. The approach was evaluated on a large open corpus annotated manually with 17 types of selected CST relations. The configuration of experiments was similar to those known from SEMEVAL and we obtained very promising results.

Słowa kluczowe

  • Cross-document structure theory
  • CST
  • supervised learning
  • graph-based representation
  • logistic model tree
  • LMT
  • support vector machine
  • SVM
Otwarty dostęp

A Semantic Multi-Field Clinical Search for Patient Medical Records

Data publikacji: 30 Mar 2018
Zakres stron: 171 - 182

Abstrakt

Abstract

A semantic-based search engine for clinical data would be a substantial aid for hospitals to provide support for clinical practitioners. Since electronic medical records of patients contain a variety of information, there is a need to extract meaningful patterns from the Patient Medical Records (PMR). The proposed work matches patients to relevant clinical practice guidelines (CPGs) by matching their medical records with the CPGs. However in both PMR and CPG, the information pertaining to symptoms, diseases, diagnosis procedures and medicines is not structured and there is a need to pre-process and index the information in a meaningful way. In order to reduce manual effort to match to the clinical guidelines, this work automatically extracts the clinical guidelines from the PDF documents using a set of regular expression rules and indexes them with a multi-field index using Lucene. We have attempted a multi-field Lucene search and ontology-based advanced search, where the PMR is mapped to SNOMED core subset to find the important concepts. We found that the ontology-based search engine gave more meaningful results for specific queries when compared to term based search.

Słowa kluczowe

  • Semantic similarity
  • application to NLP
  • SNOMED ontology
  • information extraction and text simplification

Zaplanuj zdalną konferencję ze Sciendo