Journal & Issues

Volume 13 (2023): Issue 2 (March 2023)

Volume 13 (2023): Issue 1 (January 2023)

Volume 12 (2022): Issue 4 (October 2022)

Volume 12 (2022): Issue 3 (July 2022)

Volume 12 (2022): Issue 2 (April 2022)

Volume 12 (2022): Issue 1 (January 2022)

Volume 11 (2021): Issue 4 (October 2021)

Volume 11 (2021): Issue 3 (July 2021)

Volume 11 (2021): Issue 2 (April 2021)

Volume 11 (2021): Issue 1 (January 2021)

Volume 10 (2020): Issue 4 (October 2020)

Volume 10 (2020): Issue 3 (July 2020)

Volume 10 (2020): Issue 2 (April 2020)

Volume 10 (2020): Issue 1 (January 2020)

Volume 9 (2019): Issue 4 (October 2019)

Volume 9 (2019): Issue 3 (July 2019)

Volume 9 (2019): Issue 2 (April 2019)

Volume 9 (2019): Issue 1 (January 2019)

Volume 8 (2018): Issue 4 (October 2018)

Volume 8 (2018): Issue 3 (July 2018)

Volume 8 (2018): Issue 2 (April 2018)

Volume 8 (2018): Issue 1 (January 2018)

Volume 7 (2017): Issue 4 (October 2017)

Volume 7 (2017): Issue 3 (July 2017)

Volume 7 (2017): Issue 2 (April 2017)

Volume 7 (2017): Issue 1 (January 2017)

Volume 6 (2016): Issue 4 (October 2016)

Volume 6 (2016): Issue 3 (July 2016)

Volume 6 (2016): Issue 2 (April 2016)

Volume 6 (2016): Issue 1 (January 2016)

Volume 5 (2015): Issue 4 (October 2015)

Volume 5 (2015): Issue 3 (July 2015)

Volume 5 (2015): Issue 2 (April 2015)

Volume 5 (2015): Issue 1 (January 2015)

Volume 4 (2014): Issue 4 (October 2014)

Volume 4 (2014): Issue 3 (July 2014)

Volume 4 (2014): Issue 2 (April 2014)

Volume 4 (2014): Issue 1 (January 2014)

Volume 3 (2013): Issue 4 (October 2013)

Volume 3 (2013): Issue 3 (July 2013)

Volume 3 (2013): Issue 2 (April 2013)

Volume 3 (2013): Issue 1 (January 2013)

Journal Details
Format
Journal
eISSN
2449-6499
First Published
30 Dec 2014
Publication timeframe
4 times per year
Languages
English

Search

Volume 10 (2020): Issue 4 (October 2020)

Journal Details
Format
Journal
eISSN
2449-6499
First Published
30 Dec 2014
Publication timeframe
4 times per year
Languages
English

Search

5 Articles
Open Access

Browser Fingerprint Coding Methods Increasing the Effectiveness of User Identification in the Web Traffic

Published Online: 15 Jun 2020
Page range: 243 - 253

Abstract

Abstract

Web-based browser fingerprint (or device fingerprint) is a tool used to identify and track user activity in web traffic. It is also used to identify computers that are abusing online advertising and also to prevent credit card fraud. A device fingerprint is created by extracting multiple parameter values from a browser API (e.g. operating system type or browser version). The acquired parameter values are then used to create a hash using the hash function. The disadvantage of using this method is too high susceptibility to small, normally occurring changes (e.g. when changing the browser version number or screen resolution). Minor changes in the input values generate a completely different fingerprint hash, making it impossible to find similar ones in the database. On the other hand, omitting these unstable values when creating a hash, significantly limits the ability of the fingerprint to distinguish between devices. This weak point is commonly exploited by fraudsters who knowingly evade this form of protection by deliberately changing the value of device parameters. The paper presents methods that significantly limit this type of activity. New algorithms for coding and comparing fingerprints are presented, in which the values of parameters with low stability and low entropy are especially taken into account. The fingerprint generation methods are based on popular Minhash, the LSH, and autoencoder methods. The effectiveness of coding and comparing each of the presented methods was also examined in comparison with the currently used hash generation method. Authentic data of the devices and browsers of users visiting 186 different websites were collected for the research.

Keywords

  • browser fingerprint
  • device fingerprint
  • LSH algorithm
  • autoencoder
Open Access

Data-Driven Temporal-Spatial Model for the Prediction of AQI in Nanjing

Published Online: 15 Jun 2020
Page range: 255 - 270

Abstract

Abstract

Air quality data prediction in urban area is of great significance to control air pollution and protect the public health. The prediction of the air quality in the monitoring station is well studied in existing researches. However, air-quality-monitor stations are insufficient in most cities and the air quality varies from one place to another dramatically due to complex factors. A novel model is established in this paper to estimate and predict the Air Quality Index (AQI) of the areas without monitoring stations in Nanjing. The proposed model predicts AQI in a non-monitoring area both in temporal dimension and in spatial dimension respectively. The temporal dimension model is presented at first based on the enhanced k-Nearest Neighbor (KNN) algorithm to predict the AQI values among monitoring stations, the acceptability of the results achieves 92% for one-hour prediction. Meanwhile, in order to forecast the evolution of air quality in the spatial dimension, the method is utilized with the help of Back Propagation neural network (BP), which considers geographical distance. Furthermore, to improve the accuracy and adaptability of the spatial model, the similarity of topological structure is introduced. Especially, the temporal-spatial model is built and its adaptability is tested on a specific non-monitoring site, Jiulonghu Campus of Southeast University. The result demonstrates that the acceptability achieves 73.8% on average. The current paper provides strong evidence suggesting that the proposed non-parametric and data-driven approach for air quality forecasting provides promising results.

Keywords

  • Air quality prediction
  • k-Nearest Neighbor
  • BP neural network
  • Non-monitoring stations
Open Access

Triangular Fuzzy-Rough Set Based Fuzzification of Fuzzy Rule-Based Systems

Published Online: 15 Jun 2020
Page range: 271 - 285

Abstract

Abstract

In real-world approximation problems, precise input data are economically expensive. Therefore, fuzzy methods devoted to uncertain data are in the focus of current research. Consequently, a method based on fuzzy-rough sets for fuzzification of inputs in a rule-based fuzzy system is discussed in this paper. A triangular membership function is applied to describe the nature of imprecision in data. Firstly, triangular fuzzy partitions are introduced to approximate common antecedent fuzzy rule sets. As a consequence of the proposed method, we obtain a structure of a general (non-interval) type-2 fuzzy logic system in which secondary membership functions are cropped triangular. Then, the possibility of applying so-called regular triangular norms is discussed. Finally, an experimental system constructed on precise data, which is then transformed and verified for uncertain data, is provided to demonstrate its basic properties.

Keywords

  • general type-2 fuzzy logic systems
  • fuzzy-rough fuzzification
  • regular type-2 t-norms
  • cropped triangular secondary membership functions
Open Access

A Novel Drift Detection Algorithm Based on Features’ Importance Analysis in a Data Streams Environment

Published Online: 15 Jun 2020
Page range: 287 - 298

Abstract

Abstract

The training set consists of many features that influence the classifier in different degrees. Choosing the most important features and rejecting those that do not carry relevant information is of great importance to the operating of the learned model. In the case of data streams, the importance of the features may additionally change over time. Such changes affect the performance of the classifier but can also be an important indicator of occurring concept-drift. In this work, we propose a new algorithm for data streams classification, called Random Forest with Features Importance (RFFI), which uses the measure of features importance as a drift detector. The RFFT algorithm implements solutions inspired by the Random Forest algorithm to the data stream scenarios. The proposed algorithm combines the ability of ensemble methods for handling slow changes in a data stream with a new method for detecting concept drift occurrence. The work contains an experimental analysis of the proposed algorithm, carried out on synthetic and real data.

Keywords

  • data stream mining
  • random forest
  • features importance
Open Access

Local Levenberg-Marquardt Algorithm for Learning Feedforwad Neural Networks

Published Online: 15 Jun 2020
Page range: 299 - 316

Abstract

Abstract

This paper presents a local modification of the Levenberg-Marquardt algorithm (LM). First, the mathematical basics of the classic LM method are shown. The classic LM algorithm is very efficient for learning small neural networks. For bigger neural networks, whose computational complexity grows significantly, it makes this method practically inefficient. In order to overcome this limitation, local modification of the LM is introduced in this paper. The main goal of this paper is to develop a more complexity efficient modification of the LM method by using a local computation. The introduced modification has been tested on the following benchmarks: the function approximation and classification problems. The obtained results have been compared to the classic LM method performance. The paper shows that the local modification of the LM method significantly improves the algorithm’s performance for bigger networks. Several possible proposals for future works are suggested.

Keywords

  • feed-forward neural network
  • neural network learning algorithm
  • optimization problem
  • Levenberg-Marquardt algorithm
  • QR decomposition
  • Givens rotation
5 Articles
Open Access

Browser Fingerprint Coding Methods Increasing the Effectiveness of User Identification in the Web Traffic

Published Online: 15 Jun 2020
Page range: 243 - 253

Abstract

Abstract

Web-based browser fingerprint (or device fingerprint) is a tool used to identify and track user activity in web traffic. It is also used to identify computers that are abusing online advertising and also to prevent credit card fraud. A device fingerprint is created by extracting multiple parameter values from a browser API (e.g. operating system type or browser version). The acquired parameter values are then used to create a hash using the hash function. The disadvantage of using this method is too high susceptibility to small, normally occurring changes (e.g. when changing the browser version number or screen resolution). Minor changes in the input values generate a completely different fingerprint hash, making it impossible to find similar ones in the database. On the other hand, omitting these unstable values when creating a hash, significantly limits the ability of the fingerprint to distinguish between devices. This weak point is commonly exploited by fraudsters who knowingly evade this form of protection by deliberately changing the value of device parameters. The paper presents methods that significantly limit this type of activity. New algorithms for coding and comparing fingerprints are presented, in which the values of parameters with low stability and low entropy are especially taken into account. The fingerprint generation methods are based on popular Minhash, the LSH, and autoencoder methods. The effectiveness of coding and comparing each of the presented methods was also examined in comparison with the currently used hash generation method. Authentic data of the devices and browsers of users visiting 186 different websites were collected for the research.

Keywords

  • browser fingerprint
  • device fingerprint
  • LSH algorithm
  • autoencoder
Open Access

Data-Driven Temporal-Spatial Model for the Prediction of AQI in Nanjing

Published Online: 15 Jun 2020
Page range: 255 - 270

Abstract

Abstract

Air quality data prediction in urban area is of great significance to control air pollution and protect the public health. The prediction of the air quality in the monitoring station is well studied in existing researches. However, air-quality-monitor stations are insufficient in most cities and the air quality varies from one place to another dramatically due to complex factors. A novel model is established in this paper to estimate and predict the Air Quality Index (AQI) of the areas without monitoring stations in Nanjing. The proposed model predicts AQI in a non-monitoring area both in temporal dimension and in spatial dimension respectively. The temporal dimension model is presented at first based on the enhanced k-Nearest Neighbor (KNN) algorithm to predict the AQI values among monitoring stations, the acceptability of the results achieves 92% for one-hour prediction. Meanwhile, in order to forecast the evolution of air quality in the spatial dimension, the method is utilized with the help of Back Propagation neural network (BP), which considers geographical distance. Furthermore, to improve the accuracy and adaptability of the spatial model, the similarity of topological structure is introduced. Especially, the temporal-spatial model is built and its adaptability is tested on a specific non-monitoring site, Jiulonghu Campus of Southeast University. The result demonstrates that the acceptability achieves 73.8% on average. The current paper provides strong evidence suggesting that the proposed non-parametric and data-driven approach for air quality forecasting provides promising results.

Keywords

  • Air quality prediction
  • k-Nearest Neighbor
  • BP neural network
  • Non-monitoring stations
Open Access

Triangular Fuzzy-Rough Set Based Fuzzification of Fuzzy Rule-Based Systems

Published Online: 15 Jun 2020
Page range: 271 - 285

Abstract

Abstract

In real-world approximation problems, precise input data are economically expensive. Therefore, fuzzy methods devoted to uncertain data are in the focus of current research. Consequently, a method based on fuzzy-rough sets for fuzzification of inputs in a rule-based fuzzy system is discussed in this paper. A triangular membership function is applied to describe the nature of imprecision in data. Firstly, triangular fuzzy partitions are introduced to approximate common antecedent fuzzy rule sets. As a consequence of the proposed method, we obtain a structure of a general (non-interval) type-2 fuzzy logic system in which secondary membership functions are cropped triangular. Then, the possibility of applying so-called regular triangular norms is discussed. Finally, an experimental system constructed on precise data, which is then transformed and verified for uncertain data, is provided to demonstrate its basic properties.

Keywords

  • general type-2 fuzzy logic systems
  • fuzzy-rough fuzzification
  • regular type-2 t-norms
  • cropped triangular secondary membership functions
Open Access

A Novel Drift Detection Algorithm Based on Features’ Importance Analysis in a Data Streams Environment

Published Online: 15 Jun 2020
Page range: 287 - 298

Abstract

Abstract

The training set consists of many features that influence the classifier in different degrees. Choosing the most important features and rejecting those that do not carry relevant information is of great importance to the operating of the learned model. In the case of data streams, the importance of the features may additionally change over time. Such changes affect the performance of the classifier but can also be an important indicator of occurring concept-drift. In this work, we propose a new algorithm for data streams classification, called Random Forest with Features Importance (RFFI), which uses the measure of features importance as a drift detector. The RFFT algorithm implements solutions inspired by the Random Forest algorithm to the data stream scenarios. The proposed algorithm combines the ability of ensemble methods for handling slow changes in a data stream with a new method for detecting concept drift occurrence. The work contains an experimental analysis of the proposed algorithm, carried out on synthetic and real data.

Keywords

  • data stream mining
  • random forest
  • features importance
Open Access

Local Levenberg-Marquardt Algorithm for Learning Feedforwad Neural Networks

Published Online: 15 Jun 2020
Page range: 299 - 316

Abstract

Abstract

This paper presents a local modification of the Levenberg-Marquardt algorithm (LM). First, the mathematical basics of the classic LM method are shown. The classic LM algorithm is very efficient for learning small neural networks. For bigger neural networks, whose computational complexity grows significantly, it makes this method practically inefficient. In order to overcome this limitation, local modification of the LM is introduced in this paper. The main goal of this paper is to develop a more complexity efficient modification of the LM method by using a local computation. The introduced modification has been tested on the following benchmarks: the function approximation and classification problems. The obtained results have been compared to the classic LM method performance. The paper shows that the local modification of the LM method significantly improves the algorithm’s performance for bigger networks. Several possible proposals for future works are suggested.

Keywords

  • feed-forward neural network
  • neural network learning algorithm
  • optimization problem
  • Levenberg-Marquardt algorithm
  • QR decomposition
  • Givens rotation