<abstract xmlns="http://www.w3.org/1999/xhtml">

Outlier detection aims to find a data sample that is significantly different from other data samples. Various outlier detection methods have been proposed and have been shown to be able to detect anomalies in many practical problems. However, in high dimensional data, conventional outlier detection methods often behave unexpectedly due to a phenomenon called the curse of dimensionality. In this paper, we compare and analyze outlier detection performance in various experimental settings, focusing on text data with dimensions typically in the tens of thousands. Experimental setups were simulated to compare the performance of outlier detection methods in unsupervised versus semi-supervised mode and uni-modal versus multi-modal data distributions. The performance of outlier detection methods based on dimension reduction is compared, and a discussion on using k-NN distance in high dimensional data is also provided. Analysis through experimental comparison in various environments can provide insights into the application of outlier detection methods in high dimensional data.
</abstract>

A Comparative Study for Outlier Detection Methods in High Dimensional Text Data

<abstract xmlns="http://www.w3.org/1999/xhtml">

Nowadays, applied computer-oriented and information digitalization technologies are developing very dynamically and are widely used in various industries. One of the highest priority sectors of the economy of Ukraine and other countries around the world, the needs of which require intensive implementation of high-performance information technologies, is agriculture. The purpose of the article is to synthesise scientific and practical provisions to improve the information technology of the comprehensive monitoring and control of microclimate in industrial greenhouses. The object of research is non-stationary processes of aggregation and transformation of measurement data on soil and climatic conditions of the greenhouse microclimate. The subject of research is methods and models of computer-oriented analysis of measurement data on the soil and climatic state of the greenhouse microclimate. The main scientific and practical effect of the article is the development of the theory of intelligent information technologies for monitoring and control of greenhouse microclimate through the development of methods and models of distributed aggregation and intellectualised transformation of measurement data based on fuzzy logic.
</abstract>

Department of Software of Computer Systems

SHEI ’Donetsk National Technical University’ of the Ministry of Education and Science of Ukraine

Information Technology for Comprehensive Monitoring and Control of the Microclimate in Industrial Greenhouses Based on Fuzzy Logic

<abstract xmlns="http://www.w3.org/1999/xhtml">

Nowadays, textual information grows exponentially on the Internet. Text summarization (TS) plays a crucial role in the massive amount of textual content. Manual TS is time-consuming and impractical in some applications with a huge amount of textual information. Automatic text summarization (ATS) is an essential technology to overcome mentioned challenges. Non-negative matrix factorization (NMF) is a useful tool for extracting semantic contents from textual data. Existing NMF approaches only focus on how factorized matrices should be modeled, and neglect the relationships among sentences. These relationships provide better factorization for TS. This paper suggests a novel non-negative matrix factorization for text summarization (NMFTS). The proposed ATS model puts regularizes on pairwise sentences vectors. A new cost function based on the Frobenius norm is designed, and an algorithm is developed to minimize this function by proposing iterative updating rules. The proposed NMFTS extracts semantic content by reducing the size of documents and mapping the same sentences closely together in the latent topic space. Compared with the basic NMF, the convergence time of the proposed method does not grow. The convergence proof of the NMFTS and empirical results on the benchmark data sets show that the suggested updating rules converge fast and achieve superior results compared to other methods.
</abstract>

Automatic Extractive and Generic Document Summarization Based on NMF

<abstract xmlns="http://www.w3.org/1999/xhtml">

Introducing variation in the training dataset through data augmentation has been a popular technique to make Convolutional Neural Networks (CNNs) spatially invariant but leads to increased dataset volume and computation cost. Instead of data augmentation, augmentation of feature maps is proposed to introduce variations in the features extracted by a CNN. To achieve this, a rotation transformer layer called Rotation Invariance Transformer (RiT) is developed, which applies rotation transformation to augment CNN features. The RiT layer can be used to augment output features from any convolution layer within a CNN. However, its maximum effectiveness is shown when placed at the output end of final convolution layer. We test RiT in the application of scale-invariance where we attempt to classify scaled images from benchmark datasets. Our results show promising improvements in the networks ability to be scale invariant whilst keeping the model computation cost low.
</abstract>

Feature Map Augmentation to Improve Scale Invariance in Convolutional Neural Networks

<abstract xmlns="http://www.w3.org/1999/xhtml">

Real life applications of deep learning (DL) are often limited by the lack of expert labeled data required to effectively train DL models. Creation of such data usually requires substantial amount of time for manual categorization, which is costly and is considered to be one of the major impediments in development of DL methods in many areas. This work proposes a classification approach which completely removes the need for costly expert labeled data and utilizes noisy web data created by the users who are not subject matter experts. The experiments are performed with two well-known Convolutional Neural Network (CNN) architectures: VGG16 and ResNet50 trained on three randomly collected Instagram-based sets of images from three distinct domains: metropolitan cities, popular food and common objects - the last two sets were compiled by the authors and made freely available to the research community. The dataset containing common objects is a webly counterpart of PascalVOC2007 set. It is demonstrated that despite significant amount of label noise in the training data, application of proposed approach paired with standard training CNN protocol leads to high classification accuracy on representative data in all three above-mentioned domains. Additionally, two straightforward procedures of automatic cleaning of the data, before its use in the training process, are proposed. Apparently, data cleaning does not lead to improvement of results which suggests that the presence of noise in webly data is actually helpful in learning meaningful and robust class representations. Manual inspection of a subset of web-based test data shows that labels assigned to many images are ambiguous even for humans. It is our conclusion that for the datasets and CNN architectures used in this paper, in case of training with webly data, a major factor contributing to the final classification accuracy is representativeness of test data rather than application of data cleaning procedures.
</abstract>

Training CNN Classifiers Solely on Webly Data

AHEAD OF PRINT

Volume 14 (2024): Issue 3 (June 2024)

Volume 14 (2024): Issue 2 (March 2024)

Volume 14 (2024): Issue 1 (January 2024)

Volume 13 (2023): Issue 4 (October 2023)

Volume 13 (2023): Issue 3 (June 2023)

Volume 13 (2023): Issue 2 (March 2023)

Volume 13 (2023): Issue 1 (January 2023)

Volume 12 (2022): Issue 4 (October 2022)

Volume 12 (2022): Issue 3 (July 2022)

Volume 12 (2022): Issue 2 (April 2022)

Volume 12 (2022): Issue 1 (January 2022)

Volume 11 (2021): Issue 4 (October 2021)

Volume 11 (2021): Issue 3 (July 2021)

Volume 11 (2021): Issue 2 (April 2021)

Volume 11 (2021): Issue 1 (January 2021)

Volume 10 (2020): Issue 4 (October 2020)

Volume 10 (2020): Issue 3 (July 2020)

Volume 10 (2020): Issue 2 (April 2020)

Volume 10 (2020): Issue 1 (January 2020)

Volume 9 (2019): Issue 4 (October 2019)

Volume 9 (2019): Issue 3 (July 2019)

Volume 9 (2019): Issue 2 (April 2019)

Volume 9 (2019): Issue 1 (January 2019)

Volume 8 (2018): Issue 4 (October 2018)

Volume 8 (2018): Issue 3 (July 2018)

Volume 8 (2018): Issue 2 (April 2018)

Volume 8 (2018): Issue 1 (January 2018)

Volume 7 (2017): Issue 4 (October 2017)

Volume 7 (2017): Issue 3 (July 2017)

Volume 7 (2017): Issue 2 (April 2017)

Volume 7 (2017): Issue 1 (January 2017)

Volume 6 (2016): Issue 4 (October 2016)

Volume 6 (2016): Issue 3 (July 2016)

Volume 6 (2016): Issue 2 (April 2016)

Volume 6 (2016): Issue 1 (January 2016)

Volume 5 (2015): Issue 4 (October 2015)

Volume 5 (2015): Issue 3 (July 2015)

Volume 5 (2015): Issue 2 (April 2015)

Volume 5 (2015): Issue 1 (January 2015)

Volume 4 (2014): Issue 4 (October 2014)

Volume 4 (2014): Issue 3 (July 2014)

Volume 4 (2014): Issue 2 (April 2014)

Volume 4 (2014): Issue 1 (January 2014)

Volume 3 (2013): Issue 4 (October 2013)

Volume 3 (2013): Issue 3 (July 2013)

Volume 3 (2013): Issue 2 (April 2013)

Volume 3 (2013): Issue 1 (January 2013)

Journal of Artificial Intelligence and Soft Computing Research

 Print version published by: The Journal of the Polish Neural Network Society and the University of Social Sciences in Lodz  Editor-in-Chief: Prof. Leszek Rutkowski, IEEE FellowFrequency of publishing: quarterlyMonths of publication: January, April, July, October  ISSN (print): 2083-2567ISSN (on-line): 2449-6499  Journal of Artificial Intelligence and Soft Computing Research is a dynamically developing international journal focused on the latest scientific results and methods constituting traditional artificial intelligence methods and soft computing techniques. Our goal is to bring together scientists representing both approaches and various research communities.  Why JAISCR?   Premier source of high quality research,  We have practice in publishing innovative articles dealing with various aspects of artificial intelligence methods and soft computing,  Extensive scope,  We are focused on permanent development of our journal.   JAISCR research areas  JAISCR publishes high quality, innovative research results in various areas of artificial intelligence and soft computing. These areas include, but are not limited to:   AI in Modelling and Simulation,  AI in Scheduling and Optimization,  Automated Reasoning and Inference,  Bioinformatics,  Cognitive Aspects of AI,  Computer Vision and Speech Understanding,  Data Mining,  Distributed Intelligent Processing,  Evolutionary Design,  Expert Systems,  Fuzzy Modelling and Control,  Hardware Implementations,  Heuristic Search,  Hybrid Models,  Information Retrieval,  Intelligent Database Systems,  Knowledge Engineering,  Mechatronics,  Multi-agent Systems,  Natural Language Processing,  Neural Network Theory and Architectures,  Neuro-informatics and Bio-inspired Models,  Pattern Recognition,  Robotics and Related Fields,  Rough Sets Theory: Foundations and Applications,  Supervised and Unsupervised Learning,  Swarm Intelligence and Systems,  Web Intelligence Applications &amp; Search,  Various Applications.   Submitting to JAISCR - a great advantage for the authorsWhy?   Fair and thorough peer review (at least three referees are required for the acceptance of each article),  Fast and efficient publication process,  Polishing the structure and technical format of each submission,  Open access (all articles published in JAISCR are open access - freely available on Sciendo platform),  Permanence (because of the open access form, permanent accessibility is assured),  High visibility within the field (the author's work is freely accessible to a global audience. In addition, articles are available through Sciendo),  Promotion of the published papers in the scientific community.   Archiving  Sciendo archives the contents of this journal in Portico - digital long-term preservation service of scholarly books, journals and collections.  Plagiarism Policy  The editorial board is participating in a growing community of Similarity Check System's users in order to ensure that the content published is original and trustworthy. Similarity Check is a medium that allows for comprehensive manuscripts screening, aimed to eliminate plagiarism and provide a high standard and quality peer-review process.