Sentiment analysis of reviews on cappadocia: The land of beautiful horses in the eyes of tourists

: The Cappadocia region is one of the most popular tourist destinations in Turkey, and its tourism sector has a significant share in the Turkish economy. In this study, we scraped TripAdvisor reviews of visitors of the Cappadocia region with the Python programming language and used them to analyse public sentiment using various supervised machine learning algorithms. The main purpose of the study is to help create competitive intelligence on both regional and global scales using social media data. For this, we applied Random Forest, Naïve Bayes, and Support Vector Machine methods to classify 4,770 reviews and get insights about the visi-tors’ perspectives. Results show that the majority of the tourists (90%) had a positive experience during their visit. Most of the complaints focused on the attitudes of staff members. In addition, all three supervised machine learning methods achieved high accuracy in their classification of the reviews. This study is significant in terms of providing a meaningful database for understanding visitor comments, the most important data for the development of tourism in the region, through state-of-the-art machine learning methods, and to direct improvements accordingly.


Introduction
Social media has an important role as a source of information for travellers around the world.Social networks have led to widespread changes in "business-to-business", "business-to-customer" and "customer-to-customer" communications.The Internet has evolved from a broadcast medium to a participatory platform that allows people to themselves be the "media" for collaboration and information sharing (Leung et al., 2013).According to a 2021 Statista report, there were 859 million reviews on Tri-pAdvisor as of 2019 (Statista, 2021b).In July 2019 alone, 224 million people visited the TripAdvisor website.The website Booking.com,another popular and well-known travel platform, was visited 697 million times in the same period (Condor Ferries, 2021).In addition, Twitter had an average of 330 million active users per month in 2019 (Statista, 2021a).
These statistics clearly show that the dominance of social media in our lives has increased.With this increase, it is seen that corporate communication has become democratized (Kietzmann et al., 2011).The ability to share information through social media has created significant changes in the bargaining power of consumers and decreased information asymmetries.Stories and experiences shared by people across many online platforms (blogs, microblogs, Twitter, Facebook, TripAdvisor, Booking.com,YouTube, etc.) have made social media into a megatrend, significantly affecting the tourism sector (Leung et al., 2013).
The tourism industry, whose stakeholders are more than in many other industries, is the largest job-creating sector on the planet.Developing technology and the internet play a significant role for tourism organizations and destinations in increasing their market share.Advances in search engines and social networks have increased the number of people who plan and experience their travels around the world.These developments have also affected the efficiency of tourism organizations and their business models, as well as the way consumers communicate with these organizations.There have been many new players in the tourism sector since these developments, and the importance of tourism for a growing number of national and regional economies has been increasingly recognized (UNWTO, 2001; Buhalis & Law, 2008).In short, the internet and social networks have significantly changed the distribution channels of tourism-related information, as well as travel planning and consumption habits (Xiang & Gretzel, 2010).
Tourism is both a service and a knowledge-intensive industry.Therefore, it is essential to understand how changes in technologies and consumer behaviour are affecting the distribution and availability of travel-related information.Technology and social media are needed to create an avant-garde tourism experience.Information and communication technologies are powerful tools for reshaping tourism as they create totally new products, communication networks, business models, industry structures, and types of companies.Therefore, they are an important resource for travel companies seeking an enhanced competitive advantage and maximum profit in the global market (Sheldon, 2006).
There is extensive literature on social media analytics that combines web scraping, machine learning applications with recent software, and statistical techniques to collect, clean, analyse, and understand large amounts of data.With these methods, opinions and beliefs about products can be examined for commercial purposes, such as tracking and identifying trending topics and popular sentiment (Xiang et al., 2017).In particular, with social media playing an increasingly in both travellers' decisions and tourism activities and management, studies are increasingly focusing on the effects of user ratings and comments on tourism destinations and hotel management practices (see Vermeulen  Turkey is in the Mediterranean climate zone, and its clean beaches and coves, historical sites, and natural beauty all endow it with great potential for tourism.Turkey has hosted many civilizations throughout history and offers a wide range of options to visitors of all preferences and tastes.Each one of its seven regions has its own unique natural sites.Therefore, the development of the tourism sector could play an important role in solving socio-economic problems in Turkey, such as unemployment, inflation, and migration (Bulut, 2018).Turkey, which welcomed 51.2 million tourists in 2019, is ranked 6th in the world in terms of number of arrivals.However, the country ranks 13th in the world in terms of tourism reve-nues, which stood at 29.8 billion dollars in 2019 (Republic of Turkey Ministry of Culture and Tourism, 2020).
The Cappadocia region, which encompasses the Nevsehir, Aksaray, Nigde, Kayseri, and Kirsehir provinces, is one of Turkey's most important tourism destinations.The Rocky Cappadocia region, which is a narrower area, consists of Uchisar, Urgup, Avanos, Goreme, Derinkuyu, Kaymakli, Ihlara, and their surroundings.Analysing social media data in the tourism industry will allow both local businesses and policymakers to understand the decision-making behaviours of visitors and to be aware of the opportunities and pitfalls in the industry.In this context, we investigated the social media data of the Rocky Cappadocia region through sentiment analysis and various machine-learning methods.
In the next part of the study, we briefly introduce social media and sentiment analysis methods.Later, we explain sentiment classification and machine learning algorithms.In the fourth section, we discuss the data set and findings, and finally, we present the conclusion and evaluations.

Social Media and Sentiment Analysis
When using social media, people interact with each other and share information, opinions, and many other things freely.At the same time, companies and organizations may reach customers more easily and enjoy advantages such as offering them personalised options.According to Appel et al. (2019), social media sites should be seen as "digital places" where people manage important parts of their lives, rather than just platforms that offer digital media and technology services.When evaluated from this perspective, social media will mean "less about specific technologies or platforms and more about what people do in these environments" (Appel et al., 2019, p. 80).
On the other hand, reviews made by social media users cannot be treated as the single source of truth.False information can spread very quickly on social media.Thus, users need to pay attention to this kind of information and always double-check the information they find on social media.
Sentiment analysis is a method for analysing emotions by evaluating posts on internet platforms.Making inferences from the related platforms on the feelings and opinions of people provides an understanding of human behaviour, with implications for efficiency in areas such as international public influence, business decisions, and policy development (Alamoodi et al., 2020).Sentiment analysis has many other applications in education, health, and tourism management, among many other areas (Balahadia et al.For these reasons, it is an essential tool for understanding social trends.

Methodology
To analyse the general public opinion about the Cappadocia region, we used sentiment analysis, an effective and frequently-used method in big-data analytics.In sentiment analysis, the main idea in a text is classified by applying natural-language processing and text analytics.Sentiment analysis aims to understand the attitude of the author by detecting the emotional polarity of a text and classifying it as positive, negative, or neutral (Luo et al., 2013, pp. 53-54).For this, a dictionary-based emotion score is determined for each word in the text using flexible and open-source programming languages such as R, Python, and related packages.Later, this score, determined on the basis of the selected words, is calculated for the whole text.As a result, the entire text can be classified as positive, negative, or neutral (Luo et al., 2013).
Sentiment analysis studies are generally performed with three different methods: the lexicon-based method, the machine learning method and the hybrid approach.The lexicon-based approach "detects the sentiment based on a sentiment lexicon, which includes a collection of known and pre-compiled sentiment terms.It is divided into two, namely dictionary-based approach and corpus-based approaches, using statistical or semantic methods to find sentiment polarity" (Medhat et al., 2014(Medhat et al., , p. 1098)).Machine learning approaches apply machine learning algorithms by taking advantage of syntactic and/or linguistic features, including sentiment lexicons (Maynard & Funk, 2011).The hybrid approach combines both machine learning and a lexicon-based method with manually written linguistic rules.The classifiers in this approach are used gradually.Thus, if one method cannot classify a document, the algorithm passes to the next method until the document is classified or all methods are tried (Prabowo & Thelwall, 2009;Manda, 2019).In this study, opinions of tourists about the Cappadocia region, which is one of the most popular tourist destinations in Turkey, is examined using machine learning algorithms.

Machine Learning
Machine learning techniques for sentiment classification have gained attention because of the possibility of modelling many features while also capturing the context, adapting easily to changing inputs, and measuring the degree of uncertainty with which a classification is made (Boiy & Moens, 2008).There are two basic kinds of machine learning algorithms: supervised and unsupervised learning."A supervised learning algorithm takes a known set of input data (training set) and known responses to data (output) and trains a model to generate reasonable predictions for response to new input data."(MathWorks, 2016, p.2). Unsupervised learning involves pattern recognition without the involvement of a target feature.In other words, all variables are used as features (Alloghani et al., 2019).Three different supervised machine learning algorithms are employed in the study.These classification algorithms are Naïve Bayes (NB), Support Vector Machines (SVM), and Random Forest (RF).

Naïve Bayes (NB)
Naïve Bayes, which is the simplest example of a probabilistic classifier, is the probability (P (C | d)) that a d document belongs to a C class.The Naïve Bayes Classifier does not take the possible dependencies between inputs into account and reduces a multivariate problem into a set of univariate problems.With supervised training, Naïve Bayes learns the pattern of reviewing a set of well-categorized test documents, thus comparing the contents of all categories with a word list and their emergence.Such word formation lists are used to classify new documents into the correct sub-categories according to the highest-end probability (Islam et al., 2007;Ting et al., 2011).

Support Vector Machine (SVM)
"Support Vector Machine is a supervised machine algorithm that uses statistical learning theories.It works as separating the classes in the dataset with an optimal hyperplane that maximizes the margin between classes" (Yu et

Random Forest (RF)
Random forests are an important bagging modification that create a collection of de-correlated trees and then averages them.Random forests are used to improve bagging variance reduction by scaling down the correlation among trees without changing the variance too much.This is done by randomly selecting input variables during the tree growing process."On many problems, the performance of random forests is very similar to boosting, and they are simpler to train and tune" (Hastie et al., 2009, pp. 587-588)."It gives good results in data sets that contain categorical variables with a large number of variables and class labels, have missing data or exhibit an uneven distribution" (Aydın, 2018, p.172).

Performance Evaluation
In the current study, the success of classification algorithms is decided by confusion matrix, mean absolute error (MAE) and root mean square error (RMSE) criteria.The confusion matrix gives the actual and predicted classification data computed from a classification system.The performance of these kinds of systems is often determined using the data in the matrix.A two-class classification problem confusion matrix is shown below: Accuracy is the statistic that determines the success of the machine learning model.As seen in Equation ( 1), the accuracy rate is the ratio of data classified as correct to all data: (1) Precision, Recall, and F1-score are three other criteria that can be used to evaluate the outcome of emotion analysis.Considering an exemplary system of sentiment analysis that has two classes and using the notations given in Table 1, Precision is calculated as seen in Equation (  absolute error expresses the mean of the verification sample of the absolute values of the differences between the estimated and observed," as is seen in Equation ( 5) below (Yi & Liu, 2020, p.632): (5 where d is the observed and d ̂ is the estimated value.A smaller MAE value indicates better prediction accuracy.RMSE is a quadratic scoring rule that measures the mean error size.Each difference between the estimated and the observed values is squared and then averaged on the sample, as seen in Equation ( 6) below (Aydın, 2018): RMSE should be equal to or greater than MAE.The magnitude of the variance between the individual errors in the sample will be parallel to the size of the gap between RMSE and MAE.On the other hand, if MAE is equal to RMSE, it means that all the errors have the same size.

Data Set
In this study, we collected social media reviews on Göreme National Park and Göreme Open Air Museum in the Rocky Cappadocia region from the TripAdvisor website.The collection of reviews, also known as data scraping, was done using the Python programming language.A total number of 4,770 English-language reviews were collected from April 2011 to January 2021.In order to analyse the reviews, first we converted the text into numbers.To do that, we used the "term frequency-inverse document frequency" (TF-IDF) method, which supposes that a document is just a stack of words.Therefore, the relative importance of any word in the document can be calculated, that is, vec-torized, by considering the frequency of the word in the document and its popularity in the compilation (Kim et al., 2019, p.17).We retained only the most common 2,000 words in the dataset, as less common words do not play a crucial role in classification.In addition, a word needed to be in at least five reviews, and at most in 70% of all the reviews.The rationale behind choosing 70% as a threshold was that words that occur in more than 70% of documents are very common and unlikely to play any role in classifying sentiment.Finally, we created a term document matrix using the CountVectorizer module.
We applied data cleaning before creating the term document matrix.After that, we converted all characters in the reviews to lowercase and removed punctuation and numeric characters.Then, we removed the words that do not significantly affect the meaning of the sentence using the StopWords application and lemmatized the text to map the various forms of the words to the root form using WordNet Lemmatizer.Two classifications, positive and negative, were made according to the ratings of the reviews.We labelled reviews that have ratings of 1, 2 and 3 as negative, and 4 and 5 as positive.70% of the data set was used as the training set, and the remaining part was used for the test set.

Findings
In this part of the study, we make conclusions about the machine learning algorithm results and interpretations regarding the sentiment polarity of tourists who visited Göreme Open Air Museum and Göreme National Park.In this context, first, the distribution of reviews by rating is shown in Figure 1.According to Figure 1, the tourist reviews of Cappadocia generally show satisfaction with the experience that they have had.In the rating distribution of the reviews, the comments with ratings of 4 and 5  constitute approximately 90% of the total reviews, while the remaining ratings compose 10%.
Figure 2 shows the length of the reviews according to the rating.5-star reviews have the lowest median size, while the highest median sizes are for 1 and 2-star reviews.In addition, considering the minimum number of characters, it can be observed that 1-and 2-star comments are longer than those given other rating values.The greater length of the negative reviews is included in the figure .In Table 2, some of the negative and positive reviews of the Cappadocia Region are given.As can be seen from the table, positive reviews are generally shorter than negative ones.On the other hand, staff is the aspect of the location most frequently complained about in negative reviews.This indicates that the businesses in the region should take steps to increase the quality of their service by taking into account these reviews.
The most frequently mentioned bigrams all appear in the reviews in Figure 3.In fact, the most common bigrams in the text are "open air," "national park," and "goreme air."However, since the reviews of Göreme Open Air Museum and Göreme National Park are being exam-ined within the scope of the study, these words, which are likely to be seen in every review, have been removed from the chart.Apart from this, one of the most important buildings of Göreme Open Air Museum, the Dark Church, is the most mentioned bigram in the reviews.Rocks composed of lava and ash are also frequently mentioned in the reviews.
Finally, Figure 4 shows the frequency of the 20 most common trigrams in the reviews.According to the chart, tourists like the Cappadocia Region and it is seen that the Dark Church is mentioned a lot in trigrams as well.The popularity of the hot air balloon ride, one of the important activities in the region, is also reflected in the reviews, and the local people living in caves were also the focus of tourists' interest.
Performance evaluation of the classification algorithms is examined using the micro-average values of F1-Score, Recall, and Precision statistics, and the outcome is presented in Table 3.While the macro-average treats all classes equally, the micro-average supports larger classes.Thus, in cases where there are class imbalances, the  3 also shows the RMSE and MAE results.The degree of success achieved by each of the three algorithms is very similar.However, Naïve Bayes algorithm performed the classification process with the highest accuracy rate.At the same time, the lowest RMSE and MAE values belonged to the Naïve Bayes algorithm.Based on these results, we found the Naïve Bayes algorithm to be preferable for classifying the reviews.

Discussion
In this study, we analysed sentiment about the Cappadocia region, which is one of Turkey's most important tourism destinations.We employed various machine learning algorithms and visualization methods for this purpose.We collected 4,770 English reviews for Göreme Open Air Museum and Göreme National Park in the Cappadocia region.For classification purposes, we carried out Random Forest, Naïve Bayes, and Support Vector Machine supervised machine learning methods.We used accuracy, RMSE and MAE criteria for performance evaluation.Although the classification algorithms performed very similarly, we determined that the Naïve Bayes algorithm was the most successful among them.
Most of the reviews (90%) indicated that their writers had positive feelings about the region.Tourism revenues in both the regional and the national economy are one of the most important components of Turkey's GDP.Cappadocia, with its fairy chimneys, valleys, underground cities, churches, inns, and cultural structures, carries the traces of a civilization thousands of years old.It can be seen from the chart in the findings section, churches are mentioned in most of the reviews, meaning religious tourism is popular.The Dark Church, in particular, was at the centre of tourists' attention.Strategic plans should be made to attract more visitors from various parts of the world by preserving the spirit and potential of the Cappadocia Region.The number and focus of negative reviews show that it is easy to eliminate the negative factors with some small improvements in customer service.The success of the tourism sector will eventually contribute to the income level of the locals as well.

Conclusion
Language is a powerful mechanism that enables people to express themselves.Studies on sentiment analysis, also known as opinion mining, are one of the most interesting and rapidly-growing research areas of natural language processing and text mining.A wide range of sentiment   analyses can be made for every field through social media reviews, which are a crucial artifact of the information age.Social media allows individuals and organizations to reach larger audiences at much less cost than surveys, interviews, and similar methods.In addition, on social platforms, people share their feelings and thoughts in line with their own wishes.This causes these opinions to be more candid than others gathered by more conventional methods.It is thought that sentiment analysis on the tourism sector using social media reviews will help countries and businesses make better strategic decisions for the future.In addition, it will prevent the information-asymmetry problem in the global competitiveness, and thus enhance awareness of opportunities and threats in the market.This study reveals the overall opinion of the visitors about Cappadocia region and thus serves as a mirror for service providers in the region.To be able to improve the quality perceptions of the region, locals, service providers and decision makers should take these reviews into account and keep increasing service quality accordingly.A significant limitation of this study, however, is that only a certain part of the Cappadocia Region is investigated.Thus, as a suggestion for further investigations, expanding the research to the whole region, or to other tourist destinations, may further reveal and contribute to the tourism potential of Turkey.
It is an important tourism destination of both natural and historical interest and has been listed as a UNESCO World Heritage Site since 1985.Natural and cultural tourism is popular in the region, encompassing experiences such as thermal baths, ballooning, horse-back riding, and pottery, just to name a few (Ahiler Development Agency Plans for Future, 2015; Republic of Turkey Ministry of Culture and Tourism, 2021).Common activities include hiking and horseback riding in Ihlara; visiting the Red and Rose Valleys, or the Göreme or Zelve open-air museums; and taking a hot air balloon tour over the region.
al., 2012, p.232).In other words, SVM maximizes the distance between support vectors belonging to different classes."The interesting property of SVM is that it is an approximate implementation to the structure risk minimization (SRM) in statistical learning theory, rather than the empirical risk minimization method from which the classification function is derived by minimizing mean square error (MSE)"(Song et al., 2002, p.440).As a powerful machine learning algorithm, SVM works well in text classification, as it has good generalization ability in a high-dimensional feature space(Joachims, 1998;Wei et al., 2012;Nohh et al., 2019).
2).Precision statistics of a positive class are calculated by dividing correctly classified samples to the total samples predicted to be class-positive.The Recall, on the other hand, is calculated by dividing correctly classified positives (True Positives) by the sum of correctly classified positives (True Positives) and incorrectly classified negatives (False Negatives), as seen in Equation (3).Finally, the F1-score is the weighted harmonic average of Precision and Recall, as seen in Equation (4).According to these definitions as they are explained, Precision, Recall and F1 values of the positive class are calculated as follows (Ribeiro et al., 2016; Alaei et al., 2017): (MAE) and Root Mean Square Error (RMSE) statistics are also used in this study to evaluate the performance of classification algorithms."The mean

Figure 1 :
Figure 1: Distribution of Reviews According to Rating Source: Authors' computation.

Figure 2 :
Figure 2: Review Length by Rating Source: Authors' computation.

Table 1 :
The Confusion Matrix for Two-Class Classification Problem Visa et al. (2011)Positive, FP is False Positive, FN is False Negative, and TN is True Negative."Source:Visaet al. (2011)

Table 2 :
Reviews and Rating We visited Zelve Open Air Museum the day before, which is why the Goreme Open Air Museum was very disappointing for us.By comparison, GoremeOAM was 3x more expensive than Zelve, most of the sites inside Goreme OAM was closed, too many people were around…" "Göreme Open Air Museum is a magnificent open air park, ideal for a guided tour.Many caves are great for exploring.It is good to see 'churches' in some of these caves." 5 Source: TripAdvisor micro-average value is preferred (Sokolova & Lapalme, 2009; Nohh et al., 2019).sssss Table

Table 3 :
Performance Evaluation