Comparison of feature extraction methods by color analysis and image recognition for photos on tourism websites

Many people go sightseeing based on the information available on tourism websites. There is much information available for famous tourist destinations and little information for unknown tourist destinations. In particular, for foreign travelers, it is difficult to find their favorite tourist destinations, so we proposed a personal adaptive tourism recommendation system in former research. During the system development, the method of tourism feature extraction is a key issue. Via a questionnaire, we showed the importance of photo information on tourism websites. As the first step of the tourism feature extraction of photos on tourism websites, we propose two methods of analysis: color analysis and image recognition. Comparing the two methods through experiments, we confirmed that each method had different characteristics and the combination of these methods exhibited the best accuracy in distinguishing between the ratio of artificial and natural objects in the photos.


INTRODUCTION
In recent years, the number of people traveling from Japan to foreign countries has significantly increased (JTB tourism research & consulting, 2020). In particular, the number of people traveling on foreign independent traveler (FITs) exceeds the number of package tourists (Japanese travel trade news (Official), 2017). FIT travelers often find tourist attractions tailored to their interests and make travel plans themselves. Tourism websites play an important role in this trend because they often provide comprehensive and credible information on tourism. However, information on these websites is generic, and it is difficult to obtain useful information from tourism websites with high efficiency. Recently, tourism websites and social network services have started to provide various modalities of tourism information, and internet data mining is one of the most common methods of tourism analysis. Compared with social network services providing streamlined information, tourism websites usually gather larger amount of more diverse information, including formal introductions, comments from tourists, and photos.
For example, TripAdvisor is the most popular website for Japanese independent travelers to obtain tourist information when traveling abroad. This website contains a large amount of tourist information, including details of hotel accommodation and restaurants, and the descriptions and rankings of major attractions in local areas. A large amount of information often means that the classification of tourist attractions is general and has not been subdivided. Furthermore, in the case of less famous areas, the information described in Japanese is not sufficient. Therefore, it is difficult for Japanese FIT travelers to find tourist attractions suited to their interests in less famous areas.
The aim of this study is to develop a personal adaptive tourism recommendation system (PATRS) for solving the above problems. The remainder of this paper organized as follows. We discuss on the related works in section 2, and explain the concept of PATRS in section 3. In the system development, the method of tourism feature extraction is a crucial issue. Using questionnaire results shown in section 4, we confirm the importance of photo information on tourism websites for foreign tourists. As the first step of the tourism feature extraction of photos on tourism websites, we propose two methods to distinguish the ratio of artificial and natural objects in the photos in section 5. Section 6 presents the experimental results to compare the two methods as well as discussion. We conclude the paper in section 7.

RELATED WORK
Several researchers have used computer data mining technology to solve a number of problems in the development of tourism, including the discovery and classification of points of interest (POIs) (Bao, 2021;Maeda et al., 2018). Bao (2021) proposes an architecture using machine learning that can be used to identify the semantic structure of tourist information in websites and thus extract feature. Some researchers have used social network services, such as Twitter or Flickr, to find and classify POIs. Maeda et al. (2018) proposed a method for extracting tourist attraction locations by analyzing the data obtained from Twitter and Foursquare. They evaluated the attractiveness and originality of each location based on TF-IDF (Term frequency, Inverse document frequency). They successfully extracted POIs with both high attractiveness and high originality and considered them locations of tourist attractions. Their study contributed to the location extraction of high-quality POIs, but the features of those POIs were not extracted and classified.
Photo analysis has also received increased attention in the field of tourism analysis (Zhang and Dong, 2021;Taecharungroj and Mathayomchan, 2020;Cao et al., 2010;Chen et al., 2009;Wang et al., 2020;Samany, 2019). Zhang and Dong (2021) used data mining technology to analyze the image monitoring and management of famous tourist destinations. Taecharungroj and Mathayomchan (2020) detected the label and extracted the topic from photos by using Google Cloud Vision. Cao et al. (2010) clustered the photos based on the location information and selected the most representative photo and tags from each cluster to recommend locations according to the travelers' needs. Chen et al. (2009) analyzed the similar features from a large number of photographs and outlines the features to generate icons of tourist spots. Wang et al. (2020) used artificial intelligence (AI) to recognize traveling photos, and classify the photos by travel scenes. Samany (2019) also used location data to extract the landmark. These studies achieved the tourism resources classification globally based on a large number of photos, but they remain unavailable for the feature extraction and classification of a local and specific area. Moreover, there are many studies focused on the photos taken by tourists (Zhang et al., 2019;Yang et al., 2017;Sun et al., 2019;Giglio et al., 2019). Zhang et al. (2019) identified scenes from tourists' photos and compared the behaviors and perceptions of tourists from different continents and countries. Yang et al. (2017) collected geotagged photos from Flickr and extracted tourists' trajectories to detect tourist mobility patterns. Sun et al. (2019) developed a personalized recommendation method to provide attraction recommendations that match users' preferences by using the geo-tagged photo collection from Flickr. Giglio et al. (2019) collected photos posted on Flickr and analyzed their contents. Their study extracted the photographed subject by using image analysis successfully, but it aimed to investigate tourist behavior for identifying the attractiveness of POIs and not for extracting and classifying the tourism features.
Although various methods for discovering and classifying POIs have been developed, methods for extracting and classifying the tourism features from tourism websites to be able to adopt on the tourism recommendation system have not been reported. In our previous study, we proposed a method for extracting general feature words for POIs specifically for Japanese FIT travelers, using comments from a major tourism website . We collected comments on POIs and extracted their features by using two methods for extracting feature words: counting word frequency (WF) and calculating the score (SC) using TF-IDF. The results of the experiments confirmed that the SC method could be effective in the analysis of several comments, and the WF method has potential in the case of few comments in less well-known tourism areas. However, we did not consider the use of photo information on tourism websites. Fig. 1 shows the concept of PATRS. Travelers input the travel area and their purpose and experience to the system, and the input processing module generates keywords and conditions from the input data and retrieves the POI database (DB). The POI DB stores the characteristics data provided by the POI characteristics analyzing module, which analyzes comment data and photo data on tourism websites for each POI in the user traveling area. The system sends some POI candidates to the adaptation calculation module, and the priority level of recommendation of POI each candidate is calculated. The system sends the result to the visualization module and displays the results to the users through a human interface, such as a map style. The focus area of this study is the POI characteristics analyzing module. For the construction of the POI characteristics analyzing module, we focus on the tourism feature extraction and classification of POI. This study can be beneficial by attaching the features to the POIs and making

IMPORTANCE OF PHOTO INFORMATION ON A TOURISM WEBSITE
Travelers generally obtain information about the POIs from tourism magazines and websites. The POI information from such media is limited, and it may not be entirely correct or fair because the media often provide biased information contributed by writers and editors of media companies. Therefore, we propose the use of photos and comments collected from tourism websites or social network services (SNSs) and attempt to extract the features of POIs. Although the original languages of most of the comments are not Japanese, the website prepared Japanese version of the comments through machine translation. We assumed that the comments written correctly and fairly written by general tourists who actually went to the POIs.
To investigate the effectiveness of the tourism information provided on the tourism information website and the translation accuracy of translated Japanese comments, we conducted a survey. A totally of 25 respondents whose native language was Japanese were interviewed for this survey. All respondents were over twenty years of age. A totally of ten POIs were randomly selected in a less famous tourist area with one translated Japanese comment for each POI. We provided the basic information and comment of each POI to the respondents to read and to answer the following questions: Is the text strange? Is the meaning of the translation understandable? Is the translation helpful for the trip? Do they want to visit the tourist spot? Then we presented a photo for each POI with the same information and translated comment, and the same questions. We collected the answers and divided into eight groups according to the questions and the presence of photos. Each group consisted of 250 answers.  Through the survey results, we found that inaccurate translation of the comments might give the respondents a negative impression of the tourist spots. For the questions without photos, a high proportion of negative answers was received (e.g., the text is quite strange and not understandable and not want to visit). However, compared with the questions without photos, the number of negative answers to the questions with photos reduced. In other words, the negative impression reduced by presenting the photos, although the translations were strange and difficult to understand. The respondents could understand the meaning of the strange translation, and they became wanted to visit. From this survey, we confirmed the limitation of analyzing only comments and the importance of photos for the tourism recommendations.

FEATURE EXTRACTION METHODS OF PHOTOS ON TOURISM WEBSITES
The purpose of this paper is to find a way to extract the characteristics of tourism destinations. And then, it will be useful to search for nature or artificial tourism spots according their preference. This study aims to investigate the optimal feature extraction method of the photos on tourism websites. In order to implement in the tourism recommendation system, we need an easy method of processing small amounts of data for the tourism feature extraction of photos posted on the tourism website. Generally, tourism purposes are divided into two types: nature observation and visits to man-made objects such as buildings and museums. This is often considered as different and comparable categories for offering as attraction (Tourism Management Tutorial, 2021). We thought it would be convenient for tourists to know whether the photos of tourist spots are for nature observation or for visiting artificial objects, and it is important to know the ratio from the photos of tourist spots. Thus, distinguishing the natural and artificial feature is expected to help tourists find favorite tourist destinations more efficient. In this paper, we propose two methods for distinguishing the ratio of artificial and natural objects in the photos on tourism websites as the first phase.

Color Analysis Method
We considered using the color analysis method to distinguish the ratio of natural and artificial objects for each photo. Color analysis is an important method in image classification and is widely used in many fields for identification and classification (Gowda and Yuan, 2018;Bambil et al., 2020;Akhloufi et al., 2008;Milotta et al., 2018;Szummer and Picard, 1998;Vailaya et al., 2001). In addition, Tominaga (1992) reported the possibility of color analysis in the classification of natural images.
In this study, we selected the RGB color model based on its superior performance over other color models (Akhloufi et al., 2007). Each photo contains several colors of the RGB components corresponding to the red, green, and blue image components. In this study, we extracted the colors from a photo, and separated each color into its RGB components. Table 1 shows the extracted results of colors and their RGB components from a specific photo. The photo includes 12 colors, corresponding to the color codes from #2040C0 to #200000, and the ratios of the colors in this photo are from 95% to 13%, respectively. We can extract the RGB components of each color as well. In color analysis, the hue, brightness, and contrast are essential characteristics. We selected to use hue to discriminate whether the composition of the photo is artificial or natural. To investigate whether a color represents artificial or natural composition, we focused on the RGB components of each color. We changed the combination of red (R), green (G), and blue (B), and found that Formula (1) was capable of distinguishing between the ratio of artificial and natural objects .
According to the values of Hu, we could obtain the constitution of artificial and natural components of the photos. In this study, we focused our analysis on the artificial ratio in each photo. We defined "Ra" (artificial probability) as shown in Formula (4). In Formula (4), "i" means the number of color components in a photo, and "Rco" means the ratio in the photo of each color component. "Ju" refers a variable as a basis for assessing whether the color is artificial or natural for each color component, if the color is artificial, Ju is "1", and if the color is natural, Ju is "0". =  Table 2 shows an example of the calculation results on the same photo shown in Table 1. In this table, the 50% of "Ju" means 50% of this photo is artificial.

Image Recognition Method
We used the image recognition method provided by Google Cloud Vision for extracting photographic subjects. Google Cloud Vision is a recent technology that enables the extraction of information from visual images and has been used in many recent studies (Taecharungroj and Mathayomchan, 2020;Hosseini et al., 2017;Mulfari et al., 2016;Bisong, 2019;Richards and Tunçer, 2018;Khalil Maad et al., 2020). However, there is no report on the usage of Google Cloud Vision for photos on tourism websites. Therefore, this is a new application idea for the Google Cloud Vision used in tourism analysis in this study. The Google Cloud Vision API (Application Programing Interface) is designed to help developers integrate vision detection features by machine learning and to efficiently obtain labels for the photograph subjects. Fig. 6 shows an example of the subject extraction using the Google Cloud Vision API. According to the API specifications, a maximum of 10 labels can be extracted for each photo (some labels may not be available if the contents of the images are unclear). The "topicality" of each label is a score that refers to the relevance of the image content annotation label to the image. From these labels and their topicality, we can see the photographic objects and their weight in the photo.
The extracted results often include subjects unrelated to tourism. To calculate the artificial ratio of the photos, we filtered the labels in advance, distinguishing the artificial subject words, non-artificial subject words, and other words that are neither artificial nor non-artificial. Excepting the "other words", we used only the topicality of artificial and non-artificial subjects as the weights in the calculation.
Formula (5) expresses the method used to calculate the artificial ratio of a photo by means of the label analysis. We named it Ra' to distinguish it from the artificial ratio calculated by the color analysis. Wai and Wni indicate the weights of artificial and non-artificial components, respectively, for each subject. Table 3 shows an example of the artificial ratio calculation. Finally, the Ra' of this photo was 74.7%, which means 74% of this photo is artificial.

EXPERIMENTS
To test the feasibility of the proposed methods, we conducted experiment in three types of tourism areas. This section presents the experimental process and results, and compare the two methods as well as discussion. Table 4 shows the conditions of the experiments. We selected "TripAdvisor Japan", the most popular tourism website in Japan, and three target analysis areas. Area "S" is St. Petersburg in Russia as a major sightseeing area, area "G" is Graz in Austria as an intermediately famous sightseeing area, and area "L" is Lappeenranta in Finland as a not famous area. We selected 1 photo each from the 1st to 15th recommended POIs in each tourism area, and the total number of photos was 45. Fig. 7 to Fig. 9 show the analysis of the target photos for each area. St. Petersburg (Russia): area "S" Famous sightseeing area Graz (Austria): area "G"

Experimental Conditions
Intermediately famous sightseeing area Lappeenranta (Finland): area "L" Not famous sightseeing area 3

Target analysis photo
Top 15 POIs in each tourism area recommended by the tourism Web site.
Total number of photos is 45. Fig. 7. The target analysis photos for area "S" Fig. 8. The target analysis photos for area "G" Fig. 9. The target analysis photos for area "L"  To test the feasibility of the proposed method, we extracted the artificial ratio for each target analysis photo using the two proposed methods. The extraction results, Ra by color analysis and Ra' by image recognition, for each analysis area, "S", "G", and "L", are shown in Table 5.
From the comparison, we can conclude that the results of the artificial ratio extracted using the two methods are very different; they even appear to have opposite results. Accordingly, we considered that these two proposed methods (color analysis and image recognition) had a different emphasis, which created apparent differences between them and the actual situation.  Through the experiment, we confirmed the respective limitations and characteristics of these two methods. The characteristics are the complementarity of the two proposed methods; for some patterns, the proposed methods have strengths and limitations. Table 6 summarizes the suitable conditions for each proposed method based on their respective strengths and limitations.
In particular, we obtained the following: 1. For the photos taken indoor, both methods work well. 2. Image recognition method is less appropriate at recognizing telephotographic subjects taken from a distance. In contrast, the color analysis method is unaffected by distance. 3. The artificial objects' appearance had a significant impact on the analysis of the artificial ratio. a. If there is a pated building with a natural color (e.g., L13 and G5), the color analysis method could not extract the artificial ratio correctly; however, the image recognition method was unaffected and could compensate for this defect.
b. The appearance of the artificial objects influence to both methods. If the surface of a building was reflective (e.g., S7), the extracted result was determined based on the reflected objects. c. For some unusual photographic subjects, such as sculptures in special shape, a misunderstanding may occur, such as the animal sculptures in G5. 4. The weather condition additionally affected the analysis of the artificial ratio. a. For color analysis, because blue sky is included in a natural object, the ratio of artificial objects reduced. In contrast, the image recognition method ignored the ratio of blue sky and the ratio of the artificial objects increased. b. Cloudy, rainy, and snowy weather had a severe effect on the color analysis result. However, the image recognition method is not significantly affected.

Comparison of the Visual Evaluation of the Two Methods
We conducted a visual evaluation of all photos (Figs. 7,8,and 9), and the result was compared with each method. For comparison, we defined the visual value (expressed as Vv) as the average of artificial ratio on each photo. The author of this paper evaluated the value visually. This visual value in the experiment correctly reflects the actual artificial ratios of photos. The concordance rates were defined as Cr in Formula (6), which shows the corresponding rates of the artificial ratio results in between Ra (or Ra') calculated by two different methods and Vv.
(6) Here, n: number of POI (Point of Interest), : difference between Ra (or Ra') and Vv. Table 7 shows the concordance rates of Ra and Vv and Ra' and Vv. Each of the concordance rates were above 80%, which indicates that the methods were concordant with the visual evaluation result to a certain degree. In addition, the average values of Ra and Ra' were calculated and compared with the visual value Vv as shown in Table 7. From the evaluated results shown in Table 7, we confirmed that averaging the results of the two methods yielded the highest accuracy. Compared with each method of the color analysis or image recognition, the combined method using the average values of the two methods could reduce the differences caused by the different characteristics of each method.

CONCLUSION
This study aimed to develop a PATRS to support foreign individual travelers (FIT) from Japan. Finding and classifying POIs are important steps for developing the target system, and feature extraction and classification are required. This paper described our method of extracting tourism features from the photos posted on tourism websites.
In this study, we proposed two methods: the color analysis method and image recognition (feature words extraction) method. The experimental results of the comparison of the two methods confirmed that each method had different characteristics. By combining the two methods, we could obtain good results from the ratio of artificial and natural objects in the photos.
In the future, we will consider combining the proposed methods with the feature extraction methods using comments on tourism websites and SNS (Social network service). Furthermore, we will construct more effective and useful feature extraction methods from both photos and comments. Subsequently, we will implement the method for the development of the PATRS.