|Isti Surjandari||Department of Industrial Engineering, Faculty of Engineering, Universitas Indonesia, Kampus UI Depok, Depok 16424, Indonesia|
|Reggia Aldiana Wayasti||Department of Industrial Engineering, Faculty of Engineering, Universitas Indonesia, Kampus UI Depok, Depok 16424, Indonesia|
|Zulkarnain||Department of Industrial Engineering, Faculty of Engineering, Universitas Indonesia, Kampus UI Depok, Depok 16424, Indonesia|
|Annisa Marlin Masbar Rus|
|Irfan Prawiradinata||The Boston Consulting Group (BCG), Indonesia Sampoerna Strategic Square, North Tower Level 19, DKI Jakarta, 12930|
The use of ride-hailing services as a solution to current transportation problems is currently attracting much attention. Their benefits and convenience mean many people use them in their everyday lives and discuss them in the social media. As a result, ride-hailing service providers utilize social media to capture customers’ opinions and to market their services. If these opinions and comments are analyzed, service providers can obtain feedback to evaluate their services in order to achieve customer satisfaction. This study combines the text mining approach, in the form of aspect-based sentiment analysis to identify topics in customer opinions and their sentiments, with scoring of ride-hailing service providers in general, and more specifically based on the topics and sentiments. The study analyzes customers’ opinions on Twitter of three ride-hailing service providers. Text data were classified based on six topics derived from the topic modeling process, along with the sentiments expressed on them. Scoring of the three ride-hailing service providers was based on the number of positive and negative comments in relation to each topic, as well as overall comments. The results of the study can be used as input to evaluate and improve the service in Indonesia, thus the customer satisfaction and loyalty can be maintained and improved.
Aspect-based sentiment analysis; Latent Dirichlet Allocation; Net Reputation Score; Ride-hailing service; Support Vector Machine; Text mining
Currently, social media is continuously developing, along with technological advancements. It is now used for various purposes as well as communicating and socializing, such as seeking entertainment and information (He et al., 2013). This is because social media provides an easy way to create and exchange user-generated content (UGC). Social media users can actively participate to create and share content in the form of text messages, photos, videos, amongst others. This content can be accessed and responded to instantly by other users. The volume of shared content increases over time at a fast rate, resulting in high dimensional data (Tang & Liu, 2014).
The amount of information available enables social media to play an important role in the electronic word of mouth (e-WOM) process. Content that includes opinions on or reviews of products or services has an important influence in shaping public perceptions, building product or service reputation, and helping customers make purchasing decisions, which all leads to increased sales and profitability (Philander & Zhong, 2016). Quick distribution of the content allows a product or service to be recognized, so that the responses and comments from the public can be accessed and monitored over time.
Among the existing social media platforms, Twitter is one of the most popular microblogs. Twitter users can share content in a tweet and interact with fellow users. Every day, more than half a billion tweets are posted, which means that much content is shared. The fast way of sharing and exchanging UGC on Twitter makes the e-WOM process effective (Philander & Zhong, 2016). Customers can obtain information about products or services quickly, while companies can monitor and analyze their comments to ascertain the advantages and disadvantages of their products or services.
Comments from customers in social media such as Twitter can generate insight for companies to develop further strategies for their products or services. To handle and extract information from a large number of posts and various writing styles, the text mining approach can be applied. One text mining application is sentiment analysis which processes opinionated posts and groups them based on their sentiment (Surjandari et al., 2015). There is also the aspect-based sentiment analysis technique, which identifies related topics to the object being reviewed before grouping the sentiments for each topic (Marrese-Taylor et al., 2014). This technique produces a more detailed grouping scheme and is more helpful to users, because the features that receive positive and negative sentiments can be defined.
This study aims to analyze customers’ opinions on Twitter by defining the topics discussed and their respective sentiments. Ride-hailing service providers in Indonesia were chosen to be the object of the study, since more people use these services daily. Many users share their experience, compliments and complaints about the services on Twitter. Therefore, these kinds of tweets can be utilized by the service providers to develop improvement strategies so that they can continue to provide the best service and increase customer loyalty. The study also applies a scoring scheme for the ride-hailing service providers based on the sentiment analysis results to help them decide improvement priorities.
Ride-hailing services have many positive impacts on urban life. Therefore, it is not surprising that they are increasingly being used, and discussed on social media such as Twitter. The opinions and complaints of users on Twitter can be an input for service providers to evaluate and measure the quality of service provided according to the customers’ points of view. To process text data on Twitter that is in large amounts, the text mining approach in the form of sentiment analysis can be used in the process of analyzing tweets from customers. This study combined the aspect-based sentiment analysis approach to identify topics in the customer opinions and their sentiments, with assessment of the ride-hailing service providers in general, and more specifically based on the topics and sentiments produced.
The topic modeling stage generated six topics regarding the services, apps and fares of all three ride-hailing service providers. Once the topics had been decided, positive and negative data classification by topic and sentiment was conducted. The classification model yielded an accuracy of 86% for the first service provider, 91% for the second, and 87% for the third. The model was used for the classification of new data to obtain the number of tweets with positive and negative sentiments for each topic used for the assessment of the three providers. The resulting scores were entirely negative because of the number of tweets that had more negative sentiments. However, based on the scores, the first service provider had a better reputation because there were customers who made positive comments on all the topics.
While the results could benefit the ride-hailing service providers, more time could have been taken in the text pre-processing phase of the approach employed. This is because of the immense variety of acronyms, spelling and even local language included in the text, while the availability of pre-processing software in Indonesian is still limited.
Development of this research could be made by adding opinions and complaints from the drivers, so that their needs and aspirations can also be fulfilled. In terms of the social media used, further research could add data from the comments columns in other social media such as Facebook or Instagram, or from reviews on Google Play Store or the App Store. Research could also be developed by comparing user opinions on services of the second service provider before and after acquiring the third provider. In terms of the algorithm used, further research could be made using other topic modeling techniques such as Latent Semantic Indexing (LSI) or Probabilistic Latent Semantic Analysis (PLSA), and other classification algorithms such as Decision Tree, Naïve Bayes or neural network. Finally, further research could be conducted to compare the online ride-hailing service with public transportation, so that the advantages and disadvantages of each can be defined.
Blei, D.M., 2012. Probabilistic Topic Model. Communications of the ACM, Volume 55(4), pp. 77–84
Chakraborty, G., Pagolu, M., Garla, S., 2013. Text Mining and Analysis: Practical Methods, Examples, and Case Studies using SAS. Cary, North Carolina: SAS Publishing
Colace, F., Casaburi, L., De Santo, M., Greco, L., 2015. Sentiment Detection in Social Networks and in Collaborative Learning Environments. Computers in Human Behavior, Volume 51(B), pp. 1061–1067
Duan, J., Ai, Y., Li, X., 2015. LDA Topic Model for Microblog Recommendation. In: International Conference on Asian Language Processing, Suzhou
Gruen, B., Hornik, K., 2011. Topic Models: An R Package for Fitting Topic Models. Journal of Statistical Software, Volume 40(13), pp. 1–30
Han, J., Kamber, M., Pei, J., 2012. Data Mining: Concepts and Techniques. 3rd Edition. San Fransisco: Morgan Kaufmann Publishers
He, W., Zha, S., Li, L., 2013. Social Media Competitive Analysis and Text Mining: A Case Study in the Pizza Industry. International Journal of Information Management, Volume 33(3), pp. 464–472
Hsu, C.W., Lin, C.J., 2002. A Comparison of Methods for Multiclass Support Vector Machines. In: IEEE Transactions on Neural Networks, Taiwan, pp. 415–425
Lim, K.W., Buntine, W., 2012. Twitter Opinion Topic Model: Extracting Product Opinions from Tweets by Leveraging Hashtags and Sentiment Lexicon. In: The 21st ACM International Conference on Information and Knowledge Management, Maui
Liu, Y., Bi, J.W., Fan, Z.P., 2017. Ranking Products Through Online Reviews: A Method based on Sentiment Analysis Technique and Intuitionistic Fuzzy Set Theory. Information Fusion, Volume 36, pp. 149–161
Marrese-Taylor, E., Velásquez, J.D., Bravo-Marquez, F., 2014. A Novel Deterministic Approach for Aspect-based Opinion Mining in Tourism Products Reviews. Expert Systems with Applications, Volume 41, pp. 7764–7775
Miner, G., Elder IV, J., Fast, A., Hill, T., Nisbet, R., Delen, D., 2012. Practical Text Mining and Statistical Analysis for Non-structured Text Data Applications. Oxford: Elsevier.
Philander, K., Zhong, Y., 2016. Twitter Sentiment Analysis: Capturing Sentiment from Integrated Resort Tweets. International Journal of Hospitality Management, Volume 55, pp. 16–24
Prameswari, P., Surjandari, I., Zulkarnain, Laoh, E., 2017. Mining Online Reviews in Indonesia’s Priority Tourist Destinations using Sentiment Analysis and Text Summarization Approach. In: IEEE 8th International Conference on Awareness Science and Technology, Kaohsiung
Putri, I.R., Kusumaningrum, R., 2017. Latent Dirichlet Allocation (LDA) for Sentiment Analysis Toward Tourism Review in Indonesia. Journal of Physics: Conference Series, Volume 801(1), pp. 1–6
Rokach, L., Maimon, O., 2015. Data Mining with Decision Trees: Theory and Applications. 2nd Edition. Singapore: World Scientific Publishing
Social Meteor, 2017. Net Reputation Score: Dropping Neutral Mentions Means More. Available Online at: http://www.socialmeteor.com/2017/03/16/net-reputation-score-dropping-neutral-means/, Accessed on March 16th, 2017
Surjandari, I., Megawati, C., Dhini, A., Hardaya, I.B.N.S., 2016. Application of Text Mining for Classification of Textual Reports: A Study of Indonesia’s National Complaint Handling System. In: Sixth International Conference on Industrial Engineering and Operations Management, Kuala Lumpur
Surjandari, I., Naffisah, M.S., Prawiradinata, M.I., 2015. Text Mining of Twitter Data for Public Sentiment Analysis of Staple Foods Price Changes. Journal of Industrial and Intelligent Information, Volume 3(3), pp. 253–257
Tang, J., Liu, H., 2014. Feature Selection for Social Media Data. ACM Transactions on Knowledge Discovery from Data (TKDD), Volume 8(4), pp. 1–27
Tong, Z., Zhang, H., 2016. A Text Mining Research Based on LDA Topic Modelling. In: The Sixth International Conference on Computer Science, Engineering and Information Technology, Vienna
Vidya, N.A., Fanany, M.I., Budi, I., 2015. Twitter Sentiment to Analyze Net Brand Reputation of Mobile Phone Providers. Procedia Computer Science, Volume 72, pp. 519–526
Wang, Z., Xue, X., 2014. Multi-class Support Vector Machines. Support Vector Machines Applications. Basel: Springer International Publishing, pp. 23–49