International Journal of Applied Science and Engineering
Published by Chaoyang University of Technology

Subha Jyoti Das1, Riki Murakami1, Basabi Chakraborty2*

1 Graduate School of Software and Information Science, Iwate Prefectural University, Iwate, Japan
2 Faculty of Software and Information Science, Iwate Prefectural University, Iwate, Japan


Download Citation: |
Download PDF


ABSTRACT


Summarization of online reviews by customers is a popular practice for evaluation of products or services. As the reviews accumulate, the large size and the unstructured nature of the reviews hinder manual summarization. Automatic categorization of the reviews as a whole into only positive and negative group cannot represent a clear picture. An aspect based automatic summarization technique can provide better visualization.  However, automatic extraction of proper aspects from the huge reviews of any product is not very easy. There are some research works in this direction, but any definite method is yet to come. In this work, a two-step Latent Dirichlet Allocation (LDA) technique, which is popularly used for topic modelling has been developed for efficient aspect extraction. The method has been evaluated by simulation experiments on Amazon product reviews and Yelp restaurant and hotel reviews. The results have been found quite matching with human annotated results.


Keywords: Opinion analysis, Aspect extraction, Review summarization, Two-step LDA.


Share this article with your colleagues

 


REFERENCES


  1. Allahyari, M., Pouriyeh, S., Kochut, K., Arabnia, H.R. 2017. A knowledge-based topic modeling approach for automatic topic labeling. International Journal of Advanced Computer Science and Applications(ijacsa), 8. http://dx.doi.org/10.14569/IJACSA.2017.080947

  2. Bagheri, A., Saraee, M., De Jong, F. 2014. ADM-LDA: An aspect detection model based on topic modelling using the structure of review sentences. Journal of Information Science, 40, 621–636.

  3. Bagheri, A., Saraee, M., Jong, F.D. 2013. Care more about customers: Unsupervised domain-independent aspect detection for sentiment analysis of customer reviews. Knowledge-Based Systems 52 201213, https://doi.org/10.1016/ j.knosys.2013.08.011

  4. Blei, D.M., Ng, A.Y., Jordan, M.I. 2003. Latent dirichlet allocation, J Mach Learn Res 3 (Jan.). 993–1022.

  5. Brody, S., Elhadad, N. 2010. An unsupervised aspect-sentiment model for online reviews. Human Language Technologies: in Proceedings of the 11th Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL-HLT-10), Los Angeles, USA, 804-812.

  6. Chen, Z., Mukherjee, A., Liu, B., Hsu, M., Castellanos, M., Ghosh, R. 2013.  Leveraging multi-domain prior knowledge in topic models. Proceedings of the Twenty-Third international joint conference on Artificial Intelligence (IJCAI-13), Beijing, China, AAAI Press, 2071–2077.

  7. Chen, Z., Mukherjee, A., Liu, B., Hsu, M., Castellanos, M., Ghosh, R. 2013.  Discovering coherent topics using general knowledge. Proceedings of the 22nd ACM international conference on Conference on information & knowledge management (CIKM-13), San Francisco, USA, 209–218.

  8. Chen, Z., Mukherjee, A., Liu, B., Hsu, M., Castellanos, M., Ghosh, R. 2013. Exploiting domain knowledge in aspect extraction. Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing (EMNLP-13), Seattle, USA, 1655–1667.

  9. Das, S.J., Chakraborty, B. 2019. An approach for automatic aspect extraction by latent dirichlet allocation. IEEE 10th International Conference on Awareness Science and Technology (iCAST), Morioka, Japan, 1-6. https://doi.org/10.1109/ICAwST.2019.8923417

  10. Das, S.J., Chakraborty, B. 2020. Design of a category independent, aspect based automated opinion analysis technique for online product reviews. International Journal of Applied Science and Engineering, 17, 175–189. https://doi.org/10.6703/IJASE.202005_17(2).175

  11. Debortoli, S., Müller, O., Junglas, I., vom Brocke, J. 2016. Text mining for information systems researchers: An annotated topic modeling tutorial. Communications of the Association for Information Systems, 39. https://doi.org/10.17705/1CAIS.03907.110–135.

  12. Ekinci, E., Omurca, S.I. 2017. Extracting implicit aspects based on latent dirichlet allocation. Doctoral Consortium - DCAART, (ICAART 2017) ISBN, 17–23.

  13. Fang, L., Huang, M. 2012. Fine granular aspect analysis using latent structural models. Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics, Jeju, South Korea: Short Papers- 2, 333–337.

  14. Hajmohammadi, M.S., Ibrahim, R., Othman, Z.A. 2012. Opinion mining and sentiment analysis: A survey. International Journal of Computers & Technology, 2. ISSN:2277-3061(online)

  15. He, R., Jullian, McAuley, 2016. Ups and downs: Modeling the visual evolution of fashion trends with one-class collaborative filtering. World Wide Web conference.http://dx.doi.org/10.1145/2872427.2883037. 507–517.

  16. Hu, M., Liu, B. 2004. Mining and summarizing customer reviews. Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining, Seattle, WA, USA. https://doi.org/10.1145/1014052.1014073. 168–177

  17. Jakob, N., Gurevych, I. 2010. Extracting opinion targets in a single- and cross-domain setting with conditional random fields. 2010 Conference on Empirical Methods in Natural Language Processing, Cambridge, Massachusetts.

  18. Jin, W., Ho, H.H., Srihari, R.K., 2009. OpinionMiner: a novel machine learning system for web opinion mining and extraction. 15th ACM SIGKDD international conference on Knowledge discovery and data mining, Paris, France.

  19. Jo, Y., Oh, A.H. 2011. Aspect and sentiment unification model for online review analysis. Proceedings of the Fourth ACM International Conference on Web Search and Data Mining (WSDM-11), Hong Kong, 815–824.

  20. Kessler, J.S., Nicolov, N. 2009. Targeting sentiment expressions through supervised ranking of linguistic configurations. Third International AAAI Conference on Weblogs and Social Media, San Jose, California, USA, 90–97.

  21. Kumar, K. 2018. Evaluation of topic modeling: Topic coherence. https://datascienceplus.com/evaluation-of-topic-modeling-topic-coherence

  22. Moghaddam, S., Ester, M., 2011. ILDA interdependent LDA model for learning latent aspects and their ratings from online product reviews. SIGIR’11, July 24–28, Beijing, China. Copyright 2011 ACM 978-1-4503-0757-4/11/07. https://doi.org/10.1145/2009916.2010006.665–674.

  23. Popescu, A.M., Etzioni, O. 2005. Extracting product features and opinions from reviews. Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing, Vancouver, British Columbia, Canada. https://doi.org/ 10.3115/1220575.1220618. 339–346.

  24. Rao, A., Shah, K. 2018. A domain independent technique to generate feature opinion pairs for opinion mining. WSEAS transactions on information science and applications, ISSN / E-ISSN: 1790-0832 / 2224-3402, 15, 61–69.

  25. Rayana, S., Akoglu. L. Stony Brook University. http://odds.cs.stonybrook.edu/yelpzip-dataset/.

  26. Singh, V. 2017. Guided LDA: Guided topic modeling with latent Dirichlet allocation. https://guidedlda.readthedocs.io/en/latest/

  27. Srivastava, A., Sutton, C. 2017. Autoencoding variational inference for topic models, Proc. Int. Conf. Learn. Representations. arXiv:1703.01488

  28. Teh, Y., Jordan, M., Beal, M., Blei, D. 2006. Hierarchical dirichlet processes. Journal of the American Statistical Association. 101. https://doi.org/10.2307/27639773. 1566–1581.

  29. Wang, T., Cai, Y., Leung, H.-f., Lau, R.Y., Li, Q., Min, H. 2014. Product aspect extraction supervised with online domain knowledge. Knowledge-Based Systems, 71, 86–100.

  30. Wawer, A. 2015. Towards domain independent opinion target extraction. IEEE 15th International Conference on Data Mining Workshops(ICDMW), https://doi.org/10.1109/ICDMW.2015.255, 1326 –1331.

  31. Xu, X., Tan, S., Liu, Y., Cheng, X., Lin, Z. 2012. Towards jointly extracting aspects and aspect-specific sentiment knowledge. Proceedings of the 21st ACM International Conference on Information and Knowledge management (CIKM-12). Maui Hawaii, USA, 1895–1899.

  32. Xueke, X., Xueqi, C., Songbo, T., Yue, L., Huawei, S. 2013. Aspect level opinion mining of online customer reviews. China Communications, 10, 25–41.

  33. Yi, J., Nasukawa, T., Bunescu, R., Niblack, W. 2003. Sentiment analyzer: Extracting sentiments about a given topic using natural language processing techniques. Proceedings of the Third IEEE International Conference on Data Mining. https://doi.org/10.1109/ICDM.2003.1250949. 427–434.

  34. Zhao, W.X., Jiang, J., Yan, H., Li, X. 2010. Jointly modeling aspects and opinions with a Maxent-LDA hybrid. Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing (EMNLP-10), Massachusetts, USA, 56–65.


ARTICLE INFORMATION


Received: 2020-06-02
Revised: 2020-08-27
Accepted: 2020-12-16
Available Online: 2021-03-01


Cite this article:

Das, S.J., Murakami, R., Chakraborty, B. 2021. Development of a two-step LDA based aspect extraction technique for review summarization. International Journal of Applied Science and Engineering, 18, 2020120. https://doi.org/10.6703/IJASE.202103_18(1).003

  Copyright The Author(s). This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are cited.