Sentiment analysis of Indonesian reviews using fine-tuning IndoBERT and R-CNN


Herlina Jayadianti(1*); Wilis Kaswidjanti(2); Agung Tri Utomo(3); Shoffan Saifullah(4); Felix Andika Dwiyanto(5); Rafal Drezewski(6);

(1) Universitas Pembangunan Nasional Veteran Yogyakarta
(2) Universitas Pembangunan Nasional Veteran Yogyakarta
(3) Universitas Pembangunan Nasional Veteran Yogyakarta
(4) Universitas Pembangunan Nasional Veteran Yogyakarta, AGH University of Science and Technology
(5) AGH University of Science and Technology, Universitas Negeri Malang
(6) AGH University of Science and Technology
(*) Corresponding Author

  

Abstract


Reviews are a form of user experience information on a product or service that can be used as a reference for potential consumers’ preferences to buy, use, or consume a product. They can be also used by business entities to find out public opinion about their product or the performance of their business products. It will be very difficult to process the review data manually and it will take a long time. Therefore, sentiment analysis automation can be used to get polarity information from existing reviews. In this study, IndoBERT with Recurrent Convolutional Neural Network (RCNN) was used to automate sentiment analysis of Indonesian reviews. The data used was a sentiment analysis dataset obtained from IndoNLU with sentiment consisting of negative sentiment, neutral sentiment, and positive sentiment. The results of the test showed that IndoBERT with the Recurrent Convolutional Neural Network (RCNN) had better results than the IndoBERT base. IndoBERT with Recurrent Convolutional Neural Network (RCNN) obtained 95.16% accuracy, 94.05% precision, 92.74% recall and 93.27% f1 score.


Keywords


Sentiment Analysis; IndoBERT; Recurrent Convolutional Neural Network; Pretained Language Models

  
  

Full Text:

PDF
  

Article Metrics

Abstract view: 2577 times
PDF view: 776 times
     

Digital Object Identifier

doi  https://doi.org/10.33096/ilkom.v14i3.1505.348-354
  

Cite

References


B. Yang, Y. Liu, Y. Liang, and M. Tang, “Exploiting user experience from online customer reviews for product design,” Int. J. Inf. Manage., vol. 46, pp. 173–186, Jun. 2019, doi: 10.1016/j.ijinfomgt.2018.12.006.

A. Karunakaran, W. J. Orlikowski, and S. V. Scott, “Crowd-Based accountability: examining how social media commentary reconfigures organizational accountability,” Organ. Sci., vol. 33, no. 1, pp. 170–193, Jan. 2022, doi: 10.1287/orsc.2021.1546.

N. H. Cahyana, S. Saifullah, Y. Fauziah, and A. S. Aribowo, “Text Annotation automation for hate speech detection using SVM-classifier based on Feature Extraction,” 2022.

Y. Fauziah, S. Saifullah, and A. S. Aribowo, “Design text mining for anxiety detection using Machine Learning based-on social media data during COVID-19 pandemic,” in Proceeding of LPPM UPN “Veteran” Yogyakarta Conference Series 2020–Engineering and Science Series, 2020, vol. 1, no. 1, pp. 253–261, doi: 10.31098/ess.v1i1.117.

N. H. Cahyana, S. Saifullah, Y. Fauziah, A. S. Aribowo, and R. Drezewski, “Semi-supervised Text Annotation for Hate Speech Detection using K-Nearest Neighbors and Term Frequency-Inverse Document Frequency,” Int. J. Adv. Comput. Sci. Appl., vol. 13, no. 10, 2022, doi: 10.14569/IJACSA.2022.0131020.

Y. Nurdiansyah, S. Bukhori, and R. Hidayat, “Sentiment analysis system for movie review in Bahasa Indonesia using naive bayes classifier method,” J. Phys. Conf. Ser., vol. 1008, no. 1, pp. 1–7, Apr. 2018, doi: 10.1088/1742-6596/1008/1/012011.

M. Wongkar and A. Angdresey, “Sentiment Analysis using Naive Bayes algorithm of the data Crawler: Twitter,” 2019 Fourth Int. Conf. Informatics Comput., pp. 1–5, Oct. 2019, doi: 10.1109/ICIC47613.2019.8985884.

N. L. P. C. Savitri, R. A. Rahman, R. Venyutzky, and N. A. Rakhmawati, “Analisis klasifikasi sentimen terhadap sekolah daring pada twitter menggunakan Supervised Machine Learning,” J. Tek. Inform. dan Sist. Inf., vol. 7, no. 1, pp. 47–58, Apr. 2021, doi: 10.28932/jutisi.v7i1.3216.

M. Ahmad, S. Aftab, M. Salman, N. Hameed, I. Ali, and Z. Nawaz, “SVM Optimization for Sentiment Analysis,” Int. J. Adv. Comput. Sci. Appl., vol. 9, no. 4, pp. 393–398, 2018, doi: 10.14569/IJACSA.2018.090455.

H. S. Utama, D. Rosiyadi, B. S. Prakoso, and D. Ariadarma, “Analisis sentimen sistem ganjil genap di tol Bekasi menggunakan algoritma Support Vector Machine,” J. RESTI (Rekayasa Sist. dan Teknol. Informasi), vol. 3, no. 2, pp. 243–250, Aug. 2019, doi: 10.29207/resti.v3i2.1050.

D. R. Pant, P. Neupane, A. Poudel, A. K. Pokhrel, and B. K. Lama, “Recurrent Neural Network Based Bitcoin Price Prediction by Twitter Sentiment Analysis,” 2018 IEEE 3rd Int. Conf. Comput. Commun. Secur., pp. 128–132, Oct. 2018, doi: 10.1109/CCCS.2018.8586824.

Merinda Lestandy, Abdurrahim Abdurrahim, and Lailis Syafa’ah, “Analisis Sentimen Tweet Vaksin COVID-19 Menggunakan Recurrent Neural Network dan Naïve Bayes,” J. RESTI (Rekayasa Sist. dan Teknol. Informasi), vol. 5, no. 4, pp. 802–808, Aug. 2021, doi: 10.29207/resti.v5i4.3308.

X. Ouyang, P. Zhou, C. H. Li, and L. Liu, “Sentiment Analysis using Convolutional Neural Network,” 2015 IEEE Int. Conf. Comput. Inf. Technol. Ubiquitous Comput. Commun. Dependable, Auton. Secur. Comput. Pervasive Intell. Comput., pp. 2359–2364, Oct. 2015, doi: 10.1109/CIT/IUCC/DASC/PICOM.2015.349.

Y. Yuliska, D. H. Qudsi, J. H. Lubis, K. U. Syaliman, and N. F. Najwa, “Analisis sentimen pada data saran mahasiswa terhadap kinerja departemen di perguruan tinggi menggunakan Convolutional Neural Network,” J. Teknol. Inf. dan Ilmu Komput., vol. 8, no. 5, p. 1067, Oct. 2021, doi: 10.25126/jtiik.2021854842.

S. Lai, L. Xu, K. Liu, and J. Zhao, “Recurrent Convolutional Neural Networks for Text Classification,” Proc. Twenty-Ninth AAAI Conf. Artif. Intell., pp. 2267–2273, 2015, [Online]. Available: https://dl.acm.org/doi/10.5555/2886521.2886636.

C. Du and L. Huang, “Sentiment Classification via recurrent Convolutional Neural Networks,” DEStech Trans. Comput. Sci. Eng., no. cii, pp. 308–316, Dec. 2017, doi: 10.12783/dtcse/cii2017/17268.

A. D. Arumsari and E. Winarko, “Analisis Sentimen pada Tweet Indonesia menggunakan Recurrent Convolutional Neural Network,” Universitas Gadjah Mada, 2017.

Z. Mahmood et al., “Deep sentiments in Roman Urdu text using Recurrent Convolutional Neural Network model,” Inf. Process. Manag., vol. 57, no. 4, pp. 1–14, Jul. 2020, doi: 10.1016/j.ipm.2020.102233.

J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding,” Proc. 2019 Conf. North, pp. 4171–4186, 2019, doi: 10.18653/v1/N19-1423.

Q. T. Nguyen, T. L. Nguyen, N. H. Luong, and Q. H. Ngo, “Fine-Tuning BERT for Sentiment Analysis of Vietnamese Reviews,” 2020 7th NAFOSTED Conf. Inf. Comput. Sci., pp. 302–307, Nov. 2020, doi: 10.1109/NICS51282.2020.9335899.

B. Wilie et al., “IndoNLU: Benchmark and resources for evaluating indonesian natural language understanding,” Proc. 1st Conf. Asia-Pacific Chapter Assoc. Comput. Linguist. 10th Int. Jt. Conf. Nat. Lang. Process., pp. 843–857, Sep. 2020, [Online]. Available: http://arxiv.org/abs/2009.05387.

C. A. Putri, “Analisis sentimen review film berbahasa Inggris dengan pendekatan bidirectional encoder representations from transformers,” JATISI (Jurnal Tek. Inform. dan Sist. Informasi), vol. 6, no. 2, pp. 181–193, Jan. 2020, doi: 10.35957/jatisi.v6i2.206.

D. Fimoza, A. Amalia, and T. H. F. Harumy, “Sentiment analysis for movie review in Bahasa Indonesia using BERT,” 2021 Int. Conf. Data Sci. Artif. Intell. Bus. Anal., pp. 27–34, Nov. 2021, doi: 10.1109/DATABIA53375.2021.9650096.

K. S. Nugroho, A. Y. Sukmadewa, H. Wuswilahaken DW, F. A. Bachtiar, and N. Yudistira, “BERT Fine-Tuning for Sentiment Analysis on Indonesian Mobile apps reviews,” 6th Int. Conf. Sustain. Inf. Eng. Technol. 2021, pp. 258–264, Sep. 2021, doi: 10.1145/3479645.3479679.

H. Christian, D. Suhartono, A. Chowanda, and K. Z. Zamli, “Text based personality prediction from multiple social media data sources using pre-trained language model and model averaging,” J. Big Data, vol. 8, no. 1, p. 68, Dec. 2021, doi: 10.1186/s40537-021-00459-1.

S. M. Isa, G. Nico, and M. Permana, “Indobert for Indonesian fake news detection,” ICIC Express Lett., vol. 16, no. 3, pp. 289–297, 2022.

D. Fan, L. Wan, W. Xu, and S. Wang, “A bi-directional attention guided cross-modal network for music based dance generation,” Comput. Electr. Eng., vol. 103, p. 108310, Oct. 2022, doi: 10.1016/j.compeleceng.2022.108310.

M. Arevalillo-Herraez, P. Arnau-Gonzalez, and N. Ramzan, “On adapting the DIET architecture and the rasa conversational toolkit for the sentiment analysis task,” IEEE Access, vol. 10, pp. 107477–107487, 2022, doi: 10.1109/ACCESS.2022.3213061.

S. Saifullah, Y. Fauziah, and A. S. Aribowo, “Comparison of Machine Learning for Sentiment Analysis in detecting anxiety based on social media data,” Jan. 2021, [Online]. Available: http://arxiv.org/abs/2101.06353.

M. Peters et al., “Deep Contextualized Word Representations,” Proc. 2018 Conf. North Am. Chapter Assoc. Comput. Linguist. Hum. Lang. Technol. Vol. 1 (Long Pap., pp. 2227–2237, 2018, doi: 10.18653/v1/N18-1202.

X. Qiu, T. Sun, Y. Xu, Y. Shao, N. Dai, and X. Huang, “Pre-trained models for natural language processing: A survey,” Sci. China Technol. Sci., vol. 63, no. 10, pp. 1872–1897, Oct. 2020, doi: 10.1007/s11431-020-1647-3.


Refbacks

  • There are currently no refbacks.


Copyright (c) 2022 Shoffan Saifullah, Herlina Jayadianti, Wilis Kaswidjanti, Agung Tri Utomo

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.