Sentiment analysis on public opinion of electric vehicles usage in Indonesia using support vector machine algorithms

Naufal Avilandi Poedjimartojo, Dita Pramesti, Riska Yanu Fa’rifah

Abstract


Technological developments in the automotive industry have experienced significant progress in recent years. Currently, many electric vehicles are being produced as an environmentally friendly alternative to vehicles. The use of electric vehicles has become an intense topic of conversation in society, giving rise to various responses and opinions on Twitter. This research aims to analyze Indonesian people's sentiment regarding using electric vehicles through data collected from Twitter. Sentiment analysis is carried out using a machine-learning approach. The best method for pattern recognition problems is a Support Vector Machine (SVM) to sort each comment into positive or negative sentiments. Meanwhile, SVM classification performance was measured using the Confusion Matrix method. In this research, the Synthetic Minority Over-Sampling Technique (SMOTE) method and the Random Undersampling (RUS) method were used to overcome data imbalance. After the model creation and performance evaluation process, the best model produced was the baseline Support Vector Machine with a data sharing ratio of 70:30 without applying imbalance handling techniques. This model achieved an accuracy of 94.8%, a precision value of 95.5%, a recall value of 99.1%, and an F-1 Score value of 97.2%. 

Keywords


Electric Vehicle; Twitter; Sentimen Analysis; Support Vector Machine; Oversampling; Undersampling; SMOTE; RUS

Full Text:

PDF

References


Potoglou, D., Song, R., & Santos, G. (2023). Public charging choices of electric vehicle users: A review and conceptual framework. Transportation Research Part D: Transport and Environment, vol. 121, no. 103824, pp. 1-22.

Samuel, Y., Delima, R., & Rahmat, R. (2015). Implementasi metode k-nearest neighbor dengan decision rule untuk klasifikasi subtopik berita, Jurnal Khatulistiwa Informatika, vol. 10, pp. 1-14.

Nasukawa, T., & Yi, J. (2003). Sentiment analysis: Capturing favorability using natural language processing. In Proceedings Of The 2nd International Conference on Knowledge Capture, pp. 70-77.

Pertiwi, S. R. G. (2018). Perbandingan metode k-nearest neighbor dan support vector machine dalam analisis sentimen twitter terhadap stasiun televisi berita Indonesia, [Dissertation], Yogyakarta: Universitas Gadjah Mada.

Astuti, I. N. F., Darmawan, I., & Pramesti, D. (2020). Analisis sentimen pada data kuesioner evaluasi dosen oleh mahasiswa (edom) Prodi Sistem Informasi Telkom University menggunakan algoritma support vector machine. eProceedings of Engineering, vol. 7, no. 2, pp. 7018-7025.

Ridwansyah, T. (2022). Implementasi text mining terhadap analisis sentimen masyarakat dunia di twitter terhadap Kota Medan menggunakan k-fold cross validation dan naïve bayes classifier. Klik: Kajian Ilmiah Informatika dan Komputer, vol. 2, no. 5, pp. 178-185.

Han, J., Kamber, M., & Pei, J. (2012). Data Mining: Concepts and. Techniques, Waltham: Morgan Kaufmann Publishers.

Maddison, J., & Jeske, D. (2014). Fear and perceived likelihood of victimization in traditional and cyber settings. International Journal of Cyber Behavior, Psychology and Learning (IJCBPL), vol. 4, no. 4, pp. 23-40.

Yulian, E. (2018). Text mining dengan k-means clustering pada tema LGBT dalam arsip tweet masyarakat Kota Bandung. Jurnal Matematika “MANTIK, vol. 4, no. 1, pp. 53-58.

Aditya, B. R. (2015). Penggunaan web crawler untuk menghimpun tweets dengan metode pre-processing text mining. Jurnal Infotel, vol. 7, no. 2, pp. 93-100.

Bholat, D., Hansen, S., Santos, P., & Schonhardt-Bailey, C. (2015). Text Mining for Central Banks. England: Center for Central Banking Studies, Bank of England.

Agarwal, A., Xie, B., Vovsha, I., Rambow, O., & Passonneau, R. J. (2011). Sentiment analysis of twitter data. In Proceedings of the workshop on language in social media (LSM 2011), pp. 30-38.

Coletta, L. F., da Silva, N. F., Hruschka, E. R., & Hruschka, E. R. (2014, October). Combining classification and clustering for tweet sentiment analysis. In IEEE: 2014 Brazilian conference on intelligent systems, pp. 210-215.

Novantirani, A., Sabariah, M. K., & Effendy, V. (2015). Analisis sentimen pada twitter untuk mengenai penggunaan transportasi umum darat dalam kota dengan metode support vector machine. eProceedings of Engineering, vol. 2, no. 1, pp. 1177-1183.

C. Troussas, M. Virvou, K. J. Espinosa, K. Llaguno, dan J. Caro, “Sentiment analysis of Facebook statuses using Naive Bayes classifier for language learning,” dalam IISA 2013, IEEE, Jul 2013, hlm. 1–6. doi: 10.1109/IISA.2013.6623713.

Chakraborty, K., Bhattacharyya, S., Bag, R., & Hassanien, A. A. (2018). Sentiment analysis on a set of movie reviews using deep learning techniques. Social network analytics: Computational research methods and techniques, 127. Cambridge: Elsevier Inc.

Lidya, S. K., Sitompul, O. S., & Efendi, S. (2015). Sentiment analysis pada teks Bahasa Indonesia menggunakan support vector machine (SVM) dan K-Nearest Neighbor (K-NN). Proceeding Sentika 2015, pp. 1-8.

Goh, R. Y., & Lee, L. S. (2019). Credit scoring: a review on support vector machines and metaheuristic approaches. Advances in Operations Research, no. 1974794, pp. 1-30.

Mammone, A., Turchi, M., & Cristianini, N. (2009). Support vector machines. Wiley Interdisciplinary Reviews: Computational Statistics, vol. 1, no. 3, 283-289.

Mansourifar, H., & Shi, W. (2020). Deep synthetic minority over-sampling technique. arXiv preprint arXiv:2003.09788.

Bunkhumpornpat, C., Sinapiromsaran, K., & Lursinsap, C. (2012). DBSMOTE: density-based synthetic minority over-sampling technique. Applied Intelligence, vol. 36, pp. 664-684.

Fernández, A., Garcia, S., Herrera, F., & Chawla, N. V. (2018). SMOTE for learning from imbalanced data: progress and challenges, marking the 15-year anniversary. Journal of artificial intelligence research, vol. 61, pp. 863-905.

Chawla, N. V., Bowyer, K. W., Hall, L. O., & Kegelmeyer, W. P. (2002). SMOTE: synthetic minority over-sampling technique. Journal of Artificial Intelligence Research, vo. 16, pp. 321-357.

Hennenfent, G., & Herrmann, F. J. (2008). Simply denoise: Wavefield reconstruction via jittered undersampling. Geophysics, vol. 73, no. 3, pp. 19-28.

Prusa, J., Khoshgoftaar, T. M., Dittman, D. J., & Napolitano, A. (2015, August). Using random undersampling to alleviate class imbalance on tweet sentiment data. In 2015 IEEE international conference on information reuse and integration, pp. 197-202.

Wongvorachan, T., He, S., & Bulut, O. (2023). A comparison of undersampling, oversampling, and SMOTE methods for dealing with imbalanced classification in educational data mining. Information, vol. 14, no. 54, pp. 1-15.

Irawaty, I., Andreswari, R., & Pramesti, D. (2020, September). Vectorizer comparison for sentiment analysis on social media youtube: A case study. In 2020 3rd International Conference on Computer and Informatics Engineering (IC2IE), pp. 69-74.

Attal, F., Mohammed, S., Dedabrishvili, M., Chamroukhi, F., Oukhellou, L., & Amirat, Y. (2015). Physical human activity recognition using wearable sensors. Sensors, vol. 15, no. 12, pp. 31314-31338.

Visa, S., Ramsay, B., Ralescu, A. L., & Van Der Knaap, E. (2011). Confusion matrix-based feature selection. Maics, vol. 710, no. 1, pp. 120-127.

Nasution, M. R. A., & Hayaty, M. (2019). Perbandingan akurasi dan waktu proses algoritma K-NN dan SVM dalam analisis sentimen twitter. J. Inform, vol. 6, no. 2, pp. 226-235.




DOI: http://dx.doi.org/10.36055/tjst.v19i2.21967

Refbacks

  • There are currently no refbacks.


Copyright (c) 2023 Teknika: Jurnal Sains dan Teknologi

Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

Creative Commons License

Teknika: Jurnal Sains dan Teknologi is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.