A comparative study on classification models for stock rating prediction





Stock Rating, Machine Learning, Classification, Web Scraping, S&P 500


The digital transformation in the stockbroker industry has led to a significant increase in retail investors, who often lack the expertise to analyse stocks thoroughly. This research addresses the challenge by proposing a classification model to predict stock ratings such as "Reduce", "Hold", "Moderate Buy", and "Buy”, allowing retail investors to make informed decisions. The data analysed is collected from the S&P 500 index through web scraping using Beautiful Soup, resulting in a dataset used for training and testing the classification model. Popular stock indicators are used as attributes in predicting the rating of the stock, which includes the Exchange, Price, Volume, Market Cap, ROE, ROA, P/E Ratio, EPS, Annual Sales, Net Income, Net Margins, and PB Ratio of the stock. The models selected for classification include K-Nearest Neighbors (k-NN), Gaussian Naive Bayes, Support Vector Machine (SVM), Decision Tree, and Random Forest. GridSearch is employed to maximize each algorithm's parameters for optimal performance. Results indicate that the k-NN model outperforms others, achieving the highest accuracy (0.618644) and weighted F1-score (0.605011). However, all models exhibit relatively low accuracy, suggesting the complexity of predicting stock ratings due to external factors not considered in the study.


Download data is not yet available.


Metrics Loading ...

Author Biographies

Justin Yap, School of Information Technology, Universitas Ciputra Surabaya





Trianggoro Wiradinata, School of Information Technology, Universitas Ciputra Surabaya






A. Kuriakose and P. B. Sajoy, “Digital Transformation in the Stockbroking Industry and its Role in Strong Retail Investor Participation in the Indian Stock Market,” The Management Accountant Journal, vol. 57, no. 8, p. 50, Aug. 2022, doi: 10.33516/maj.v57i8.50-54p.

Dr. D. Pathak, “Intrinsic Value in Assessing the Fairness of IT Stock Price using Fundamental Analysis”, BSSS Journal of Management, vol. 12, no. 1, pp. 23–34, Sep. 2021, doi: 10.51767/jm1203.

P. Wnuczak, “Profitability of investment strategies developed on the basis of buy and sell recommendations,” Journal of Economics and Management, vol. 43, pp. 317–338, 2021, doi: 10.22367/jem.2021.43.15.

N. S. Soraya and H. Hendry, “Komparasi linear regression, random forest regression, dan multilayer perceptron regression untuk prediksi tren musik TikTok,” AITI, vol. 20, no. 2, pp. 191–205, Aug. 2023, doi: 10.24246/aiti.v20i2.191-205.

D. Kurniadi, Y. H. Agustin, H. I. N. Akbar, and I. Farida, “Penerapan Algoritma k-Means Clustering untuk Pengelompokan Pembangunan Jalan pada Dinas Pekerjaan Umum dan Penataan Ruang,” AITI, vol. 20, no. 1, pp. 64–77, Mar. 2023, doi: 10.24246/aiti.v20i1.64-77.

T. Wiradinata, F. Graciella, R. Tanamal, Y. S. Soekamto, and T. R. D. Saputri, “Post-Pandemic Analysis of House Price Prediction in Surabaya: A Machine Learning Approach,” Journal of Southwest Jiaotong University, vol. 57, no. 5, pp. 562–573, Oct. 2022, doi: 10.35741/issn.0258-2724.57.5.45.

X. Ji, J. Wang, and Z. Yan, “A stock price prediction method based on deep learning technology,” International Journal of Crowd Science, vol. 5, no. 1, pp. 55–72, Apr. 2021, doi: 10.1108/IJCS-05-2020-0012.

Z. K. Lawal, H. Yassin, and R. Y. Zakari, “Stock Market Prediction using Supervised Machine Learning Techniques: An Overview,” in 2020 IEEE Asia-Pacific Conference on Computer Science and Data Engineering (CSDE), IEEE, Dec. 2020, pp. 1–6. doi: 10.1109/CSDE50874.2020.9411609.

S. Kompella and K. C. Chilukuri, “Stock Market Prediction using Machine Learning Methods,” Inernational Journal of Computer Engineering and Technology, vol. 10, no. 3, May 2019, doi: 10.34218/IJCET.10.3.2019.003.

M. Misra, A. P. Yadav, and H. Kaur, “Stock Market Prediction using Machine Learning Algorithms: A Classification Study,” in 2018 International Conference on Recent Innovations in Electrical, Electronics & Communication Engineering (ICRIEECE), IEEE, Jul. 2018, pp. 2475–2478. doi: 10.1109/ICRIEECE44171.2018.9009178.

G. Attanasio, L. Cagliero, and E. Baralis, “Leveraging the explainability of associative classifiers to support quantitative stock trading,” in Proceedings of the Sixth International Workshop on Data Science for Macro-Modeling, New York, NY, USA: ACM, Jun. 2020, pp. 1–6. doi: 10.1145/3401832.3402679.

J. C. Fernández, M. Carbonero, P. A. Gutiérrez, and C. Hervás-Martínez, “Multi-objective evolutionary optimization using the relationship between F1 and accuracy metrics in classification tasks,” Applied Intelligence, vol. 49, no. 9, pp. 3447–3463, Sep. 2019, doi: 10.1007/s10489-019-01447-y.

É. Perthame, C. Friguet, and D. Causeur, “Stability of feature selection in classification issues for high-dimensional correlated data,” Stat Comput, vol. 26, no. 4, pp. 783–796, Jul. 2016, doi: 10.1007/s11222-015-9569-2.

D. U. Ozsahin, M. Taiwo Mustapha, A. S. Mubarak, Z. Said Ameen, and B. Uzun, “Impact of feature scaling on machine learning models for the diagnosis of diabetes,” in 2022 International Conference on Artificial Intelligence in Everything (AIE), IEEE, Aug. 2022, pp. 87–94. doi: 10.1109/AIE57029.2022.00024.

A. Luque, A. Carrasco, A. Martín, and A. de las Heras, “The impact of class imbalance in classification performance metrics based on the binary confusion matrix,” Pattern Recognit, vol. 91, pp. 216–231, Jul. 2019, doi: 10.1016/j.patcog.2019.02.023.

J. Novakovic, A. Veljovic, S. Ilić, Ž. M. Papic, and T. Milica, “Evaluation of Classification Models in Machine Learning,” Theory and Applications of Mathematics & Computer Science, vol. 7, pp. 39–46, 2017, [Online]. Available: https://api.semanticscholar.org/CorpusID:125586327

A. K. Gupta, N. Tatbul, R. Marcus, S.-T. Zhou, I. Lee, and J. E. Gottschlich, “A Skew-Sensitive Evaluation Framework for Imbalanced Data Classification,” 2020. [Online]. Available: https://api.semanticscholar.org/CorpusID:222310663

G. Sonkavde, D. S. Dharrao, A. M. Bongale, S. T. Deokate, D. Doreswamy, and S. K. Bhat, “Forecasting Stock Market Prices Using Machine Learning and Deep Learning Models: A Systematic Review, Performance Analysis and Discussion of Implications,” International Journal of Financial Studies, vol. 11, no. 3, p. 94, Jul. 2023, doi: 10.3390/ijfs11030094.

S. Sundar, Mr. B. Dhyani, and D. P. Chhajer, “Factors Affecting Stock Market Movements: An Investors Perspective.,” European Economic Letters, 2023, [Online]. Available: https://api.semanticscholar.org/CorpusID:258137619

S. Mohan, S. Mullapudi, S. Sammeta, P. Vijayvergia, and D. C. Anastasiu, “Stock Price Prediction Using News Sentiment Analysis,” in 2019 IEEE Fifth International Conference on Big Data Computing Service and Applications (BigDataService), IEEE, Apr. 2019, pp. 205–208. doi: 10.1109/BigDataService.2019.00035.

W. Khan, M. A. Ghazanfar, M. A. Azam, A. Karami, K. H. Alyoubi, and A. S. Alfakeeh, “Stock market prediction using machine learning classifiers and social media, news,” J Ambient Intell Humaniz Comput, vol. 13, no. 7, pp. 3433–3456, Jul. 2022, doi: 10.1007/s12652-020-01839-w.

G. Serafeim and A. Yoon, “Stock Price Reactions to ESG News: The Role of ESG Ratings and Disagreement,” SSRN Electronic Journal, 2021, doi: 10.2139/ssrn.3765217.




How to Cite

J. Yap and T. Wiradinata, “A comparative study on classification models for stock rating prediction”, AITI, vol. 21, no. 1, pp. 140–151, Apr. 2024.