Islamic Chatbot Based on Reinforcement Learning Using Q-Learning Algorithm

Islamic Chatbot Based on Reinforcement Learning Using Q-Learning Algorithm

Authors

  • Aria Octavian Hamza Department of Informatics, UIN Sunan Gunung Djati Bandung
  • Denis Firmansyah Department of Informatics, UIN Sunan Gunung Djati Bandung
  • Abidzar Giffari Department of Informatics, UIN Sunan Gunung Djati Bandung
  • Aisyah Muthmainnah Department of Informatics, UIN Sunan Gunung Djati Bandung
  • Adil Zukhruf Firdaus Department of Informatics, UIN Sunan Gunung Djati Bandung

DOI:

https://doi.org/10.15575/kjrt.v3i2.1731

Keywords:

Chatbot, Q-Learning, Reinforcement Learning, sentence-transformers

Abstract

This study develops a retrieval-based chatbot using a Reinforcement Learning (RL) approach with a Q-Learning algorithm combined with a Sentence-Transformer (SBERT) model to understand the semantic context of user questions. The system is designed to map questions into vector representations, calculate meaning similarity with training data, and select answers based on learned Q-values. The dataset used consists of question-answer pairs in JSON format. The training process is carried out in a question-and-answer simulation environment, where the RL agent is rewarded based on the suitability of the selected answer. Test results show that the chatbot is able to provide relevant and contextual responses even though the sentence structure differs from the training data. To improve accessibility, the system is packaged as a REST API using Flask and integrated into a Flutter-based mobile application as a user interface. This approach has proven to be efficient and computationally lightweight, and offers a promising alternative for developing responsive retrieval-based educational chatbots.

References

[1] E. Solomon and S. L. Tilahun, “Rule based chatbot design methods: A review,” Journal of Computational Science and Data Analytics, vol. 01, no. 1, pp. 75–84, Sep. 2024, doi: 10.69660/jcsda.01012405.

[2] A. Lommatzsch, B. Llanque, V. S. Rosenberg, S. A. M. Tahir, H. D. Boyadzhiev, and M. Walny, “Combining Information Retrieval and Large Language Models for a Chatbot that Generates Reliable, Natural-style Answers.,” in LWDA, 2023, pp. 298–310.

[3] H.-Y. Lin, “Large-Scale Artificial Intelligence Models,” Computer (Long Beach. Calif)., vol. 55, no. 5, pp. 76–80, May 2022, doi: 10.1109/MC.2022.3151419.

[4] D. Cohen et al., “Dynamic Planning in Open-Ended Dialogue using Reinforcement Learning,” Jul. 2022.

[5] R. Horev, “BERT Explained: State of the art language model for NLP,” towardsdatascience.com. Accessed: Dec. 04, 2021. [Online]. Available: https://towardsdatascience.com/bert-explained-state-of-the-art-language-model-for-nlp-f8b21a9b6270

[6] V. Raina and S. Krishnamurthy, “Natural Language Processing,” in Building an Effective Data Science Practice, Berkeley, CA: Apress, 2022, pp. 63–73. doi: 10.1007/978-1-4842-7419-4_6.

[7] K. R. Chowdhary, “Natural Language Processing,” in Fundamentals of Artificial Intelligence, New Delhi: Springer India, 2020, pp. 603–649. doi: 10.1007/978-81-322-3972-7_19.

[8] Smt. Jeetbala, Dr. D. Singh, and K. D. Goyal, “Natural Language Processing (NLP),” in Enhancing Cybersecurity with Machine Learning: A Data-Driven Approach to Detect and Mitigate Threats, Iterative International Publishers, Selfypage Developers Pvt Ltd, 2025, pp. 46–59. doi: 10.58532/nbennurtech6.

[9] A. Malik, A. P. Gefadri, E. Sidik, and A. P. Syadrina, “SoulScripture: Chatbot using Bidirectional Encoder Representations from Transformers as a Medium of Spiritual Guidance,” Khazanah Journal of Religion and Technology, vol. 2, no. 1, pp. 23–27, Aug. 2024, doi: 10.15575/kjrt.v2i1.822.

[10] F. R. A. Sutiyo, N. S. Harahap, S. Agustian, and R. M. Candra, “Implementasi Question Answering Berbasis Chatbot Telegram Pada Tafsir Al-Jalalain Menggunakan Langchain dan LLM,” KLIK: Kajian Ilmiah Informatika dan Komputer, vol. 4, no. 5, pp. 2464–2472, 2024.

[11] R. Shah, S. Lahoti, and K. Lavanya, “An intelligent chat-bot using natural language processing,” International Journal of Engineering Research, vol. 6, no. 5, p. 281, 2017, doi: 10.5958/2319-6890.2017.00019.8.

[12] R. N. E. Anggraini, D. Tursina, and R. Sarno, “Islamic QA with Chatbot System Using Convolutional Neural Network,” Iraqi Journal of Science, pp. 2232–2241, Apr. 2024, doi: 10.24996/ijs.2024.65.4.38.

[13] N. Esfandiari, K. Kiani, and R. Rastgoo, “Transformer-based generative chatbot using reinforcement learning,” Journal of AI and Data Mining, vol. 12, no. 3, pp. 349–358, 2024.

[14] J. Liu, F. Pan, and L. Luo, “GoChat: Goal-oriented Chatbots with Hierarchical Reinforcement Learning,” May 2020.

[15] R. Wirth and J. Hipp, “CRISP-DM: Towards a standard process model for data mining,” in Proceedings of the 4th international conference on the practical applications of knowledge discovery and data mining, Springer-Verlag London, UK, 2000.

[16] N. Oralbayeva et al., “K-Qbot: Language Learning Chatbot Based on Reinforcement Learning,” in 2022 17th ACM/IEEE International Conference on Human-Robot Interaction (HRI), IEEE, Mar. 2022, pp. 963–967. doi: 10.1109/HRI53351.2022.9889428.

[17] M. B. Lone et al., “Self-Learning Chatbots using Reinforcement Learning,” in 2022 3rd International Conference on Intelligent Engineering and Management (ICIEM), IEEE, Apr. 2022, pp. 802–808. doi: 10.1109/ICIEM54221.2022.9853156.

[18] R. B. Yousif, M. G. Abd Alkreem, and A. B. Yousif, “Personalized Chatbot Responses using Reinforcement Learning and User Modeling,” Journal of Education for Pure Science, vol. 14, no. 4, Dec. 2024, doi: 10.32792/jeps.v14i3.462.

[19] A. Onishi, “FRAC-Q-Learning: A Reinforcement Learning with Boredom Avoidance Processes for Social Robots,” Nov. 2024.

[20] X. Wu, “Enhancing Q-Learning with Large Language Model Heuristics,” May 2024.

[21] P. K. R et al., “Deep Reinforcement Learning for natural language understanding and Dialogue Systems,” in 2023 6th International Conference on Recent Trends in Advance Computing (ICRTAC), IEEE, Dec. 2023, pp. 736–741. doi: 10.1109/ICRTAC59277.2023.10480771.

Downloads

Published

2026-03-08

Issue

Section

Articles
Loading...