VENKATA SIVA PRASAD BHARATHULA. The Role of Reward Models and Reinforcement Learning in LLM Fine-tuning. International Journal of Scientific Research in Computer Science, Engineering and Information Technology, [S. l.], v. 11, n. 2, p. 471–477, 2025. DOI: 10.32628/CSEIT25112381. Disponível em: https://www.ijsrcseit.com/index.php/home/article/view/CSEIT25112381. Acesso em: 18 sep. 2025.