Author: Mohamed, Nesma Mostafa Ashraf./ Title: An efficient deep reinforcement learning model for video games /

Search In this Thesis

العنوان

An efficient deep reinforcement learning model for video games /

المؤلف

Mohamed, Nesma Mostafa Ashraf.

هيئة الاعداد

باحث / نسمة مصطفي أشرف محمد

مشرف / مجدي زكريا رشاد

مشرف / ريهام رضا مصطفي

مشرف / رشا حسن صقر

مناقش / مجدي زكريا رشاد

الموضوع

Vedio games - Design. Mass media and children. Television and children.

تاريخ النشر

2021.

عدد الصفحات

online resource (110 pages) :

اللغة

الإنجليزية

الدرجة

ماجستير

التخصص

Computer Science (miscellaneous)

تاريخ الإجازة

1/1/2021

مكان الإجازة

جامعة المنصورة - كلية الحاسبات والمعلومات - علوم الحاسب

الفهرس

Only 14 pages are availabe for public view

from

110

from

110

Abstract

Deep Reinforcement Learning (DRL)academic community now more than ever. Because of the presence of computational apabilities and the deep neural networks revolution that can find compact low-dimensional features of high-dimensional data automatically. DRL agents considered being a promising step towards fully autonomous agents that learn from trial and error with little or no prior knowledge about the environment they were dealing with since it evolves some creative ways in which neural networks can be used to bring us steps closer to create AI agents that can deal with the real world. DRL enables agents to make decisions based on a well-designed reward function that suites a particular environment without any prior knowledge related to the given environment. The choice of values for the hyperparameters of the learning algorithm can, however, have a major effect on the overalllearning process and the learning time needed to finish the training .Those hyperparameters must be accurately defined before the raining began. A standard method to select those hyperparameters is the manual search for a suitable parameter set. To find good hyperparameter sets, appropriate expertise and experience are needed. The implementation of an automated search process for hyperparameters provides a significant advantage, as only the optimal hyperparameters allow the DRL algorithms to produce optimal results for a given task. This thesis uses a swarm-based optimization algorithm called the Whale Optimization Algorithm (WOA) to optimize the selection of the hyperparameters of the Deep Deterministic Policy Gradient (DDPG) algorithm to achieve the optimum control strategy in the Autonomous Driving field. DDPG is a state-of-the-art DRL algorithm that is capable of handling Complex environments that contain continuous spaces for actions. To test the proposed method, a realistic autonomous driving simulationenvironment called The Open Racing Car Simulator (TORCS) was chosen as the environment of investigation. Meanwhile, a set of appropriate sensor information and rewards from TORCS were carefully selected. In the experimental evaluation, the optimized DDPG hyperparameters werecompared with a set of reference hyperparameters which is suggested by an expert. The experimental results showed that the DDPG’s hyperparameters optimization leads to maximizing the total rewards, along with testing episodes and maintaining a stable driving policy.