Emmanouil Tzorakoleftherakis,Mathworks
在reinforcement learning Designer应用程序中使用可视化交互工作流设计、训练和模拟强化学习代理。使用该应用程序在reinforcement learning Toolbox™中设置一个强化学习问题,而无需编写MATLAB®代码。通过整个钢筋学习工作流程进行:
作为MATLAB的R2021a发行,强化学习工具箱可以交互设计,培训和模拟RL代理新强化学习设计的应用程序。从命令行或从MATLAB ToolStrip打开应用程序。首先,您需要创建代理程序将培训的环境对象。钢筋学习设计器允许您从MATLAB工作区导入环境对象,从多个预定义的环境中选择,或创建自己的自定义环境。对于此示例,让我们创建一个带有离散动作空间的预定义的购物车POL MATLAB环境,我们还将导入一个带有来自MATLAB工作空间的连续动作空间的4脚机器人的自定义Simulink环境。万博1manbetx您可以根据需要从环境窗格中删除或重命名环境对象,您可以在预览窗格中查看观察和操作空间的尺寸。要创建代理,请单击“强化学习”选项卡上的“代理部分”中的“新建”。根据所选环境,以及观察和行动空间的性质,该应用程序将显示兼容的内置培训算法列表。对于这个演示,我们将选择DQN算法。该应用程序将生成具有默认批评架构的DQN代理。 You can adjust some of the default values for the critic as needed before creating the agent. The new agent will appear in the Agents pane and the Agent Editor will show a summary view of the agent and available hyperparameters that can be tuned. For example let’s change the agent’s sample time and the critic’s learn rate. Here, we can also adjust the exploration strategy of the agent and see how exploration will progress with respect to number of training steps. To view the critic default network, click View Critic Model on the DQN Agent tab. The Deep Learning Network Analyzer opens and displays the critic structure. You can change the critic neural network by importing a different critic network from the workspace. You can also import a different set of agent options or a different critic representation object altogether. Click Train to specify training options such as stopping criteria for the agent. Here, let’s set the max number of episodes to 1000 and leave the rest to their default values. To parallelize training click on the Use Parallel button. Parallelization options include additional settings such as the type of data workers will send back, whether data will be sent synchronously or not and more. After setting the training options, you can generate a MATLAB script with the specified settings that you can use outside the app if needed. To start training, click Train. During the training process, the app opens the Training Session tab and displays the training progress. If visualization of the environment is available, you can also view how the environment responds during training. You can stop training anytime and choose to accept or discard training results. Accepted results will show up under the Results Pane and a new trained agent will also appear under Agents. To simulate an agent, go to the Simulate tab and select the appropriate agent and environment object from the drop-down list. For this task, let’s import a pretrained agent for the 4-legged robot environment we imported at the beginning. Double click on the agent object to open the Agent editor. You can see that this is a DDPG agent that takes in 44 continuous observations and outputs 8 continuous torques. In the Simulate tab, select the desired number of simulations and simulation length. If you need to run a large number of simulations, you can run them in parallel. After clicking Simulate, the app opens the Simulation Session tab. If available, you can view the visualization of the environment at this stage as well. When the simulations are completed, you will be able to see the reward for each simulation as well as the reward mean and standard deviation. Remember that the reward signal is provided as part of the environment. To analyze the simulation results, click on Inspect Simulation Data. In the Simulation Data Inspector you can view the saved signals for each simulation episode. If you want to keep the simulation results click accept. When you finish your work, you can choose to export any of the agents shown under the Agents pane. For convenience, you can also directly export the underlying actor or critic representations, actor or critic neural networks, and agent options. To save the app session for future use, click Save Session on the Reinforcement Learning tab. For more information please refer to the documentation of Reinforcement Learning Toolbox.
您还可以从以下列表中选择一个网站:
选择中国网站(以中文或英文)以获取最佳网站性能。其他MathWorks国家网站未优化您的位置。