Handmade RL -1

2018-05-26

Unity로 Machine learning Agent 만들기 : First story : 기본구성 설명

wonseok Jung


Getting Started with the 3D Balance Ball Environment


Understanding a Unity Environment (3D Balance Ball)

  • An agent observes and interatcs with an environment

  • In Unity, an environment contains :

    • Academy
    • One or more Brain
    • Agent objects

1. Academy

  • The Academy object for the scene is placed on the Ball3DAcademy Gameobject
  • Several properties that control how the environment works
    • Training and Inference Configuration : set the graphcs and timescale
    • Training Configuration : academy uses it during training
    • Inference Configuration : when not training

1.1 Setting graphics and time

  • Training configuration : low graphics quality, high time scale
  • Inference Configuration : High graphics quality, time scale 1.0
  • Observing the environment during traing
    • Adjeust the Inference Configuration : use larger window, timecale closer to 1:1

1.2 Three functions

  • There are three functions you can implement
    1. Acdemy.InitializeAcademy() : Called once when the envrionment is launched
    2. Academy.AcademyStep() : Called at every simulation step before Agent.AgentAction() and after the agents collect observation
    3. Academy.AcademyReset() : when Academy starts or restarts simulations

2. Brain

  • Brain doesn’t store inofrmation about an agent
  • Routes the agent’s collected observations to the decision making process and returns the chosen action to the agent
  • All agents can share the same brain, but act differently

2.1 Type of Brains

  • Brain Type : how an agent makes its decisions
    • External type : when you train your agents
    • Internal type : when you use the trained model
    • Heuristic brain : allow you to handcode tha agent’s logic
    • Player brain : lets you map keyboard commands to actions,

You can also implement your own type of brain


3. Vector Observation Space

  • ML-Agents classfies vector observation into two types:

    • Continuous : vector of floating point of numbers
    • Discrete : index into a table of states

4. Vector Action Space

  • An agent is given instructions from the brain in the form of actions.

  • Two types of action:

    • Continuous :vector of numbers
    • Can vary continuously (force, torque)
    • Discrete : action space defines its actions as a table(index to this table)

5. Agent

  • Agent is the actor that obeserves and takes action in the Environment
  • Agent object has few properties that afftect behavior:
    • Brain : Every agent must have a brain.Brain determines how an agent make decisions
    • Visual observations - Camera objects, used by agent to obseve the envrionment.
    • Max Step - How many simulation steps can occur before the agent decides it is done.
    • Reset On Done - Defines whether an agent over when it is finished

5. Agent subclass implementation

  • Agent.AgentReset() - When the Agent resets and beginning of a session.
  • Agent.CollectObservations() - Called every simulation step, Collecting the agent’s obeservation
  • Agent.AgentAction() - Called every simulation step, receive
  • action chosen by the brain