Deep Reinforcement Learning

 

Deep Reinforcement Learning

 

DRL is for:

 

Optimal control

Decision making

 

 

 

 

Course Overview

Deep Reinforcement Learning (DRL) is becoming the forefront and hottest branch of Artificial Intelligence today. Unlike Computer Vision, DRL mainly solves decision-making and optimal control problems, and is considered to be the core technology of future Artificial Intelligence.

Some of the applications using DRL:

- It improves the accuracy of Computer Vision;
- It is used to design/select the best neural network structure;
- It is used in autonomous systems such as autonomous driving and robot learning;
- It is used for optimizing financial transaction systems and city management systems;
- It is used for optimizing manufacturing processes.

The Objectives

 Upon completion of the program, the students will be able to:

- Understand the concepts and the most popular algorithms in DRL;
- Understand how to choose a DRL algorithm for a problem to be solved;
- Understand how to train a DRL model and the skills associated with it;
- Understand how DRL is used in many hot applications such as financial investment, Computer Vision, robot learning, and self-driving vehicles.
- Finally receive the certificate signed by Dr. Thomas Gold, AAFIE Chairman and the professor of the University of California Berkeley.

The Prerequisites

You are expected to have a basic understanding of Machine Learning prior to taking the program.

The main contents are as follows:

PART 1: FUNDAMENTALS

1. Basic Concepts of Reinforcement Learning

o Model Free and Model Based Learning
o Probability Distribution
o Stationary and Non-stationary
o Policy
o Value Function
o Q-Function

2. Value and Policy Iteration

o Value Iteration
o Policy Iteration

3. Prediction Problems

o Monte Carlo Learning
o TD Learning
o TD (Lambda)

4. Policy Control Problems

o Q-Learning
o Bellman Equation
o Deep Q-Learning (DQN)
o Case Studies

5. Improvements to DQN

o Double DQN
o Dueling Network Architecture
o Soft Q-Learning
o Recent Papers on Improvements of DQN

6. Sarsa

o Sarsa Algorithm
o Q-Learning vs Sarsa

7. Policy Gradient Methods

o Introduction to Policy Gradient Methods
o Vanilla Policy Gradient
o REINFORCEMENT Algorithm
o Actor-Critic
o A3C
o A2C
o Natural Policy Gradient TRPO
o Proximal Policy Optimization (PPO)
o Deterministic Policy Gradient (DPG)
o Deep Deterministic Policy Gradient (DDPG)

8. Deep Imitation Learning

o Demonstration
o DAGGER
o Few-shot imitation learning
o Policy aggregation
o Policy gradient with demonstrations

9. Soft Actor-Critic and Applications

o Maximum entropy RL
o Soft policy and soft actor-critic
o The optimization problem
o Soft Actor-Critic algorithm
o Applications

10. Meta Learning

o RL2 - Fast Reinforcement Learning Via Slow Reinforcement Learning
o A Simple Neural Attentive Meta-Learner
o Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks

11. Hierarchical Reinforcement Learning

o Data efficient hierarchical RL
o FeUdal networks

12. Vision-based Robotic Manipulation

o Imagined goals
o QT-Opt

PART 2. APPLICATIONS

We will cover each of the following applications in details:

o DRL for Computer Vision
o DRL for neural network structure learning
o DRL for autonomous driving
o DRL for product design
o DRL for robot learning
o DRL for financial investment