Deep Reinforcement Learning
DRL is for:
Optimal control
Decision making
Course Overview
Deep Reinforcement Learning (DRL) is becoming the forefront and hottest branch of Artificial Intelligence today. Unlike Computer Vision, DRL mainly solves decision-making and optimal control problems, and is considered to be the core technology of future Artificial Intelligence.
Some of the applications using DRL:
- It improves the accuracy of Computer Vision;
- It is used to design/select the best neural network structure;
- It is used in autonomous systems such as autonomous driving and robot learning;
- It is used for optimizing financial transaction systems and city management systems;
- It is used for optimizing manufacturing processes.
The Objectives
Upon completion of the program, the students will be able to:
- Understand the concepts and the most popular algorithms in DRL;
- Understand how to choose a DRL algorithm for a problem to be solved;
- Understand how to train a DRL model and the skills associated with it;
- Understand how DRL is used in many hot applications such as financial investment, Computer Vision, robot learning, and self-driving vehicles.
- Finally receive the certificate signed by Dr. Thomas Gold, AAFIE Chairman and the professor of the University of California Berkeley.
The Prerequisites
You are expected to have a basic understanding of Machine Learning prior to taking the program.
The main contents are as follows:
PART 1: FUNDAMENTALS
1. Basic Concepts of Reinforcement Learning
o Model Free and Model Based Learning
o Probability Distribution
o Stationary and Non-stationary
o Policy
o Value Function
o Q-Function
2. Value and Policy Iteration
o Value Iteration
o Policy Iteration
3. Prediction Problems
o Monte Carlo Learning
o TD Learning
o TD (Lambda)
4. Policy Control Problems
o Q-Learning
o Bellman Equation
o Deep Q-Learning (DQN)
o Case Studies
5. Improvements to DQN
o Double DQN
o Dueling Network Architecture
o Soft Q-Learning
o Recent Papers on Improvements of DQN
6. Sarsa
o Sarsa Algorithm
o Q-Learning vs Sarsa
7. Policy Gradient Methods
o Introduction to Policy Gradient Methods
o Vanilla Policy Gradient
o REINFORCEMENT Algorithm
o Actor-Critic
o A3C
o A2C
o Natural Policy Gradient TRPO
o Proximal Policy Optimization (PPO)
o Deterministic Policy Gradient (DPG)
o Deep Deterministic Policy Gradient (DDPG)
8. Deep Imitation Learning
o Demonstration
o DAGGER
o Few-shot imitation learning
o Policy aggregation
o Policy gradient with demonstrations
9. Soft Actor-Critic and Applications
o Maximum entropy RL
o Soft policy and soft actor-critic
o The optimization problem
o Soft Actor-Critic algorithm
o Applications
10. Meta Learning
o RL2 - Fast Reinforcement Learning Via Slow Reinforcement Learning
o A Simple Neural Attentive Meta-Learner
o Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks
11. Hierarchical Reinforcement Learning
o Data efficient hierarchical RL
o FeUdal networks
12. Vision-based Robotic Manipulation
o Imagined goals
o QT-Opt
PART 2. APPLICATIONS
We will cover each of the following applications in details:
o DRL for Computer Vision
o DRL for neural network structure learning
o DRL for autonomous driving
o DRL for product design
o DRL for robot learning
o DRL for financial investment