Update README.md
Browse files
README.md
CHANGED
|
@@ -12,6 +12,8 @@ This repo contains the model and the notebook [to this Keras example on Deep Det
|
|
| 12 |
|
| 13 |
Full credits to: [Hemant Singh](https://github.com/amifunny)
|
| 14 |
|
|
|
|
|
|
|
| 15 |
## Background Information
|
| 16 |
Deep Deterministic Policy Gradient (DDPG) is a model-free off-policy algorithm for learning continous actions.
|
| 17 |
|
|
@@ -39,4 +41,3 @@ Second, it uses Experience Replay.
|
|
| 39 |
|
| 40 |
We store list of tuples (state, action, reward, next_state), and instead of learning only from recent experience, we learn from sampling all of our experience accumulated so far.
|
| 41 |
|
| 42 |
-

|
|
|
|
| 12 |
|
| 13 |
Full credits to: [Hemant Singh](https://github.com/amifunny)
|
| 14 |
|
| 15 |
+

|
| 16 |
+
|
| 17 |
## Background Information
|
| 18 |
Deep Deterministic Policy Gradient (DDPG) is a model-free off-policy algorithm for learning continous actions.
|
| 19 |
|
|
|
|
| 41 |
|
| 42 |
We store list of tuples (state, action, reward, next_state), and instead of learning only from recent experience, we learn from sampling all of our experience accumulated so far.
|
| 43 |
|
|
|