Shaked Zychlinski 🎗️
1 min readJun 5, 2019

--

Hi Arka, as I mentioned in the text — you copy the first network into the target network every once in a while. In a regular DQN learning, the first predictions that are made come from pure random weights, yet you use them for training. That’s the same thing here, just on longer periods.

--

--

Shaked Zychlinski 🎗️
Shaked Zychlinski 🎗️

No responses yet