The tools we use influence the research we do, and while there are many good RL tools out there, there are still areas where tools need to be built.
In this post, we’ll talk about the REINFORCE policy gradient algorithm and the Advantage Actor Critic (A2C) algorithm. I’ll be assuming familiarity with at least the basics of RL. If you need a refresher, I have a blog post on this here. For an interactive version of this blog post, see this colab notebook.
Last year I was fortunate enough to attend NeurIPS 2019. It was an amazing experience, I was able to meet lots of smart people and learned a ton. This post discusses my time at NeurIPS 2019
In this post, we’ll get into the weeds with some of the fundamentals of reinforcement learning. Hopefully, this will serve as a thorough overview of the basics for someone who is curious and doesn’t want to invest a significant amount of time into learning all of the math and theory behind the basics of reinforcement learning.
I recently migrated my blog from the Svbtle platform onto GitHub pages. This post briefly talks about that experience and why I moved away from Svbtle.