## ChainerRL - Deep Reinforcement Learning Library

Chainer-based deep reinforcement learning library, ChainerRL has been released. https://github.com/pfnet/chainerrl

(This post is translated from the original post written by Yasuhiro Fujita.)

### Algorithms

ChainerRL contains a set of Chainer implementations of deep reinforcement learning (DRL) algorithms. The followings are implemented and accessible under a unified interface.

### Examples

The ChainerRL library comes with many examples such as video gameplay of Atari 2600 using A3C,

and learning to control humanoid robot using DDPG.

### How to use

Here is a brief introduction to ChainerRL.

First, user must provide an appropriate definition of the problem (called “environment”) that is to be solved using reinforcement learning. The format of defining the environment in ChainerRL follows that of OpenAI’s Gym (https://github.com/openai/gym), a benchmark toolkit for reinforcement learning. ChainerRL can be used either with Gym or an original implementation of environment. Basically, the environment should have two methods, reset() and step().

In DRL, neural networks correspond to policy that determines an action given a state, or value functions (V-function or Q-function), that estimate the value of a state or action. The parameters of neural network models are then updated through training. In ChainerRL, policies and value functions are represented as a Link object in Chainer that implements __call__() method.

Then “Agent” can be defined given the model, an optimizer in Chainer, and algorithm-specific parameters. Agents execute the training of the model through interactions with the environment.

After creating the agent, training can be done either by user’s own training loop,

or a pre-defined training function as follows.

We also provide a quickstart guide to start playing with ChainerRL.

As ChainerRL is currently a beta version, feedbacks are highly appreciated if you are interested in reinforcement learning. We are planning to keep improving ChainerRL, by making it easier to use and by adding new algorithms.