Chainer supports CUDA computation. It only requires a few lines of code to leverage a GPU. It also runs on multiple GPUs with little effort.
Chainer supports various network architectures including feed-forward nets, convnets, recurrent nets and recursive nets. It also supports per-batch architectures.
Forward computation can include any control flow statements of Python without lacking the ability of backpropagation. It makes code intuitive and easy to debug.
Install Chainer:
pip install chainer
Run the MNIST example:
wget https://github.com/chainer/chainer/archive/v7.8.1.tar.gz
tar xzf v7.8.1.tar.gz
python chainer-7.8.1/examples/mnist/train_mnist.py
Learn more from the official documentation.