Released Chainer/CuPy v6.0.0

We have released Chainer and CuPy v6.0.0 today! This is a major release that introduces several new features. Full updates can be found in the release notes: Chainer, CuPy.


The biggest update is the introduction of ChainerX. It is a fast and portable ndarray engine with autograd support written in C++ with a very thin Python wrapper.

We have released the beta version of ChainerX in v6.0.0b1 as we wrote in the previous blog post. Since then, we have been working on improving it in various aspects. In particular, ChainerX in v6.0.0 expands the coverage of various features since v6.0.0b1.

  • Wider op coverage. We have more Chainer functions that directly call ChainerX’s low-overhead implementation. The effort is still on going at the tracking issue with the spreadsheet of op-wise implementation status. We continue to expand the op coverage towards the next v7 release. Contributions are always welcomed!
  • Wider Function coverage. Most users will start using ChainerX through Chainer’s existing interface (just by replacing NumPy/CuPy arrays with ChainerX arrays). When ChainerX does not have an implementation for an operation, Chainer automatically falls back to NumPy/CuPy-based implementation. It basically works without any fix for most functions, but sometimes not. We are fixing such bugs to enlarge the coverage of functions for ChainerX usage. The effort is accompanied by the introduction of a test fixture class for function tests (you can find the tracking issue). Currently, 40% of the functions under chainer.functions are already tested with ChainerX. They cover basic array operations resembling routines in NumPy and operations commonly used in convolutional neural networks such as convolution, deconvolution and pooling. Operations for recurrent neural networks will be addressed in the upcoming releases. We hope the coverage will reach 100% in v7. Contributions are always welcomed here, too!
  • Wider example coverage. Most examples now support ChainerX. By specifying ChainerX’s device names (e.g. native for CPU and cuda:0, cuda:1, … for GPUs), examples run with ChainerX arrays. It also means that the coverage of ChainerX support in Chainer’s features in general is expanding.

You can find the previous blog post for its background and overview, and ChainerX Documentation for the installation guide, tutorial, and reference.

Other updates

This release also includes many features other than ChainerX. We list up notable updates as follows.

  • More mixed precision support. Chainer v6 introduces mixed precision mode and dynamic loss scaling for better support of mixed precision training. Mixed precision mode is enabled by setting CHAINER_DTYPE=mixed16 or chainer.global_config.dtype = chainer.mixed16. In this mode, Chainer automatically chooses either float16 or float32 depending on what is appropriate in terms of a performance-to-precision tradeoff. Dynamic loss scaling, originated from Apex, automatically adjusts the scaling coefficient of backprop to avoid underflow.
  • Device API. We introduce a new device API for better interoperability between backends (including ChainerX). It unifies the way in which devices are specified and data is transferred between devices. In particular, a unified device specifier is introduced. It is based on ChainerX’s device specifier of the format 'backend:id', e.g. 'native:0' and 'cuda:N' (where N is the CUDA device id). For native (CPU), the id part can be omitted (like 'native'). For conventional devices backed by NumPy-like modules, the name is @numpy, @cupy:N, and @intel64. This notation can be used, e.g., in the to_device function. Note that the existing APIs related to devices (e.g. to_cpu and to_gpu) are still available.
  • __array_function__ in CuPy. NumPy’s __array_function__ is an experimental feature for letting NumPy dispatch implementations of almost all functions to third-party duck arrays. CuPy now supports this interface. To use this feature, you need to get NumPy 1.16 and set NUMPY_EXPERIMENTAL_ARRAY_FUNCTION=1 (it will hopefully be the default mode in NumPy 1.17). Then, many NumPy functions that CuPy supports will accept CuPy arrays and automatically call CuPy’s implementation.

We recommend updating to the latest version of Chainer and CuPy. You can find the upgrade guide here. Updating Chainer should be done as usual with the command pip install -U chainer. Note that ChainerX is not built by default; see the installation guide of ChainerX for details. CuPy can be updated with pip as well, but be careful to use the appropriate package name if you are using a wheel package (cupy-cuda NN).

Any feedback to the dev team would be welcomed and appreciated. You can ask questions or leave comments at gitter, Slack, Google Groups, and StackOverflow.

About Chainer

Chainer is a Python-based, standalone open source framework for deep learning models. Chainer provides a flexible, intuitive, and high performance means of implementing a full range of deep learning models, including state-of-the-art models such as recurrent neural networks and variational autoencoders.

Recent Posts