Tuesday, October 4, 2016

Running TensorFlow natively on Windows 10

TensorFlow is a library for evaluating numerical expressions of high-rank arrays (a.k.a. “multi-dimensional arrays” or “tensors”, which sometimes may actually represent tensors in the mathematical sense), a capability that is crucial for many scientific computing tasks. TensorFlow, however, specifically targets machine learning tasks, in particular ‘deep learning,’ whose practical viability critically depends on highly efficient multi-dimensional algebra routines and highly efficient high-dimensional gradient calculation. While TensorFlow’s core is implemented in C++, it comes with a Python API that enables interactive experimentation with the library.

Unfortunately TensorFlow—or rather: its build system—hasn’t yet been ported to Windows (but the guys are working on it). Until then, one can get by using Docker containers or running full-blown Linux VMs. With the introduction of the WSL (Windows Subsystem for Linux) as a part of Windows 10 Anniversary Update, however, it has become possible to run the Linux-version of TensorFlow on Windows in its Ubuntu user space (CPU only, sadly). WSL is still in beta, so there are some quirks to be expected. Follow the instructions below to set up a working IPython-TensorFlow environment on Windows 10 Pro:

Step 1: Activate WSL and “Bash on Ubuntu on Windows”

First you need to activate the Linux Subsystem and install the Ubuntu user land. First, open Windows Settings (Windows + I) and click on “Windows Update, recovery, backup” (Fig. 1).

Figure 1: Click on “Update” ...
Click on “Use developer features” and enable “developer mode.” (Fig. 2). This might take a while and you may have to restart your machine afterwards.

Figure 2: ... to enable developer mode.
Now we need to activate the Linux subsystem. Open the (classical) control panel and navigate to "Programs" → "Turn Windows features on or off" (Fig. 3 & 4).

Figure 3

Figure 4
Select "Windows Subsytem for Linux (Beta)" in the dialog and click OK. You'll probably have to restart your machine again (Fig. 5).

Figure 5
Now you should be able to run "bash" (either from the start menu or from a cmd prompt), which guides you through the further installation process (c.f. WSL Installation Guide). After this procedure, there should be a new start menu entry "Bash on Ubuntu on Windows" (Fig. 6).

Figure 6

Step 2 (optional): Install mintty for WSL

When you use the aforementioned short cut, a bash shell starts in a conventional cmd.exe console host. While that is perfectly useable, I personally much prefer mintty (Cygwin's default terminal emulator). Luckily, there's already a version for WSL available: Just download the installer, run it, et voilà, ready to go. The installer also configures the explorer context menu to contain a handy "WSL in Mintty Here" shortcut, which opens a bash session in the current path.
Figure 7: mintty hosting a bash shell

Step 3: Install Anaconda

Anaconda from Continuum Analytics is the computational science Python distribution. While we could also simply use the default Python distribution from the Ubuntu repositories, Anaconda comes with Intel's MKL and thus a substantial performance boost (not to mention its potent conda package manager). Start a bash shell and download the Anaconda installer by running the following commands

    $ cd ~
    $ wget https://repo.continuum.io/archive/Anaconda3-4.2.0-Linux-x86_64.sh
    $ chmod +x Anaconda3-4.2.0-Linux-x86_64.sh
    $ ./Anaconda3-4.2.0-Linux-x86_64.sh

Note that this installs Anaconda into your WSL home directory. You could install it "system-wide" using sudo, but as WSL environments are per-Windows-user anyway, there isn't much point in doing so. At some point the installer will ask you, whether it should add Anaconda to your Linux PATH, effectively making it the default Python. Confirm by entering "yes" (Fig. 8).

Figure 8: YES!!!
You may have to start a new bash session in order to make the PATH change effective. Alternatively you can "source" (reload/re-execute) .bashrc via

    $ . ~/.bashrc

Now you should be able to run

    $ ipython

And see a message like this:

        Python 3.5.2 |Anaconda 4.2.0 (64-bit)| (default, Jul  2 2016, 17:53:06)
        Type "copyright", "credits" or "license" for more information.

        IPython 5.1.0 -- An enhanced Interactive Python.
        ?         -> Introduction and overview of IPython's features.
        %quickref -> Quick reference.
        help      -> Python's own help system.
        object?   -> Details about 'object', use 'object??' for extra details.
   
        In [1]:

Yet, when you enter the command (IPython magic)

        In [1]: %pylab

Python will throw some PyQt4 error at us:

      --->   31 from .qt_compat import QtCore, QtGui, QtWidgets, _getSaveFileName, __version__
             32 from matplotlib.backends.qt_editor.formsubplottool import UiSubplotTool
             33

        /home/niemeyer/anaconda3/lib/python3.5/site-packages/matplotlib/backends/qt_compat.py in ()
            135     # have been changed in the above if block
            136     if QT_API in [QT_API_PYQT, QT_API_PYQTv2]:  # PyQt4 API
        --> 137         from PyQt4 import QtCore, QtGui
            138
            139         try:

        ImportError: No module named 'PyQt4'

Step 4: Fix Matplotlib PyQt4 Error

The above error is already known by the Anaconda developers. Sadly, the proposed solutions like explicitly selecting the Qt5Agg backend or downgrading to Qt4 didn't work for me. What did work was switchting to the TkAgg. For that you need to create a new text file

    $ vi ~/.config/matplotlib/matplotlibrc

(use nano, if you can't handle vi...) and add the following line

    backend : TkAgg

When you now start IPython again, executing %pylab should work fine ...

        In [1]: %pylab
        Using matplotlib backend: TkAgg
        Populating the interactive namespace from numpy and matplotlib

... only to run into the next error when trying to create a little test plot:

        In [2]: x = linspace(0, 10, 1000)
        In [3]: plot(x, x**2)
        OMP: Error #100: Fatal system error detected.
        OMP: System error #22: Invalid argument

Step 5: Work Around OpenMP Error

The previous error also is already known, though not yet fixed. To work around this bug(?), edit your .bashrc

    $ vi ~/.bashrc

and add the line

    export KMP_AFFINITY=disabled

to the end of the file. Run

    $ . ~/.bashrc

again and re-try %pylab and plotting. This time, IPython will reward us with a new error message:

        -> 1868         self.tk = _tkinter.create(screenName, baseName, className, interactive, wantobjects, useTk, sync, use)
           1869         if useTk:
           1870             self._loadtk()

        TclError: no display name and no $DISPLAY environment variable

Step 6: Install X11 Server and set $DISPLAY

Matplotlib needs an X server to draw its plot windows. Nowadays I recommend VcXsrv, which is easy to install and just works out of the box. You could use Cygwin/X or Xming, but at least the former requires some fiddeling with its setting for it to work with WSL.

After having installed and started your X server of choice, edit your .bashrc again to add the following line

    export DISPLAY=:0.0

Again,

        $ . ~/.bashrc

Now, IPython/matplotlib should finally work.

Step 7: Install TensorFlow

The TensorFlow installation itself is pretty straight-forward: Execute

    $ conda install -c conda-forge tensorflow

Alongside of raw TensorFlow, you may also want to install a deep learning library like Keras, which is easily installed via PIP

    $ pip install keras

When you now start again IPython and enter

    import keras

you should get a little "using TensorFlow backend" message, indicating your successful installation of TensorFlow on Windows!

Figure 9: When installed correctly, Keras defaults to TensorFlow as its backend
Figure 10: Training a (small) CNN using TensorFlow on WSL