{ "metadata": { "name": "", "signature": "sha256:73854a92945ac61aef6eb758cb6e03a7bc472944a85d6959766fcf3a01710087" }, "nbformat": 3, "nbformat_minor": 0, "worksheets": [ { "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "> This is one of the 100 recipes of the [IPython Cookbook](http://ipython-books.github.io/), the definitive guide to high-performance scientific computing and data science in Python." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# 5.10. Interacting with asynchronous parallel tasks in IPython" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You need to start IPython engines (see previous recipe). The simplest option is to launch them from the *Clusters* tab in the notebook dashboard. In this recipe, we use four engines." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "1. Let's import a few modules." ] }, { "cell_type": "code", "collapsed": false, "input": [ "import time\n", "import sys\n", "from IPython import parallel\n", "from IPython.display import clear_output, display\n", "from IPython.html import widgets" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "2. We create a Client." ] }, { "cell_type": "code", "collapsed": false, "input": [ "rc = parallel.Client()" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "3. Now, we create a load balanced view on the IPython engines." ] }, { "cell_type": "code", "collapsed": false, "input": [ "view = rc.load_balanced_view()" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "4. We define a simple function for our parallel tasks." ] }, { "cell_type": "code", "collapsed": false, "input": [ "def f(x):\n", " import time\n", " time.sleep(.1)\n", " return x*x" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "5. We will run this function on 100 integer numbers in parallel." ] }, { "cell_type": "code", "collapsed": false, "input": [ "numbers = list(range(100))" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "6. We execute `f` on our list `numbers` in parallel across all of our engines, using `map_async()`. This function returns immediately an `AsyncResult` object. This object allows us to retrieve interactively information about the tasks." ] }, { "cell_type": "code", "collapsed": false, "input": [ "ar = view.map_async(f, numbers)" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "7. This object has a `metadata` attribute, a list of dictionaries for all engines. We can get the date of submission and completion, the status, the standard output and error, and other information." ] }, { "cell_type": "code", "collapsed": false, "input": [ "ar.metadata[0]" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "8. Iterating over the `AsyncResult` instance works normally; the iteration progresses in real-time while the tasks are being completed." ] }, { "cell_type": "code", "collapsed": false, "input": [ "for _ in ar:\n", " print(_, end=', ')" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "9. Now, we create a simple progress bar for our asynchronous tasks. The idea is to create a loop polling for the tasks' status at every second. An `IntProgressWidget` widget is updated in real-time and shows the progress of the tasks." ] }, { "cell_type": "code", "collapsed": false, "input": [ "def progress_bar(ar):\n", " # We create a progress bar.\n", " w = widgets.IntProgressWidget()\n", " # The maximum value is the number of tasks.\n", " w.max = len(ar.msg_ids)\n", " # We display the widget in the output area.\n", " display(w)\n", " # Repeat every second:\n", " while not ar.ready():\n", " # Update the widget's value with the\n", " # number of tasks that have finished\n", " # so far.\n", " w.value = ar.progress\n", " time.sleep(1)\n", " w.value = w.max" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "code", "collapsed": false, "input": [ "ar = view.map_async(f, numbers)" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "code", "collapsed": false, "input": [ "progress_bar(ar)" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "10. Finally, it is easy to debug a parallel task on an engine. We can launch a Qt client on the remote kernel by calling `%qtconsole` within a `%%px` cell magic." ] }, { "cell_type": "code", "collapsed": false, "input": [ "%%px -t 0\n", "%qtconsole" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The Qt console allows us to inspect the remote namespace for debugging or analysis purposes." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "> You'll find all the explanations, figures, references, and much more in the book (to be released later this summer).\n", "\n", "> [IPython Cookbook](http://ipython-books.github.io/), by [Cyrille Rossant](http://cyrille.rossant.net), Packt Publishing, 2014 (500 pages)." ] } ], "metadata": {} } ] }