{ "nbformat": 3, "nbformat_minor": 0, "worksheets": [ { "cells": [ { "source": [ "> This is one of the 100 recipes of the [IPython Cookbook](http://ipython-books.github.io/), the definitive guide to high-performance scientific computing and data science in Python.\n" ], "cell_type": "markdown", "metadata": [] }, { "source": [ "# 4.9. Processing huge NumPy arrays with memory mapping" ], "cell_type": "markdown", "metadata": {} }, { "cell_type": "code", "language": "python", "outputs": [], "collapsed": false, "input": [ "import numpy as np" ], "metadata": {} }, { "source": [ "## Writing a memory-mapped array" ], "cell_type": "markdown", "metadata": {} }, { "source": [ "We create a memory-mapped array with a specific shape." ], "cell_type": "markdown", "metadata": {} }, { "cell_type": "code", "language": "python", "outputs": [], "collapsed": false, "input": [ "nrows, ncols = 1000000, 100" ], "metadata": {} }, { "cell_type": "code", "language": "python", "outputs": [], "collapsed": false, "input": [ "f = np.memmap('memmapped.dat', dtype=np.float32, \n", " mode='w+', shape=(nrows, ncols))" ], "metadata": {} }, { "source": [ "Let's feed the array with random values, one column at a time because our system memory is limited!" ], "cell_type": "markdown", "metadata": {} }, { "cell_type": "code", "language": "python", "outputs": [], "collapsed": false, "input": [ "for i in range(ncols):\n", " f[:,i] = np.random.rand(nrows)" ], "metadata": {} }, { "source": [ "We save the last column of the array." ], "cell_type": "markdown", "metadata": {} }, { "cell_type": "code", "language": "python", "outputs": [], "collapsed": false, "input": [ "x = f[:,-1]" ], "metadata": {} }, { "source": [ "Now, we flush memory changes to disk by removing the object." ], "cell_type": "markdown", "metadata": {} }, { "cell_type": "code", "language": "python", "outputs": [], "collapsed": false, "input": [ "del f" ], "metadata": {} }, { "source": [ "## Reading a memory-mapped file" ], "cell_type": "markdown", "metadata": {} }, { "source": [ "Reading a memory-mapped array from disk involves the same memmap function but with a different file mode. The data type and the shape need to be specified again, as this information is not stored in the file." ], "cell_type": "markdown", "metadata": {} }, { "cell_type": "code", "language": "python", "outputs": [], "collapsed": false, "input": [ "f = np.memmap('memmapped.dat', dtype=np.float32, shape=(nrows, ncols))" ], "metadata": {} }, { "cell_type": "code", "language": "python", "outputs": [], "collapsed": false, "input": [ "np.array_equal(f[:,-1], x)" ], "metadata": {} }, { "cell_type": "code", "language": "python", "outputs": [], "collapsed": false, "input": [ "del f" ], "metadata": {} }, { "source": [ "> You'll find all the explanations, figures, references, and much more in the book (to be released later this summer).\n\n> [IPython Cookbook](http://ipython-books.github.io/), by [Cyrille Rossant](http://cyrille.rossant.net), Packt Publishing, 2014 (500 pages)." ], "cell_type": "markdown", "metadata": {} } ], "metadata": {} } ], "metadata": { "name": "", "signature": "sha256:6c2f964d6e248336692081c284f18e3e7aa2a75206ce92e8de6e42f49d88fe6d" } }