============================ Read, Write and Play Sound ============================ .. sidebar:: Miles Davis .. figure:: /figures/kindofblue.jpeg As a sound file we will use the famous recording of "So What" from the record "Kind of Blue" played by the Miles Davis combo (really: buy that album or stream it). The audio file is available as an :download:`mp3 file` and as an :download:`ogg file `.. We assume that `PyAudio` and `soundfile` are installed. There are basically two ways to read a file: reading all data into memory at once and secondly reading the data in small chunks (buffers or blocks). We will show both of them. .. exec_python:: read_write_sf read_write_play :results: show import numpy as np import soundfile as sf FILENAME = "source/data/sowhat.ogg" print(sf.info(FILENAME)) data, samplerate = sf.read(FILENAME) print() print("data: ", data.shape) print("sample rate: ", samplerate) Please note that the ordering of dimensions in the data array is :code:`(no_frames, no_channels)`, other packages may use the other ordering. In some cases you need to transpose the data (for instance to play it). To play the audio data is dependent on the machine/program you are working with. For instance in an IPython notebook it is relatively easy (using the browser capabilities to deal with sound). .. figure:: /figures/play_audio_jupyter.png **Playing an audio fragment (numpy array) in a Jupyter notebook.** Another way of feeding the audio data to the sound system we can also use the portaudio library. Instead of reading an entire audio file into a numpy array as done above. We could read the file an chunks (buffers or blocks) and feed these one after the other to the audio system. To do so we switch to using the hardware audio system (Alsa on Linux systems) through the portaudio package. Portaudio is a cross platform library to access the audio system of the machine you are working on. In short, an audio signal is chopped up in blocks and those are obtained (using soundfile in case of a file and portaudio in case of for instance a microphone attached to your computer), then optionally the blocks are processed (DSP) and then fed to the audio system in blocks for audio rendering. .. code-block:: python import soundfile as sf import pyaudio import numpy as np pa = pyaudio.PyAudio() FILE = "../../data/sowhat.wav" chunk = 1024 f = sf.SoundFile(FILE) print(f.channels) print(f.samplerate) stream = pa.open(format=pyaudio.paInt32, channels=f.channels, rate=f.samplerate, frames_per_buffer=chunk, output=True) duration = 120 # play the file for the first 2 minutes nblocks = duration * f.samplerate // chunk for i, b in enumerate(f.blocks(blocksize=CHUNK, dtype=np.int32)): if i >= nblocks: break stream.write(b.tobytes()) stream.stop_stream() stream.close() pa.terminate() The logic of the program above is quite simple. * open the audio file * open the default output stream * read the blocks (chunks) sequentially from the file and feed them to the output stream. Note that the :code:`soundfile` package returns numpy arrays whereas the :code:`pyaudio` package works with raw databuffers. That is why we need the :code:`tobytes` method for conversion. Also note that in the loop where we 'feed' the output stream processing other than this audio processing is not simple to do. You can do some processing in the for loop but it is your responsibility to be in time feeding the output stream with new data. This **blocking mode** audio processing is therefore in most cases not the recommended way to do things. It is better to let the output stream 'ask' for data when needed: the **callback mode**. In this mode you pass a function when opening the stream. The callback function then gets called whenever the stream needs data. .. code-block:: python import soundfile as sf import pyaudio import numpy as np import time FILE = "../../data/sowhat.wav" chunk = 1024 pa = pyaudio.PyAudio() f = sf.SoundFile(FILE) def callback(in_data, frame_count, time_info, status): data = f.read(chunk, dtype='int32').tobytes() return (data, pyaudio.paContinue) stream = pa.open(format=pyaudio.paInt32, channels=f.channels, rate=f.samplerate, frames_per_buffer=chunk, output=True, stream_callback=callback) stream.start_stream() start_time = time.time() duration = 120 # play the first 2 minutes while stream.is_active() and time.time() - start_time < duration: time.sleep(0.1) # just waiting for the stream to finish stream.stop_stream() stream.close() pa.terminate() Observe that the while loop is only used to wait either for the entire file to be played or the set duration to be exceeded. In an application that time would be spent on calculations, the user interface etc. On my (RvdB) computer the above program ran fine. If you have a slower computer and you experience hick-ups in the sound rendering you might consider the enlarge the chunk size. Then you will need less callbacks from the audio system and thus less time will be spend in relatively slow Python code. We can also use :code:`soundfile` to write audio data to a file (in several formats, see the documentation).