Table of Contents
Overview
In addition to numbers and tables, MATLAB can work directly with common multimedia formats such as images and audio. For a beginner this is very useful, because you can quickly see and hear the results of your computations. This chapter focuses on tasks that are specific to image and audio files. You will learn how to read these files, view or play them, perform very simple manipulations, and write the results back to disk. More advanced image and signal processing topics will be left for later study.
Image Files: Reading into MATLAB
To work with an image in MATLAB, you first need to load it from disk into memory as an array. Most of the time you will use the function imread for this purpose. The basic usage is very simple.
For example, if you have a file peppers.png in your current folder, you can read it as follows.
I = imread('peppers.png');
The variable I will be an array that contains the image data. The exact size and type of this array depend on the kind of image.
For a typical color image, MATLAB stores the data as an array of size m x n x 3. The first two dimensions represent the image height and width in pixels. The third dimension has size 3 for the red, green, and blue color channels. If you check the size and class you might see something like:
size(I)
class(I)
For many common image formats, imread returns an array of type uint8, which represents integer values from 0 to 255. These values represent the intensity levels of each color channel. For grayscale images, imread typically returns a 2D array of size m x n with one value per pixel.
If you read an indexed image, such as some GIF files, imread can return both an index array and a colormap.
[X, map] = imread('example.gif');
Here X contains integer indices into the color map map. For simple work, it is often easier to convert indexed images to truecolor using ind2rgb, but that topic belongs to image processing, which is not the focus here.
Displaying Images
Once you have an image array in MATLAB, you usually want to display it to check what it looks like. The simplest function for this is imshow.
For a standard truecolor image I read with imread, you can display it with:
imshow(I);
This opens a figure window and shows the image. If you have multiple images, you can display them in separate figures or use subplot layouts, which you will study in plotting chapters. In its most basic form, imshow automatically interprets uint8 arrays with three channels as color images and scales other types appropriately.
If you accidentally use plot(I) instead of imshow(I), MATLAB will treat the data as numeric values and not as pixels, which usually does not give a useful view of the image. imshow is the function that understands image data.
Sometimes you will have an image stored as a numeric array of type double or single instead of uint8. In that case, imshow expects values normalized between 0 and 1, where 0 means black and 1 means maximum intensity. If your double image values fall outside this range, you can scale them or tell imshow how to interpret the range, but that is part of more detailed image handling.
Basic Manipulations of Image Data
Because images are just arrays, you can apply basic array operations to them. Even with little knowledge of image processing, you can perform simple actions specific to image files, such as converting to grayscale or flipping.
One very common task is converting a color image into grayscale. If you have the Image Processing Toolbox, you can use rgb2gray.
I = imread('peppers.png');
Igray = rgb2gray(I);
imshow(Igray);
Here Igray is a 2D array that represents a grayscale version of the original color image. Without the toolbox, you can still work with the color channels directly, for example by selecting just one channel using indexing. Since images are arrays of pixels, you can access particular rows, columns, or channels with standard MATLAB indexing.
You can also flip or rotate images using functions such as flipud and fliplr, which flip arrays upside down or left to right. For instance, to mirror an image horizontally:
I = imread('peppers.png');
I_flipped = fliplr(I);
imshow(I_flipped);All these manipulations operate on the underlying numeric array. The fact that the data represents an image only affects how you interpret the result.
Writing Image Files
After you modify an image in MATLAB, you often want to save it as an image file again. For this you use the imwrite function. The basic pattern is:
imwrite(I, 'output.png');
Here I is your image array, and 'output.png' is the name of the file to create. MATLAB uses the file extension, such as .png, .jpg, or .tif, to choose the file format.
You can also write different types of images, such as grayscale or indexed images, by passing additional inputs such as a colormap for indexed images. For most simple cases, however, it is enough to pass the array and the filename.
You are not limited to the original format. You can read a JPEG file and then write it as PNG, or the other way round, as long as you choose appropriate options for each format. This allows you to convert formats easily.
Audio Files: Reading into MATLAB
Audio files contain sampled sound data, such as speech or music. In MATLAB you typically use the function audioread to bring audio into the workspace.
Assume you have music.wav in your current folder. You can read it as follows.
[y, Fs] = audioread('music.wav');
Here y is an array that contains the audio samples, and Fs is the sampling frequency, measured in Hertz. The sampling frequency tells you how many samples represent one second of audio.
If the file is mono, y is a column vector of size N x 1, where N is the number of samples. If the file is stereo, y is an array of size N x 2, where each column represents one channel, often left and right.
By default, audioread returns data of type double, with values typically in the range from −1 to 1. These numbers represent the amplitude of the sound wave at each sample time.
You can also read a specific portion of an audio file by giving a sample range instead of reading the whole file. This is a useful technique to avoid loading very large files completely into memory.
Listening to Audio
To listen to an audio signal stored in MATLAB, you can use the sound or soundsc functions. They send the audio data to your computer's sound hardware.
Once you have the data y and the sampling frequency Fs, you can play it like this:
[y, Fs] = audioread('music.wav');
sound(y, Fs);
The function plays the sound once and then returns control to the Command Window after the sound finishes. The duration of the playback depends on the length of the vector y and the sampling frequency Fs.
The function soundsc is similar but automatically scales the signal to use the available dynamic range, which can be helpful if your audio is very quiet.
Since audio is just numerical data, you can also plot it to see the waveform. That is part of general plotting, which is covered elsewhere, but it is common to plot a small segment of y versus time to visualize the signal.
Basic Manipulations of Audio Data
Because audio signals are numeric arrays, you can apply basic array operations to them as well. Even without advanced signal processing, you can do simple manipulations that are specific to audio files.
One simple modification is changing the volume. Since the amplitude corresponds to loudness, you can scale the audio by multiplying the array y by a constant factor.
[y, Fs] = audioread('music.wav');
y_quiet = 0.5 * y; % reduce volume
sound(y_quiet, Fs);You must be careful not to scale so much that the values exceed the −1 to 1 range, because that can cause distortion when you play or write the audio. If you suspect the values might exceed the range, you can normalize them by dividing by the maximum absolute value.
You can also do simple editing operations, such as selecting a part of the audio by indexing a range of samples. For example, to play only the first two seconds you can compute the corresponding number of samples as 2 * Fs and then index into y.
Stereo data, stored as two columns, can be manipulated per column. For example, you can mute one channel by setting its column to zero.
All of these operations treat the audio as numeric arrays. The only reason they are considered specific to audio is that you listen to the result.
Writing Audio Files
To save audio from MATLAB to a file, you use the audiowrite function. The simplest usage is:
audiowrite('output.wav', y, Fs);
Here y is the signal array, and Fs is the sampling frequency. The file extension .wav tells MATLAB to write a WAV file. You can also choose other formats, such as .flac or .m4a, depending on your MATLAB version and platform support.
Before writing, you should ensure that y is in an appropriate numeric range and type. For typical double precision data, you want values in the interval from −1 to 1. If you pass data outside that range, audiowrite will usually clip or scale, which may degrade the sound.
You can adjust additional properties such as bit depth and compression by providing extra name-value pairs to audiowrite, but those are advanced details. For basic usage, giving the filename, data, and sampling frequency is sufficient.
Working with File Paths for Media
When working with image and audio files, the rules for paths and folders are the same as for other data files. The functions imread, imwrite, audioread, and audiowrite all accept full or relative paths, not only filenames in the current folder.
For example, to read an image in a subfolder:
I = imread('images/peppers.png');And to save an audio file into a specific folder:
audiowrite('results/output.wav', y, Fs);
You can combine this with functions for file management covered elsewhere, such as dir or fullfile, to build more robust code that processes many media files in batches.
Combining Basic Media Operations
Once you understand how image and audio data appear as arrays, you can combine reading, simple manipulation, and writing to create small but useful scripts. For instance, you can read an image, convert it to grayscale, and save the result using just a few lines of code. Or you can read an audio file, lower its volume or trim it, and write an edited version.
These tasks introduce you to an important pattern. MATLAB loads external data into arrays, you operate on those arrays using core language features, and then you write the results back out. Images and audio files are special only in how they are visualized or heard. Inside MATLAB they are just arrays that you can inspect, modify, and use like any other data.
Important points to remember:
imread reads image files into arrays. Use imshow to display them, and imwrite to save them back to disk.
Color images usually appear as m x n x 3 arrays, often of type uint8. Grayscale images are 2D arrays.
audioread reads audio files into numeric arrays and returns the sampling frequency. Use sound or soundsc to play the audio, and audiowrite to save it.
Images and audio in MATLAB are just arrays. You can apply standard array operations for simple manipulations before writing new files.