Kahibaro
Discord Login Register

MAT-Files and Importing External Data

Overview

In MATLAB you will often need to keep data between sessions and to bring in data that was created elsewhere. MATLAB offers its own native file format, called MAT-files, and can also import many external text, spreadsheet, and binary formats. This chapter focuses on how to use MAT-files effectively and how to import external data into MATLAB so that you can continue working with it as variables in your workspace.

What Are MAT-Files

A MAT-file is a binary file format used by MATLAB to store variables. When you save your workspace or selected variables using MATLAB commands or the graphical interface, MATLAB writes them into a .mat file.

MAT-files can store many types of variables in a single file. For example, a MAT-file might contain numeric arrays, tables, structures, or cell arrays together. Each variable keeps its name and type, so when you load it back into MATLAB you recover the variables as they were when saved.

MAT-files are especially useful when you work with large data sets or complex collections of variables that you want to reuse later or share with other MATLAB users.

Saving Variables to MAT-Files

You save variables to a MAT-file with the save command. If you call save without any options, MATLAB saves all variables in the current workspace to a file named matlab.mat in the current folder.

To save to a specific file, provide a file name as a string. For example, to save all current variables into results.mat you can write:

matlab
save('results.mat')

Often you do not want to save every variable, only some of them. You can list variable names after the file name. For example, to save only x and y into the file mydata.mat use:

matlab
save('mydata.mat','x','y')

If a file with that name already exists in the current folder, save overwrites it without asking. If you want to be more deliberate, you should first check with the Current Folder browser or with commands like exist before running save. This habit helps avoid losing previous data.

If you omit the file name and give only variable names, MATLAB uses the default file matlab.mat:

matlab
save x y

This is short, but less explicit, and is usually better avoided in scripts where clarity is important.

MAT-File Versions and Compatibility

MAT-files have internal versions that control which features and data sizes they support. Modern MATLAB releases save in a recent version by default, which handles large files and most data types.

On some systems you may need to choose a particular MAT-file version, for example to share data with someone who uses an older MATLAB release. You can do this through the -v options of save. For instance:

matlab
save('oldformat.mat','x','y','-v7')

This request attempts to save using the version 7 MAT-file format, which is compatible with older MATLAB versions, at the cost of some features. The exact options available depend on your MATLAB release, but the key idea is that the version can be chosen when required.

For very large data, MATLAB uses a format sometimes referred to as v7.3 that supports data sets larger than 2 GB and allows some interaction with external tools such as HDF5 utilities. To request this format explicitly you can write:

matlab
save('bigdata.mat','bigArray','-v7.3')

You only need this if you are dealing with very large arrays or need interoperability with other applications that read HDF5.

Selective Saving and Appending

Sometimes you need to update an existing MAT-file without replacing everything inside it. MATLAB can append new variables to a MAT-file or update existing ones, while leaving other stored variables unchanged.

To append, use the -append option. Suppose the file results.mat already exists with variables x and y, and now you have computed z. You can add z to the file with:

matlab
save('results.mat','z','-append')

If z already existed in the file, this command replaces it with the current value. Any other variables in the file, such as x and y, remain unchanged.

You can also combine -append with a version option, although you should be careful to use a version that is compatible with the existing file. Changing formats within a single file is not supported.

Selective saving is particularly useful inside longer computations. You can periodically save only the most important results as they are produced, rather than rewriting the entire collection of variables every time.

Loading Variables from MAT-Files

To bring data from a MAT-file into your workspace, use the load command. If you run load with only a file name, MATLAB reads all variables from the file into the current workspace.

For example, to load everything from results.mat you write:

matlab
load('results.mat')

After this call, any variables stored in the file appear in your workspace. If a variable already exists in your workspace with the same name, the value from the file overwrites it without warning.

If you want to avoid unintentionally replacing variables, you can load specific variables by name. For example, to load only x and y from results.mat write:

matlab
load('results.mat','x','y')

You can also use load without an extension if the file is a MAT-file in the current folder. MATLAB assumes the .mat extension. For example, load results is equivalent to load('results.mat'), but explicitly writing the full name is usually clearer.

Loading into a Structure Instead of the Workspace

When you want to inspect file contents or avoid polluting the base workspace, it can be helpful to load the contents of a MAT-file into a single structure variable. In this mode, each stored variable becomes a field of the structure.

To do this, call load with an output argument:

matlab
data = load('results.mat');

The result, data, is a structure where the field names match the stored variable names. For example, if results.mat contains variables x and y, then data.x and data.y hold those values.

This method protects your current workspace from accidental overwrites, and it helps keep related data grouped together under one structure. It is particularly helpful when you are writing functions, where keeping track of a clean set of variables is important.

If you load only some variables, then only those become fields:

matlab
subset = load('results.mat','x');

In this case subset has a field x, and not y, even if y is present in the file.

Inspecting MAT-File Contents

Sometimes you need to know what is inside a MAT-file before loading it. MATLAB provides the whos function with a file argument for this purpose. Instead of running whos on the workspace, you can ask for information about the file contents:

matlab
whos('-file','results.mat')

This command lists the variables stored in results.mat, along with their sizes and types, without adding anything to your workspace. This is useful when you receive a MAT-file from someone else and you want to understand what it contains.

You can also capture this information into a structure array by assigning the output of whos. For example:

matlab
info = whos('-file','results.mat');

Then info contains details about each variable, which you can inspect in the Command Window or Variable Editor.

Importing Text and Delimited Data

External data often arrives as text files, for example plain text with columns separated by spaces, tabs, or commas. MATLAB can read such data using several functions. For more control than very simple readers, modern MATLAB encourages the use of readtable or readmatrix for text and comma separated values.

Assume you have a comma separated file data.csv with numeric values in columns. To read it as a numeric array, you can write:

matlab
M = readmatrix('data.csv');

This returns a matrix M where each row corresponds to a line in the file and each column to a separated field. This is convenient for straightforward numeric data.

If the file has a header row with column names or contains mixed types such as numbers and text, readtable is often more suitable. To read the same data.csv as a table, you could write:

matlab
T = readtable('data.csv');

MATLAB then uses the first row as variable names if it looks like a header. Each file column becomes a variable in the table, and you can access it by name.

For text files that are not comma separated, you can specify the delimiter. For example, if measurements.txt uses semicolons between fields:

matlab
T = readtable('measurements.txt','Delimiter',';');

MATLAB then splits the line accordingly.

There are older functions such as dlmread, textread, and csvread. In newer MATLAB versions readmatrix and readtable are recommended, since they are more flexible and consistent.

Handling Headers, Missing Data, and Options

Real world data files often contain descriptive text headers, missing values, or extra columns that you do not need. readtable and related functions allow you to deal with these issues through name value options.

Suppose a CSV file has two lines of header text before the actual column names. You can skip them like this:

matlab
T = readtable('data.csv','NumHeaderLines',2);

If your file uses a particular string to mark missing values, such as NA, you can tell MATLAB to treat those as missing. For example:

matlab
T = readtable('data.csv','TreatAsEmpty','NA');

The resulting table contains missing entries which you can handle later using table and missing value tools.

You can also control which rows and columns are read with options such as Range. For example, to read only a block of data from rows 5 to 20 and columns 2 to 4 in an Excel like range syntax:

matlab
T = readtable('data.csv','Range','B5:D20');

The exact options available depend on the reading function, but the pattern is similar. You supply the file name and then specify additional behavior with name value pairs.

Importing Excel and Spreadsheet Data

Many data sets are stored in Excel files with extension .xlsx or .xls. MATLAB can read these files directly with functions such as readtable, readmatrix, and xlsread in older code.

To read an Excel sheet into a table, you can write:

matlab
T = readtable('experiment.xlsx');

By default this imports the first sheet and tries to detect variable names and data types. If you need a specific sheet, you can name it:

matlab
T = readtable('experiment.xlsx','Sheet','Sheet2');

You can also use the sheet index:

matlab
T = readtable('experiment.xlsx','Sheet',3);

For purely numeric data, readmatrix may be more direct:

matlab
A = readmatrix('experiment.xlsx','Sheet','Results');

You can combine sheet selection with range selection, exactly as for text files, to import only parts of the data. This helps reduce memory use and simplifies your workspace.

Import Tool and Interactive Import

If you prefer not to write import code immediately, MATLAB includes an interactive Import Tool. When you double click a text or spreadsheet file in the Current Folder browser, MATLAB often opens it in the Import Tool. Alternatively you can start it from the Home tab or by right clicking the file and choosing an import option.

The Import Tool displays the file contents in a grid and attempts to interpret columns as numeric, text, or other types. You can choose which columns to import, rename variables, adjust the detected data type, and define how headers and delimiters should be handled. As you adjust the options, the preview updates.

At the end of the process you can import the data directly into the workspace or generate a function that reproduces the same import settings using code. This generated function is useful because it documents how the import is done, and you can rerun it if the source file changes.

Using the Import Tool is an effective way to explore the file structure and then move to scripted imports once you know what settings you need.

Importing Binary and Other External Formats

Some external data is stored in binary formats that are not text or spreadsheets. These can include custom formats from instruments or other programming languages. MATLAB can read arbitrary binary data using functions such as fopen, fread, and fclose. These functions give you full control over how bytes from a file are interpreted as numeric or other types.

For example, to read 100 double precision numbers from a binary file raw.bin, you might write:

matlab
fid = fopen('raw.bin','r');
data = fread(fid,100,'double');
fclose(fid);

Here fid is a file identifier, fread reads 100 values of class double from the file, and fclose closes the file when you are done. The details depend on how the data was written, especially on the number type and any file header.

MATLAB also supports many specialized file formats through dedicated functions, such as image and audio files, which are discussed elsewhere. For those, you usually do not need to think about binary layout directly.

Basic Troubleshooting When Importing Data

Importing external data sometimes fails on the first attempt. Common issues include incorrect delimiters, unexpected header content, inconsistent rows, or mixed data types in the same column.

If a call to readtable or readmatrix reports an error, first verify the file contents with a text editor or with the Import Tool. Check whether the delimiter is what you expect, whether there are header lines that need to be skipped, and whether every row contains the same number of fields.

When data types are mixed, for example some entries are numeric and others are text, MATLAB may represent the whole column as text. In that case you might need to clean the file, adjust import options, or post process the column to convert valid numbers while handling problematic entries separately.

If values appear shifted into the wrong variables, reconsider your choice of delimiter and range. Often a single extra separator in the file can misalign columns. Narrowing the range or adjusting the delimiter argument usually resolves this.

When dealing with MAT-files, if load cannot read a file, confirm that the file is actually a MAT-file and that it is not corrupt. A text file saved with a .mat extension will not load. You can inspect the file by opening it as text to see if it is human readable. Proper MAT-files appear as binary when opened in a text editor.

MAT-files store variables exactly as they exist when saved, including names and types. Loading without care can overwrite variables in your workspace that share the same names. Use selective save and load, and consider loading into a structure when you want to avoid conflicts.
When importing external data, always verify delimiters, headers, and data types. Prefer readtable or readmatrix for text and spreadsheet files, and use the Import Tool to explore unfamiliar files and generate reliable import code.

Views: 4

Comments

Please login to add a comment.

Don't have an account? Register now!