Kahibaro
Discord Login Register

8.2 Creating and Manipulating Strings

Introduction

In MATLAB you will often need to create, combine, and modify text. Text can be stored as character arrays or as string arrays. In this chapter the focus is on practical operations that you perform on strings once you have them, such as concatenating pieces, changing case, extracting parts, and building messages that mix text and numbers.

Creating String Arrays

You create a string scalar with double quotes. For example s = "hello"; creates a single string. A string array uses square brackets with elements separated by commas or spaces. For example names = ["Alice" "Bob" "Charlie"]; creates a row string array with three elements.

A column string array is created with semicolons. For example names = ["Alice"; "Bob"; "Charlie"]; gives a column of strings. MATLAB treats each element of a string array as an independent piece of text.

You can also create an empty string with "". An empty string is different from a missing string. A missing string is created with the missing function and represents an unknown or unavailable text entry.

Concatenating and Joining Strings

Combining strings is one of the most common operations. MATLAB supports simple concatenation with square brackets and higher level joining with functions such as strcat, append, and strjoin.

If you have two strings first = "Hello"; and second = "world"; you can concatenate them directly as combined = [first second];. This produces "Helloworld" because there is no space added automatically. To include a space, include it explicitly, for example combined = [first " " second]; which gives "Hello world".

The append function works similarly and is often clearer when joining multiple pieces. For example msg = append("The value is ", "42");. When you call append with string arrays of compatible sizes it operates elementwise. For example if prefix = "Item "; and nums = ["1" "2" "3"]; then labels = append(prefix, nums); produces ["Item 1" "Item 2" "Item 3"].

The strcat function also concatenates text, but it removes trailing whitespace from character arrays before joining. With string arrays, append is usually easier to understand, and it does not remove spaces.

When you have many strings and you want to create a single string with a separator, use strjoin. For example if names = ["Alice","Bob","Charlie"]; then line = strjoin(names, ", "); returns "Alice, Bob, Charlie". The second argument is the delimiter that appears between elements.

Repeating Strings and Creating Patterns

Sometimes you need repeated patterns of text. The repmat function creates repeated blocks of a string array. For example if s = "ha"; then block = repmat(s, 1, 3); creates a 1 by 3 string array ["ha" "ha" "ha"]. If you want a single string that repeats a pattern, you can combine repmat with strjoin. For example:

s = "ha";
parts = repmat(s, 1, 3);
laugh = strjoin(parts, "");

This produces "hahaha".

Another approach is to use compose with a numeric sequence when you need numbered labels. For example:

idx = 1:4;
labels = compose("Trial %d", idx);

This gives ["Trial 1" "Trial 2" "Trial 3" "Trial 4"] as a string array.

Changing Case and Basic Transformations

MATLAB offers simple functions to transform the appearance of strings. The upper function converts letters to uppercase, for example upper("Matlab") returns "MATLAB". The lower function converts to lowercase, so lower("Matlab") returns "matlab".

To capitalize the first letter of each word and make the rest lowercase, use capitalize. For example capitalize("hello world") returns "Hello World" when applied to a string. If you apply these functions to a string array, they act elementwise on each string in the array.

You can also trim whitespace from the start and end of a string. The strtrim function removes leading and trailing whitespace characters, for example strtrim(" padded text ") returns "padded text". For string arrays, strtrim processes each element.

If you need to pad or strip specific characters rather than just whitespace, there are functions like strip, strip(str,"x"), and pad, although these are more commonly used when preparing data for further processing.

Extracting Substrings by Position

You often need to extract parts of a string by index. Strings are indexed with parentheses, similar to numeric arrays. For a string scalar s = "abcdef"; the expression s(2) returns "b" as a string of length 1. A range such as s(2:4) returns "bcd".

If you work with a string array, s(k) indexes elements, not characters. For example if s = ["red","green","blue"]; then s(2) is "green". To extract characters from a specific element, chain the indexing. For example s(2)(1:3) takes the second string and then the first three characters, giving "gre".

When you want to take the first or last few characters of each string in an array, you can combine indexing with the extractBefore and extractAfter functions. For example for a single string s = "report_2025.txt"; the expression extractBefore(s, ".") returns "report_2025" because it takes everything before the first period. Similarly extractAfter(s, "_") returns "2025.txt".

Extracting Substrings with Patterns and Positions

Instead of numeric positions, you can use specified substrings or pattern objects as markers. The function extractBetween takes a start marker and an end marker. For example if s = "Name: Alice, Age: 30"; then:

namePart = extractBetween(s, "Name: ", ",");

returns "Alice" as a string. When you give extractBetween string arrays as input it works elementwise and can return string arrays of substrings.

The extractBefore and extractAfter functions are special cases. For example:

file = "results_final.csv";
prefix = extractBefore(file, "_final");   % "results"
ext    = extractAfter(file, ".");         % "csv"

These operations are useful when parsing filenames, codes, or identifiers that follow predictable formats.

If the marker substring occurs more than once you can control which occurrence to use by providing an additional argument. For example extractBefore(s, ":", 2) uses the second occurrence of the colon as the reference point.

Replacing and Removing Substrings

You can modify existing strings by replacing parts of them. The basic function is replace. For example:

s = "color: red";
t = replace(s, "red", "blue");

This gives "color: blue". If the substring appears multiple times, replace replaces them all. For example replace("aaa", "a", "b") returns "bbb".

To remove a substring, replace it with an empty string. For example clean = replace("a-b-c", "-", ""); gives "abc". For a string array, replace works elementwise. If codes = ["AB-1","CD-2"]; then replace(codes, "-", "") returns ["AB1","CD2"].

If you need more control by position, you can combine indexing with concatenation. For example:

s = "abcdef";
s = [s(1:2) "X" s(4:end)];  % "abXdef"

This keeps the first two characters, inserts "X", and then keeps the rest starting from the fourth character.

Splitting Strings into Pieces

To divide a string into parts, use split and related functions. For example, if s = "red,green,blue"; then parts = split(s, ","); returns a string array ["red","green","blue"]. You can omit the delimiter when splitting on whitespace. For example words = split("one two three"); treats any whitespace as a separator and produces ["one","two","three"].

To split only at the first occurrence of a delimiter, use split(s, ",", 2). This gives at most 2 pieces. For example:

s = "header: value: extra";
pieces = split(s, ":", 2);

returns ["header" " value: extra"].

When you only care about dividing into lines, use splitlines. For example:

text = "first line\nsecond line\r\nthird line";
lines = splitlines(text);

This returns each line as an element of a string array and automatically handles different newline conventions.

Joining Arrays Back into Strings

After splitting, you might need to glue pieces back together. As before, strjoin joins a string array into a single string. For example:

parts = ["2025" "12" "31"];
dateStr = strjoin(parts, "-");  % "2025-12-31"

The delimiter can be any string, not only a single character.

For multi row string arrays, strjoin acts along a dimension if you specify it. For example strjoin(names, ", ", 2) can join elements across columns for each row.

Building Strings from Numbers and Variables

In practice you often build messages that mix text with numeric values. If you directly concatenate a string with a number, MATLAB will give an error because they are different types. Instead you either convert the numbers to strings or use formatting functions.

The string function converts many types to string form. For example:

x = 3.14159;
sx = string(x);      % "3.1416" (shortened according to display rules)
msg = "pi is about " + sx;

The + operator between strings is equivalent to append.

If you need control over the numeric format, use compose or sprintf. The compose function returns strings directly. For example:

x = 3.14159;
msg = compose("pi is %.3f", x);  % "pi is 3.142"

You can use multiple placeholders. For example:

n = 5;
value = 10.2;
info = compose("n = %d, value = %.2f", n, value);

This produces "n = 5, value = 10.20". For arrays of numbers, compose creates a string array with one element per row of inputs.

You can also use sprintf, which returns a character vector, then convert that to a string with string if you want to remain in the string workflow. For example:

infoChar = sprintf("Result: %.2f", value);  % char vector
infoStr  = string(infoChar);               % string scalar

Handling Missing and Empty Strings

When manipulating strings that come from files or user input, some entries can be missing or empty. In a string array, missing values are represented with missing. For example:

s = ["Alice", missing, "Charlie"];

If you apply functions such as upper, the missing elements stay missing.

Empty strings "" are different. They are valid strings with zero characters. For example splitting an empty string on a delimiter can still produce an empty element, and concatenating with empty strings leaves the other part unchanged.

Some functions treat missing and empty differently. For example, strlength("") returns 0 while strlength(missing) returns NaN. When writing code, keep track of which kind of absence you want to use. For unavailable or unknown text, use missing. For a deliberate blank entry, use "".

Practical Examples

A common pattern is to process filenames. Suppose you have:

files = ["data_01.csv","data_02.csv","data_03.csv"];
base   = extractBefore(files, ".");      % "data_01", ...
index  = extractAfter(base, "data_");    % "01", "02", "03"
labels = append("Run ", index);          % "Run 01", "Run 02", "Run 03"

Here you combine substring extraction and appending to create human readable labels.

Another example is preparing a simple report line:

name   = "Alice";
score  = 92.5;
report = compose("Student %s scored %.1f%%.", name, score);

The resulting string contains both text and a formatted numeric value. You can store multiple such lines in a string array and write them to a file or display them.

Important points to remember:
Use double quotes for strings and brackets or functions like append and strjoin to combine them.
Use extractBefore, extractAfter, and extractBetween to take substrings without manually counting positions.
Use replace to change or remove parts of a string and split or splitlines to break strings into pieces.
Convert numbers to text with string or compose before concatenating with strings.
Distinguish between "" for an empty string and missing for an unknown or unavailable value.

Views: 48

Comments

Please login to add a comment.

Don't have an account? Register now!