Kahibaro
Discord Login Register

Basic Linux command line usage

Why the Command Line Matters in HPC

On most HPC systems you interact with the machines through a text-based interface called the command line. Graphical desktops are rare on production clusters. To submit jobs, manage files, inspect results, and use software, you type commands into a shell such as bash on a login node. Becoming comfortable with basic command line usage is essential, even if you never plan to be a professional system administrator.

This chapter focuses on practical, day to day commands and patterns you will actually use when working on an HPC cluster. It assumes you are already logged into a Linux system, often via SSH, and have a shell prompt in front of you.

The Shell Prompt and Running Commands

When you log in, you typically see a prompt that looks similar to

user@login-node:~$

or

[user@cluster01 ~]$

The prompt usually encodes your username, the host name of the machine, and your current directory. To run a command, you type its name and press Enter. For example:

hostname

prints the name of the machine you are on. The shell finds the hostname program by searching directories listed in the PATH environment variable.

Many commands accept options, which usually start with - or --, and sometimes additional arguments. For example:

ls -l /home/user

applies the -l option to ls and passes /home/user as an argument.

If a command produces no output and no error, it usually means it succeeded quietly. Shell commands also return an exit status, an integer where 0 normally means success and nonzero indicates some kind of problem. You can inspect the exit status of the last command with:

echo $?

Basic Navigation in the Filesystem

Most of your interaction with an HPC system involves moving around directories and manipulating files. The core navigation commands are pwd, ls, and cd.

Use pwd to print your current working directory:

pwd

On login, this is often your home directory, such as /home/user or /users/user.

Use ls to list files and directories:

ls
ls -l
ls -a
ls -lh

The option -l shows detailed information, -a shows hidden files whose names start with ., and -h makes sizes human readable.

Use cd to change the current directory:

cd /path/to/somewhere
cd                # go to your home directory
cd ~              # also home
cd ..             # go up one level
cd ../..          # go up two levels

The directory . represents the current directory and .. represents the parent directory. Tilde ~ expands to your home directory.

Absolute paths start from /, for example /home/user/data. Relative paths are interpreted from your current directory, for example data/results refers to a subdirectory of where you are now.

Creating, Copying, Moving, and Removing Files

You frequently need to organize code, input data, and job outputs. The main commands are mkdir, cp, mv, and rm.

Create directories with mkdir:

mkdir project
mkdir -p project/run1/output

The -p option creates parent directories as needed.

Copy files or directories with cp:

cp input.txt input_backup.txt
cp -r src/ src_copy/

The -r option makes a recursive copy of directories.

Move or rename files or directories with mv:

mv oldname.txt newname.txt
mv results/* archived_results/

Removing is done using rm. Use it with care, because deletion is immediate and usually not reversible on shared clusters:

rm file.txt
rm -i file.txt       # interactive, asks for confirmation
rm -r old_results/   # remove a directory and its contents

rm -r will permanently delete everything under the specified directory, including subdirectories. On HPC systems this can mean losing days or weeks of simulation output. Double check the path before pressing Enter.

Some clusters provide snapshots or backups, but you should not rely on them in daily work. Avoid using rm -rf unless you fully understand what it does and have verified the path.

Viewing and Editing Text Files

HPC usage involves many text files, such as job scripts, configuration files, and output logs. You often want to look at them quickly from the command line.

Use cat to print the entire file to the terminal:

cat job.out

For large files, cat can flood your screen. Use less to view files page by page:

less job.out

Inside less, use the arrow keys, Page Up, Page Down, q to quit, and /pattern to search forward.

To see the beginning or end of a file, use head or tail:

head job.out
tail job.out
tail -n 50 job.out   # last 50 lines
tail -f job.out      # follow, shows new lines as they are written

tail -f is particularly useful to monitor a running job’s log file in real time.

For editing, HPC users typically use terminal-based editors. Common choices are nano, vim, and emacs. nano is often the easiest for beginners:

nano job_script.slurm

Within nano, hints at the bottom show key combinations. You can move with arrow keys, edit text, then save and exit using the indicated commands. Learning at least one such editor is essential since you cannot rely on graphical editors when connected remotely.

Working with Standard Input and Output

Linux commands read input and produce output as streams. The standard streams are:

  1. Standard input (stdin), usually your keyboard.
  2. Standard output (stdout), usually your terminal.
  3. Standard error (stderr), also usually your terminal.

You can redirect these streams using <, >, and 2>. For example:

my_program < input.txt > output.txt 2> error.log

This runs my_program with input from input.txt, sends standard output to output.txt, and standard error to error.log.

If you only redirect > it affects standard output:

my_program > output.txt

To append to an existing file instead of overwriting it, use >>:

echo "new line" >> notes.txt

In HPC, many batch job scripts explicitly redirect program output to log files so that you can inspect them after the job completes.

Using Pipes to Combine Commands

You can connect commands using pipes. The operator | passes standard output of one command directly as standard input of the next command, without using intermediate files.

For example:

ls -l | less

sends the detailed directory listing to less for easy viewing. Another common pattern is searching within output:

ps aux | grep my_program

Here ps aux lists processes and grep my_program filters lines that contain my_program.

Pipes are very powerful for working with large text outputs from log files or tools, which is common in HPC. They let you construct simple one line workflows without creating temporary files.

Finding Files and Searching in Files

Clusters often host many files, and your home or project directories can become large. The find and grep commands help locate files and search within them.

Use find to search for files based on name or other properties:

find . -name "*.out"
find /project/user -type f -name "job_*.log"

The first example searches the current directory and subdirectories for files with names that end in .out. The . can be replaced by any starting path.

Use grep to search inside files:

grep "ERROR" job.out
grep -n "temperature" job_*.log
grep -R "parameter_x" src/

The -n option shows line numbers. -R searches recursively through directories. In HPC, using grep on log files is an efficient way to check quickly for warnings or failures in large runs.

Managing Processes and Jobs at the Shell Level

Full job scheduling on clusters is handled by a batch system and will be covered separately. At the shell level, there are commands that help you manage interactive processes.

Use ps to list your processes:

ps
ps aux | grep user

To stop a running command in your current terminal, press Ctrl+C. This sends an interrupt signal to the foreground process.

To suspend a job and return to the prompt, use Ctrl+Z. The process stops temporarily. You can resume it in the background with:

bg

or bring it back to the foreground with:

fg

To run a command directly in the background, end it with &:

long_command &

The shell immediately returns control to you, while long_command runs in the background. You can see background jobs using:

jobs

and kill a process using its process ID with:

kill 12345

For stubborn processes, use a stronger signal:

kill -9 12345

On shared HPC systems, you should only manage your own processes and avoid interfering with system services.

Permissions and Ownership Basics

Shared clusters are multi user systems, so Linux file permissions are important. These control who can read, write, or execute a file.

Use ls -l to see permissions:

ls -l

You will see lines that start with something like:

-rw-r--r-- 1 user group  1024 Jan  1 10:00 file.txt
drwxr-xr-x 2 user group  4096 Jan  1 10:00 scripts

Each file or directory has three permission sets: for the owner, the group, and others. The letters r, w, and x represent read, write, and execute. A d at the beginning indicates a directory.

You can change permissions with chmod. For example:

chmod u+x run.sh      # give execute to the user
chmod go-rw data.txt  # remove read/write from group and others

Use execute permission on a script to make it runnable:

./run.sh

On HPC clusters, strict permissions help keep project data protected. Avoid making files world writable or executable without a reason.

Getting Help from the Command Line

Linux has built in documentation. The man command shows manual pages:

man ls
man grep

Inside man, use arrow keys or Page Up and Page Down to scroll, and q to quit. You can search inside the manual page using /pattern.

Many commands also support --help:

ls --help
grep --help

This prints a brief description of options and usage. Combining man and --help with online documentation is usually enough to learn new commands as you need them.

Shortcuts and Quality of Life Features

Even at the command line, there are conveniences that save time. The shell keeps a history of your commands. Use the up and down arrow keys to navigate recent commands, then edit and re run them. You can search backward through history with Ctrl+R and then typing part of a previous command.

Tab completion accelerates typing. Start a command or filename, then press Tab. If there is a unique match, the shell completes it. If there are multiple matches, pressing Tab twice often lists them.

Common editing keys in bash include Ctrl+A to move to the beginning of the line, Ctrl+E to move to the end, and Ctrl+U to erase from the cursor to the beginning. These are especially useful when working interactively on remote clusters.

Command Line Usage in the HPC Workflow

On an HPC system, basic Linux command usage appears in almost every step of a workflow. You use navigation and file commands to organize your project directories. You use editors to prepare job scripts. You use redirection and pipes to create log files and filter output. You use tail, less, and grep to inspect and debug the results of jobs that may run for hours or days.

Investing a small amount of time in practicing these basic commands will pay off throughout your work with HPC, since the job scheduler, software environment, and performance tools you encounter later all assume you are already comfortable operating at the Linux command line.

Views: 3

Comments

Please login to add a comment.

Don't have an account? Register now!