← home

Computing knowledge base

Shells

Shells are programs that interpret commands. They act as your interface to the system by allowing you to run other programs. When you type on your computer’s command line, you are using a shell in interactive mode. You can also write shell scripts to be batch processed.

There are many different shell command languages and shells that understand them. Most operating systems have multiple options, and you can choose which ones to use for scripting and your interactive shell.

sh is the POSIX-specified shell command language. Nearly every operating system has a shell located at /bin/sh that supports it. Modern shells that interpret languages with more features and better syntax than sh often have compatibility modes to interpret sh scripts.

bash is a command language and corresponding shell implementation. It is derived from sh with a number of extensions that make it nicer to use. It is also extremely widespread, but less so than sh. FreeBSD, for example, does not ship with a bash shell by default. You can find a description of the language and shell in the Bash manual.

zsh is a command language and shell, also derived from sh, that is more modern and friendly than bash. It is present on fewer systems than sh or bash, but it is gaining popularity. It has the best interactive mode of the three. You can find a description of the language and shell in the Zsh manual.

Choosing a shell

I recommend using zsh for your interactive shell, where concerns about cross-platform support don’t apply. When you need to write a script, you should choose the language based on where the script needs to work.

If the script is non-trivial and only needs to work on a small set of machines that you control, I recommend using a real programming language. They are much nicer to use, and all major languages have libraries for doing things that shells normally do, like executing subprocesses and setting up pipelines. These libraries also often support invoking a shell directly. You can simply install an interpreter for your language of choice on all the machines that need to run your script, and you’re off to the races.

If the script needs to work absolutely everywhere, then use sh. Otherwise, bash’s improved syntax is likely worth the reduced compatibility.

In the following notes on shell scripting, I assume the use of bash. For correct information about a particular shell or shell command language, read the appropriate manual.

Environment and shell variables

Like all processes, shells operate with a set of environment variables that are key-value pairs. These environment variables are passed to child processes that the shell creates. Shells also have a set of key-value pairs, known as shell variables, that are not passed to child processes. The set of shell variables is a superset of the set of environment variables. You can control environment variables, shell variables, and how they are passed to child processes using the shell command language.

To set a shell variable that later commands and built-in shell functions will see but child processes will not, you can use the syntax varname=value.

To set an environment variable that will also be seen by child processes, you can use the export builtin: export varname=value.

You can also define variables that are passed to a child process’ environment but not set as shell or environment variables in the shell’s process by defining them and executing a child process on the same line: varname=value program.

Here is an example script that shows all of these variable-setting methods in action:

# creates a shell variable
$ MYSHELLVAR=hello

# creates an environment variable
$ export MYENVVAR=goodbye

# prints "hello" because echo is a builtin
# and can see the shell variable
$ echo ${MYSHELLVAR}

# prints nothing because the shell variable
# is not exported to the bash child process
$ bash -c 'echo ${MYSHELLVAR}'

# prints "goodbye" because echo is a builtin
# and can see the environment variable
$ echo ${MYENVVAR}

# prints "goodbye" because the environment
# variable is exported to the bash child process
$ bash -c 'echo ${MYENVVAR}'

# prints "ciao" because MYCHILDVAR is passed
# to the child process' environment
$ MYCHILDVAR="ciao" bash -c 'echo ${MYCHILDVAR}'

# prints nothing because MYCHILDVAR was only set
# for the previous command's child process
$ echo ${MYCHILDVAR}

You can list all shell variables with the set builtin by typing set at the command line with no arguments. You can list all environment variables passed to a child process with the standalone env program by typing env at the command line with no arguments.

Shell variables are used to store information private to the shell, including options that configure shell behavior and user-defined variables. Environment variables are used by all kinds of programs. Here are some canonical ones that are useful to know:

Commands and paths

The first part of a shell command is typically the name of a program to run. When processing the command, the shell creates a child processes and uses it to execute the specified program with specified arguments, input file, and output files.

If the name of the program is given as an absolute or relative path, then the shell executes the program at the specified path. If program is given as a name alone, then the shell searches directories in the PATH environment variable, in order, for an executable file with a name that matches the provided one. It executes the first one it finds.

The which program, when given a program name as an argument, searches directories in the PATH variable and prints the absolute path of the first program it finds with a name that matches the provided one. In other words, it tells you which program the shell would execute for a command started with the given program name.

The env program allows you to modify environment variables before executing a given program. Like the shell, and which, it searches the PATH variable to determine which program to execute. It is often used in shebangs at the top of executable script files as /usr/bin/env <interpreter> so that the script will be executed by the appropriate interpreter without its author having to know the interpreter’s exact path.

Systems aren’t required by POSIX to have env located at /usr/bin/env, but it’s the most portable solution for script shebangs in almost all scenarios. If you have a sh script that you really want to run everywhere, then using /bin/sh might be better.

Input and output

All processes have access to three special files: standard input, standard output, and standard error. By default, when the shell executes a program, it sets up the standard input file to receive keyboard input from the terminal emulator hosting the shell, and it sets up the standard output and error files so that their output is displayed in the terminal emulator.

You can use redirections to change where the input and output for these special files comes from and goes. You can also link multiple processes together using pipes, which hook up the standard output of each program in a pipeline to the standard input of the following one. You can create more sophisticated inter-process communication hookups using the mkfifo command.

You can use braces to group a list of commands together so that their combined output can be redirected as a single unit:

# outfile contains hello and world
{ echo hello; echo world; } > /path/to/outfile

Tips and tricks

Shell utilities

cd
change the working directory
  • to go back to the previous working directory:
    cd -
    
pushd
change the working directory and push the current directory onto the directory stack
popd
push a directory off the directory stack and change the working directory to it
echo
output arguments simply
printf
output arguments with more control
chsh
change a user’s interactive shell
which
locate a program in the user’s path
env
print environment variables, or modify environment and execute a program
xargs
read whitespace-delimited strings from standard input and execute a program with those strings as arguments
tee
copy standard input to standard output and a list of files

Terminal emulators

Terminfo and Termcap are libraries and corresponding databases that let programs use terminals and terminal emulators in a device-independent way. They look up the capabilities of the terminal they are running on, as described by the TERM environment variable, in the databases, and allow programs to alter their behavior accordingly. You can add or modify entries in the databases to control how programs behave with your terminal.

ANSI escape sequences are standardized in-band byte sequences that control terminal behavior.

UNIX tools

In the philosophy of UNIX-like operating systems, programs are meant to be simple tools that do a small set of things well. As the user, you can configure and combine them to achieve your goals. In this section I discuss some important kinds of software in detail and maintain categorized lists of other useful programs.

The system manual is your first port of call for figuring out how software works. You can search the manual for information about a particular tool, system call, or library function by typing man <name> at the command line. Type man man for more information about using the manual.

Some of the below tools are specific to particular operating systems. Check your system’s manual to see whether you have them. You can also use the UNIX Rosetta Stone to translate between OS-level tools across different systems.

Many of these tools aren’t included in default OS installations. You can install them from the command line using your system’s package manager. It’s apt on Ubuntu, brew on macOS, and pkg on FreeBSD.

File basics

On Linux distributions, many software tools for basic file manipulation are from the GNU core utilities project. On flavors of BSD they are maintained and distributed as part of the base system. This means that these tools can have slightly different behaviors across systems.

You may occasionally want to move or remove filenames containing characters that are difficult to type. In these cases, the easiest way to proceed is by opening the directory containing the filename in an interactive editor like vim or emacs. For directories that don’t have many entries, you can also use rm -i -- * to be prompted whether to delete each filename in the directory. You can alternatively use ls -li to determine the filename’s inode number, and then use find . -inum <inode number> with the -delete flag to remove the filename or the -exec flag to otherwise interact with it. However, note that the find command will apply to all filenames in the current directory hierarchy with the specified inode number, so it may have unintended consequences if other filenames share the inode number you are interested in.

touch
update access and modified times; create empty file if it doesn’t exist
mkdir
create a directory
ls
list files in a directory
rm
remove a file or directory
mv
move a file or directory
cp
copy a file or directory
cat
write file contents to standard output
chmod
change file permissions
chown
change file owner
chflags
change file flags
ln
create hard links and symbolic links
  • to create a symbolic link:
    ln -s </path/to/symlink/target> </path/to/new/symlink>
    
rsync
copy files to local or remote destination

File searching, viewing, and editing

lf
terminal file manager
grep
find lines in a file with contents that match a regular expression
  • to find matching lines:
    grep <regular expression> </file/to/search>
    
  • -c prints the number of matching lines
  • -C[=num] prints lines of leading and trailing context
  • -e is useful to specify multiple regular expressions
  • -E enables extended regular expressions (special meanings for characters like ? and +)
  • -n each match is preceded by its line number in the file
  • -r recursively search subdirectories of a directory
  • -v select lines that don’t match the given expressions
rg (ripgrep)
faster and more powerful version of grep
find
search for files in a file hierarchy
  • to search for files with a sh extension:
    find </path/to/directory> -name "*.sh"
    
  • -name search by name
  • -type search by file type
  • -mtime search by modification time
fd
faster and simpler version of find
fzf
fast generic fuzzy finder, good integrations with vim and shell history
tail
display the last part of a file
  • to wait for and display additional data as it is appended to a file:
    tail -f </path/to/file>
    
lsof
list open files
vim
text editor
nvim (neovim)
more modern, mostly backwards-compatible version of vim
xxd
create a hex dump of a binary file, or create a binary file from a hex dump
hexedit
view and edit binary files in hexadecimal and ASCII
open
open a file in the corresponding default application on macOS
xdg-open
open a file in the corresponding default application on Linux or FreeBSD

File processing

tar
create and manipulate archive files
  • to create a tar archive with gzip compression:
    tar -czvf </path/to/output.tar.gz> </path/to/input/files>
    
  • to extract a tar archive with gzip compression, so that the archive contents will be placed into an existing output directory:
    tar -xzvf </path/to/input.tar.gz> -C </path/to/output/directory>
    
ffmpeg
video and audio converter
magick (imagemagick)
convert and edit images
pandoc
universal document converter
sed
stream edit files
awk
pattern-directed file processing
cut
print selected portions of each line of a file
paste
merge corresponding lines of input files
uniq
report or filter out repeated lines in a file
sort
sort files by lines

System administration

passwd
change a user’s password
groups
list user groups
useradd
add a user on Linux
usermod
modify a user on Linux
groupadd
add a group on Linux
pw
manage users and groups on FreeBSD
shutdown
cleanly halt, shutdown, or reboot a machine
exit
exit an interactive shell
mail
send email from a machine
  • to send a simple message
    mail -s <subject> <someone@example.com>
    
cron
execute commands on a schedule
service
control daemons on Linux and FreeBSD
launchctl
control daemons on macOS

Processes

ps
list processes
kill
send signals to processes
top
interactive display about processes
htop
improved top
strace
trace system calls on Linux
ktrace
trace system calls on macOS and FreeBSD
dtruss
trace system calls on macOS and FreeBSD
vmstat
kernel statistics about processes, virtual memory, traps, and CPU usage on FreeBSD

Networking

curl
transfer data to or from a server; more flexible than wget
  • to download a single file:
    curl -o </path/to/output.file> <http://domain.com/remote.file>
    
wget
download files from a network
  • to download a single file:
    wget -O </path/to/output.file> <http://domain.com/remote.file>
    
telnet
open TCP connections
ping
send ICMP packets to check whether a host is online
traceroute
print the route taken by packets to a host
dig
look up DNS information
ncat (nmap’s version of netcat)
scriptable TCP and UDP toolbox
netstat
show network-related data structures
sockstat
information about open sockets on FreeBSD
tcpdump
capture and print packet contents
ifconfig
list and configure network interfaces on FreeBSD
route
manipulate network routing tables on FreeBSD
ip
manage network interfaces and routing tables on Linux
nftables
configure firewall rules on Linux
ufw
simple front end to nftables
pfctl (packet filter)
configure firewall rules on BSDs
netmap
framework that bypasses the kernel to enable fast packet I/O

Disks

df
show free space for mounted filesystems
du
show disk usage for directories
  • to show the size of a particular directory, where -h means human-readable size and -d is the depth of subdirectory sizes to display:
    du -h -d 0 </path/to/directory>
    
mount
mount filesystems or list mounted filesystems
  • to mount a filesystem:
    mount </path/to/device> </path/to/mount/point>
    
umount
unmount filesystems
gpart
partition disks on FreeBSD
parted
partition disks on Linux
newfs
create UFS filesystems on FreeBSD
zpool
create ZFS filesystems on FreeBSD
mkfs
create filesystems on Linux
makefs
create a file system image from files on FreeBSD
mkimg
combine file system images into a partitioned disk image on FreeBSD
hdiutil
work with disk images on macOS
dd
copy files
  • to write a disk image to a storage device:
    dd if=</path/to/disk.img> of=</path/to/device> bs=8M status=progress
    
iostat
statistics about disk use on FreeBSD
fuse
kernel interface that allows userspace programs to export a virtual filesystem

Peripherals

devinfo
information about peripheral devices on FreeBSD
lspci
list PCI devices on FreeBSD
pciconf
configure PCI devices on FreeBSD
acpidump
analyze ACPI tables on FreeBSD
lsblk
list block devices on Linux
udev
dynamic peripheral management and naming on Linux
devd
dynamic peripheral management and naming on FreeBSD
picocom
terminal emulator for communicating over serial connections
  • to open a terminal session using a serial device:
    picocom -b <baud rate> </path/to/serial/device>
    
bpf (eBPF, extended Berkeley Packet Filter)
write arbitrary programs that run on a virtual machine within the kernel

Security analysis

afl-fuzz (American Fuzzy Lop plus plus)
general-purpose fuzzer
syzkaller
kernel fuzzer
nmap
network scanner
wireshark
network packet analyzer
squid
web proxy
aircrack-ng
wifi security tools
Burp
intercepting web proxy
Frida
dynamic binary instrumentation toolkit
ghidra
binary reverse engineering tool
radare2
binary reverse engineering tool with command-line interface
binwalk
identify files and code in binary firmware images
john (John the Ripper)
password cracker
hashcat
password cracker with good GPU support
auditd
event auditing for Unix-like operating systems

SSH

The Secure Shell Protocol (SSH) is the most common way to get secure remote shell access to a machine. It supports a wide range of use cases, including port forwarding, X display forwarding, and SOCKS proxying. The most popular implementation is OpenSSH, which I describe here.

The primary components of OpenSSH are sshd, the SSH server daemon that runs on the machine you want to access remotely, and ssh, the client application that runs on your local machine. Global configuration files for ssh and sshd can be found in /etc/ssh. /etc/ssh/ssh-config is used to configure ssh and /etc/ssh/sshd-config is used to configure sshd. Per-user configuration is in the directory ~/.ssh, and configuration files there must have permissions 700 to be used. ~/.ssh/config overrides the global /etc/ssh/ssh-config.

Key-based authentication

SSH can use various forms of authentication, including the password for the user on the remote machine, public-private keypairs, and Kerberos. Using passwords exposes you to brute-force attacks. You should configure your servers only to accept key-based authentication by adding the lines PubkeyAuthentication yes and PasswordAuthentication no to sshd-config.

You can generate a public-private keypair using the interactive ssh-keygen command. It puts both a public and private key in the ~/.ssh directory.

The private key, called id_rsa by default, stays on your local machine and is used to prove your identity. It must have permissions 600 for the programs to work correctly. You should protect your private key with a passphrase. Otherwise, someone who obtains your private key or gains access to your local user account automatically gains access to all of the machines you can SSH into.

The public key, called id_rsa.pub by default, is placed onto machines that you want to access using your private key. More specifically, the contents of id_rsa.pub are appended as a line in the file ~/.ssh/authorizd_keys in the home directory of the user that you want to log in as on the remote machine that you want to access. The authorized_keys file must have permissions 600 for the programs to work correctly. You can use the ssh-copy-id program to automatically add your public key to the appropriate authorized_keys file on a remote machine.

With your keypair set up in this way, you can SSH into the remote machine and get an interactive shell without using the remote user’s password:

ssh -i <path/to/private/key> <username>@<remote host>

In your ~/.ssh/config file, you can specify usernames and keys to use with particular hosts:

Host <remote host>
  User <username>
  IdentityFile <path/to/private/key>
  IdentitiesOnly yes # don't try default private keys

Then you can ssh into the machine with the simple command:

ssh <remote host>

SSH agents

If you set a passphrase on your private key, you will be prompted for this passphrase each time you want to use the key. You can use the ssh-agent and ssh-add programs to remember the private key passphrase for a certain amount of time:

eval `ssh-agent`
ssh-add -t <lifetime> <private key>

You can check which keys have been added to the agent as follows:

ssh-add -l

You can configure the command involving ssh-agent to be run every time you log in to your machine, so you only have to run ssh-add to store your private key passphrase. ssh-agent can also be used to implement single-sign-on across multiple remote machines, so that the passphrase for your private key only has to be entered on your local machine. This requires the ForwardAgent option to be enabled in the ssh_config file on clients and the AllowAgentForwarding option to be enabled in the sshd_config file on servers. You can then forward your agent connection with the -A flag:

ssh -A <user>@<remote host>

SOCKS proxying

SSH can be used to create an encrypted SOCKS proxy connection. A SOCKS proxy connection is similar to a virtual private network (VPN) connection. It is an encrypted channel between a local and remote host. The local host sends packets across the channel; the remote host receives the packets and then forwards them to their final destinations. The final destinations can vary across packets and do not need to be specified ahead of time. The below command opens a SOCKS tunnel to the remote host on the specified local port number:

ssh -D <local port> <user>@<remote host>

You can configure your operating system to forward all network traffic over SOCKS tunnel by specifying the appropriate local port number. You can also configure web browsers to forward web-based traffic.

Local port forwarding

SSH features local port forwarding, also known as tunneling. It allows you to specify a port on your local machine such that connections to that port are forwarded to a given host and port via the remote machine. Data you send to the local port are passed through the encrypted SSH tunnel to the remote machine and then sent by the remote machine to the destination you specify. This is useful for getting access to a service behind a firewall from your local machine:

ssh -L <local port>:<destination host>:<destination port> <user>@<remote host>

Remote port forwarding

SSH features remote port forwarding, also known as reverse tunneling. It allows you to specify a port on the remote machine such that connections to that port are forward to a given host and port via your local machine. Data sent to the remote port are passed through the encrypted SSH tunnel to your local machine and then sent by your local machine to the destination you specify. By default, the remote port is only accessible from the remote host itself. You can open it to the wider Internet by enabling the GatewayPorts option in the sshd_config file on the remote machine. This can be run on a machine that’s behind a firewall or NAT to enable other machines to access it:

ssh -R <remote port>:<destination host>:<destination port> <user>@<remote host>

X forwarding

SSH can also forward X graphical applications from a remote host to your local machine. The X11Forwarding option must be enabled in the sshd_config file on the remote machine. You must be running an X server on your local machine, and the DISPLAY environment variable must be set correctly in your local machine’s shell.

DISPLAY tells an X application where to send its graphical output. Its format is <hostname>:<display>.<screen>. A display is a collection of monitors that share a mouse and keyboard, and most contemporary computers only have one display. A screen is a particular monitor. The hostname is the name of the host running the X server. It can be omitted, in which case localhost will be used. The display number must always be included, and numbering starts from 0. The screen number can be omitted, in which case screen 0 will be used. For example, a DISPLAY value of :0 means that the X server is running on localhost, and graphical output should be rendered on the first screen of the first display.

You can then run ssh -X to enable X forwarding. SSH should set DISPLAY in your shell session on the remote host to localhost:10.0, and it will tunnel traffic sent there to the X server on your local machine. With the -X flag, SSH will subject X forwarding to security restrictions, which for some default configurations include a timeout after which new X connections cannot be made. One way to bypass these security restrictions is using the -Y flag instead of -X.

File transfers

SSH can transfer files to and from remote machines with the scp and sftp commands. The program sshfs, which is not part of OpenSSH, can mount directories on a remote machine using SSH.

Debugging

To debug issues with SSH, you can run ssh -vvv and sshd -ddd for verbose debugging output. If you are using SSH to access a server and your Internet connection is spotty, dropped connections can be frustrating. One way to address this is by running tmux on the remote machine, so that you can reattach to sessions if you get dropped. If you are mobile or have a truly terrible Internet connection, mosh is a less featureful alternative to SSH that provides a better experience.

Encryption

GPG

GNU Privacy Guard (GPG) is a good way to encrypt and decrypt individual files. It supports symmetric (passphrase-based) and asymmetric (keypair-based) encryption.

For GPG commands that produce binary output, the -a flag encodes the binary output as ASCII text for ease of transmission.

Symmetric encryption

Use the following command for symmetric encryption:

gpg -o <encrypted output file> -c --cipher-algo AES256 <plaintext input file>

You will be prompted to choose a passphrase.

Asymmetric encryption

Asymmetric encryption requires working with GPG keypairs. GPG keypairs are distinct from SSH keypairs.

To create a GPG keypair, run gpg --generate-key, which will prompt you to provide a name and email address to associate with the created keypair and then to enter a passphrase for the private key. You can reference a keypair by its id, name, or email address in GPG commands. You can list keys in GPG’s keyring with gpg --list-keys and edit keys with gpg --edit-key <key>.

To export and import public keys, use gpg -o <output key file> --export <key> and gpg --import <input key file>. The --export-secret-key and --allow-secret-key-import flags do the same thing for private keys.

With asymmetric encryption, you encrypt a file for a given public key that is present in your GPG keyring, and the corresponding private key is required to decrypt it:

gpg -o <encrypted output file> -e -r <recipient key> <plaintext input file>

Decryption

Use the following command to decrypt a file encrypted by GPG:

gpg -o <plaintext output file> -d <encrypted input file>

You will be prompted to enter the passphrase for either the symmetric encryption or the appropriate private key. You can configure gpg-agent to reduce the number of times you have to enter a private key’s passphrase.

rclone

rclone is an excellent way to perform encrypted cloud backups. In ~/.config/rclone/rclone.conf, set up a crypt backend over a backend for your cloud provider. You can then use rclone sync to make an encrypted cloud backup match the contents of a local folder:

rclone sync --links --fast-list <path/to/local/folder> <crypt-backend:>

You can use the rclone bisync command to make an encrypted cloud backup sync bidirectionally with multiple clients.

You can also mount an encrypted cloud drive as a local filesystem:

rclone mount --vfs-cache-mode full <crypt-backend:> <path/to/local/mount/point>

Git

Version control software keeps track of how files change over time and allows multiple people to work on the same set of files concurrently. Git is a popular program for version control. It’s a complicated and flexible tool: some ways of using it make working in teams easy, while others make it painful. This section describes some key Git concepts and suggests a good workflow for collaborative projects.

Concepts

Commits

The files for a Git-controlled project exist in a directory on your filesystem known as the work tree. As version control software, Git keeps track of how files change over time. But changes that you make in the work tree aren’t automatically reflected in Git’s historical record.

Git records changes in terms of commits, which are snapshots of the work tree. Each commit contains a set of modifications that affect one or more files relative to the previous commit. These modifications are colloquially called changes or diffs, for differences.

To create a commit, first make some changes in your work tree. You can run git status to see which files have changed in your work tree and git diff to see exactly what the changes are.

Commits are prepared in the staging area. Run git add <file> to add changes affecting a file in the work tree to the staging area. You can check exactly what’s in staging with git diff --staged.

You can create a commit by running git commit and adding a message when prompted. Changes are then taken from the staging area and added to Git’s historical record as a commit. Each commit is assigned a unique hash identifier. git show <commit identifier> shows the changes associated with a particular commit.

To remove a file from the staging area, run git restore --staged <file>. To discard changes to a file in your work tree, run git restore <file>.

Branches

It’s common to work with multiple versions of the same project. For example, you may want to add a big button to your website, but you aren’t sure whether red or blue would be a better color. You decide to create different versions of the site to check.

Git supports different project versions via branches. A branch is a named sequence of commits, where each commit except the first one points to a parent commit that happened before it. A ref is a human-readable name that references a particular commit. Each branch name is a ref that points to the most recent commit, or tip, of the corresponding branch.

With Git, you are always working on some branch. Most projects start with a branch called main, which holds the main version of the project. When you make a commit, the current branch’s ref is updated to point to the new commit, and the new commit’s parent is set to the previous tip of the branch, i.e. the previous value of the branch’s ref.

You can list branches and determine your current branch with git branch. You can switch branches with git switch <branch name>. HEAD is a special ref that always points to the tip of the current branch. You can run git log to see the sequence of commits that makes up the current branch, starting from the tip. git log -p shows the commits and their corresponding diffs.

To create a new branch and switch to it, run git switch -c <new branch name>. The ref of the new branch then points to the same commit as the ref of the branch you switched from. When you make commits in this new branch, they only change the new branch’s ref. They do not modify the ref of the original branch or the commit sequence that is shared between the new and original branches. The new and original branches are said to diverge at the start of that shared commit sequence.

Returning to our button example, say you evaluate the different colors by creating new branches off of your website’s main branch. One is called red-button and the other blue-button. In each branch, you make a commit that adds the appropriately colored button.

You decide you like red best. Now, you want to get the changes you made in the red-button branch into the website’s main branch. There are two main ways to do so.

Merge

One way to integrate changes from one branch into another is to perform a merge. You can merge changes from the red-button branch into main by switching to main and running git merge red-button. For this discussion, we’ll call the branch that the changes are coming from the source branch and the branch being merged into the target branch.

In a merge, Git takes the changes that have been made in the source branch since it diverged with the target branch and applies them to the target all at once.

If the target branch hasn’t experienced any commits since it diverged with the source branch, then the source branch is just the target branch with additional commits added. In this case, which is known as a fast-forward merge, the merge operation simply sets the target branch’s ref to the source branch’s ref.

If both the target and the source branches have experienced commits since they diverged, then the merge operation adds a new commit to the target branch, known as a merge commit, that contains all of the changes from the source branch since divergence. Merge commits have two parents: the refs of the source and target branches from before the merge.

After a merge, the two branch histories are joined. Commits that were made in the source branch are displayed in the git log of the target branch, interleaved with commits that were made in the target branch according to creation time, even though they do not affect the target branch directly. You can run git log --graph to see a graphical representation of previously merged branches. Running git log --first-parent shows only commits that were made in the target branch.

If a target and source branch have made different changes to the same part of a file since divergence, then the merge may not be able to happen automatically. This is known as a merge conflict. In this case, Git will pause in the middle of creating the merge commit and allow you to manually edit conflicting files to decide which changes should be preserved. When you are done resolving the merge conflict, add the conflicting files to the staging area and run git merge --continue. Running git merge --abort cancels a paused merge.

When you run git show on a merge commit, it will only show changes in files that have been modified relative to both parents. This means that it typically only shows files in which you manually merged changes as part of a conflict resolution. It also doesn’t include the context for those conflict resolutions. You can run git show --first-parent to see all changes made by a merge commit relative to the target branch, and git show --remerge-diff to see the context for merge conflict resolutions.

Rebase

An alternative to merging changes from one branch into another is to rebase them. You can rebase changes from the red-button branch onto main by switching to red-button and running git rebase main. For this discussion, we’ll call the branch that the changes are coming from the source branch and the branch that the changes are being rebased onto the target branch.

In a rebase, Git first determines the changes associated with each commit made in the source branch since it diverged with the target branch. It saves these sets of changes to temporary storage. It then sets the ref of the source branch to point to the ref of the target branch. Finally, it creates new commits in the source branch that apply the saved changes one set at a time.

The overall effect of a rebase is that changes made in the source branch are re-applied as if they were made on top of the target branch rather than the point of divergence. Rebases result in a linear history that is simpler than the joined history after a merge. Unlike a merge, which modifies the target branch, a rebase modifies the source branch and leaves the target unchanged.

If a target and source branch have made different changes to the same part of a file since divergence, then the rebase may not be able to happen automatically. This is known as a rebase conflict. In this case, Git will pause in the middle of creating the first commit whose changes do not automatically apply. Like in a merge conflict, you can manually edit conflicting files to decide which changes should be preserved in the commit, add them to the staging area, and run git rebase --continue to continue with the rebase.

Note that resolving a conflict for a particular rebased commit may prevent subsequent changes from being applied automatically. This can result in a painful cascading rebase-conflict scenario that should be avoided. Running git rebase --abort cancels a paused rebase.

Rather than adding new commits to the tip of a branch, rebasing rewrites the branch’s commit history. As discussed further below, this means that rebases should not be used on branches that are being actively worked on by more than one collaborator.

You can perform more complex rebases with the --onto flag. git rebase --onto <new-base> <end-point> <start-point> gathers changes corresponding to commits that have been made in the branch specified by the <start-point> ref going back until the <end-point> ref, sets the current branch to <new-base>, then applies the changes.

The git cherry-pick command is an easy way to take the changes associated with a commit or range of commits and apply them to the tip of the current branch, one at a time. It is effectively shorthand for git rebase --onto HEAD <end-point> <start-point>.

Remotes

Git keeps track of work trees, branches, commits, and other project information in a repository. You can create a new git repository on your local filesystem by running git init.

Git is a distributed version control system, which means that there can be many repositories for a single project. These repositories can also be in different locations. For example, a collaborative project might exist in repositories on your local machine, on a server belonging to a Git hosting service, and on the machine of another developer. Working effectively with others requires sharing information between these repositories.

Repositories distinct from the one you are currently working in are called remotes. You can list the names and URLs of remotes for your current repository by running git remote -v, and see information about a particular remote by running git remote show <remote name>. You can add remotes with git remote add <remote name> <remote URL> and rename remotes with git remote rename <old remote name> <new remote name>.

To download data from remote repositories, you can run git fetch <remote name>. This takes branches from the remote repository and creates or updates corresponding remote branches in your local repository. Remote branches are named with the format <remote name>/<branch name>.

You can’t work on remote branches directly, but a local branch can be configured to have a direct relationship with a remote branch. Such local branches are called tracking branches. The remote branch associated with a tracking branch is called its upstream branch.

To create a new tracking branch based on a remote upstream branch, you can run git switch -c <new branch name> <remote name>/<branch name>. You can also set up an existing branch to track a remote upstream branch by switching to the existing branch and running git branch --set-upstream-to <remote name>/<branch name>.

The git pull command integrates changes from a particular remote branch into the current local branch. If you run git pull with no other arguments from a tracking branch, it will automatically use the upstream remote branch. git pull <remote name> <branch name> specifies a particular remote branch to use.

If the local branch ref is an ancestor of the remote branch ref, git pull will fast-forward the local branch to the remote. If the two branches have diverged, it will rebase the local branch on top of the remote one. Make sure you pass the --rebase option or set pull.rebase to true in your Git config.

The git push command uploads the current local branch to a remote branch. If you run git push with no other arguments from a tracking branch, it will automatically update the upstream remote branch. git push <remote name> <branch name> specifies a particular remote branch to update. If you have created a new local branch, you can use the following command to create a new remote branch with the same name as your new local branch, update the remote branch with the contents of your local branch, and establish a tracking relationship between them: git push --set-upstream <remote name> <current local branch name>. You can delete a remote branch by running git push -d <remote name> <branch name>

By default, git push will only succeed if the remote branch ref is an ancestor of the local branch ref. In other words, it will only succeed if it can fast-forward the remote branch ref to the local branch ref. If you want to rewrite the history of the remote branch, you can use git push --force-with-lease. However, the --force-with-lease flag will still result in failure if a commit has been made in the remote branch since the last time you pulled it. To overwrite the remote branch with your local branch in all scenarios, you can use git push --force. You should rarely need to use the --force flag, and you should never rewrite the history of a remote branch that other developers may be working on.

When you want to start collaborating on a new project that is already in progress, a common thing to do is clone that project’s repository from some Git hosting service. The command git clone <remote URL> creates a new directory with the name of the remote repository, copies the remote repository into the new directory, sets up the cloned repository as a remote with the name origin, creates remote branches for each branch in origin, and creates a local tracking branch for origin’s primary branch.

Workflow for collaboration

This workflow outlines how to use Git for collaborative projects in the most painless way possible. It describes the process of getting a new feature added into the project from start to finish.

  1. Clone the project’s repository from wherever it is hosted.

  2. Create a feature branch to work in. This branch may be created off of your local copy of the repository’s main branch or a different feature branch.

  3. Do the required development work in your feature branch. While working, you should maintain a small number of commits, often only one, at the tip of the branch. The commits should be semantically distinct and the sets of files they modify should often be disjoint. They should also have short, descriptive commit messages.

    For example, if we wanted to add support for user accounts to a web app, the commits in our feature branch, displayed with git log --oneline, might look like this:

    ff725997b backend: add support for user accounts
    b54b004df frontend: add login page
    f7f7769b4 frontend: main page: display logged in user's name
    

    The following tips will help you to maintain your commits:

    • Use interactive rebasing (git rebase -i) to order and squash commits. git commit --fixup <commit-to-fixup> and the --autosquash argument to git rebase -i are helpful for this.

    • Use git add -p to add individual chunks of files to the staging area.

    • Use git commit --amend to fold changes from the staging area into the previous commit.

    Maintaining a small number of semantically distinct commits at the tip of your feature branch makes your branch easier to maintain, understand, and review. It also makes rebasing easy.

    During the course of development, you often have to incoporate changes from a mainline branch into your feature branch. You should never use a merge in this scenario. For one thing, merging pollutes your feature branch’s history and makes it hard to identify which commits are actually relevant to the feature. But more importantly, merges that involve conflict resolution end up splitting changes to the same region of code across multiple commits: the feature branch commit that originally introduced the changes and the potentially massive merge commits that resolved conflicts. This makes it hard for collaborators and even yourself to work with your branch.

    To incorporate changes from a mainline branch, you should rebase the feature branch onto it. Doing so results in a clean linear history that is easy to understand and work with. And having a small number of semantically distinct commits in the feature branch guarantees you won’t experience cascading rebase conflicts.

  4. When your feature is done and tests are passing locally, push your feature branch to a new corresponding remote branch for review. Address any feedback by making changes to your local branch and using the tips mentioned above to maintain your commits. You likely won’t need to create any new semantically distinct commits in response to reviews. Use git push --force-with-lease to push versions of your feature branch with rewritten history back up to the remote. This is fine to do as long as no other developers are working on the remote copy of your feature branch.

  5. To incorporate your feature branch into the mainline branch, first rebase it on the mainline branch a final time if necessary. Then fast-forward the mainline branch to the feature branch. This results in a clean linear history for the mainline branch as well.

With this workflow, you almost never perform an actual git merge. However, merges are useful in certain scenarios. Imagine that you maintain a fork of some upstream project and want to incorporate changes from a new version of upstream. In this case, rebasing all the changes you’ve made in your fork onto a new upstream version is impractical, and the split of changes between your original commits that introduced them and merge commits for new versions of upstream is useful.

Stacked branches

Sometimes, when working on a large feature, you may want to make the changes in distinct parts that can be reviewed and integrated separately. One way to do this is to create a separate branch for each part of the feature such that each part’s branch is an extension of the previous part’s. More specifically, you would create a part-1 feature branch off of main, a part-2 feature branch off of part-1, and so on. Each part’s branch should contain the tip of the previous part’s branch (or main) in its history, so that the parts of the feature all apply cleanly on top of one another. This scenario is often referred to as having stacked branches.

While stacked branches can make the review process for large features simpler and more effective, they also involve additional management work. For example, when the main branch changes, you have to rebase the entire branch stack on top of a new commit such that part-1 is based on the new main, part-2 is based on the new part-1, and so on. Similarly, when one part of the feature is changed in response to review feedback, the subsequent parts must all be rebased.

Git’s --update-refs argument to the rebase command handles the extra work involved in rebasing stacked branches automatically. It performs the specified rebase command as normal, but for any local branches whose refs point to commits affected by the rebase, those refs are updated to the post-rebase commits. So, if you need to rebase a stack of branches on a new version of main, you can check out the last part in the stack, rebase it on main with the --update-refs argument, and all of the previous parts in the stack will also be rebased on top of each other and the new main automatically. You can then push all of the updated branches to the remote server with a single command.

The --update-refs argument also works with interactive rebases, which is useful for incorporating review feedback. You can check out the last part in the stack, make changes as needed, then perform an interactive rebase with the --update-refs flag to move and squash your commits. There will be update-ref lines in the list of commands that allow you to control exactly where the refs of each branch in the stack will point after the rebase.

Tips

Build tools

Build systems

For language that require compilation, build systems handle invoking the compiler. They typically let you write configuration files that specify the command line arguments, libraries, input files, and output files that the compiler will use. Many allow you to make these specifications in a cross-platform way, so that your code can be both built on different platforms and built to execute on different platforms.

Build systems may support incremental builds, which only recompile files that have been modified since the previous compilation, and build caches, which store compilation outputs, for efficiency.

The right build system to use depends on the language being compiled and project requirements. Build systems are sometimes integrated into compilers or package management systems. Useful build systems to know about for C and C++ compilation include CMake, GNU Autotools, and Make.

Package managers

Package managers allow you to split a codebase into packages, where a package is a single library or application. They also allow you to manage dependencies, which are the packages that your packages require in order to be built or run. Build-tool package managers typically are specific to a particular programming language and integrate with a build system. Unlike operating-system package managers, which install packages at a system-wide level, build-tool package managers typically only install dependencies in a particular build environment.

A first-party package is a library or application that is a part of your codebase. It is defined by a package manifest file that allows you to specify a package name, a version number, a set of files to include in the package, metadata, and a dependency list that references other packages. You can use a package manager to upload first-party packages to a package registry server.

A third-party package is a one that has been defined outside of your codebase and published to a registry. For each first-party package in a codebase, the package manager can automatically download and cache all missing dependencies from the package registry. The downloaded dependencies can then be used during development, compilation, and execution.

To specify a dependency in a first-party package’s manifest file, you must include a range of version numbers to indicate which versions of the dependency your package is compatible with. When you modify one of these ranges or add or remove a dependency, you should use the package manager to generate a lockfile for the first-party package.

A lockfile is a file that names a specific version for each of a package’s dependencies, whether they are listed directly in the package manifest or present in the dependency graph as a dependency of a dependency. In generating a lockfile, the package manager chooses specific dependency versions from the allowed ranges to minimize the overall number of dependencies required. For example, a third-party package that appears twice in the dependency graph with version ranges that overlap can be represented by a single entry in the lockfile. A third-party package that appears twice in the dependency graph with disjoint version ranges must be represented by two entries.

Once you have a lockfile, you can use the package manager to install the dependencies listed in it. A particular lockfile will always install exactly the same versions of exactly the same dependencies. Lockfiles are thus an efficient way to facilitate reproducible builds. Rather than adding all of a package’s dependencies into version control, you can simply add a lockfile.

Package managers should prevent the direct use of phantom dependencies. Phantom dependencies are dependencies of a package’s dependencies; they are present in the package’s dependency graph but not its manifest file. A package’s phantom dependencies can change versions or be added or removed unexpectedly as its explicit dependencies change over time. Because of this, the direct use of phantom dependencies can lead to hard-to-diagnose bugs, and attempts to use phantom dependencies should break a package’s build.

For simple projects, a codebase might contain only a single first-party package. However, more complex projects may have multiple first-party packages in a single codebase. Package mangers that support such codebases should have the following capabilities:

Task runners

Task runners orchestrate shell commands related to your codebase. They can interface with multiple package managers and build systems across different programming languages.

One feature of task runners is that they allow you to invoke arbitrary shell commands via simple shorthands. For example, running taskrunner lint might invoke the linter using a complicated list of configuration arguments.

Another feature is that they keep track of dependencies between commands. For example, if you run taskrunner test, the task runner might run taskrunner build before the test command so that your tests run against the most recent version of your project. This dependency tracking is particularly useful for codebases with multiple packages. In such a codebase, if you have an application that depends on a library, taskrunner build application might have taskrunner build library as a dependency.

Task runners track dependencies via a user-defined task dependency graph. Conceptually, each task comprises a shell command, a set of tasks that it depends on, and optional sets of input and output files. You must define the shell command and dependencies explicitly; the task runner might be able to determine the sets of input and output files by integrating with a build system or package manager. Edges in the task dependency graph connect a task to its dependencies. When you run a task, the runner makes sure that all of its dependencies are run first.

A task’s input and output file sets allow the task runner to perform caching. The runner can hash the contents of all of the input files and check that hash against a cache. If the hash hits, then the input files haven’t been changed, and so the output files and logs from a previous shell command run can be restored directly from the cache without running the shell command again. If the hash misses, then the task runner runs the task’s shell command and stores the logs and output files in the cache. The cache can be stored locally or on a remote server shared by multiple users.

For this caching to work correctly, a task must always yield the same output when given the same input. Such tasks are called hermetic; hermetic tasks are amenable to parallelization as well as caching. To promote hermeticity, some task runners accept as explicit inputs all of the binaries and libraries used by a task, including version numbers, so that the task will run the same way regardless of the host system executing it. Some task runners also restrict the types of commands tasks can perform to limit their ability to violate hermeticity.

Caching task runners are particularly useful in codebases with multiple packages. For example, consider a codebase that contains libraries as well as applications that use subsets of them. No matter how many source files in the codebase you edit, when you execute the task to build a particular application, it will also build only libraries that the application depends on whose input files have been modified since the last time they were built. Libraries that are not dependencies of the target application will not be built, and libraries whose input files have not been modified will be restored from the cache. Without a task runner, you would either have to manually determine exactly which libraries need to be rebuilt, which is slow and error-prone, or rebuild all of the packages in the codebase, which is inefficient.

Task runners can also be used to identify the set of tasks that have been affected by a set of changed files. Specifically, the set of affected tasks contains each task that has a changed file in its input set and each task that has such a task as a dependency. Identifying affected tasks is useful for eliminating unnecessary work in codebases with multiple packages. For example, imagine you have pulled a new version of your codebase and want to make sure that the changes didn’t break any tests. You can use the set of files changed by the pull to determine which tasks were affected. Then, executing only the test-running tasks that were affected, rather than the test-running tasks for all packages in the codebase, is sufficient to validate the changes.

Task runners might allow you to implement special handling for changes to particular types of files. For example, when a package manager’s lockfile changes, the task runner might invoke the package manager to determine which packages have been affected by the change, and only consider the lockfile to be changed in the input sets for tasks that correspond to the affected packages.

Current popular task runners include moonrepo and Turborepo.

Virtual environments

In the course of developing software, you’ll run into a lot of virtual environments. Below I describe the main kinds, how they’re useful, and some tools that can help you use them effectively.

Virtualization

Virtualization is when a piece of software known as a hypervisor uses features of real computer hardware to create and manage a set of virtual machines that can run guest operating systems as if each guest OS were running on an isolated instance of the underlying hardware. Sometimes the guest OSes are aware of the fact that they’re running on a virtual machine – this is known as paravirtualization – and sometimes they aren’t.

Virtualization is useful for setting up isolated development environments without polluting your primary operating system. It’s also useful for testing software against a wide range of operating systems or getting access to software that isn’t available on your primary OS.

Virtual machines are represented as disk-image files, which contain the guest OS, and hypervisor configuration files. Using identical images and configuration files results in identical instantiated VMs. These files can be distributed to share development environments, to package pre-configured applications with their dependencies, and to guarantee that networked applications are tested and deployed in identical environments.

Because virtualization uses hardware directly, it’s fast relative to emulation. However, it comes with the limitation that each virtual machine has the same architecture as the underlying physical hardware.

Popular hypervisors are kvm, bhyve, Hyper-V, the macOS Virtualization Framework, and Xen.

Emulation

Emulation is when a software emulator mimics the behavior of hardware. It’s broadly similar to virtualization, except that because an emulator works purely in software, it can emulate any kind of hardware. Operating purely in software also makes emulation slower than virtualization.

Emulation can be used for the same things as virtualization, but its worse performance makes it less likely to be used for distributing or running production applications. It is most useful for testing software across a wide range of hardware architectures, peripheral devices, and operating systems. You can also run nearly any piece of software, even ancient relics, using emulation.

The most popular general-purpose emulator is QEMU.

Simulation

The distinction between emulators and simulators is subtle. While emulators emulate the behavior of an entity, simulators simulate the entity’s behavior as well as some aspect of its internal operation. For example, an emulator for a particular CPU could take a set of instructions and execute them in any way, as long as the externally observable results are the same as they would be for the CPU being emulated. On the other hand, a simulator might execute the set of instructions in the same way that the real CPU would, taking the same number of simulated cycles and using simulated versions of its microarchitectural components.

In some sense, whether a piece of software is an emulator or simulator depends on the level of detail you’re interested in. Generally, though, simulators are much slower than emulators. They are typically used to do high-fidelity modeling of hardware before investing the resources to produce a physical version. A popular hardware simulator is gem5.

Containerization

Containers are lightweight virtual environments within a particular operating system. They isolate applications or groups of applications by limiting unnecessary access to system resources.

Some containerization systems represent containers as image files that can be instantiated by a container runtime. These kinds of containers are similar to virtual machines, but because they work within a particular operating system, they are both more efficient and less flexible. They can be used to package and distribute individual applications, networked or non-networked, with all of their dependencies or to set up isolated development environments.

Image-based containers also help to ensure applications are tested and used in identical environments, and they make it easy to deploy and scale networked applications. Production networked applications should always be run in some kind of container to limit damage in case of compromise.

Popular containerization systems include FreeBSD’s jails and Docker on Linux. Using chroot on UNIX-like systems changes the apparent root directory for a process, but it is not a containerization system.

Compatibility layers

Compatibility layers are interfaces that allow programs designed for some target system to run on a different host system. There are many kinds of compatibility layers, but typical ones work by implementing target system library function calls in terms of functions that are available on the host system. Some compatibility layers require recompiling the program, and others work on unmodified target system binaries.

Notable compatibility layers include:

Docker

Docker is a tool for creating and running containers. Containers are instantiated from images, and you can instantiate multiple disposable containers from a single image. You can specify how an image should be built, including which packages and files to include, in a Dockerfile. Base images can be pulled from Docker Hub or custom-made.

Docker builds images in terms of layers. Each command line in a Dockerfile creates a new layer in the final image, and created layers are stored in a build cache. For command lines whose inputs, including all previous commands in the Dockerfile, have been seen before, the corresponding layers are simply restored from the build cache. When a command line’s input does change, its corresponding layer and the layers of all subsequent command lines will be rebuilt. You should organize your Dockerfiles to use the build cache efficiently and be aware of situations in which you might have to clear the build cache to force a layer to be rebuilt.

Dockerfiles also support multi-stage builds, which allow you to specify multiple build stages based on different base images. They are useful for producing minimal images without unnecessary build-time dependencies and for writing Dockerfiles with good build-cache properties that are relatively easy to read and maintain.

Docker creates images in the Open Container Initiative (OCI) format. It uses the containerd daemon to manage images and containers at a high level. To actually instantiate containers, containerd uses runc.

Docker compose is a relatively straightforward tool for managing multi-container applications, and Kubernetes is a more powerful orchestration framework for deploying containerized applications on a cluster.

Useful commands

docker build
Create a new image from a Dockerfile.
docker commit <container> <tag>
Create a new image from a container’s current state.
docker run <image>
Start a container.
docker stop <container>
Stop a container.
docker ps -a
Show containers.
docker images
Show downloaded images.
docker rm <container>
Remove a container.
docker rmi <image>
Remove an image.
docker buildx ls
View cross-platform image builders.
docker buildx create --platform <platform>
Create a cross-platform image builder for the given target platform(s) and print its name.
docker buildx use <builder>
Use a cross-platform image builder for future builds.
docker buildx inspect --bootstrap
Initialize and print information about the currently used cross-platform image builder.
docker buildx build --platform <platform>
Create a new image for the target platform(s) from a Dockerfile; the currently used cross-platform image builder must support the specified platform(s).

QEMU

QEMU is a full-system emulator. It’s useful for testing system-level software across a wide range of hardware platforms and architectures. It has device models that emulate real peripheral devices, and it also supports VirtIO. When QEMU is emulating a target architecture that matches the architecture of the host machine, it can use hypervisors such as kvm on the host machine to achieve performance on par with that of virtualization.

A notable feature of QEMU is user-mode emulation, which is supported on Linux and BSDs. It supports running binaries compiled for the same operating system but a different architecture, and it’s lighter weight than doing full-system emulation. It’s useful for testing and debugging cross-compiled applications.