Command Shells

Overview

This is an overview and short tutorial on command shells followed by an exercise to test your skills.

Introduction

Although there are a lot of great GUI-based machine learning applications, there are many cases when it is more productive to use command line interfaces, for example with git or python and basic file and directory manipulation.

We’ll get you familiar with command shells for Windows and MacOS here. In the next section we’ll cover the Shared Computing Cluster at BU and its Linux operating system.

History and Types of Command Shells

By far, most command shells used in machine learning are descendents from the Bourne shell, sh, which was developed in the 1970s at AT&T’s Bell Labs for the Unix operating system. The Bourne Again shell1, bash, was developed in the 1980s at the University of California, Berkeley.

Most machine learning infrastructures are now built on Linux, which was a mostly compatible operating system built mostly from scratch by Linux founder Linus Torvalds in the 1990s.

The zsh shell is a more recent descendent of the Bourne shell. It is the default shell on MacOS. MacOS itself is a descendent of Unix and so the command shell is tightly integrated with the operating system.

Windows Terminal

Warning

It should be noted that Windows Terminal (todo: use correct name) is, for the most part, a completely different command shell from bash and zsh with different syntax and commands. As we describe below, we recommended to use one of the Linux variants of shells available on Windows.

Getting to a command shell

The types and how you start command shells varies by operating system. Choose the tab for your operating system.

In MacOS, start the terminal app. MacoS Terminal

Starting with MacOS Catalina, the default shell is zsh. If you’ve upgraded from an earlier version of Catalina, you might still be using bash shell. If you are still using bash, there should be reminder everytime you open Terminal instructing how you switch to zsh.

The reason it is important to know which shell you are running is that the shell startup configuration files are particular to the shell. There are some other subtle differences in shell commands and syntax.

You can see which shell you are running with the process status command ps. One or more of the processes listed will include bash or zsh.

There are several options for command shells on Windows, provided below in increasing order of preference.

Windows Terminal/Command Prompt

As mentioned above, Windows Terminal is a different shell from bash and zsh and is generally not as well supported by many machine learning related command line tools. But once you have python and git installed, the python and git commands will work in Windows Terminal fairly similarly to Linux style.

Windows Powershell

Windows Powershell is a more modern command shell that is similar to zsh and bash. It is the default shell on Windows 10 and later. (need to confirm this). But it is not exactly the command syntax of bash or zsh.

Windows GIT Bash

When you install GIT on Windows, it also installs a command shell called GIT Bash. This is a clone of the Linux bash shell and so the command syntax is more like Linux. It also operates in the Windows filesystem so it is straightforward to access files you use from Windows applications.

Windows Subsystem for linux (WSL)

Microsoft released Windows Subsystem for Linux (WSL) in 2016. WSL2 is a more modern version of WSL that is more compatible with Linux. It is the recommended way to use Linux on Windows.

Installation instructions arehere.

Once you setup WSL2, you can install a Linux distribution. Ubuntu is a popular choice, and we recommend that.

Note

When you start the WSL2 shell, you are in the /home/user directory, which is different than your Windows home directory. You can get to your Windows home directory with cd /mnt/c/Users/your_username. Generally you don’t want to be manipulating Windows files from the WSL2 shell and vice versa as there are some subtle file format differences which may cause problems. But if needed, you can navigate between the two filesystems.

First Basic Commands

At this point you should have a command shell open and seeing at least a minimal prompt like:

$

or

%

Your prompt may have some other decorations around the prompt character, and we’ll cover those later.

pwd – print working directory

usage: pwd

It’s very helpful to know where you are in the filesystem and pwd tells you that. When you first open a terminal, you are usually in your home directory.

cd – change directory

usage: cd [directory]

You use cd to change directories.

If you want to change to your home directory you can type cd with no arguments.

cd

Besides type directory paths such as /path/to/directory, you can also use .. to go up one directory and . to refer to the current directory. The other special directories are ~ for your home directory and - for the previous directory you were in.

cd ..
cd .

ls – list directory contents

Once you are in a directory You can use ls to list the contents of a directory.

usage: ls [-al1] [directory]

If you type ls with no arguments, it will list the contents of the current directory.

By default file with filenames starting with a dot are not listed. If you want to list all the contents of the current directory, including the dot hidden files, you can use the -a option, ls -a.

You can see more information about the files in your directory with the -l option, ls -l. It will show something like this from MacOS:

drwxr-xr-x  14 tomg  staff  448 Aug 15 15:43 courses
drwxr-xr-x@ 16 tomg  staff  512 Aug 26  2023 coursework
File Permissions Number of Links Owner Group Size Date Modified Filename
drwxr-xr-x 14 tomg staff 448 Aug 15 15:43 courses
drwxr-xr-x@ 16 tomg staff 512 Aug 26 2023 coursework

The file permissions are a string of 10 characters. The first character is the file type. The next 9 characters are the file permissions in triplets corresponding to the user, group, and other permissions.

In each triplet, the first character is the read permission, the second character is the write permission, and the third character is the execute permission. If there is - in the triplet, it means the permission is not granted.

If there is a @ at the end of the line, it means the file has extended attributes.

             group permissions
        file type    |    extended attributes
                |   ---   |
                drwxr-xr-x@
                 ---   ---
                  |     |
                  |       other/world permissions
                  user permissions

Understanding file permissions are important to understand. For command shells on personal computers, the permissions are not as important. However, on shared computing clusters, understanding file permissions is critical. We’ll talk more about in the SCC section.

Also the “Owner” and “Group” are important. The “Owner” is the user that created the file. The “Group” is the group that the user belongs to. Again, on personal computers, these are not as important. However, on shared computing clusters, these are critical.

rm – remove files or directories

usage: rm [-fivr] file...

cp – copy files or directories

usage: cp [-R] source destination

mv – move or rename files or directories

usage: mv [-fiv] source destination

mkdir – make directories

usage: mkdir [-pv] directory...

cat – display file contents

usage: cat [-bEevnst] [file...]

more – display file contents one page at a time

usage: more [file...]

head – display the first 10 lines of a file

usage: head [-n] [file...]

tail – display the last 10 lines of a file

usage: tail [-f] [-n] [file...]

man – display the manual page for a command

usage: man [-k] command...

Note that for the most of the “builtin” commands shown above, there might ::: {.notes} be detailed command help, which is the case for MacOS zsh. SCC’s Linux bash does have more detailed help on builtin commands.

Other commands

There are many more commands and aspects of command shells that are helpful, such as pipes, background jobs, redirection, and more.

Command Shell-Based Editors

There are many great GUI-based Integrated Development Environments (IDEs) for machine learning. However, many command line tools are still useful.

nano – simple text editor

usage: nano [file...]

nano is a simple text editor that is easy to use. It is a good editor to use for beginners. For the most part, you are shown what commands are available to you.

vim – advanced text editor

usage: vim [file...]

vim is a more advanced text editor that is more powerful than nano. It is a good editor to use for more advanced users. It might be worth taking the time to learn it.

See for example the documentation and a Vim Cheat Sheet.

Perhaps the most basic getting started tutorial is to edit a new file to add a line of text to it, then save and exit.

vim myfile.txt

The edit will then occupy your entire terminal window.

An important concept of vim is that you switch between a navigate mode and an insert mode. When in navigate mode any keys you press will be interpretted as commands. You start in navigate mode. To switch to insert mode, you press i. To save and exit, you press Esc and then :wq.

So type ‘i’ to switch to insert mode, type any text you want to add, for example, “Hello, world!”, then press Esc to switch back to navigate mode.

To save and exit, press Esc and then :wq.

Shell Configuration

When you start a new shell, it will read certain files to set configuration. One useful one is .zshrc in your home directory.

If you add the following lines to your .zshrc file, it will add three very helpful decorations to your shell prompt:

  1. The current working directory.
  2. The current git branch name.
  3. A checkmark if the last shell command was successful and a question mark if it was not.
# Find and set branch name var if in git repository.
# From: https://medium.com/pareture/simplest-zsh-prompt-configs-for-git-branch-name-3d01602a6f33
function git_branch_name()
{
  branch=$(git symbolic-ref HEAD 2> /dev/null | awk 'BEGIN{FS="/"} {print $NF}')
  if [[ $branch == "" ]];
  then
    :
  else
    echo '('$branch')'
  fi
}

prompt='%(?.%F{green}√.%F{red}?%?)%f %B%F{240}%1~%f%b %F{red}$(git_branch_name)%f %# '

For example, on MacOS zsh my prompt looks like this:

ml-549-course (lectures) %

Which tells me that the last command was successful, I am in the ml-549-course directory, the current git branch is lectures.

When you start a new shell, it will read certain files to set configuration. One useful one is .bashrc in your home directory.

If you add the following lines to your .bashrc file, it will add two very helpful decorations to your shell prompt:

  1. The current working directory.
  2. The current git branch name.
parse_git_branch() {
     git branch 2> /dev/null | sed -e '/^[^*]/d' -e 's/* \(.*\)/(\1)/'
}
export PS1="\[\e[32m\]\w \[\e[91m\]\$(parse_git_branch)\[\e[00m\]$ "

So in a bash command shell, my prompt my look like this:

/users/tomg/ml-549-course (main) $

Which tells me I’m in the directory /users/tomg/ml-549-course and I’m on git branch main.

Exercise: Command Shell Scavenger Hunt

Create a shell script called treasure_hunt.sh that performs a series of tasks, demonstrating your understanding of basic command line operations. The script should:

  1. Create a directory called ml_treasure_hunt
  2. Inside that directory, create a text file named clue_1.txt with the content “The treasure is hidden in plain sight”
  3. Create a subdirectory called secret_chamber
  4. In the secret_chamber, create a file called clue_2.txt with the content “Look for a hidden file”
  5. Create a hidden file in the ml_treasure_hunt directory called .treasure_map.txt with the content “Congratulations! You’ve found the treasure!”

Use either nano or vim from a command shell to create the script.

Tip

Shell scripts are just text files. By convention, they have a .sh extension and start with a “shebang” line.

#!/usr/bin/env bash

# Your code here

There are multiple ways to execute the shell script. Perhaps the easiest is

source treasure_hunt.sh

References

Back to top

Footnotes

  1. The name bash is a pun on the name of the Bourne shell, sh.↩︎