Command Shells
Overview
This is an overview and short tutorial on command shells followed by an exercise to test your skills.
Introduction
Although there are a lot of great GUI-based machine learning applications, there are many cases when it is more productive to use command line interfaces, for example with git or python and basic file and directory manipulation.
We’ll get you familiar with command shells for Windows and MacOS here. In the next section we’ll cover the Shared Computing Cluster at BU and its Linux operating system.
History and Types of Command Shells
By far, most command shells used in machine learning are descendents from the Bourne shell, sh, which was developed in the 1970s at AT&T’s Bell Labs for the Unix operating system. The Bourne Again shell1, bash, was developed in the 1980s at the University of California, Berkeley.
Most machine learning infrastructures are now built on Linux, which was a mostly compatible operating system built mostly from scratch by Linux founder Linus Torvalds in the 1990s.
The zsh shell is a more recent descendent of the Bourne shell. It is the default shell on MacOS. MacOS itself is a descendent of Unix and so the command shell is tightly integrated with the operating system.
Windows Terminal
It should be noted that Windows Terminal (todo: use correct name) is, for the most part, a completely different command shell from bash and zsh with different syntax and commands. As we describe below, we recommended to use one of the Linux variants of shells available on Windows.
Getting to a command shell
The types and how you start command shells varies by operating system. Choose the tab for your operating system.
In MacOS, start the terminal app. ![]()
Starting with MacOS Catalina, the default shell is zsh. If you’ve upgraded from an earlier version of Catalina, you might still be using bash shell. If you are still using bash, there should be reminder everytime you open Terminal instructing how you switch to zsh.
The reason it is important to know which shell you are running is that the shell startup configuration files are particular to the shell. There are some other subtle differences in shell commands and syntax.
You can see which shell you are running with the process status command ps. One or more of the processes listed will include bash or zsh.
There are several options for command shells on Windows, provided below in increasing order of preference.
Windows Terminal/Command Prompt
As mentioned above, Windows Terminal is a different shell from bash and zsh and is generally not as well supported by many machine learning related command line tools. But once you have python and git installed, the python and git commands will work in Windows Terminal fairly similarly to Linux style.
Windows Powershell
Windows Powershell is a more modern command shell that is similar to zsh and bash. It is the default shell on Windows 10 and later. (need to confirm this). But it is not exactly the command syntax of bash or zsh.
Windows GIT Bash
When you install GIT on Windows, it also installs a command shell called GIT Bash. This is a clone of the Linux bash shell and so the command syntax is more like Linux. It also operates in the Windows filesystem so it is straightforward to access files you use from Windows applications.
Windows Subsystem for linux (WSL)
Microsoft released Windows Subsystem for Linux (WSL) in 2016. WSL2 is a more modern version of WSL that is more compatible with Linux. It is the recommended way to use Linux on Windows.
Installation instructions arehere.
Once you setup WSL2, you can install a Linux distribution. Ubuntu is a popular choice, and we recommend that.
When you start the WSL2 shell, you are in the /home/user directory, which is different than your Windows home directory. You can get to your Windows home directory with cd /mnt/c/Users/your_username. Generally you don’t want to be manipulating Windows files from the WSL2 shell and vice versa as there are some subtle file format differences which may cause problems. But if needed, you can navigate between the two filesystems.
First Basic Commands
At this point you should have a command shell open and seeing at least a minimal prompt like:
$or
%Your prompt may have some other decorations around the prompt character, and we’ll cover those later.
pwd – print working directory
usage: pwdIt’s very helpful to know where you are in the filesystem and pwd tells you that. When you first open a terminal, you are usually in your home directory.
cd – change directory
usage: cd [directory]You use cd to change directories.
If you want to change to your home directory you can type cd with no arguments.
cdBesides type directory paths such as /path/to/directory, you can also use .. to go up one directory and . to refer to the current directory. The other special directories are ~ for your home directory and - for the previous directory you were in.
cd ..
cd .ls – list directory contents
Once you are in a directory You can use ls to list the contents of a directory.
usage: ls [-al1] [directory]If you type ls with no arguments, it will list the contents of the current directory.
By default file with filenames starting with a dot are not listed. If you want to list all the contents of the current directory, including the dot hidden files, you can use the -a option, ls -a.
You can see more information about the files in your directory with the -l option, ls -l. It will show something like this from MacOS:
drwxr-xr-x 14 tomg staff 448 Aug 15 15:43 courses
drwxr-xr-x@ 16 tomg staff 512 Aug 26 2023 coursework| File Permissions | Number of Links | Owner | Group | Size | Date Modified | Filename |
|---|---|---|---|---|---|---|
| drwxr-xr-x | 14 | tomg | staff | 448 | Aug 15 15:43 | courses |
| drwxr-xr-x@ | 16 | tomg | staff | 512 | Aug 26 2023 | coursework |
The file permissions are a string of 10 characters. The first character is the file type. The next 9 characters are the file permissions in triplets corresponding to the user, group, and other permissions.
In each triplet, the first character is the read permission, the second character is the write permission, and the third character is the execute permission. If there is - in the triplet, it means the permission is not granted.
If there is a @ at the end of the line, it means the file has extended attributes.
group permissions
file type | extended attributes
| --- |
drwxr-xr-x@
--- ---
| |
| other/world permissions
user permissions
Understanding file permissions are important to understand. For command shells on personal computers, the permissions are not as important. However, on shared computing clusters, understanding file permissions is critical. We’ll talk more about in the SCC section.
Also the “Owner” and “Group” are important. The “Owner” is the user that created the file. The “Group” is the group that the user belongs to. Again, on personal computers, these are not as important. However, on shared computing clusters, these are critical.
rm – remove files or directories
usage: rm [-fivr] file...cp – copy files or directories
usage: cp [-R] source destinationmv – move or rename files or directories
usage: mv [-fiv] source destinationmkdir – make directories
usage: mkdir [-pv] directory...cat – display file contents
usage: cat [-bEevnst] [file...]more – display file contents one page at a time
usage: more [file...]head – display the first 10 lines of a file
usage: head [-n] [file...]tail – display the last 10 lines of a file
usage: tail [-f] [-n] [file...]man – display the manual page for a command
usage: man [-k] command...Note that for the most of the “builtin” commands shown above, there might ::: {.notes} be detailed command help, which is the case for MacOS zsh. SCC’s Linux bash does have more detailed help on builtin commands.
Other commands
There are many more commands and aspects of command shells that are helpful, such as pipes, background jobs, redirection, and more.
Command Shell-Based Editors
There are many great GUI-based Integrated Development Environments (IDEs) for machine learning. However, many command line tools are still useful.
nano – simple text editor
usage: nano [file...]nano is a simple text editor that is easy to use. It is a good editor to use for beginners. For the most part, you are shown what commands are available to you.
vim – advanced text editor
usage: vim [file...]vim is a more advanced text editor that is more powerful than nano. It is a good editor to use for more advanced users. It might be worth taking the time to learn it.
See for example the documentation and a Vim Cheat Sheet.
Perhaps the most basic getting started tutorial is to edit a new file to add a line of text to it, then save and exit.
vim myfile.txtThe edit will then occupy your entire terminal window.
An important concept of vim is that you switch between a navigate mode and an insert mode. When in navigate mode any keys you press will be interpretted as commands. You start in navigate mode. To switch to insert mode, you press i. To save and exit, you press Esc and then :wq.
So type ‘i’ to switch to insert mode, type any text you want to add, for example, “Hello, world!”, then press Esc to switch back to navigate mode.
To save and exit, press Esc and then :wq.
Shell Configuration
When you start a new shell, it will read certain files to set configuration. One useful one is .zshrc in your home directory.
If you add the following lines to your .zshrc file, it will add three very helpful decorations to your shell prompt:
- The current working directory.
- The current git branch name.
- A checkmark if the last shell command was successful and a question mark if it was not.
# Find and set branch name var if in git repository.
# From: https://medium.com/pareture/simplest-zsh-prompt-configs-for-git-branch-name-3d01602a6f33
function git_branch_name()
{
branch=$(git symbolic-ref HEAD 2> /dev/null | awk 'BEGIN{FS="/"} {print $NF}')
if [[ $branch == "" ]];
then
:
else
echo '('$branch')'
fi
}
prompt='%(?.%F{green}√.%F{red}?%?)%f %B%F{240}%1~%f%b %F{red}$(git_branch_name)%f %# 'For example, on MacOS zsh my prompt looks like this:
√ ml-549-course (lectures) %
Which tells me that the last command was successful, I am in the ml-549-course directory, the current git branch is lectures.
When you start a new shell, it will read certain files to set configuration. One useful one is .bashrc in your home directory.
If you add the following lines to your .bashrc file, it will add two very helpful decorations to your shell prompt:
- The current working directory.
- The current git branch name.
parse_git_branch() {
git branch 2> /dev/null | sed -e '/^[^*]/d' -e 's/* \(.*\)/(\1)/'
}
export PS1="\[\e[32m\]\w \[\e[91m\]\$(parse_git_branch)\[\e[00m\]$ "So in a bash command shell, my prompt my look like this:
/users/tomg/ml-549-course (main) $
Which tells me I’m in the directory /users/tomg/ml-549-course and I’m on git branch main.
Exercise: Command Shell Scavenger Hunt
Create a shell script called treasure_hunt.sh that performs a series of tasks, demonstrating your understanding of basic command line operations. The script should:
- Create a directory called
ml_treasure_hunt - Inside that directory, create a text file named
clue_1.txtwith the content “The treasure is hidden in plain sight” - Create a subdirectory called
secret_chamber - In the
secret_chamber, create a file calledclue_2.txtwith the content “Look for a hidden file” - Create a hidden file in the
ml_treasure_huntdirectory called.treasure_map.txtwith the content “Congratulations! You’ve found the treasure!”
Use either nano or vim from a command shell to create the script.
Shell scripts are just text files. By convention, they have a .sh extension and start with a “shebang” line.
#!/usr/bin/env bash
# Your code hereThere are multiple ways to execute the shell script. Perhaps the easiest is
source treasure_hunt.shReferences
Footnotes
The name
bashis a pun on the name of the Bourne shell,sh.↩︎