8 Linux and the Command Line

8.1 Introduction to UNIX and its siblings

UNIX
Originally developed at AT&T Bell Labs circa 1970. Has experienced a long, multi-branched evolutionary path
POSIX (Portable Operating System Interface)
a set of specifications of what an OS needs to qualify as “a Unix”, to enhance interoperability among all the “Unix” variants

8.1.1 Various Unices

The unix family tree

The unix family tree

8.1.2 Some Unix hallmarks

  • Supports multi-users, multi-processes
  • Highly modular: many small tools that do one thing well, and can be combined
  • Culture of text files and streams
  • Primary OS on HPC (High Performance Computing Systems)
  • Main OS on which Internet was built

8.2 The Command Line Interface (CLI)

The CLI provides a direct way to interact with the Operating System, by typing in commands.

8.2.1 Why the CLI is worth learning

  • Typically much more extensive access to features, commands, options
  • Command statements can be written down, saved (scripts!)
  • Easier automation
  • Much “cheaper” to do work on a remote system (no need to transmit all the graphical stuff over the network)

8.2.2 Connecting to a remote server via ssh

From the gitbash (MS Windows) or the terminal (Mac) type:

You will be prompted for your username and password.

aurora_ssh

aurora_ssh

You can also directly add your username:

In this case, you will be only asked for your password as you already specified which user you want to connect with.

** You can also use the terminal from RStudio!!**

8.4 General command syntax

  • $ command [options] [arguments]

where command must be an executable file on your PATH * echo $PATH

and options can usually take two forms * short form: -a * long form: --all

You can combine the options:

What do these options do?

8.4.1 find

Show me my Rmarkdown files!

Which files are larger than 1GB?

With more details about the files:

8.5 Getting things done

8.5.1 Some useful, special commands using the Control key

  • Cancel (abort) a command: Ctrl-c
  • Stop (suspend) a command: Ctrl-z
  • Ctrl-z can be used to suspend, then background a process

8.5.2 Process management

  • Like Windows Task Manager, OSX Activity Monitor
  • top, ps, jobs (hit q to get out!)
  • kill to delete an unwanted job or process
  • Foreground and background: &

8.5.3 What about “space”

  • How much storage is available on this system? df -h
  • How much storage am “I” using overall? du -hs <folder>
  • How much storage am “I” using, by subdirectory? du -h <folder>

8.6 Uploading Files

You have several options to upload files to the server. Some are more convenient if you have few files, like RStudio interface, some are more built for uploading a lot of files at one, like specific software… and you guessed it the CLI :)

8.6.1 RStudio

You can only upload one file at the time (you can zip a folder to trick it):

8.6.2 sFTP Software

An efficient protocol to upload files is FTP (File Transfer Protocol). The s stands for secured. Any software supporting those protocols will work to transfer files.

We recommend the following free software:

8.6.3 scp

The scp command is another convenient way to transfer a single file or directory using the CLI. You can run it from Aurora or from your local computer. Here is the basic syntax:

scp </source/path> <hostname:/path/to/destination/>

Here is an example of my uploading the file 10min-loop.R to Aurora from my laptop. The destination directory on Aurora is /home/brun/github_com/NCEAS/nceas-training/materials/files:

If you want to upload an entire folder, you can add the -r option to the command. The general syntax is:

Here is an example uploading all the images in the myplot folder

8.7 Advanced Topics:

8.7.1 Unix systems are multi-user

  • Who else is logged into this machine? who
  • Who is logged into “this shell”? whoami

8.7.2 A sampling of simple commands for dealing with files

  • wc count lines, words, and/or characters
  • diff compare two files for differences
  • sort sort lines in a file
  • uniq report or filter out repeated lines in a file

8.7.3 All files have permissions and ownership

  • Change permissions: chmod
  • Change ownership: chown
  • List files showing ownership and permissions: ls -l

          schild@aurora:~/postdoc-training/data$ ls -l
          total 1136
          -rw----r-- 1 schild scientist 1062050 May 29  2007 AT_85_to_89.csv
          -rwxrwxr-x 1 schild scientist   16200 Jun 26 11:20 env.csv
          -rwxr-xr-x 1 schild scientist   23358 Jun 26 11:20 locale.csv
          -rwxrwx--- 1 schild scientist    7543 Jun 26 11:20 refrens.csv
          -rwx------ 1 schild scientist   46653 Jun 26 11:20 sample.csv       
  • Clear contents in terminal window: clear

8.7.4 Getting help

  • <command> -h, <command> --help
  • man, info, apropos, whereis
  • Search the web!

8.7.5 History

  • See your command history: history
  • Re-run last command: !! (pronounced “bang-bang”)
  • Re-run 32th command: !32
  • Re-run 5th from last command: !-5
  • Re-run last command that started with ‘c’: !c

8.7.6 Get into the flow, with pipes

stdin, stdout, stderr

stdin, stdout, stderr

  • note use of * as character wildcard for zero or more matches (same in Mac and Windows); % is equivalent wildcard match in SQL queries
  • ? matches single character; _ is SQL query equivalent

8.7.7 Text editing

8.7.7.1 Some editors

  • vim
  • emacs
  • nano
$ nano .bashrc

8.7.7.2 Let’s look at our text file

  • cat print file(s)
  • head print first few lines of file(s)
  • tail print last few lines of file(s)
  • less “pager” – view file interactively (type q to quit command)qqqbf
  • od --t “octal dump” – to view file’s underlying binary/octal/hexadecimal/ASCII format
  • od is especially useful in searching for hidden characters in your data
  • watch for carriage return \r and new line \n\
  • dos2unix and unix2dos

8.7.8 Create custom commands with “alias”

alias lwc=’ls *.jpg | wc -l’

You can create a number of custom aliases that are available whenever you login, by putting commands such as the above in your shell start-up file, e.g. .bashrc

8.7.9 A sampling of more advanced utilities

  • grep search files for text
  • sed filter and transform text
  • find advanced search for files/directories

8.7.9.1 grep

Show all lines containing “bug” in my R scripts

Now count the number of occurrences per file

Print the names of files that contain bug

Print the lines of files that don’t contain bug

Print “hidden” dot-files in current directory

$ ls -a | grep '^\.'   

8.7.9.2 sed

Remove all lines containing “bug”!

Call them buglets, not bugs!

Actually, only do this on lines starting with #

8.8 Online resources

Above are just a few of the most useful Linux & Unix commands based on our experience. There are many more, and they comprise a rich set, that will serve you for years. They can be used in combination, and run from scripts. They can empower you when using high-end analytical servers, or doing repetitive tasks!