R for Biologists, March 8, 2017

We will be utilizing red and green sticky notes today. If you run into problems or have questions,

please place the red sticky note to the back of your computer screen and a helper will assist you.

 

If you have not already requested an HCC account under the rcourse998 group, please do so here: https://hcc.unl.edu/new-user-request

If you already have an HCC account and need to be added to the rcourse998 group, please let us know.

If you have not previously set up Duo Authentication, please ask for assistance.

 

Set up Instructions:

Windows:

For Windows will we use two third party application PuTTY and WinSCP for demonstration.

PuTTY:   <putty-0.62-installer.exe or http://www.putty.org/>

WinSCP: <winscp515setup.exe or http://winscp.net/eng/download.php>. 

Mac/Linux:

Mac and Linux users will need to download and install Cyberduck. Detailed information for downloading and setting up Cyberduck can be found here: For Mac/Linux Users

 

R core and R Studio:

We will be writing scripts offline in RStudio and then uploading them to execute them on the cluster. This lesson assumes you have the R core and RStudio installed. If you do not you can install them here:

R core: https://cloud.r-project.org/

RStudio: https://www.rstudio.com/products/rstudio/download/

 

Required Packages:

We will also be using the dplyr, ggplot2 and maps package. If you do not have these installed, please install them now. You can do so using the following commands inside the RStudio console:

install.packages("dplyr")

install.packages("ggplot2")

install.packages("maps")

 

What is a cluster:

(picture courtesy of: http://training.h3abionet.org/technical_workshop_2015/?page_id=403)

 

Linux Commands Reference List:

Command
What it does
Example Uses
ls list: Lists the files and directories located in the current directory
  • ls
  • ls -a
    • shows all the files in the directory, including hidden ones
  • ls -l
    • shows contents in a list format including information such as file size, file permissions and date the file was modified
  • ls *.txt
    • shows all files in the current directory which end with .txt
cd change directory: this allows users to in or out of file directories
  • cd <folder path>
  • cd folder_name
    • navigates into directory "folder_name" located in the current directory
  • cd ..
    • navigates out of a directory and into the parent directory
    cd $HOME (or $WORK)
    • navigates to a user's home (or work) directory
mv move: used to move a file or directory to another location
  • mv <current file(s)> <target file(s)>
  • mv * ../
    • moves all files from the current directory into the parent directory
  • mv old_filename new_filename
    • renames the file "old_filename" to "new_filename"
man

manual: displays documentation for commands

Note: Use up and down arrows to scroll through the text. To exit the manual display, press 'q'

  • man <command name>
  • man ls
    • displays documentation for the ls command
mkdir make directory: creates a directory with the specified name
  • mkdir <new_folder>
    • creates the directory "new_folder" within the current directory
rmdir

remove directory: deletes a directory with the specified name

Note: rmdir only works on empty directories

  • rmdir <folder_name>
    • removes the directory "folder_name" if the directory
  • rmdir *
    • removes all directories within the current directory
rm remove: deletes file or files with the specified name(s)
  • rm <file_name>
    • deletes the file "file_name"
  • rm *
    • deletes all files in the current directory

nano

nano text editor: opens the nano text editor

Note: To access the menu options, ^ indicates the control (CTRL) key.

  • nano
    • opens the text editor in a blank file
  • nano <file_name>
    • opens the text editor with "file_name" open. If "file_name" does not exist, it will be created if the file is saved
clear clear: clears the screen of all input/output
  • clear
less

less: opens an extended view of a file

Note: Use up and down arrows to scroll through the text. To exit the extended view, press 'q'

  • less <file_name>
    • opens an extended view of the file "file_name"

 

To download the tutorial files:

 

Take Home Exercise:

Data Analysis in R - Please note that the on the bottom of page three, there is a missing parenthesis at the end of the last command.

The final code chunk should read:

# Calculate flight age using birthmonth

age <- data.frame(names(acStart), acStart, stringsAsFactors=FALSE)

colnames(age) <- c("TailNum", "acStart")

flights <- left_join(flights, age, by="TailNum")

flights <- mutate(flights, Age = (flights$Year * 12) + flights$Month - flights$acStart)

Attachments:

Big_Data_Example.pdf (application/pdf)
Screen Shot 2017-03-08 at 10.19.41 AM.png (image/png)
Screen Shot 2017-03-08 at 10.20.51 AM.png (image/png)
Screen Shot 2017-03-08 at 10.27.19 AM.png (image/png)
Screen Shot 2017-03-08 at 10.28.57 AM.png (image/png)
Screen Shot 2017-03-08 at 10.30.11 AM.png (image/png)
cluster_small2.png (image/png)