WebR Charlie Data Example

Loading Libraries and Data

Libraries and Packages

First, the words “package” and “library” refer to basically the same thing and are used interchangeably.

R comes with built in functions, but you can access many, many more by loading in libraries. Since R is open source, libraries can be written by anyone and can contain very specific functions. In order to load in the libraries you will use the function library() where the name of what you want goes inside the parentheses. However, you need to have a library installed before you use it, and to do that you have to use the function install.packages().

If you are using the server version of R (the web-based version), you should not need to install most packages, you can just load them, but if you’re using your own version of R you may need to install packages first. Think of installing packages like buying a book and loading libraries like getting the book off the shelf. You only need to buy it once (install.packages()), but every time you want to use it, you need to get it off the shelf (library()).

We’ll load the {tidyverse} package, which will automatically load many other libraries for us including {readr} to read in files and {ggplot} to make graphs.

One annoying thing to note is that with install.packages() the name of the library has to be in quotes, but then with library() the name is not in quotes. If you need to install a library first, the code would look like:

install.packages("tidyverse")
library(tidyverse)

It is typical practice in R coding to always put all the necessary library commands at the top of the script. Sometimes you won’t need a library until midway through the script, but you should always add the library at the top anyway. When loading multiple libraries, you should make a new line and new command for each one, ex:

library(tidyverse) 
library(readxl) 
library(stats)

# rest of script goes after

Loading Data

There are a number of ways to do this, but they all boil down to making sure you know where your file is and directly telling R where to look for it. This means finding the file path.

File Paths

A file path is like the address of where a file lives. It lists each folder that a file is nested in separated by a /. For example, if your home name was Charlie and you have a file called “data.xlsx” in a folder called “project” inside your documents folder the extension would be:

Mac/Linux: "/Users/Charlie/Documents/project/data.xlsx"
Windows: "C:\Users\Charlie\Documents\project\data.xlsx"

This shows the broadest information first and then goes through each nested folder until it reaches the file. If you’re unsure where a file is, on a mac you can right click on a file and say “Get Info” to see the full file path, and on Windows you can right click and go to “Properties”.

Important to note is that R is case sensitive, so the capitalization in the file name does matter.

Also, a nice shortcut is that on Mac/Linux you can use ~ in place of the home folder, so the following file paths are actually identical to your computer and R.

"/Users/Charlie/Documents/project/data.xlsx"
"~/Documents/project/data.xlsx"

Different Methods

Depending on which version of R you are using, you will load in the data differently:

If you’re using the Reed server to access R (rstudio.reed.edu), go to Loading to Server. If you’re using your own desktop version of R (meaning you downloaded R and RStudio), go to Loading from Desktop. After either of these, complete the Reading in Data from a File section.

If you are just following along with this document, you don’t need to download anything and can skip to Reading in Data from a URL.

Loading to Server

To add a file to the server, you have to download it to your computer, then upload it to the server. This may seem counter-intuitive or tedious, and there are other to load files, but this will likely be the most common way you add files. If you have need a file from Moodle or Google Drive, it may not be directly accessible by the server, but if you first download it to your computer, then you’ll be able to add it to the server.

Once you have downloaded the file, take note of where it is in on your computer. Maybe you downloaded it to a Downloads folder, maybe not, you can search for the name “charlie_treats.csv” to see where it is. In this example, I have it on my Desktop. Now we’ll move it to the server. (This can be a little confusing because the organization of files on the server has a similar format to that on your computer: both have a home folder that can contain sub-folders.)

To add it from your computer to the server, click the “Files” tab in the lower right pane to see all the files that you have on the server and press the “Upload” button. (If you don’t see an Upload button, press the gear icon and you’ll see more options.) Press “Choose File” to find the csv file on your computer. “Target directory” is where the file will end up on the server. Press “Browse” and then “New Folder” at the bottom right. Name this folder “Workshop”, press “Choose”, then press “OK”.

The file extension for this file is now "~/Workshop/charlie_treats.csv" .

Loading from Desktop

Once you have downloaded it, take note of where it is in on your computer. Maybe you downloaded it to a Downloads folder, maybe not, you can search for the name “charlie_treats.csv” to see where it is. Move the folder to whatever folder you would like it to be in. In this example, I created a folder called “Workshop” on my Desktop.

The file extension for this file is now "~/Desktop/Workshop/charlie_treats.csv" for Mac/Linux and "C:\Users\<YourUsername>\Workshop\charlie_treats.csv" for Windows, where you would replace <YourUsername> with the actual name of your home folder.

Reading in Data from a File

Now that we know the file extension that tells us where the file is located we can use the command read_csv() to load it in and store it as an object named ‘data’. Be sure to include the quotes and the .csv extension.

# load the data into this doc by pasting in the file extension
# example: data <- read_csv("~/Desktop/Workshop/charlie_treats.csv")

treats <- read_csv()

If you read it in successfully, you will see some output in the console like in the picture below. All it’s saying is a brief overview of what it loaded, but as long as it doesn’t say “Error” you can ignore it.

Reading in Data from a URL

If you are only following along with this document and not using RStudio, you can read the data in from a Github repository by using the url() command inside the read_csv() command. (Make sure you have loaded your library first!)

Note: If you read it in successfully, you will see some red output in the console like in the picture below. This is actually okay, I don’t know why they picked to show this in red. All it’s saying is a brief overview of what it loaded, but as long as it doesn’t say “Error” you can ignore it.

After reading in the data, you should say an object called treats appear in your Variable Environment.