In week 1 we talked about getting started in R, the role it can play for psychology, and made our first attempt to learn how to use the language. We went through these slides and sections, and the “homework” exercise was to try to make sure we all have a basic understanding of these sections:
So the place we’ll start this week is with “revision” (which in my experience is a terrible name to describe something important… which isn’t just about revisiting something you already know, but also a mechanism for talking about stuff that didn’t make sense the first time). Here are a few exercises that I’d like you to try:
Write a script that does the following
Save your script to a file like week2_ex1.R
Write a new script called week2_ex2.R
(or whatever) that does the following
names
ages
Make sure that the TurtleGraphics
package is installed on your machine, by typing library(TurtleGraphics)
at the console. If it works, great! Move on to Exercise 4. If it does not, here are the commands we need
install.packages("devtools")
library(devtools)
install_github("djnavarro/TurtleGraphics")
library(TurtleGraphics)
Create a new script called week2_turtle.R
. It should do this:
TurtleGraphics
packageturtle_init()
A question you should consider: why did I ask you to include line 1, given that you’ve already done exercise 3???
The main goal this week is to cover some key R concepts and programming ideas. We’ll go through these sections in turn, each of which ends with some exercises.
At the end of this, we reach the point that the turtle draws a pretty picture. Your main exercise here is to try to modify what picture it draws!
I’m not sure how this will all work out time-wise, but if we do have time we’ll follow this up by getting started on the “working with data” section of the notes. We’ll start at the prelude, and talk briefly about some of the data types.
Here are some additional exercises:
library(tidyverse)
Reading a data frame from an online CSV file. We’re taking a tidyverse approach so strictly speaking we have a tibble rather than a pure data frame!
books <- read_csv(file = "http://psyr.djnavarro.net/data/booksales.csv")
##
## ── Column specification ────────────────────────────────────────
## cols(
## Month = col_character(),
## Days = col_double(),
## Sales = col_double(),
## Stock.Levels = col_character()
## )
class(books)
## [1] "spec_tbl_df" "tbl_df" "tbl" "data.frame"
head(books)
## # A tibble: 6 x 4
## Month Days Sales Stock.Levels
## <chr> <dbl> <dbl> <chr>
## 1 January 31 0 high
## 2 February 28 100 high
## 3 March 31 200 low
## 4 April 30 50 out
## 5 May 31 0 out
## 6 June 30 0 high
“Inside” a data frame are just regular vectors:
print(books$Sales)
## [1] 0 100 200 50 0 0 0 0 0 0 0 0
To see how data frames are just regular vectors bound together, create one:
names <- c("Granny","Nanny","Magrat")
ages <- c(70, 70, 30)
family <- tibble(names, ages)
print(family)
## # A tibble: 3 x 2
## names ages
## <chr> <dbl>
## 1 Granny 70
## 2 Nanny 70
## 3 Magrat 30
A data manipulation exercise… data from multiple people, but might be missing cases! First read one data set to take a look:
subj1 <- read_csv(file = "http://psyr.djnavarro.net/data/subj1.csv")
##
## ── Column specification ────────────────────────────────────────
## cols(
## response = col_double(),
## word = col_character()
## )
print(subj1)
## # A tibble: 10 x 2
## response word
## <dbl> <chr>
## 1 1 blah
## 2 2 blah
## 3 3 blah
## 4 4 blah
## 5 5 blah
## 6 6 blah
## 7 7 blah
## 8 8 blah
## 9 9 blah
## 10 10 blah
Next, define a function to “check” if a data frame has the correct number of cases
check_file <- function(dataset) {
n_cases <- dim(dataset)[1] # number of cases in the data frame
is_okay <- n_cases == 10 # file is okay if it has 10 observations
return(is_okay)
}
Create a vector listing the files we want to check
file_list <- c(
"http://psyr.djnavarro.net/data/subj1.csv",
"http://psyr.djnavarro.net/data/subj2.csv",
"http://psyr.djnavarro.net/data/subj3.csv",
"http://psyr.djnavarro.net/data/subj4.csv"
)
Write a loop that checks the functions one at a time
for(file in file_list) {
dat <- read_csv(file)
is_okay <- check_file(dat)
if( !is_okay ) {
print(file)
}
}
## [1] "http://psyr.djnavarro.net/data/subj3.csv"
Make sure all three code fragments are in a single file and run it!
Eek what if there are 400 files! (there actually are!). I refuse to type all that into a long list, use text manipulation
file_list <- paste0("http://psyr.djnavarro.net/data/subj", 1:20, ".csv")
for(file in file_list) {
dat <- read_csv(file)
is_okay <- check_file(dat)
if( !is_okay ) {
print(file)
} else{
print("ok")}
}