Hello, beautiful creature!
When I have a bad day, I fantasise about going back to doing my data analysis in Excel 🤦♀️.
Please don’t tell my supervisor.
While planning my experimental design, I was a passionate filter user. Sometimes I needed to know which nest boxes I should start monitoring if a female had already laid eggs. Sometimes I needed to check which nests I should visit first to collect my samples. For that, I needed to know whether and when the eggs had hatched 🐣.
It was nice and easy to check it in Excel.
Until it wasn’t. At some point, some nest boxes were empty, some had eggs, whereas parents were already raising chicks in others. Notes-taking started consuming too much of my time. I also didn’t work alone so I needed to document my thought process step-by-step.
That is why it may be worth making friends with some ⭐logical operators⭐.
Let’s start by grabbing the dataset here.
First install packages if you don’t have them installed already:
install.packages("readr")
install.packages("dplyr")
And load them:
library(dplyr) # please note the (lack of) quotation marks
library(readr)
Then read in the dataset we will be working with:
blue_tits <- read_csv("paste_the_path_to_the file_here")
One thing you need to know is that RStudio doesn’t like backslashes. If you are copying an absolute path on a Windows machine, e.g "C:\Users\DarwinianLass\Documents\Coding\data\blue_tits.csv" you will have to change it to “C:/Users/DarwinianLass/Documents/Coding/data/blue_tits.csv”
To just get a gist of how the dataset looks like, you can type:
head(blue_tits)
Here, we will start playing with some useful logical operators, such as AND, OR and IS NOT.
As I said, in my work, I sometimes want to know how many birds laid eggs after or before a given date. To do so, I type:
eggs_after13 <- filter(blue_tits, Laying_day > 13)
Since 1 in my dataset represents the first day of April, the blue_tits_after13th
data frame will only contain rows with nests where eggs appeared after the 13th of April.
If you want to check for nest boxes where birds laid eggs on the 13th OR 18th day of April, try:
eggs_13or18 <- filter(blue_tits, Laying_day == 13 | Laying_day == 18)
The funny thing is, the “equal to” sign is denoted in this context by a double “=”, and OR by “|”.
If I would like to see the list of nest boxes where birds laid eggs after the 13th of April AND their chicks hatched (in other words, the hatching day IS NOT “NA”), type:
chicks_after13 <- filter(blue_tits, Laying_day > 13 & Hatching_day !="NA" )
Note that “! =” together means basically “≠ “.
That’s all for today!
Or almost all.
If you reached the end of this tutorial, you deserve to know that R in fact has some in-built filters you can use to quickly view (but not save!) things 😉. To do this, you have to click here in RStudio:
And then click here:
See you soon!
Aga
PS: Here is the survey in which you can tell me what R topic you find particularly confusing and why you want to learn it so that we can shape this space together!