LM 1 & 2: Single Predictor
...with solutions

Practical 09

Published

April 12, 2021

DOI

Welcome to LM!

This week, and for the rest of this term (and next year!), we’ll be primarily working with the Linear Model. This week we’re looking at what we’ve learned in the lectures and practicals for the last two weeks to get started creating linear models ourselves.

Setup

Task 1punk!

Set up as usual:

I Wanna Be the Very Best

The theme for today’s task is inspired by a weird Internet thing that happened in the spring of 2014, called Twitch Plays Pokemon. A programmer in Australia hooked up the classic Pokemon Red game to the chat room on video game streaming site Twitch. By typing commands in the chat, viewers could control what the character in the game did - but on the scale of thousands or tens of thousands of participants at once. The game turned into a massive social experiment and even spawned a minor cult before the cumulative 1.16 million viewers beat the game in about two and a half weeks

I (Jennifer) was doing my first year of PhD in Edinburgh, actually sat at the desk next to our own Milan Valasek (the universe is weird like that!). Because I spent a lot of my time on the Internet instead of, y’know, doing my work, I watched a lot of this unfold in real time. It was even stranger than this makes it sound.
1 .

To Catch Them Is My Real Test, To Train Them Is My Cause

If you aren’t really familiar with Pokemon, all you need to know for today is that it’s a series of video games (later TV show, graphic novels, etc. etc.) that take place in an alternate universe with magical animals called Pokemon. Pokemon battling (sort of like magical dogfighting?) is a core part of this universe, with children setting out at a young age to travel the world, capture many different types of Pokemon, and compete in massive championships, with the winner being crowned a Pokemon Master. It’s a bit ethically murky because at least some Pokemon seem to be sentient, and they can all talk, but only to say their own species name. You know, as I’m writing this description, it just keeps sounding weirder…

A picture of Pikachu, a mouselike yellow fantasy creature. Figure 1 A Pokemon called Pikachu. You may have heard of him. Source

For this week’s stats practical, we’re doing a Twitch-Plays-Pokemon-style walkthrough of a dataset of Pokemon characteristics to practice the linear model. You don’t need to know anything about Pokemon to do this practical besides what’s in the box above, but if you weren’t in the live practical session, you might want to watch the recording before you get started.

Task 2punk!

Read in the pokemon dataset at the following link, which is a copy of this Pokemon dataset.

Link: https://and.netlify.app/docs/pokemon.csv

Making Predictions

First we need to choose a research question to investigate.

As future Pokemon Masters, we want to work out which Pokemon is the strongest. So, we’ll keep using attack as our outcome variable, which quantifies a Pokemon’s offensive capabilities.

Task 3punk!

For the predictor, choose from one of the following:

For the predictor, you should choose a different predictor than the one in the guided part of the practical. For the purposes of the solutions in this worksheet, I’ll use hp.

Task 4punk!

In your RMarkdown file, write down your prediction about the relationship between your chosen predictor and the outcome, attack.

Visualising the Relationship

Next up, we should have a look at our data.

Task 5Prog-rocK

Create a scatterplot of your chosen predictor and attack as the outcome.

Task 5.1

If you like, label the axes better and clean up the plot with a theme.

Task 5.2

Stop and have a look at your plot. How would you draw the line of best fit? Is this the direction of relationship you expected?

Task 5.3

Add a line of best fit to your plot. Is this what you expected?

Task 5.4

Optionally, add another line to your plot that represents the null model.

Hint

Have a look at geom_hline, or the code for the plots in the lecture!

Creating the Model

Now that we have some idea of what to expect, we can create our linear model!

Task 6punk!

Use the lm() function to run your analysis and save the model in a new object called poke_lm.

Task 7punk!

Call this poke_lm object in the Console to view it, then write out the linear model from the output.

Task 8Prog-rocK

How can you interpret the value of b1 for this model? Write down your thoughts in your RMarkdown.

Task 9punk!

Using your equation, what attack value would you predict a Pokemon with 86 HP to have?

Evaluating the Model

Now we have the model parameters, but we don’t want to just describe the line - we want to be able to say something about the population, not just our sample. For this, we need some more info!

Significance Testing

Task 10Prog-rocK

Use broom::tidy() to get p-values for your bs. Is your predictor significant?

Task 11punk!

Add the conf.int = T argument to broom::tidy() to get confidence intervals. How can you interpret these? Do they “agree” with the p-value?

Goodness of Fit

Task 12punk!

Use broom::glance() to get R2 and adjusted R2 for this model. How much of the variance of the outcome is explained by the model for this sample? What would we expect this to be in the population?

Reading the Summary

Finally, we can get all this same information - except for CIs - from the summary() function. I (Jennifer) like summary() because you can get a good overview of a lot of information quickly, but it’s very inconvenient for reporting, so it’s good to know how to use the broom functions as well.

Task 13punk!

Get a summary of your model. What do the stars mean?

I Know It’s My Destiny

That’s as far as we’ll go today - we’ll practice reporting models after the break, when we have multiple predictors to explore.

The Linear Model is all of our destinies for the next year or so, so it’s important to get comfortable working with it. Feel free to explore this dataset further if you’d like more practice over the holiday!

Footnotes

  1. I (Jennifer) was doing my first year of PhD in Edinburgh, actually sat at the desk next to our own Milan Valasek (the universe is weird like that!). Because I spent a lot of my time on the Internet instead of, y’know, doing my work, I watched a lot of this unfold in real time. It was even stranger than this makes it sound.[↩]