When the response variable is right skewed, many think regression becomes difficult. Skewed data is generally thought of as problematic. However the glm framework provides two options for dealing with right skewed response variables. For the gamma and inverse gaussian distributions, a right skewed response variable is actually helpful.

The critical step is being able to spot a gamma distribution when you see one. Theatrical skewness is \(\frac{2}{\sqrt(shape)}\). If shape is small, the gamma distribution is right skewed. If shape increases, the gamma becomes more symmetrical

```
library(GlmSimulatoR)
library(ggplot2)
library(dplyr)
library(stats)
set.seed(1)
#Very right skewed. Skewness 2
Gamma <- rgamma(1000, shape = 1, scale = 1)
temp <- tibble(gamma = Gamma)
ggplot(temp, aes(x=gamma)) +
geom_histogram(bins = 30)
```

```
#Very right skewed and spread out more. Skewness 2
Gamma <- rgamma(1000, shape = 1, scale = 5)
temp <- tibble(gamma = Gamma)
ggplot(temp, aes(x=gamma)) +
geom_histogram(bins = 30)
```

```
#Hump moves slightly towards the middle. Skewness 1.414214
Gamma <- rgamma(1000, shape = 2, scale = 1)
temp <- tibble(gamma = Gamma)
ggplot(temp, aes(x=gamma)) +
geom_histogram(bins = 30)
```

```
#Hump moves slightly more towards the middle. Skewness 1.154701
Gamma <- rgamma(1000, shape = 3, scale = 1)
temp <- tibble(gamma = Gamma)
ggplot(temp, aes(x=gamma)) +
geom_histogram(bins = 30)
```

To show the generalized linear model can handle skewness, lets make some data and train a model. Then calculate mean squared error.

Above we saw the gamma distribution take on many different shapes. The inverse gaussian distribution is not as flexible. It tends to maintain itâ€™s skewness for a variety of parameters.

```
library(statmod)
set.seed(1)
Invgauss <- rinvgauss(1000, mean = 1, shape = .2)
temp <- tibble(Invgauss = Invgauss)
ggplot(temp, aes(x=Invgauss)) +
geom_histogram(bins = 30)
```

```
Invgauss <- rinvgauss(1000, mean = 1, shape = 1)
temp <- tibble(Invgauss = Invgauss)
ggplot(temp, aes(x=Invgauss)) +
geom_histogram(bins = 30)
```

```
Invgauss <- rinvgauss(1000, mean = 1, shape = 3)
temp <- tibble(Invgauss = Invgauss)
ggplot(temp, aes(x=Invgauss)) +
geom_histogram(bins = 30)
```

```
Invgauss <- rinvgauss(1000, mean = 10, shape = .2)
temp <- tibble(Invgauss = Invgauss)
ggplot(temp, aes(x=Invgauss)) +
geom_histogram(bins = 30)
```

Similar to above, lets create data and train a model.Then calculate mean squared error.