# Algebra¶

Vectors and matrices can be used to compute new vectors (matrices) with simple and intuitive algebraic expressions.

## Vector algebra¶

We have two vectors, `a`

and `b`

```
a <- 1:5
b <- 6:10
```

Multiplication works element by element. That is `a[1] * b[1]`

,
`a[2] * b[2]`

, etc

```
d <- a * b
a
## [1] 1 2 3 4 5
b
## [1] 6 7 8 9 10
d
## [1] 6 14 24 36 50
```

The examples above illustrate a special feature of *R* not found in most
other programming languages. This is that you do not need to ‘loop’ over
elements in an array (vector in this case) to compute new values. It is
important to use this feature as much as possible. In other programming
languages you would need to write a *for-loop* to achieve the above
(for-loops do exist in *R*. They are very important and are discussed in
a later chapter).

You can also multiply a vector with a single number.

```
a * 3
## [1] 3 6 9 12 15
```

In the examples above the computations used either vectors of the same
length, or one of the vectors had length 1. You can use algebraic
computations with vectors of different lengths, as the shorter ones will
be “recycled”. *R* only issues a warning if the length of the longer
vector is not a multiple of the length of the shorter object. This is a
great feature when you need it, but it may also make you overlook errors
when your data are not what you think they are.

```
a + c(1,10)
## Warning in a + c(1, 10): longer object length is not a multiple of shorter
## object length
## [1] 2 12 4 14 6
```

No warning here:

```
1:6 + c(0,10)
## [1] 1 12 3 14 5 16
```

### Logical comparisons¶

It is very common in computer programs to test for (in)equality or whether a value is greater of smaller than another value.

Recall that `==`

is used to test for equality

```
a <- 1:5
b <- 6:10
a == 2
## [1] FALSE TRUE FALSE FALSE FALSE
```

And inequality is evaluated with `!=`

```
a != 2
```

“Less than or equal” is `<=`

, and “more than or equal” is `>=`

.

```
a < 3
## [1] TRUE TRUE FALSE FALSE FALSE
b >= 9
## [1] FALSE FALSE FALSE TRUE TRUE
```

`&`

is Boolean “AND”, and `|`

is Boolean “OR”.

```
a
## [1] 1 2 3 4 5
b
## [1] 6 7 8 9 10
b > 6 & b < 8
## [1] FALSE TRUE FALSE FALSE FALSE
# combining a and b
b > 9 | a <= 2
## [1] TRUE TRUE FALSE FALSE TRUE
```

### Functions¶

There are many functions that allow us to do vectorized algebra. For example:

```
sqrt(a)
## [1] 1.000000 1.414214 1.732051 2.000000 2.236068
exp(a)
## [1] 2.718282 7.389056 20.085537 54.598150 148.413159
```

Not all functions return a vector of the same length. The following functions return just one or two numbers:

```
min(a)
## [1] 1
max(a)
## [1] 5
range(a)
## [1] 1 5
sum(a)
## [1] 15
mean(a)
## [1] 3
median(a)
## [1] 3
prod(a)
## [1] 120
sd(a)
## [1] 1.581139
```

If you cannot guess what `prod`

and `sd`

do, look it up in the help
files (e.g. `?sd`

)

### Random numbers¶

It is common to create a vector of random numbers in data analysis, and also to create example data to demonstrate how a procedure works. To get 10 numbers sampled from the uniform distribution between 0 and 1 you can do

```
r <- runif(10)
r
## [1] 0.3506256 0.3939491 0.9509510 0.1066483 0.9347601 0.3461621 0.5330606
## [8] 0.5387943 0.7147179 0.4057905
```

For Normally distributed numbers, use `rnorm`

```
r <- rnorm(10, mean=10, sd=2)
r
## [1] 7.950903 10.646013 12.087225 11.466181 9.091726 8.688436 9.928155
## [8] 12.138323 9.032050 9.757980
```

If you run the functions above, you will get different numbers then the
ones shown here. After all, they are random numbers! Modern data
analysis methods use a lot of randomization. This can make a challange
to exactely reproduce results obtained. To allow for exact reproduction
of examples or real data analysis, we often want to assure that we take
exactly the *same* random sample each time we run our code. To do that
we use `set.seed`

. This function initializes the random number
generator (to a specific point in an infinite but static sequence of
numbers). This is illustrated below.

```
set.seed(12)
runif(2)
## [1] 0.06936092 0.81777520
runif(3)
## [1] 0.9426217 0.2693819 0.1693481
runif(4)
## [1] 0.03389562 0.17878500 0.64166537 0.02287774
set.seed(12)
runif(1)
## [1] 0.06936092
runif(2)
## [1] 0.8177752 0.9426217
set.seed(12)
runif(3)
## [1] 0.06936092 0.81777520 0.94262173
runif(5)
## [1] 0.26938188 0.16934812 0.03389562 0.17878500 0.64166537
```

Note that after each time `set.seed`

is called, the same sequence of
random numbers was be generated. This is a very important feature, as it
allows us to exactly reproduce results that involve random sampling. The
seed number is arbitrary; a different seed number will give a different
sequence.

```
set.seed(999)
runif(3)
## [1] 0.38907138 0.58306072 0.09466569
runif(5)
## [1] 0.85263123 0.78674676 0.11934226 0.60644699 0.08095691
```

The idea is that this will allow you to exactly reproduce results. By avoiding small amounts of variation between each time you run your code, you can be sure that all still works as before. You may wonder how to choose the value of the seed. You could take the date (e.g. “20210329”), but it should not really matter. If you notice that you data analysis gives materially different results besed on your choice of the seed, than you need to reconsider what you are doing, as your results are not stable (or potentially run it many times).

## Matrices¶

Computation with matrices is also ‘vectorized’. For example, with matrix
`m`

you can do `m * 5`

to multiply all values of m3 with 5, or do
`m^2`

or `m * m`

to square the values of m.

```
# set up an example matrix
m <- matrix(1:6, ncol=3, nrow=2, byrow=TRUE)
m
## [,1] [,2] [,3]
## [1,] 1 2 3
## [2,] 4 5 6
m * 2
## [,1] [,2] [,3]
## [1,] 2 4 6
## [2,] 8 10 12
m^2
## [,1] [,2] [,3]
## [1,] 1 4 9
## [2,] 16 25 36
```

We can also do math with a matrix and a vector. Note, again, that
computation with matrices in *R* is column-wise, and that shorter
vectors are recycled.

```
m * 1:2
## [,1] [,2] [,3]
## [1,] 1 2 3
## [2,] 8 10 12
```

Can you predict the result of this multiplication?

```
m * 1:4
```

You can multiply two matrices.

```
m * m
## [,1] [,2] [,3]
## [1,] 1 4 9
## [2,] 16 25 36
```

Note that this is “cell by cell” multiplication. For ‘matrix
multiplication’
in the mathematical sense, you need to use the `%*%`

operator.

```
m %*% t(m)
## [,1] [,2]
## [1,] 14 32
## [2,] 32 77
```