# Functions¶

We now have used many functions that come with *R*. For example `c`

, `matrix`

, `read.csv`

, and `sum`

. Functions are always used (‘called’) by typing their name, followed by parenthesis. In most, but not all, cases you supply ‘arguments’ within the parenthesis. If you do not type the parenthesis the function is not called. Instead, either the function definition or some of type of reference to it is shown.

## Existing functions¶

To see the content of a function, type its name:

```
nrow
## function (x)
## dim(x)[1L]
## <bytecode: 0x000000000cf1ebc8>
## <environment: namespace:base>
```

We see that `nrow`

has a single argument called `x`

. It calls another function, `dim`

to which it provides the same argument (`x`

) and returns its first element (`1L`

) (recall that adding `L`

(‘literal’) is a way to create an integer). Can you guess how `ncol`

is implemented? (See for yourself if you are right!). Now, let’s see what `dim`

looks like.

```
dim
## function (x) .Primitive("dim")
```

It is a ‘primitive’ (low level) *R* function that we cannot easily learn more about. Well, you could, by looking at the source code of *R* — but that is way out of scope of this tutorial.

To run (instead of inspect) `nrow`

we add parentheses:

```
nrow()
## Error in nrow(): argument "x" is missing, with no default
```

But this fails, because the function requires a valid argument, like this:

```
m <- matrix(1:6, nrow=2, ncol=3, byrow=TRUE)
nrow(m)
## [1] 2
```

Note `nrow(m)`

and that this is equivalent to

```
nrow(x=m)
## [1] 2
```

because the first argument of `nrow`

is called `x`

.

## Writing functions¶

*R* comes with thousands of functions for you to use. Nevertheless, it is often necessary to write your own functions. For example, you may want to write a function to:

- more clearly describe and isolate a particular task in your data analysis workflow.
- re-use code. Rather than repeating the same steps several times (e.g. for each of 200 cases you are analysing), you can write a function that gets called 200 times. This should lead to faster development of scripts and to fewer mistakes. And if there is a mistake it only needs to be fixed in one place.
- create a function that is an argument to another function (!). This is quite commonly done when using ‘apply’ type functions (see next chapter).

Writing your own functions is not difficult. The below is a very simple function. It is called `f`

. This is an entirely arbitrary name. You can also call it `myFirstFunction`

. It takes no arguments, and always returns ‘hello’.

```
f <- function() {
return('hello')
}
```

Look carefully how we assign a function to name `f`

using the `function`

keyword followed by parenthesis that enclose the arguments (there are none in this case). The *body* of the function is enclosed in braces (also known as “curly brackets” or “squiggly brackets”).

Now that we have the function, we can inspect it, and use it.

```
#inspect
f
## function() {
## return('hello')
## }
## <environment: 0x000000001057f2b8>
#use 2 times
f()
## [1] "hello"
f()
## [1] "hello"
```

`f`

is a very boring function. It takes no arguments and always returns the same result. Let’s make it more interesting.

```
f <- function(name) {
x <- paste('hello', name)
return(x)
}
f('Jasmin')
## [1] "hello Jasmin"
```

Note the `return`

statement. This indicates that variable `x`

(which is only known inside of the function) is returned to the caller of the function. Simply typing `x`

would also suffice, and ending the function with `paste('hello', name)`

would also do! So the below is equivalent but shorter, at the expense of being less explicit.

```
f <- function(name) {
paste('hello', name)
}
f('Sviatoslav')
## [1] "hello Sviatoslav"
```

Here is a function that returns a sequence of letters. The length is determined by argument `n`

.

```
frs <- function(n) {
s <- sample(letters, n, replace=TRUE)
r <- paste0(s, collapse='')
return(r)
}
```

Because the function uses randomization, I use `set.seed`

to always get the same result (as we discussed here.

```
set.seed(0)
frs(5)
## [1] "xgjox"
frs(5)
## [1] "fxyrq"
x <- frs(10)
x
## [1] "bferjumszj"
```

Now an example of a functions that manipulates numbers. This function squares the sum of two numbers.

```
sumsquare <- function(a, b) {
d <- a + b
dd <- d * d
return(dd)
}
```

We can now use the sumsquare function. Note that it is vectorized (each argument can be more than one number)

```
sumsquare(1,2)
## [1] 9
x <- 1:3
y <- 5
sumsquare(x,y)
## [1] 36 49 64
```

You can name the arguments when using a function; that often makes your intentions clearer.

```
sumsquare(a=1, b=2)
## [1] 9
```

But the names must match

```
sumsquare(a=1, d=2)
## Error in sumsquare(a = 1, d = 2): unused argument (d = 2)
```

And both arguments need to be present

```
sumsquare(1:5)
## Error in sumsquare(1:5): argument "b" is missing, with no default
```

Unless we redefine the function with default arguments that will be used if a value for the argument is not provided.

```
sumsquareD <- function(a=0, b=1) {
d <- a + b
dd <- d * d
return(dd)
}
sumsquareD(1:5, 2)
## [1] 9 16 25 36 49
```

As both arguments have a default value, we can call `sumsquareD`

without providing arguments

```
sumsquareD()
## [1] 1
```

Or with a single argument

```
sumsquareD(5)
## [1] 36
```

Above the value `5`

was assigned to argument `a`

because the argument was matched “by position”. If we only wanted to provide a value for `b`

, we need to match “by name”.

```
sumsquareD(b=3)
## [1] 9
```

Just another example, a function to compute the number of unique values in a vector:

```
nunique <- function(x) {
length(unique(x))
}
data <- c('a', 'b', 'a', 'c', 'b')
nunique(data)
## [1] 3
```

Of course, these were toy examples, but if you understand these, you should be able to write much longer and more useful functions. It can be difficult to “debug” (find errors in) a function. It is often best to first write the sequence of commands that you need outside a function, and only when it all works, wrap that code inside of a function block (`function( ) { }`

).

## Ellipses (…)¶

Ellipses `...`

are a special argument to many functions. It allows to pass optional additional arguments and/or arguments that are passed on to other functions. Consider these two functions (this is a bit advanced).

```
f1 <- function(x, y=10) {
x * y
}
# f2 calls f1
f2 <- function(x, ...) {
f1(x, ...)
}
f2(5)
## [1] 50
f2(5, y=5)
## [1] 25
```

Even though `f2`

does not have an argument `y`

it can be provided and it is passed on to `f1`

. This call returns an error :

```
f2(5, z=5)
## Error in f1(x, ...): unused argument (z = 5)
```

because `f1`

does not have an argument `z`

.

## Functions overview¶

A list of much used functions that we discuss in this introduction to *R*:

`c`

, `cbind`

, `rbind`

`length`

, `dim`

, `nrow`

, `ncol`

`sum`

, `mean`

, `prod`

, `sqrt`

`apply`

, `sapply`

, `tapply`

, `aggregate`

`rowSums`

, `rowMeans`

`merge`

, `reshape`

Also see this cheatsheet