Flow control

Most programs have two general control-flow features: iteration and alternation. Iteration is done via “loops”, alteration via “if-then-else” branches.

Looping

Loops are typically used to repeat the same code a number of times for a set of cases. For example, compute the average grade for each student.

In R we avoid loops wherever we can, as they tend to be slower than ‘vectorized’ computation. We also generally prefer functions like apply (see the previous chapter), as this is more concise. At first code using for-loops may seem easier to read, but after using R for a while, the reverse is true is most cases. Nevertheless, there are cases where loops are much easier to write and clearer to read than using vectorized approaches.

There are two types of loops: ‘for-loops’ and ‘while-loops’. A ‘for-loop’ repeats the same code a predefined number of times. A ‘while-loop’ continues until a certain condition has been met (and is therefore prone to the dreaded “infinite loop” that never finishes!)

for-loops

Here is a basic for-loop that does not do anything useful. The trick is that in the parenthesis after for, you define a sequence of values. This sets the number of repetitions (the length of the sequence) and potentially provides a value that changes what is done in each loop.

Note that the braces { } are used to open and close a ‘block’ of code.

for (i in 1:3) {
    print('hi')
}
## [1] "hi"
## [1] "hi"
## [1] "hi"

Now let’s do something with i. You normally do not use print inside a loop. I am only doing that here to illustrate what is going on.

j <- 0
for (i in 1:3) {
    print(i)
    j <- j + i
}
## [1] 1
## [1] 2
## [1] 3
j
## [1] 6

The loop above was used to sum the values 1, 2, and 3. Of course, it would have been easier to use sum(1:3).

Another example.

for (i in 1:3) {
    txt <- paste('the square of', i, 'is', i * i)
    print(txt)
}
## [1] "the square of 1 is 1"
## [1] "the square of 2 is 4"
## [1] "the square of 3 is 9"

The example below is a bit more complex. It shows how iterator i is typically used in a loop (to get a single case from a collection of cases and compute a new value for that case).

s <- 0
a <- 1:6
b <- 6:1
# initialization of output variables
res <- vector(length=length(a))
# i goes from 1 to 6 (the length of b)
for (i in 1:length(b)) {
  s <- s + a[i]
  res[i] <- a[i] * b[i]
}
s
## [1] 21
res
## [1]  6 10 12 12 10  6

Again, for this simple problem, it would have been simpler to do s <- sum(a) and res <- a * b.

break and next

Sometimes you want to include a condition to either “break out” of a for loop, or to skip the remainder of the for block and go to the next iteration. How to do that is illustrated below:

for (i in 1:10) {
    if (i %in% c(1,3,5,7)) {
        next
    }
    if (i > 8) {
        break
    }
    print(i)
}
## [1] 2
## [1] 4
## [1] 6
## [1] 8

while-loops

“while-loops” are not nearly as common as “for-loops”. Here is an example.

i <- 0
while (i < 4) {
   print(paste(i, 'and counting ...'))
   i <- i + 1
}
## [1] "0 and counting ..."
## [1] "1 and counting ..."
## [1] "2 and counting ..."
## [1] "3 and counting ..."

And one that is less predictable, as it depends on the value of a random number.

set.seed(1)
i <- 0
while(i < 0.5) {
    i <- runif(1)
    print(i)
}
## [1] 0.2655087
## [1] 0.3721239
## [1] 0.5728534

You can also combine while with break.

set.seed(1)
while(TRUE) {
    i <- runif(1)
    print(i)
    if (i > 0.5) {
        break
    }
}
## [1] 0.2655087
## [1] 0.3721239
## [1] 0.5728534

Branching

Branching is an important mechanism in computer programs. A branch allows you to execute some code if certain conditions are met, and do something else in other cases. This is illustrated below. Note that the braces { } are used to open and close a ‘block’ of code.

We have two variables, x and y

x <- 5
y <- 10

We want to change y, depending on the value of x.

We need to branch our R code using one or more conditional statements (if, then, else) and some boolean logic (a statement that can evaluate to TRUE or FALSE.

if (x == 5) {
    y <- 15
}
y
## [1] 15

We tested for one condition, x==5. If this condition evaluated to TRUE, the code within the block, { y <- 15 } is executed. If it evaluates to FALSE, the code within the block is ignored. Note that the expression within the parenthesis if(), or else() should always evaluate to a single value of either TRUE or FALSE (not to NA or to multiple values).

Here is a more complex example, where we evaluate three cases. x > 20 (x is larger than 20), x >= 5 & x < 10 (x is in between 5 and 10, including 5, but not 10), and all other cases.

if (x > 20) {
    y <- y + 2
} else if (x > 5 & x < 10) {
    y <- y - 1
} else {
    y <- x
}
y
## [1] 5

If we have a boolean variable

b <- TRUE

You can do

if (b == TRUE) {
    print('hello')
}
## [1] "hello"

But it is more elegant to do

if (b) {
    print('hello')
}
## [1] "hello"

Now combining the previous chapter with this one, a for loop with an if/else branch:

a <- 1:5
f <- vector(length=length(a))
for (i in 1:length(a)) {
    if (a[i] > 2) {
        f[i] <- a[i] / 2
    } else {
        f[i] <- a[i] * 2
    }
}
f
## [1] 2.0 4.0 1.5 2.0 2.5