1. IntroductionΒΆ

R is perhaps the most powerful computer environment for data analysis that is currently available. R is both a computer language, that allows you to express instructions, and a program that responds to these instructions. R has core functionality to read and write files, manipulate and summarize data, run statistical test and models, make fancy plots, and many more things like that. This core functionality is extended by 100s of packages (plug-ins). Some of these packages provide more advanced generic functionality, others provide highly specialized and cutting-edge methods that are only used in highly specialized analysis.

Because of its versatility, R has become very popular across data analysts in many fields, from agronomy to bioinformatics, ecology, finance, geography, pharmacology and psychology. You can read about it in this article in Nature or in the New York Times. So you probably should learn R if you want to do modern data analysis, be a successful researcher, collaborate, get a high paying job, ... If you are not that much into data analysis but want to learn programming, I would suggest that you learn python instead.

This document provides a concise introduction to R. It emphasizes what you need to know to be able to use the language in any context. There is no fancy statistical analysis here. we just present the basics of the R language itself. We do not assume that you have done any computer programming before (but we do assume that you think it is about time you did). Experienced R users obviously need not read this. But the material may be useful if you want to refresh your memory if you have not used R much, or not recently, or if you feel confused.

When going through the material, it is very important to follow Norman Matloff‘s advice: “When in doubt, try it out!”. That is, copy the examples shown, and then make modifications to test if you can predict what will happen. Only then you really understand what is going on. You are learning a language, you are going to have to practice a lot to become good at it. And you just have to accept that for a while you will be stumbling.

Before going to the next chapters, we suggest that you first go through the R Code School. This will introduce some of the key concepts in an easy and interactive way in less than an hour. This text will then reinforce some of what you learned and take it further. If you like the interactive on-line experience, you might also try Datacamp’s introduction to R.

To work with R on your own computer, you need to download and install it. I recommend that you also install R-Studio. R-Studio is a separate program that ‘wraps around’ R to make it easier to use. Here is a video that shows how to work in R-Studio.

If you have trouble with the material presented here, you could consult additional resources to learn R. There are many free resources on the web, including R for Beginners by Emmanuel Paradis and this tutorial. Or consult this brief overview by Ross Ihaka (one of the originators of R) from his Information Visualization course. You can also pick up an introductory R book such as A Beginner’s Guide to R by Zuur, Leno and Meesters, R in a nutshell by Joseph Adler and Norman Matloff’s The Art of R Programming.

There is also a lot of very good material on rstatistics.net

If you want to take it easy, or perhaps learn while you commute in a packed train, you could watch some Google Developers videos.

If none of this appeals to you, and you already are experienced with R, or have done a lot of programming with other languages, skip all of this and have a look at Hadley Wickham’s Advanced R.

You can download this manual as a pdf