Learn:

How to get and run R

R Syntax

Arithmetic calculations and mathematical functions

Data types

Vectors

Dormann, C. (2013). Parametrische Statistik: Verteilungen, maximum likelihood und GLM in R. Springer. (German).

Zuur, A. (2007). Analyzing Ecological Data. Springer.

McElreath, R. (2015). Statistical Rethinking. CRC Press.

Crawley, MJ (2012). The R Book. Wiley.

Wickham, H. (2014). Advanced R. CRC Press. http://adv-r.had.co.nz/

Wickham, H. (2017). R for Data Science. O’Reilly. https://r4ds.had.co.nz

`citation()`

```
##
## To cite R in publications use:
##
## R Core Team (2020). R: A language and environment for statistical
## computing. R Foundation for Statistical Computing, Vienna, Austria.
## URL https://www.R-project.org/.
##
## A BibTeX entry for LaTeX users is
##
## @Manual{,
## title = {R: A Language and Environment for Statistical Computing},
## author = {{R Core Team}},
## organization = {R Foundation for Statistical Computing},
## address = {Vienna, Austria},
## year = {2020},
## url = {https://www.R-project.org/},
## }
##
## We have invested a lot of time and effort in creating R, please cite it
## when using it for data analysis. See also 'citation("pkgname")' for
## citing R packages.
```

R is a statistical programming language that lets you:

- write functions
- analyze data
- apply most available statistical techniques
- create simple and complicated graphs
- write your own library functions and algorithms
- process spatial data
- document your research and make it easier to reproduce

Furthermore, R…

- Supported by a large user group (>1500
**Packages**) - Often compared to MatLab and Python
- Open source
- Can be linked to other languages (C, Fortran, Python, Stan, etc.)

- You need to write instructions (
**Code**) - R code follows a certain
**Syntax**(Grammar) - R Code is executed by the
**R interpreter** - R can interpret code:
- interactively in the
**Console**(command-line) - saved in a text file (
**Script**) and sent entirely to the R interpreter - Several IDE’s allow sending individual lines or entire scripts to the console

- interactively in the
- Many outputs are displayed in the Console
- Graphical outputs are displayed in a separate window

- R is an expression language with a very simple syntax
- It is
**case sensitive**, so**A**and**a**are different symbols and would refer to different variables - All alphanumeric symbols are allowed as variable names plus ‘.’ and ‘_’
- However, a name must start with ‘.’ or a letter, and if it starts with ‘.’ the second character must not be a digit
- Names are effectively unlimited in length

To follow the code examples, you can download the file `Script_Lab01.R`

from Moodle and open it in R Studio.

If an expression is given as a command, it is evaluated, printed (unless specifically made invisible), and the value is lost.

`2 + 5`

`## [1] 7`

An assignment also evaluates an expression and passes the value to a variable but the result is not automatically printed. The assignment operator is `<-`

(“less than” and “minus”).

`a <- 2 + 5`

If you enter the name of an existing variable into the console, its content will be printed to the console output.

`a`

`## [1] 7`

If you assign a new expression to an already existing variable, this variable will be overwritten.

```
b <- 5
a <- a + b
a
```

`## [1] 12`

The entities that R creates and manipulates are known as **objects**. These may be variables, arrays of numbers, character strings, and functions. The collection of objects currently stored is called the **workspace**.

The function `ls()`

can be used to display the names of objects in the workspace:

`ls()`

`## [1] "a" "b"`

The function `rm()`

can be used to remove objects from the workspace:

```
rm(b)
ls()
```

`## [1] "a"`

You can see the help for each R function using `?`

:

`?is.nan()`

You can even get help for help:

`?help`

Objects can store different types of data, e.g. not only numbers but also text:

`d <- "hello world"`

You can use the function `typeof()`

to identify the *data type*:

`typeof(a)`

`## [1] "double"`

`typeof(d)`

`## [1] "character"`

The most important data types are: **character**, **integer**, **double**, and **logical**

R includes functions to set or change the data type:

`as.character(a)`

`## [1] "17"`

`as.integer("3.1")`

`## [1] 3`

`as.double("3.1")`

`## [1] 3.1`

When an element or value is “not available” or a “missing value” in the statistical sense, a place within a vector may be reserved for it by assigning it the special value `NA`

.

Any operation on an `NA`

becomes an `NA`

`3 == NA`

`## [1] NA`

To evaluate if a variable contains a missing value use `is.na()`

:

`is.na(3)`

`## [1] FALSE`

There is a second kind of “missing” values which are produced by numerical computation, the so-called Not a Number, `NaN`

, values.

`0 / 0`

`## [1] NaN`

`is.na()`

is `TRUE`

both for `NA`

and `NaN`

values. To differentiate these, `is.nan()`

is only `TRUE`

for `NaN`

s.

There are several mathematical operators already implemented in R:

```
a <- 7
b <- 5
c <- a * b + sqrt(a) - b^2 / log(2) * 1.34 * exp(b)
c
```

`## [1] -7135.204`

The elementary arithmetic operators are the usual `+`

, `-`

, `*`

, `/`

and `^`

for raising to a power.

In addition all of the common arithmetic functions are available, e.g.:

`sqrt(x)`

: square root of x`exp(x)`

: antilog of x (e^x)`log(x, n)`

: log to base n of x (default n is e, natural log)`log10(x)`

: log to base 10 of x`sin(x)`

: sine of x in radians`cos(x)`

: cosine of x in radians- …and more

The **logical** data type can have `TRUE`

and `FALSE`

values (and `NA`

for not available).

The logical data type is a result of evaluating a **condition**, e.g. by using logical operators:

`a == b # is a equal to b ?`

`## [1] FALSE`

`a < b # is a less than b ?`

`## [1] FALSE`

`a > b # is a greater than b ?`

`## [1] TRUE`

You can combine logical operators (`==`

, `<`

, `<=`

, `>`

, `>=`

, `!=`

) or conditions with AND (`&`

) or OR (`|`

):

`a != b`

`## [1] TRUE`

`a != b & a < c`

`## [1] FALSE`

`a < b | a < c`

`## [1] FALSE`

Multiple data values can be stored in various data structures:

Homogeneous (of the same type):

vector

matrix

Heterogeneous (of mixed types):

data frame

list

A vector is an ordered collection of values from a single data type.

Use `c()`

to combine different values to a vector:

```
x <- c(1, 3, 8, 12, 56, 875, 234, 13)
x
```

`## [1] 1 3 8 12 56 875 234 13`

Use `length()`

to determine the number of values in a vector:

`length(x)`

`## [1] 8`

You can construct vectors from each data type:

```
y <- c("a", "b", "c")
typeof(y)
```

`## [1] "character"`

But you cannot mix data types. If you do, the simpler data type is used (**coercion**):

```
z <- c(1, 4, "b", 8.5, "abc")
typeof(z)
```

`## [1] "character"`

The order is: Logical > Double > Integer > Character

Vectors can be used in arithmetic expressions, in which case the operations are performed element by element.

`x`

`## [1] 1 3 8 12 56 875 234 13`

`x * 2`

`## [1] 2 6 16 24 112 1750 468 26`

`x + 2`

`## [1] 3 5 10 14 58 877 236 15`

If two vectors have different lengths, the shorter vector is **recycled** as often as needed:

```
x <- c(1, 2, 3, 4, 5, 6, 7, 8)
x
```

`## [1] 1 2 3 4 5 6 7 8`

`x + c(1, 2)`

`## [1] 2 4 4 6 6 8 8 10`

`x + c(1, 5, 1, 3)`

`## [1] 2 7 4 7 6 11 8 11`

`max()`

`min()`

`sum()`

`prod()`

`length()`

`x`

`## [1] 1 2 3 4 5 6 7 8`

`sum(x)`

`## [1] 36`

`x`

`## [1] 1 2 3 4 5 6 7 8`

`x < 4`

`## [1] TRUE TRUE TRUE FALSE FALSE FALSE FALSE FALSE`

`all(x < 4)`

`## [1] FALSE`

`any(x < 4)`

`## [1] TRUE`

When arithmetic functions are applied to logical vectors, `TRUE`

is treated as the number `1`

and `FALSE`

is treated as the number `0`

. This can be very handy when counting the number of true values.

`x`

`## [1] 1 2 3 4 5 6 7 8`

`sum(x < 4)`

`## [1] 3`

R includes helpful **functions** for generating sequences:

`1:10`

`## [1] 1 2 3 4 5 6 7 8 9 10`

`15:5`

`## [1] 15 14 13 12 11 10 9 8 7 6 5`

`seq(from = 1, to = 100, by = 10)`

`## [1] 1 11 21 31 41 51 61 71 81 91`

R includes helpful **functions** for generating repeats:

`rep("x", times=10)`

`## [1] "x" "x" "x" "x" "x" "x" "x" "x" "x" "x"`

`rep(c("x", "o"), times=5)`

`## [1] "x" "o" "x" "o" "x" "o" "x" "o" "x" "o"`

`rep(c("x", "o"), each=5)`

`## [1] "x" "x" "x" "x" "x" "o" "o" "o" "o" "o"`

You can access the `i`

’th value in a vector `x`

by using its positional index `x[i]`

:

```
x <- c(1, 3, 8, 12, 56, 875, 234, 13)
x[1]
```

`## [1] 1`

`x[c(1, 5)]`

`## [1] 1 56`

`x[c(1:4, 8)]`

`## [1] 1 3 8 12 13`

You can remove values from a vector using negative indices:

`length(x)`

`## [1] 8`

```
x2 <- x[-3]
length(x2)
```

`## [1] 7`

You can also overwrite individual values in a vector using indices. Here, `x[1]`

denotes the first element in `x`

:

```
x[1] <- 5
x
```

`## [1] 5 3 8 12 56 875 234 13`

Instead of using a numeric index pointing to the `i`

th position of vector `x`

, you can use a logical expression to subset or extract elements of `x`

that meet a certain condition. For example, the expression below evaluations for every element `i`

in vector `x`

if that element is larger than 100. The result is a logical vector of `TRUE`

and `FALSE`

that has the same length as `x`

. If such a logical vector is used as index vector all elements are extracted (or replaced) where the index vector is `TRUE`

.

`x > 100`

`## [1] FALSE FALSE FALSE FALSE FALSE TRUE TRUE FALSE`

`x[x > 100]`

`## [1] 875 234`

```
x[x > 100] <- 100
x
```

`## [1] 5 3 8 12 56 100 100 13`

Watch out. If the logical vector is shorter then `x`

the recycling rule applies!

`x`

`## [1] 5 3 8 12 56 100 100 13`

`x[c(TRUE, FALSE)]`

`## [1] 5 8 56 100`

`x[c(TRUE, FALSE, TRUE)]`

`## [1] 5 8 12 100 100`

Copyright © 2020 Humboldt-Universität zu Berlin. Department of Geography.

## Comments

Comments can be put almost anywhere, starting with a hashmark (

`#`

).Everything to the end of the line is a comment.