Learn:
How to get and run R
R Syntax
Arithmetic calculations and mathematical functions
Data types
Vectors
Dormann, C. (2013). Parametrische Statistik: Verteilungen, maximum likelihood und GLM in R. Springer. (German).
Zuur, A. (2007). Analyzing Ecological Data. Springer.
McElreath, R. (2015). Statistical Rethinking. CRC Press.
Crawley, MJ (2012). The R Book. Wiley.
Wickham, H. (2014). Advanced R. CRC Press. http://adv-r.had.co.nz/
Wickham, H. (2017). R for Data Science. O’Reilly. https://r4ds.had.co.nz
You can download R for Windows and other operating systems from R’s webpage: https://www.r-project.org.
To install R on macOS and Linux we recommend reading here: macOS and Linux.
RStudio is an Integrated Development Environment (IDE) that makes writing and interacting with R code easier. You can download RStudio here. Note, RStudio does not contain R. You need to install R and RStudio.
citation()
## To cite R in publications use:
##
## R Core Team (2023). _R: A Language and Environment for Statistical
## Computing_. R Foundation for Statistical Computing, Vienna, Austria.
## <https://www.R-project.org/>.
##
## A BibTeX entry for LaTeX users is
##
## @Manual{,
## title = {R: A Language and Environment for Statistical Computing},
## author = {{R Core Team}},
## organization = {R Foundation for Statistical Computing},
## address = {Vienna, Austria},
## year = {2023},
## url = {https://www.R-project.org/},
## }
##
## We have invested a lot of time and effort in creating R, please cite it
## when using it for data analysis. See also 'citation("pkgname")' for
## citing R packages.
R is a statistical programming language that lets you:
Furthermore, R…
To follow the code examples, you can download the file
Script_Lab01.R
from Moodle and open it in R Studio.
If an expression is given as a command, it is evaluated, printed (unless specifically made invisible), and the value is lost.
2 + 5
## [1] 7
An assignment also evaluates an expression and passes the value to a
variable but the result is not automatically printed. The assignment
operator is <-
(“less than” and “minus”).
a <- 2 + 5
If you enter the name of an existing variable into the console, its content will be printed to the console output.
a
## [1] 7
If you assign a new expression to an already existing variable, this variable will be overwritten.
b <- 5
a <- a + b
a
## [1] 12
The entities that R creates and manipulates are known as objects. These may be variables, arrays of numbers, character strings, and functions. The collection of objects currently stored is called the workspace.
The function ls()
can be used to display the names of
objects in the workspace:
ls()
## [1] "a" "b" "q"
The function rm()
can be used to remove objects from the
workspace:
rm(b)
ls()
## [1] "a" "q"
You can see the help for each R function using ?
:
?is.na()
You can even get help for help:
?help
Objects can store different types of data, i.e., not only numbers but
also text (character), logical
(Boolean), and missing. You can use the function typeof()
to identify the data type.
The common numeric data types in R are integer
and
double
. An integer is a whole number (without decimal
places). A double is a real number. R treats many numeric values as
double by default, so that you do not have to worry about conversion or
loosing precision when doing integer division.
a <- 7
typeof(a)
## [1] "double"
You can explicitly define an integer using a capital
L
.
b <- 7L
typeof(b)
## [1] "integer"
R automatically converts the result of the following integer division
to type double
.
typeof(7L/2L)
## [1] "double"
7L/2L
## [1] 3.5
d <- "hello world"
typeof(d)
## [1] "character"
The logical data type can have two possible values: TRUE
and FALSE
. Bboth can be abbreviated as T
and
F
, respectively.
typeof(TRUE)
## [1] "logical"
When an element or value is “not available” or a “missing value” in
the statistical sense, a place within a vector may be reserved for it by
assigning it the special value NA
.
Any operation on an NA
results in an NA
3 == NA
## [1] NA
To evaluate if a variable contains a missing value use
is.na()
:
is.na(3)
## [1] FALSE
There is a second kind of “missing” values which are produced by
numerical computation, the so-called Not a Number, NaN
,
values.
0 / 0
## [1] NaN
is.na()
is TRUE
both for NA
and NaN
values. To differentiate these,
is.nan()
is only TRUE
for
NaN
s.
R includes functions to set or change the data type:
as.character(a)
## [1] "7"
as.integer("3.1")
## [1] 3
as.double("3.1")
## [1] 3.1
There are several mathematical operators already implemented in R:
a <- 7
b <- 5
c <- a * b + sqrt(a) - b^2 / log(2) * 1.34 * exp(b)
c
## [1] -7135.204
The elementary arithmetic operators are the usual +
,
-
, *
, /
and ^
for
raising to a power.
In addition all of the common arithmetic functions are available, e.g.:
sqrt(x)
: square root of xexp(x)
: antilog of x (e^x)log(x, n)
: log to base n of x (default n is e, natural
log)log10(x)
: log to base 10 of xsin(x)
: sine of x in radianscos(x)
: cosine of x in radiansThe logical data type can have TRUE
and
FALSE
values (and NA
for not available).
The logical data type is a result of evaluating a condition, e.g. by using logical operators:
a == b # is a equal to b ?
## [1] FALSE
a < b # is a less than b ?
## [1] FALSE
a > b # is a greater than b ?
## [1] TRUE
You can combine logical operators (==
,
<
, <=
, >
,
>=
, !=
) or conditions with AND
(&
) or OR (|
):
a != b
## [1] TRUE
a != b & a < c
## [1] FALSE
a < b | a < c
## [1] FALSE
Multiple data values can be stored in various data structures:
Homogeneous (of the same type):
vector
matrix
Heterogeneous (of mixed types):
data frame
list
A vector is an ordered collection of values from a single data type.
Use c()
to combine different values to a vector:
x <- c(1, 3, 8, 12, 56, 875, 234, 13)
x
## [1] 1 3 8 12 56 875 234 13
Use length()
to determine the number of values in a
vector:
length(x)
## [1] 8
You can construct vectors from each data type:
y <- c("a", "b", "c")
typeof(y)
## [1] "character"
But you cannot mix data types. If you do, the simpler data type is used (coercion):
z <- c(1, 4, "b", 8.5, "abc")
typeof(z)
## [1] "character"
The order is: Logical > Double > Integer > Character
Vectors can be used in arithmetic expressions, in which case the operations are performed element by element.
x
## [1] 1 3 8 12 56 875 234 13
x * 2
## [1] 2 6 16 24 112 1750 468 26
x + 2
## [1] 3 5 10 14 58 877 236 15
If two vectors have different lengths, the shorter vector is recycled as often as needed:
x <- c(1, 2, 3, 4, 5, 6, 7, 8)
x
## [1] 1 2 3 4 5 6 7 8
x + c(1, 2)
## [1] 2 4 4 6 6 8 8 10
x + c(1, 5, 1, 3)
## [1] 2 7 4 7 6 11 8 11
max()
min()
sum()
prod()
length()
x
## [1] 1 2 3 4 5 6 7 8
sum(x)
## [1] 36
x
## [1] 1 2 3 4 5 6 7 8
x < 4
## [1] TRUE TRUE TRUE FALSE FALSE FALSE FALSE FALSE
all(x < 4)
## [1] FALSE
any(x < 4)
## [1] TRUE
When arithmetic functions are applied to logical vectors,
TRUE
is treated as the number 1
and
FALSE
is treated as the number 0
. This can be
very handy when counting the number of true values.
x
## [1] 1 2 3 4 5 6 7 8
sum(x < 4)
## [1] 3
R includes helpful functions for generating sequences:
1:10
## [1] 1 2 3 4 5 6 7 8 9 10
15:5
## [1] 15 14 13 12 11 10 9 8 7 6 5
seq(from = 1, to = 100, by = 10)
## [1] 1 11 21 31 41 51 61 71 81 91
R includes helpful functions for generating repeats:
rep("x", times=10)
## [1] "x" "x" "x" "x" "x" "x" "x" "x" "x" "x"
rep(c("x", "o"), times=5)
## [1] "x" "o" "x" "o" "x" "o" "x" "o" "x" "o"
rep(c("x", "o"), each=5)
## [1] "x" "x" "x" "x" "x" "o" "o" "o" "o" "o"
You can access the i
’th value in a vector x
by using its positional index x[i]
:
x <- c(1, 3, 8, 12, 56, 875, 234, 13)
x[1]
## [1] 1
x[c(1, 5)]
## [1] 1 56
x[c(1:4, 8)]
## [1] 1 3 8 12 13
You can remove values from a vector using negative indices:
length(x)
## [1] 8
x2 <- x[-3]
length(x2)
## [1] 7
You can also overwrite individual values in a vector using indices.
Here, x[1]
denotes the first element in x
:
x[1] <- 5
x
## [1] 5 3 8 12 56 875 234 13
Instead of using a numeric index pointing to the i
th
position of vector x
, you can use a logical expression to
subset or extract elements of x
that meet a certain
condition. For example, the expression below evaluations for every
element i
in vector x
if that element is
larger than 100. The result is a logical vector of TRUE
and
FALSE
that has the same length as x
. If such a
logical vector is used as index vector all elements are extracted (or
replaced) where the index vector is TRUE
.
x > 100
## [1] FALSE FALSE FALSE FALSE FALSE TRUE TRUE FALSE
x[x > 100]
## [1] 875 234
x[x > 100] <- 100
x
## [1] 5 3 8 12 56 100 100 13
Watch out. If the logical vector is shorter then x
the
recycling rule applies!
x
## [1] 5 3 8 12 56 100 100 13
x[c(TRUE, FALSE)]
## [1] 5 8 56 100
x[c(TRUE, FALSE, TRUE)]
## [1] 5 8 12 100 100
Copyright © 2024 Humboldt-Universität zu Berlin. Department of Geography.
Comments
Comments can be put almost anywhere, starting with a hashmark (
#
).Everything to the end of the line is a comment.