5+5
[1] 10
1.) Introduction
orientation in RStudio
execution command, basic operators
assigning operator
vectors
comments
2.) Syntax and basic functions
functions, objects, values, syntax
dataframes and vectors
Syntax: object <- value
Human reading: the assigning operator <-
assigns value of the result from the operation on the right to the object on the left
#
will not run1.) create one vector which contains 10 numbers from 51 to 60
2.) create another vector which contains 10 numbers from 101 to 110
3.) save the first vector as “vect_1” and second as “vect_2”
4.) subtract (odečti) vect_1 from vect_2 and save the results as “vect_sub”
()
(závorky)function_name(argument1 = value1, argument2 = value2, ...)
seq()
[1] 1000 1010 1020 1030 1040 1050 1060 1070 1080 1090 1100 1110 1120 1130 1140
[16] 1150 1160 1170 1180 1190 1200 1210 1220 1230 1240 1250 1260 1270 1280 1290
[31] 1300 1310 1320 1330 1340 1350 1360 1370 1380 1390 1400 1410 1420 1430 1440
[46] 1450 1460 1470 1480 1490 1500 1510 1520 1530 1540 1550 1560 1570 1580 1590
[61] 1600 1610 1620 1630 1640 1650 1660 1670 1680 1690 1700 1710 1720 1730 1740
[76] 1750 1760 1770 1780 1790 1800 1810 1820 1830 1840 1850 1860 1870 1880 1890
[91] 1900 1910 1920 1930 1940 1950 1960 1970 1980 1990 2000
,
In some functions, you don´t need to specify the parameters, but we recommend you to do so, at least at the beginning.
str()
will quickly tell you what kind of object with what kind of values you haveNote the difference between numbers and characters:
cbind()
binds vectors into columns (sloupce) and then function as.data.frame()
change them into dataframecbind()
was nested into function as.data.frame()
Get the basic information about the dataframe with str()
'data.frame': 5 obs. of 3 variables:
$ cisla : chr "1" "2" "3" "4" ...
$ stovky : chr "101" "102" "103" "104" ...
$ artefacts: chr "pottery" "dagger" "fibula" "spondylus" ...
We can see that columns cisla and stovky are not numbers, but characters. To be able for us to do mathematic operations, we need to change the values into numbers by function as.numeric()
Note we are use $
to define column we need to change. We will talk about the $
more in the next slide.
[,]
Syntax: name_of_your_dataframe[row_number, column_number]
$
Syntax: name_of_your_dataframe$name_of_the_column
Copy and paste this huge piece of code, or open the script “fake_graves.R”
grave_number <- 800:819
dating <- c(
"ne.lin","ne.lin","en.zvo","en.zvo","en.snu","br.une","br.une","br.une","la.a","rstred",
"ne.lin","br.une","en.zvo","en.snu","la.a","br.une","rstred","ne.lin","en.zvo","br.une"
)
sex <- c(
"male","male","male","female","male","female","female","female","male","female",
"male","female","male","female","male","male","female","female","female","male"
)
age <- c(
"31-40","21-30","<11","31-40",">50","31-40",">50","41-50","31-40","<11",
"<11","31-40","21-30",">50","41-50","31-40","21-30","31-40","<11",">50"
)
pottery <- c(
3,4,3,2,5,4,5,3,2,1,
1,6,4,7,3,5,2,5,1,6
)
bronze <- c(
0,0,0,0,0,5,1,2,0,0,
0,3,0,0,0,2,0,0,0,4
)
stone_chipped <- c(
1,1,0,0,2,1,0,0,0,0,
0,1,0,2,0,2,0,1,0,1
)
stone_polished <- c(
2,1,0,0,1,0,0,0,0,0,
0,0,0,1,0,0,0,1,0,0
)
grave_length <- c(
210,160,180,250,300,200,225,250,150,100,
90,230,200,260,210,240,180,220,100,270
)
grave_depth <- c(
50, 40, 70,200,250,100, 80, 70, 40, 30,
25,120, 90,180,100,150, 80,120, 30,200
)
df_grave <- as.data.frame(cbind(grave_number, dating, sex, age, pottery, bronze, stone_chipped, stone_polished, grave_length, grave_depth))
df_grave$pottery <- as.numeric(df_grave$pottery)
df_grave$bronze <- as.numeric(df_grave$bronze)
df_grave$stone_chipped <- as.numeric(df_grave$stone_chipped)
df_grave$stone_polished <- as.numeric(df_grave$stone_polished)
df_grave$grave_length <- as.numeric(df_grave$grave_length)
df_grave$grave_depth <- as.numeric(df_grave$grave_depth)
There are more elegant ways to prepare such a table. But for now, this is enough.
'data.frame': 20 obs. of 10 variables:
$ grave_number : chr "800" "801" "802" "803" ...
$ dating : chr "ne.lin" "ne.lin" "en.zvo" "en.zvo" ...
$ sex : chr "male" "male" "male" "female" ...
$ age : chr "31-40" "21-30" "<11" "31-40" ...
$ pottery : num 3 4 3 2 5 4 5 3 2 1 ...
$ bronze : num 0 0 0 0 0 5 1 2 0 0 ...
$ stone_chipped : num 1 1 0 0 2 1 0 0 0 0 ...
$ stone_polished: num 2 1 0 0 1 0 0 0 0 0 ...
$ grave_length : num 210 160 180 250 300 200 225 250 150 100 ...
$ grave_depth : num 50 40 70 200 250 100 80 70 40 30 ...
This is bit messy, so let’s use unique()
to just get a list of dating categories present:
In other words, we want to subset rows with graves dated to “ne.lin” (AKA kultura s lineární keramikou)
grave_number dating sex age pottery bronze stone_chipped stone_polished
1 800 ne.lin male 31-40 3 0 1 2
2 801 ne.lin male 21-30 4 0 1 1
11 810 ne.lin male <11 1 0 0 0
18 817 ne.lin female 31-40 5 0 1 1
grave_length grave_depth
1 210 50
2 160 40
11 90 25
18 220 120
Don’t worry — we’ll soon learn a more intuitive way to filter and subset data.
This looks complicated at the first sight, but don’t panic:
dating pottery
1 br.une 29
2 en.snu 12
3 en.zvo 10
4 la.a 5
5 ne.lin 13
6 rstred 3
Human reading: “Take the variable pottery and compute its sum for each value of dating in the dataframe df_grave.”
save your work, clean your workspace and open “fake_graves.R” script
run the code to create the dataframe “df_grave”
Answer the following questions:
which age group is the most represented?
Which category has the longest graves on average?
How many bronze artefacts were found?
which culture (dating group) had the most bronze tools?
subset all female graves and create “df_female_graves” object
what is the average number of pottery in female graves?