Intro & Joining tables

Today we will…

  • learn basics of the correspondence analysis
  • explore distances, similarity and correlation
  • transform your data from the long-table to wide-table format
  • finish the exercise from the last lesson

Joining tables

It’s rare that a data analysis involves only a single data frame. Typically you have many data frames, and you must join them together to answer the questions that you’re interested in.

(Hadley Wickham, R for Data Science, https://r4ds.hadley.nz/joins.html#introduction)

Joins

Inner join

  • Only the rows that have a common key in both x and y tables.

Full join

  • All the rows in tables x and y.
  • Missing values (NA) where keys do not match in x or y.

Left join

  • All rows from x.
  • Missing values (NA) where key from x does not match any key in y.

Example

graves
# A tibble: 6 × 3
  grave_id burial     mound_present
  <chr>    <chr>      <lgl>        
1 G-001    missing    FALSE        
2 G-002    inhumation FALSE        
3 G-003    inhumation TRUE         
4 G-004    cremation  FALSE        
5 G-005    cremation  FALSE        
6 G-006    inhumation FALSE        
artefacts
# A tibble: 10 × 3
   artefact_id grave_id artefact_type
   <chr>       <chr>    <chr>        
 1 A-0076      G-001    Whetstone    
 2 A-0135      G-001    Arrowhead    
 3 A-0187      G-002    Quern stone  
 4 A-0040      G-003    Ring         
 5 A-0181      G-003    Needle       
 6 A-0213      G-003    Brooch       
 7 A-0116      G-004    Dagger       
 8 A-0055      G-005    Ring         
 9 A-0428      <NA>     Dagger       
10 A-0429      <NA>     Needle       

Inner join

inner_join(graves, artefacts, join_by(grave_id))
# A tibble: 8 × 5
  grave_id burial     mound_present artefact_id artefact_type
  <chr>    <chr>      <lgl>         <chr>       <chr>        
1 G-001    missing    FALSE         A-0076      Whetstone    
2 G-001    missing    FALSE         A-0135      Arrowhead    
3 G-002    inhumation FALSE         A-0187      Quern stone  
4 G-003    inhumation TRUE          A-0040      Ring         
5 G-003    inhumation TRUE          A-0181      Needle       
6 G-003    inhumation TRUE          A-0213      Brooch       
7 G-004    cremation  FALSE         A-0116      Dagger       
8 G-005    cremation  FALSE         A-0055      Ring         

Full join

full_join(graves, artefacts)
# A tibble: 11 × 5
   grave_id burial     mound_present artefact_id artefact_type
   <chr>    <chr>      <lgl>         <chr>       <chr>        
 1 G-001    missing    FALSE         A-0076      Whetstone    
 2 G-001    missing    FALSE         A-0135      Arrowhead    
 3 G-002    inhumation FALSE         A-0187      Quern stone  
 4 G-003    inhumation TRUE          A-0040      Ring         
 5 G-003    inhumation TRUE          A-0181      Needle       
 6 G-003    inhumation TRUE          A-0213      Brooch       
 7 G-004    cremation  FALSE         A-0116      Dagger       
 8 G-005    cremation  FALSE         A-0055      Ring         
 9 G-006    inhumation FALSE         <NA>        <NA>         
10 <NA>     <NA>       NA            A-0428      Dagger       
11 <NA>     <NA>       NA            A-0429      Needle       

Left join graves -> artefacts

graves |> 
  left_join(artefacts)
# A tibble: 9 × 5
  grave_id burial     mound_present artefact_id artefact_type
  <chr>    <chr>      <lgl>         <chr>       <chr>        
1 G-001    missing    FALSE         A-0076      Whetstone    
2 G-001    missing    FALSE         A-0135      Arrowhead    
3 G-002    inhumation FALSE         A-0187      Quern stone  
4 G-003    inhumation TRUE          A-0040      Ring         
5 G-003    inhumation TRUE          A-0181      Needle       
6 G-003    inhumation TRUE          A-0213      Brooch       
7 G-004    cremation  FALSE         A-0116      Dagger       
8 G-005    cremation  FALSE         A-0055      Ring         
9 G-006    inhumation FALSE         <NA>        <NA>         

Left join artefacts -> graves

artefacts |> 
  left_join(graves)
# A tibble: 10 × 5
   artefact_id grave_id artefact_type burial     mound_present
   <chr>       <chr>    <chr>         <chr>      <lgl>        
 1 A-0076      G-001    Whetstone     missing    FALSE        
 2 A-0135      G-001    Arrowhead     missing    FALSE        
 3 A-0187      G-002    Quern stone   inhumation FALSE        
 4 A-0040      G-003    Ring          inhumation TRUE         
 5 A-0181      G-003    Needle        inhumation TRUE         
 6 A-0213      G-003    Brooch        inhumation TRUE         
 7 A-0116      G-004    Dagger        cremation  FALSE        
 8 A-0055      G-005    Ring          cremation  FALSE        
 9 A-0428      <NA>     Dagger        <NA>       NA           
10 A-0429      <NA>     Needle        <NA>       NA