ARIADNE & Linked Open Data

Outline

  1. ARIADNE Portal
  2. Linked Open Data
    • Triples – building blocks of LOD
    • RDF – way of representing LOD
  3. ARIADNE AO-Cat ontology

ARIADNE

Archaeological data and the ARIADNE Portal

ARIADNE

A figure in Greek mythology

  • Cretan princess, daughter of King Minos.
  • Known for helping Theseus escape from the labyrinth after he kills Minotaur.

Egisto Sani, CC BY-SA 4.0 https://creativecommons.org/licenses/by-sa/4.0, via Wikimedia Commons

A research infrastructure

  • An acronym for Advanced Research Infrastructure for Archaeological Dataset Networking In Europe.
  • ARIADNE aggregates, integrates, and provides access to various archaeological data resources.

https://www.ariadne-research-infrastructure.eu/

ARIADNE Portal

https://portal.ariadne-infrastructure.eu/

Exercise

  1. How many resources are there in the ARIADNE Portal?
  2. How many records of individual artefacts are there?
  3. Who contributes the most of the records on artefacts?
  4. How many swords are there?
  5. How many Bronze Age swords are there?
  6. Are there any records from Brno (Czech Republic)?
  7. How many records are there from Africa?

What, where & when

Where

  • Geolocations of places (points, polygons, bounding boxes)

When

  • PeriodO gazeteer (https://perio.do/) – maps periods to absolute dates on a common time scale.
  • Date

What

Linked Open Data

From ARIADNE Portal to ARIADNE Knowledge Base

Let’s discuss…

  • What is data?
  • What is structured data?
  • What is unstructured data?
  • What is a database?
  • What is a relational database?
  • What makes data open?
  • What makes data linked?
  • Have you ever collected/created any data?
  • Have you ever created a database?
  • Have you ever filled in a database designed by someone else?

Linked Open Data

(LOD)

https://5stardata.info/

Linked Open Data

What makes it linked and open?

Semantic Web / Web of Data

Semantic Web is an extension of the Web of documents to the Web of data. It is about creating links between documents, datasets etc. that are understandable and readable to both humans and machines. Linked Open Data is at the core of the Semantic Web providing tools and best practices to make these links.

  • Open license.
  • Uniform Resource Identifiers (URIs) name and identify individual things.
  • URIs are resolvable using http:// (or https://) protocol
    (it is possible to find information about the things).
  • URIs lead to useful information (data in RDF or SPARQL standards).
  • URIs in the resources lead to other resources, so more things can be discovered following the links.

URLs, URIs & IRIs

URL (Uniform Resource Locator) – locates and allows retrieval of things on the Web.
URN (Universal Resource Name) – identifies but does not locate.
URI (Uniform Resource Identifier, formerly Universal Resource Identifier) – identifies (both abstract or physical) things (resources) in the Web.
IRI (Internationalized Resource identifier) – same as URI, allows wider range of characters to accomodate various writing systems.

Linked Open Data Cloud

https://lod-cloud.net/

LOD knowledge bases:


https://www.wikidata.org/
~12,5 billion triples


http://dbpedia.org/
~9,5 billion triples

The Linked Open Data Cloud from lod-cloud.net (2024-07-04)

Example – URIs

  • Ariadne is a figure in Greek mythology.
  • ARIADNE is a project acronym.
  • Ariadne was a Byzantine empress.
  • Ariadne is a genus of butterflies.
  • Ariadne is a drug…

Far too many things are called Ariadne!

We need unique identifiers!

Wikidata

DBpedia

Egisto Sani, CC BY-SA 4.0 https://creativecommons.org/licenses/by-sa/4.0, via Wikimedia Commons

Triples

How to represent LOD?

G subject subject object object subject--object predicate (property)

Example – triples

  • Ariadne is a figure in Greek mythology.

G Ariadne Ariadne mythological figure mythological figure Ariadne--mythological figure is a

Example – triples

  • Ariadne is a figure in Greek mythology.
  • Ariadne is from Crete.
  • Crete is an island.

G Ariadne Ariadne mythological\nfigure mythological figure Ariadne--mythological\nfigure is a Crete Crete Ariadne--Crete is from island island Crete--island is an

Example – triples

  • URIs in place of subjects and objects

G http://dbpedia.org/\nresource/Ariadne http://dbpedia.org/ resource/Ariadne http://dbpedia.org/\nontology/MythologicalFigure http://dbpedia.org/ ontology/MythologicalFigure http://dbpedia.org/\nresource/Ariadne--http://dbpedia.org/\nontology/MythologicalFigure is a http://dbpedia.org/\nresource/Crete http://dbpedia.org/ resource/Crete http://dbpedia.org/\nresource/Ariadne--http://dbpedia.org/\nresource/Crete is from http://dbpedia.org/\nontology/Island http://dbpedia.org/ ontology/Island http://dbpedia.org/\nresource/Crete--http://dbpedia.org/\nontology/Island is an http://sws.geonames.org/\n258763 http://sws.geonames.org/ 258763 http://dbpedia.org/\nresource/Crete--http://sws.geonames.org/\n258763 is the same as

Example – triples

  • URIs in place of subjects and objects
  • Abbreviated URIs

dbo: http://dbpedia.org/ontology/
dbr: http://dbpedia.org/resource/
gn: http://sws.geonames.org/

G dbr:Ariadne dbr:Ariadne dbo:MythologicalFigure dbo:MythologicalFigure dbr:Ariadne--dbo:MythologicalFigure is a dbr:Crete dbr:Crete dbr:Ariadne--dbr:Crete is from dbo:Island dbo:Island dbr:Crete--dbo:Island is an gn:258763 gn:258763 dbr:Crete--gn:258763 is the same as

Example – triples

  • URIs in place of predicates (properties)

dbo: http://dbpedia.org/ontology/
dbr: http://dbpedia.org/resource/
gn: https://sws.geonames.org/
rdf: http://www.w3.org/1999/02/22-rdf-syntax-ns#
owl: http://www.w3.org/2002/07/owl#

G dbr:Ariadne dbr:Ariadne dbo:MythologicalFigure dbo:MythologicalFigure dbr:Ariadne--dbo:MythologicalFigure rdf:type dbr:Crete dbr:Crete dbr:Ariadne--dbr:Crete dbo:origin dbo:Island dbo:Island dbr:Crete--dbo:Island rdf:type gn:258763 gn:258763 dbr:Crete--gn:258763 owl:sameAs

Example – RDF

  • We can write down the triples using full URIs.
  • Each triple ends with a period (.)
<http://dbpedia.org/resource/Ariadne> 
  <http://www.w3.org/2000/01/rdf-schema#label> 
    "Ariadne"@en .

<http://dbpedia.org/resource/Ariadne> 
  <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> 
    <http://dbpedia.org/ontology/MythologicalFigure> .

<http://dbpedia.org/resource/Ariadne> 
  <http://dbpedia.org/ontology/origin> 
    <http://dbpedia.org/resource/Crete> .
  
<http://dbpedia.org/resource/Crete> 
  <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> 
    <http://dbpedia.org/ontology/Island> .

<http://dbpedia.org/resource/Crete> 
  <http://www.w3.org/2002/07/owl#sameAs> 
    <https://sws.geonames.org/258763> .
  • This is perfectly fine for the machine, not so much for the human.

Example – RDF

  • To make the notation more readable, let’s abbreviate the URIs by defining prefixes.
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix owl: <http://www.w3.org/2002/07/owl#> .
@prefix dbo: <http://dbpedia.org/ontology/> .
@prefix dbr: <http://dbpedia.org/resource/> .

dbr:Ariadne rdfs:label "Ariadne"@en .
dbr:Ariadne rdf:type dbo:MythologicalFigure .
dbr:Ariadne dbo:origin dbr:Crete .
  
dbr:Crete rdf:type dbo:Island .
dbr:Crete owl:sameAs <https://sws.geonames.org/258763> .
  • There is still a lot of repetition, e.g. in the subjects.

Example – RDF

  • To remove repetition, we add different predicates and objects to the same subject using semicolon (;)
  • The set of triples still ends with a period (.)
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix owl: <http://www.w3.org/2002/07/owl#> .
@prefix dbo: <http://dbpedia.org/ontology/> .
@prefix dbr: <http://dbpedia.org/resource/> .

dbr:Ariadne
  rdfs:label "Ariadne"@en ;
  rdf:type dbo:MythologicalFigure ;
  dbo:origin dbr:Crete .
  
dbr:Crete 
  rdf:type dbo:Island ;
  owl:sameAs <https://sws.geonames.org/258763> .
  • This is a Turtle serialization of an RDF, one of common ways of representing LOD.

RDF

Resource Description Framework

  • A standard for LOD defined by the World Wide Web Consortium (W3C).
  • RDF is a data model that describes how data is structured.
  • Uses triples to represent statements.
  • RDF does not exactly tell us how to write the triples.
  • Numerous serializations (how to write things down) of RDF exist.

https://www.w3.org/standards/techs/rdf

RDF serializations

  • Turtle (subset of Notation3 language, superset of N-Triples format)
  • RDF/XML
  • JSON-LD

and many more…

RDF converter: https://www.easyrdf.org/converter

Turtle

  • Stands for Terse RDF Triple Language.
  • File extension .ttl.
  • Triple ends with a period.
  • URIs in angle brackets (<, >).
  • Literal (text or other value) in quotation marks (").
  • Triples with a same subject are divided with semicolon (;).
  • Triples with a same subject and property are divided with comma (,).
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix owl: <http://www.w3.org/2002/07/owl#> .
@prefix dbo: <http://dbpedia.org/ontology/> .
@prefix dbr: <http://dbpedia.org/resource/> .

dbr:Ariadne
  rdfs:label "Ariadne"@en ;
  rdf:type dbo:MythologicalFigure ;
  dbo:origin dbr:Crete .
  
dbr:Crete 
  rdf:type dbo:Island ;
  owl:sameAs <https://sws.geonames.org/258763> .

RDF can contain…

  • URIs/IRIs:
    • written in angle brackets
      <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
    • shortened as prefixes
      @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
  • Literals (values):
    • written in quotes followed by their datatype URI (typed literals)
    • list of RDF datatypes: https://www.w3.org/TR/rdf11-concepts/#section-Datatypes
    • strings (datatype xsd:string can be omitted)
      "Ariadne"^^xsd:string is the same as "Ariadne"
    • numbers (quotes can be omitted)
      "3.14"^^xsd:decimal is the same as 3.14
    • dates
      "2024-09-17"^^xsd:date

RDF/XML

<?xml version="1.0" encoding="utf-8" ?>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
         xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
         xmlns:owl="http://www.w3.org/2002/07/owl#"
         xmlns:dbo="http://dbpedia.org/ontology/">

  <rdf:Description rdf:about="http://dbpedia.org/resource/Ariadne">
    <rdfs:label xml:lang="en">Ariadne</rdfs:label>
    <rdf:type rdf:resource="http://dbpedia.org/ontology/MythologicalFigure"/>
    <dbo:origin>
      <dbo:Island rdf:about="http://dbpedia.org/resource/Crete">
        <owl:sameAs rdf:resource="https://sws.geonames.org/258763"/>
      </dbo:Island>
    </dbo:origin>
  </rdf:Description>
</rdf:RDF>

Tabular data vs triples

  • Any data organized in a table can be written as triples.

For example a table like this:

ID human source delta15N delta13C
sample01 TRUE coll 6.9 -19.2
sample02 FALSE coll 4.2 -21.83
sample42 NA coll 12.02 NA

Can become this:

subject predicate object
sample01 fromHuman TRUE
sample01 source coll
sample01 delta15N 6.9
sample01 delta13C -19.2
sample02 fromHuman FALSE
sample42 delta13C NA
  • Observation (row) IDs become subjects.
  • Observations (values) become objects.
  • Variable (column) names become predicates.

Exercise

Serialize the statements in this diagram as Turtle RDF using DBpedia

G dbr:Phaistos_Disc dbr:Phaistos_Disc dbr:Artifact_(archaeology) dbr:Artifact_(archaeology) dbr:Phaistos_Disc--dbr:Artifact_(archaeology) rdf:type '1908-07-03' '1908-07-03' dbr:Phaistos_Disc--'1908-07-03' dbp:discoveredDate dbr:Crete dbr:Crete dbr:Phaistos_Disc--dbr:Crete dbp:discoveredPlace dbo:Island dbo:Island dbr:Crete--dbo:Island rdf:type gn:258763 gn:258763 dbr:Crete--gn:258763 owl:sameAs

Prefixes:
rdf: http://www.w3.org/1999/02/22-rdf-syntax-ns# .
xsd: <http://www.w3.org/2001/XMLSchema#> .
owl: http://www.w3.org/2002/07/owl# .
dbo: http://dbpedia.org/ontology/ .
dbr: http://dbpedia.org/resource/ .
dbp: http://dbpedia.org/property/ .
gn: http://sws.geonames.org/ .

Solution

G dbr:Phaistos_Disc dbr:Phaistos_Disc dbr:Artifact_(archaeology) dbr:Artifact_(archaeology) dbr:Phaistos_Disc--dbr:Artifact_(archaeology) rdf:type '1908-07-03' '1908-07-03' dbr:Phaistos_Disc--'1908-07-03' dbp:discoveredDate dbr:Crete dbr:Crete dbr:Phaistos_Disc--dbr:Crete dbp:discoveredPlace dbo:Island dbo:Island dbr:Crete--dbo:Island rdf:type gn:258763 gn:258763 dbr:Crete--gn:258763 owl:sameAs

@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
@prefix owl: <http://www.w3.org/2002/07/owl#> .
@prefix dbo: <http://dbpedia.org/ontology/> .
@prefix dbr: <http://dbpedia.org/resource/> .
@prefix dbp: <http://dbpedia.org/property/> .
@prefix gn:  <http://sws.geonames.org/> .

dbr:Phaistos_Disc 
  rdf:type dbr:Artifact_(archaeology) ;
  dbp:discoveredDate "1908-07-03"^^xsd:date ;
  dbp:discoveredPlace dbr:Crete .
  
dbr:Crete 
  rdf:type dbo:Island ;
  owl:sameAs gn:258763 .

Why is LOD useful?

Let’s brainstorm…

AO-Cat Ontology

Ontologies and the Semantic Web

What is an ontology?

  • An explicit, formal way of modelling relationships between information within a particular domain.
  • An abstraction allowing formal representation of particular knowledge about the world.
  • Ontologies provide precisely defined vocabularies for modelling relationships.

What is in an ontology?

  • Classes – ontology defines abstract groups, fundamental categories of objects or concepts within a domain.
  • Relationships (properties) – ontology limits what kinds of subjects and objects can properties link and how classes are related.
  • Instances of objects – what are the described individuals.

Why is this useful?

  • Logic
  • Automated reasoning
  • Interoperability and data integration

Domain and range

G cluster_1 range cluster_0 domain subject subject object object subject--object predicate (property)

  • Domainsubject described by the property is in the class specified by the domain.
  • Rangeobject of the statement has the given range.

rdfs:domain and rdfs:range properties

Example

G cluster_0 rdf:type dbo:PopulatedPlace cluster_1 rdf:type dbo:Person cluster_2 rdf:type dbo:MythologicalFigure dbr:Crete dbr:Crete dbr:Ariadne dbr:Ariadne dbr:Ariadne--dbr:Crete dbo:origin

@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix schema: <http://schema.org/> .
@prefix dbo: <http://dbpedia.org/ontology/> .
@prefix dbr: <http://dbpedia.org/resource/> .

dbo:MythologicalFigure rdfs:subClassOf dbo:Person .

dbo:origin rdf:type rdf:Property ;
  rdfs:domain dbo:Person ;
  rdfs:range dbo:PopulatedPlace .

dbr:Ariadne
  rdf:type dbo:MythologicalFigure ;
  dbo:origin dbr:Crete .

dbr:Crete 
  rdf:type dbo:PopulatedPlace .

AO-Cat Ontology

Felicetti A., Meghini C., Richards J., Theodoridou M. 2023: The AO-Cat Ontology. doi:10.5281/zenodo.7818374.

  • Application profile of CIDOC CRM (most of the classes are mapped to CRM).
  • Namespace IRI: https://ariadne-infrastructure.eu/aocat/
  • Classes prefixed with AO_

Defines:

  • 22 classes
  • 66 properties

AO-Cat Concepts

Resources

G AO_Entity AO_Entity AO_Resource AO_Resource AO_Entity--AO_Resource AO_Service AO_Service AO_Resource--AO_Service AO_Data_Resource AO_Data_Resource AO_Resource--AO_Data_Resource AO_Individual_Data_Resource (1) AO_Individual_Data_Resource (1) AO_Data_Resource--AO_Individual_Data_Resource (1) AO_Collection (1) AO_Collection (1) AO_Data_Resource--AO_Collection (1) AO_Document AO_Document AO_Individual_Data_Resource (1)--AO_Document AO_Individual_Data_Resource (2) AO_Individual_Data_Resource (2) AO_Collection (1)--AO_Individual_Data_Resource (2) AO_Collection (2) AO_Collection (2) AO_Collection (1)--AO_Collection (2)

AO-Cat classes

Where

  • AO_Spatial_Region
    • AO_Spatial_Region_Point
    • AO_Spatial_Region_Polygon
    • AO_Spatial_Region_BBox
    • AO_Spatial_Region_StdName

When

  • AO_Temporal_Region
    • From/until given in years
    • PeriodO URIs

What

  • class AO_Concept
  • ARIADNE subjects (property has_ARIADNE_subject )
  • derived subjects (AAT subjects, property has_derived_subject)
  • native subjects (property has_native_subject)

ARIADNE subjects

  • Site/monument – each record is a site/monument.
  • Fieldwork – each record is an individual archaeological investigation (event) .
  • Fieldwork report – record of fieldwork event (link to grey literature report).
  • Fieldwork archive – record of filedwork event (link to archive of digital objects).
  • Scientific analysis – any analytical data.
  • Date – each record is a single archaeological date (C14, dendrochronology etc.)
  • Artefact – each record is a single artefact (except coins).
  • Coin – each record is a single coin.
  • Building survey – specific category of fieldwork report or archive for standing building survey.
  • Maritime – specific category of site/monument for wrecks or fieldwork event (underwater archaeology).
  • Inscription – monuments or artefacts that bear graphical manifestation of a human thought.
  • Rock art – similar to inscriptions.
  • Burial – each record is a burial.