CREATE Datasprint: Linking Cinema Context Data (2020-10-08)¶
-
When: Thursday afternoon 8 October (13:00 - 17:00)
-
Where: Online (Zoom + Slack)
- Register: https://forms.gle/BzCAFVni7HkENkTt9 (link for zoom/slack will be provided after registration)
Introduction¶
Cinema Context is an online database containing places, persons and companies involved in more than 100,000 film screenings since 1895. It provides insight into the 'DNA' of Dutch film and cinema culture. Thanks to a DANS grant for small data projects, the Cinema Context database has been converted into Linked Open Data by Menno den Engelse.
Linked Open Data offers opportunities for broadening and renewing historical and cultural research, because it allows linking data from the database to a range of external data: about buildings, persons, heritage objects and locations. The purpose of this datasprint is to start making these connections, linking entities from the Cinema Context dataset to other Linked Data resources. Nice examples of queries, datastories, or other dataset interconnections created during the datasprint will be showcased in the documentation pages.
The datasprint is conceived as a practical workshop aimed at producing concrete links between datasets, it is not intended as a Linked Data training event. Therefore, to participate, some knowledge of / experience with working with RDF is preferable. Nonetheless, interested participants without prior experience are welcome to join: we'll find a job for you ;)
We have several interlocking goals for the datasprint:
- Figure out how to actually make links between Cinema Context RDF and external data sources; what are obstacles we encounter; do we need to make modifications to the RDF?
- Demonstrate the potential/value for research of linking to other data sets
- Report on our activities on the these documentation pages in the form of brief 'data stories'
Below, we suggest several potential projects but participants are welcome to initiate their own.
Suggested projects¶
Wikidata 1: expanding the linkage¶
Links between Cinema Context and wikidata already exist in the form of a Cinema Context ID for films in wikidata: https://www.wikidata.org/wiki/Property:P8296. Can we also create connections for other properties, such as cinemas, distributors or persons? E.g. using OpenRefine, which has a service for matching to Wikidata (see tutorials here and here).
Wikidata 2: foreign orientation index¶
Economic film historian Peter Miskell has proposed what he has called a foreign orientation index in order to investigate the relative success of Hollywood productions abroad in the post-war reconstruction period. Miskell states that American productions with a relatively high proportion of non-American creative talent and non-American content (origin of story characters, narrative location) have fared better at non-American box offices. As an experiment in the potential usefulness for research purposes of Linked Data, we propose to experiment with the question whether we can use the film ID link to wikidata to establish the foreign orientation index of a film?
Netwerk Oorlogsbronnen¶
Cinema Context contains a lot of information about the World War II period. Before the war, a high proportion of Dutch cinema entrepreneurs were Jewish, and many of them were victims of the nazi terror. Can we link those persons to the register of war victims that is part of the Netwerk Oorlogsbronnen Open Data Register at https://opendata.oorlogsbronnen.nl/? We will facilitate an endpoint where the war victim data can be queried; participants can make requests if they want any other RDF datasets from the Netwerk Oorlogsbronnen available to them for sparqling.
HISGIS: Location points Amsterdam¶
Can we link all Amsterdam locations in the Cinema Context dataset to the HISGIS location points that have been made available via the CLARIAH Amsterdam Time Machine project? For the early period, these links have already been produced during a CLARIAH pilot project; can we expand this to the whole set? Once we have the links to the location points, they can be connected to other data (see next project).
Image collections¶
In Adamlink, Amsterdam location points are connected to heritage data, such as images preserved in the Amsterdam City Archives (see example here). But of course no need to limit oneself to Amsterdam – can we link to other image banks and archives?
Report on activities during the datasprint¶
Linking data, connecting endpoints¶
Linked data is all about interconnecting resources. Making these links, however, takes time and needs curation, but that does not stop us from automatically trying to make these links. During this afternoon, we have tried to connect the IISG Knowledge Graph to the Cinema Context dataset. Relevant material from the IISG-KG is for instance the collection of photos by Ben van Meerendonk. His work, among others, includes photo reports on cinematographic events, such as movie premieres, stars visiting cinemas, mainly in Amsterdam. Through a federated query (i.e. a query that tries to fetch data from an external endpoint/data source), we were able to connect the CinemaContext data that resides in the CREATE-endpoint, with the dataset by the IISG.
In an example query, Micon Schorsij demonstrates how one can search in image descriptions and keywords in the dataset of the IISG. A slight modification of this query, in which we fetch movie theater (schema:MovieTheater
resources in the Cinema Context dataset) names from the CREATE endpoint, and use them in a string comparison on the description of the image, yields 22 (d.d. October 2020) relevant images from the collection, describing a particular theater:
Select query IISG-KG - CinemaContext (MovieTheater)
This query was built to function in a different endpoint than the CREATE endpoint. An up-to-date result can be fetched from druid.
PREFIX iisgv: <https://iisg.amsterdam/vocab/>
PREFIX relator: <http://id.loc.gov/vocabulary/relators/>
PREFIX person: <https://iisg.amsterdam/authority/person/>
PREFIX schema: <http://schema.org/>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
SELECT * WHERE {
{
SELECT ?bioscoop ?name WHERE {
SERVICE <https://data.create.humanities.uva.nl/sparql> {
?bioscoop a schema:MovieTheater ;
schema:name ?name ;
schema:location ?place .
?place schema:address ?address .
?address schema:addressLocality "Amsterdam" .
FILTER(?bioscoop != <http://www.cinemacontext.nl/id/B000069>)
}
}
}
{
SELECT * WHERE {
?item schema:name ?label ;
iisgv:topic <https://iisg.amsterdam/authority/topic/83052> ;
a <http://purl.org/dc/dcmitype/StillImage> .
} LIMIT 10000
}
FILTER(REGEX(?label, ?name)) # This is the filter that matters!
} LIMIT 10000
Results
bioscoop | name | item | label |
---|---|---|---|
B000008 | City Theater | 1095482 | Drukte bij het uitgaan van het City Theater (Rock around the clock met Bill Haley). |
B000012 | Du Midi | 1162706 | In het vernieuwde Du Midi gaat ‘Spartacus’ in première, Apollolaan. |
B000012 | Du Midi | 1087314 | 1e paal voor nieuwe theater Du Midi. |
B000012 | Du Midi | 1087318 | 1e paal voor nieuwe theater Du Midi. |
B000012 | Du Midi | 1087322 | 1e paal voor nieuwe theater Du Midi. |
B000012 | Du Midi | 1087328 | 1e paal voor nieuwe theater Du Midi. |
B000012 | Du Midi | 1087337 | Maquette nieuwe theater Du Midi. |
B000025 | Cinema Royal | 1212044 | Gevel bioscoop Cinema Royal met reclame voor de film ‘Trapeze’. |
B000018 | Nöggerath | 1090038 | De gevel van bioscoop Nöggerath met reclame voor William Wyler’s ‘Ben Hur’. |
B000018 | Nöggerath | 1090043 | De gevel van bioscoop Nöggerath met reclame voor William Wyler’s ‘Ben Hur’. |
B000018 | Nöggerath | 1074936 | Drukte bij bioscoop Nöggerath bij het uitgaan van de film ‘Eva’ (regie Gustaf Molander). |
B000018 | Nöggerath | 1083636 | Drukte bij bioscoop Nöggerath bij het uitgaan van de film ‘Zij die van de zonde leven’. |
B000036 | Alhambra | 1096001 | Gevel bioscoop Alhambra Theater met reclame voor de film ‘Vrijheid’. |
B000036 | Alhambra | 1087691 | Gevel bioscoop Alhambra Theater met reclame voor de film God’s little acre. |
B000036 | Alhambra | 1087822 | Aankomst van Millie Perkins (Diary of Anne Frank, dir.: George Stevens) bij Alhambra. |
B000036 | Alhambra | 1089548 | Bioscoop Alhambra met reclame voor de film ‘On the beach’. |
B000036 | Alhambra | 1089552 | Première van de film ‘On the beach’ in Alhambra met Mary Dresselhuys (midden) en Ko van Dijk. |
B000027 | Rialto | 1094469 | Amerikaanse filmsterren Lana Turner en Lex Barker in Rialto Theater. |
B000028 | Roxy | 1212056 | Gevel bioscoop Roxy met reclame voor de film ‘Jerry als Assepoetser’. |
B000030 | Tuschinski | 1082890 | Wachtende jongeren bij Tuschinski voor de Beatle-film. |
B000047 | Apollo | 1162706 | In het vernieuwde Du Midi gaat ‘Spartacus’ in première, Apollolaan. |
B000006 | Cineac Damrak | 1089373 | Drukte bij bioscoop Cineac Damrak bij het uitgaan van de film ‘Hunde wollt ihr ewig leben’. |
The above given select query shows the matches between Cinemas in Cinema Context, and relevant items from the photo collection. Similarly, we can formulate this as a construct query to built a linkset that can be published and loaded into a SPARQL endpoint. The chosen linking property is schema:image
:
Construct query IISG-KG - CinemaContext (MovieTheater) linkset
This query was built to function in a different endpoint than the CREATE endpoint. An up-to-date result can be fetched from druid.
PREFIX void: <http://rdfs.org/ns/void#>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX iisgv: <https://iisg.amsterdam/vocab/>
PREFIX relator: <http://id.loc.gov/vocabulary/relators/>
PREFIX person: <https://iisg.amsterdam/authority/person/>
PREFIX schema: <http://schema.org/>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
CONSTRUCT {
GRAPH <https://data.create.humanities.uva.nl/id/cinemacontext/linksets/iisgkg/> {
?bioscoop schema:image ?item .
}
<https://data.create.humanities.uva.nl/id/cinemacontext/linksets/iisgkg/> a schema:Dataset, void:Linkset ;
schema:name "Linkset CinemaContext - IISG-kg StillImage for Amsterdam Cinemas" ;
schema:description "This query fetches all Amsterdam cinema names from the Cinemacontext dataset and tries to find these in the label of an item of type StillImage in the IISG-kg." ;
void:linkPredicate schema:image ;
void:target <https://data.create.humanities.uva.nl/id/cinemacontext/>, <https://iisg.amsterdam/graph/collection> ;
schema:isPartOf <https://data.create.humanities.uva.nl/id/cinemacontext/> .
}
WHERE {
{
SELECT ?bioscoop ?name ?wkt WHERE {
SERVICE <https://data.create.humanities.uva.nl/sparql> {
?bioscoop a schema:MovieTheater ;
schema:name ?name ;
schema:location ?place .
?place schema:address ?address .
?address schema:addressLocality "Amsterdam" .
FILTER(?bioscoop != <http://www.cinemacontext.nl/id/B000069>) # this name is Bio, too generic.
}
}
}
{
SELECT * WHERE {
?item schema:name ?label ;
iisgv:topic <https://iisg.amsterdam/authority/topic/83052> ;
a <http://purl.org/dc/dcmitype/StillImage> .
} LIMIT 10000
}
FILTER(REGEX(?label, ?name))
} LIMIT 10000
Result
@prefix schema: <http://schema.org/> .
@prefix cc: <http://www.cinemacontext.nl/id/> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix void: <http://rdfs.org/ns/void#> .
@prefix iisg: <https://iisg.amsterdam/graph/> .
cc:B000006 schema:image <https://iisg.amsterdam/id/item/1089373> .
cc:B000008 schema:image <https://iisg.amsterdam/id/item/1095482> .
cc:B000012 schema:image <https://iisg.amsterdam/id/item/1087314> ,
<https://iisg.amsterdam/id/item/1087318> ,
<https://iisg.amsterdam/id/item/1087337> ,
<https://iisg.amsterdam/id/item/1087322> ,
<https://iisg.amsterdam/id/item/1162706> ,
<https://iisg.amsterdam/id/item/1087328> .
cc:B000018 schema:image <https://iisg.amsterdam/id/item/1090043> ,
<https://iisg.amsterdam/id/item/1083636> ,
<https://iisg.amsterdam/id/item/1090038> ,
<https://iisg.amsterdam/id/item/1074936> .
cc:B000025 schema:image <https://iisg.amsterdam/id/item/1212044> .
cc:B000027 schema:image <https://iisg.amsterdam/id/item/1094469> .
cc:B000028 schema:image <https://iisg.amsterdam/id/item/1212056> .
cc:B000030 schema:image <https://iisg.amsterdam/id/item/1082890> .
cc:B000036 schema:image <https://iisg.amsterdam/id/item/1096001> ,
<https://iisg.amsterdam/id/item/1089548> ,
<https://iisg.amsterdam/id/item/1087691> ,
<https://iisg.amsterdam/id/item/1089552> ,
<https://iisg.amsterdam/id/item/1087822> .
cc:B000047 schema:image <https://iisg.amsterdam/id/item/1162706> .
<https://data.create.humanities.uva.nl/id/cinemacontext/linksets/iisgkg/> rdf:type void:Linkset ,
schema:Dataset ;
schema:description "This query fetches all Amsterdam cinema names from the Cinemacontext dataset and tries to find these in the label of an item of type StillImage in the IISG-kg." ;
schema:name "Linkset CinemaContext - IISG-kg StillImage for Amsterdam Cinemas" ;
schema:isPartOf <https://data.create.humanities.uva.nl/id/cinemacontext/> ;
void:linkPredicate schema:image .
<https://data.create.humanities.uva.nl/id/cinemacontext/linksets/iisgkg/> void:target iisg:collection ,
<https://data.create.humanities.uva.nl/id/cinemacontext/> .
Finally, we can visualize this result on a map, using the geographical coordinates of the cinema as marker, and projecting the image in a tooltip.
Map showing relevant items from the IISG-KG connected to movie theater’s locations. This map can be browsed in the query editor on druid.
Images of movie theaters in the RotterdamsPubliek project¶
During the Cinema Context datasprint (and some evenings afterwards) images of Rotterdam movie theaters have been gathered. These images can be queried at the RotterdamsPubliek sparql endpoint. The data is also available for download at the rotterdams-publiek-data repository on GitHub.
The images are linked to Wikidata identifiers. Since we would probably like to have links between images and Cinema Context identifiers, the following query first asks for Wikidata items of the class ‘movie theater’ (wd:Q41253
) with a Cinema Context id, then searches for images in the RotterdamsPubliek data linked to these Wikidata items. The results can be viewed in ‘gallery mode’.
Gallery of images of movie theaters (through the RotterdamsPubliek project)
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX dc: <http://purl.org/dc/elements/1.1/>
PREFIX dct: <http://purl.org/dc/terms/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX wd: <http://www.wikidata.org/entity/>
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
SELECT DISTINCT ?bios ?widgetImageLink ?widgetImage ?widgetDescription
(CONCAT(?bioslabel,' | ',GROUP_CONCAT(?ccid;SEPARATOR=",")) AS ?widgetImageCaption)
(GROUP_CONCAT(?ccid;SEPARATOR=",") AS ?ccids)
WHERE {
SERVICE <https://query.wikidata.org/sparql> {
?bios wdt:P8296 ?ccid .
?bios wdt:P31 wd:Q41253 .
?bios rdfs:label ?bioslabel .
FILTER (lang(?bioslabel) = 'nl') .
}
?widgetImageLink dct:spatial ?bios .
?widgetImageLink foaf:depiction ?widgetImage .
OPTIONAL{
?widgetImageLink dc:description ?widgetDescription .
}
BIND("<div></div>"^^rdf:HTML as ?widget)
}
GROUP BY ?bios ?bioslabel ?widgetImageLink ?widgetImage ?widgetDescription
ORDER BY ?bioslabel
LIMIT 1000
Gallery overview of images related to a movie theater. Copy and paste this query in the RotterdamsPubliek sparql endpoint. Or click this link.
Alternatively, a good way to find (and add) images of movietheaters is of course in Wikidata itself!
Gallery of images of movie theaters (through Wikidata)
#defaultView:ImageGrid
SELECT ?bios ?ccid ?biosLabel ?afb WHERE {
?bios wdt:P8296 ?ccid .
?bios wdt:P31 wd:Q41253 .
?bios wdt:P18 ?afb .
SERVICE wikibase:label { bd:serviceParam wikibase:language "nl,en". }
}
Or click this link to open the query results in the Wikidata query service directly.