Getting started¶

Are you interested in learning more about the potential of Linked Open Data for sharing and connecting data? To participate in the workshop, you do not need prior experience in working with Linked Open Data.

During the online pre-conference meeting (Wednesday 8 June 15.00-17.00 CET) and the on-site workshop in Rome (Wednesday 6th July 2022, 14.00-16.00), we will guide you through the process. The most fruitful way to participate in the workshop is to bring you own data. On this web page we will explain how you can do that.

What data?¶

In the workshop, we will focus on a type of data that is quite central to research in New Cinema History: programming data. The essence of this type of data boils down to the screening as an event: when was which film screened where?

You can participate with a small sample or with a full data set. The latter is preferable: the larger the pool of data that we can use in the workshop, the easier it will be to find meaningful connections.

By participating, the ownership of the data will remain under the license of your choosing. We recommend CC BY 4.0 that will allow free (re)use of your data but will always require attribution of the creator/owner of the data. We prefer to make available (parts of) the datasets as download on this webpage, so that they are available as documentation and example both during and after the workshop. However, it is not a problem if you don't want this.

Data quality

Since the prime goal of this workshop is to show the potential of Linked Open Data for sharing and connecting data, your dataset does not have to be flawless. You can state this in the description of the dataset if you like.

Preparing your data¶

We ask you to send in you data set so that we can convert it into Linked Open Data before the workshop in Rome. During the pre-conference meeting we will explain how this conversion works. Also, we will comment on you data set and ask you to make alterations if needed.

A schematic representation of the basic data model can be found here. It has been derived from the Cinema Context data model that has been converted to Linked Open Data in 2020.

If you supply your data in our pre-defined format (a spreadsheet with specific columns, see below), then we are able to convert it to RDF using a small script. Therefore we ask you to structure your data according to the data model, as exemplified below.

External identifiers: IMDb and Wikidata

We will use the external identifiers from the IMDb and/or Wikidata pages to compare screening data across different datasets. We require you to provide at least one of these external identifiers for the movies in your data set.

If you do not have this information in your dataset, then you can try adding Wikidata identifiers using OpenRefine. A list of guides to reconcile your data to Wikidata is available here.

Required format¶

We assume that every row in the spreadsheet represents a single screening event that shows a single film in one cinema. We ask you to format your data and fill in the mandatory columns below. This information is needed to be able to make comparisons between different datasets during the workshop.

We offer this template as download as csv and xlsx files below.

Screening granularity: days or weeks

Are your screening events single screening events, or do they represent the screening of a movie for an entire week? For instance, Cinema Context generally assumes that a film programme runs for a week and only includes the start date of a screening week. This matters once we combine our datasets and want to make comparisons. You can indicate this in the dataset submission form.

Mandatory columns¶

screening_id: unique identifier for the screening. If you do not have a unique identifier, please use a unique number (e.g. 1, 2, 3, ...).
screening_date: date of the screening, formatted as ISO 8601 in YYYY-mm-dd format (e.g. 2022-06-01).
film_name: name of the film, preferably as advertised in the local language and combined with the optional film_name_language tag.
theater_name: name of the theater
theater_city: city where the theater is located
theater_country: country of the theater as a 2-letter ISO 3166-1 code (e.g. NL, DE, US, ...)

And at least one of these columns:

film_imdb: IMDB ID of the film (e.g. tt0100802)
film_wikidata: Wikidata ID of the film (e.g. Q222018)

None of these identifiers available?

If you have other (external) identifiers in your data that are worth it to include in your data, then add a column prefixed with film_ including the name of this service/dataset to your data. We will not use this data during the workshop, but it may help us link your data.

Optional columns¶

film_id: unique identifier for the film (optional, but preferably the identifier that is used in your dataset database)
film_name_language: language of the film name as a 2-letter IETF BCP 47 language tag (e.g.: nl, en, de, fr, es, it)
theater_id: unique identifier for the theater (optional, but preferably the identifier that is used in your dataset)
theater_address: address of the theater
theater_coordinates: coordinates of the theater in WGS84 latitude-longitude (e.g. 52.366497,4.894665)

Table overview¶

screening_id	screening_date	film_id	film_name	film_name_language	film_imdb	film_wikidata	theater_id	theater_name	theater_address	theater_coordinates	theater_city	theater_country
1	1911-02-04	F031013	De Sheriff	nl	tt0354912	Q20803266	B000071	Americain Bioscoop	Daniël Stalpertstraat 67	52.3563093,4.8903107	Amsterdam	NL

You can also take a look at over 28k screening events from Cinema Context that are structured in this format:

Download cc_homer2022_example.csv

Download template¶

Download template as xlsx, or csv.

Sending in your data¶

Please first fill out a form to send in data to provide us with some metadata to help us with the conversion. Then after, upload your file using the SurfDrive link. Please send in a file as xslx, csv (encoded in UTF-8) or a zipped version of these and mention your dataset name in the filename. As starting point, we recommend using the templates above.

Form for metadata submission: https://forms.gle/bmKTasNTP3BnqkmD7
File upload: https://surfdrive.surf.nl/files/index.php/s/vJz6iWtDzgnIE1c (anonymous, others cannot see your uploaded files)

Need help?¶

You can contact us for further help: createlab@uva.nl

Last update: July 5, 2022