### Computational Literary Studies
Infrastructure (CLS INFRA).
A Horizon 2020 'Starting Community'
Christof Schöch (TCDH, Trier)
-- ## CLS INFRA Overview --- ### Key Data * Background: COST Action 'Distant Reading' and DARIAH-EU * Focus: Infrastructure for CLS research * Project lead: Maciej Eder (IJP-PAN, Krákow) * Funding: 'Starting Community' programme in Horizon 2020 * Runtime: March 2021 to Feb 2025 * Partners: 13 partners from across Europe --- ### Partners
* **Institute of Polish Language, Krákow, PL (Maciej Eder - project lead)** * Charles University, CZ (Silvie Cinkova) * DARIAH-EU, Dublin (Jennifer Edmond; Frank Fischer) * ENS, Lyon, FR (Serge Heiden) * GhentCDH, Ghent University, BE (Julie Birkholz, Sally Chambers) * HU Berlin, DE (Carolin Odebrecht) * KNAW, Amsterdam, NL (Karina van Dalen-Oskam) * National University of Ireland, Galway, IRL (Justin Tonra) * ÖAW, Vienna, AT (Matej Durco) * Potsdam University, Potsdam, DE (Peer Trilcke) * Serbian Center for Digital Humanities, Belgrade, SRB (Toma Tasovac) * TCDH, Trier University, Trier, DE (Christof Schöch) * UNED, Madrid, ES (Salvador Ros)
--- ### High-level ambitions * Help build an infrastructure for CLS: datasets, tools, knowledge, standards, policy, community * Connect institutional, regional, national and European infrastructure initiatives for CLS * Strengthen interactions between expert researchers, researchers new to CLS, collection holders, research infrastructures * Develop solutions for sustainability of the European CLS infrastructure * Conceptualize dataset delivery for/with analysis ("programmable corpora") * Continue the ambitions of the COST Action, DARIAH and CLARIN regarding open, diverse and multilingual CLS --- ### Concrete objectives * Understand requirements, best practices, emerging trends (research analysis) * Disseminate best practices and key methodological competencies (training, materials) * Identify and describe key datasets (data curation) * Connect datasets with tools (data transformations) * Make key NLP tools widely available (tool integration) * Develop a prototype implementation of programmable corpora --- ### Project Overview
--- ### Work Packages
| W P| Lead | Topics | |:--:|:---------:|:-------| |WP1 | Krákow | Management, Coordination and Innovation Planning | |WP2 | Galway | Dissemination, Communication and Exploitation | |WP3 | Trier | Methodological considerations of CLS | |WP4 | Amsterdam | Training and Skills for CLS | |WP5 | Berlin | Issues of data curation and selection | |WP6 | Vienna | Consolidating and preparing data for CLS | |WP7 | Potsdam | Building the Ecosystem of and for Programmable Corpora | |WP8 | Ghent | Corpus Enrichment and NLP Toolchains | |WP9 | Dublin | Transnational Access to key collections and methods | |WP10| Krákow | Ethics requirements |
-- ## WP 3: Methodological Considerations of CLS
("Methods") --- ### WP 3: Team * Lead: TCDH, Trier * Christof Schöch * Evgenia Fileva * Julia Dudar * Partners * Maciej Eder (IJP-PAN, Krákow) * Karina van Dalen-Oskam (KNAW, Amsterdam) * Peer Trilcke (Potsdam University) * Salvador Ros (UNED, Madrid) * Jennifer Edmond (DARIAH, Dublin) * Matej Durco (ÖAW, Vienna) --- ### WP 3: High-level ambitions * To consolidate the CLS community by establishing user requirements and by documenting and disseminating methodological best practices * To raise awareness among the wider community of CLS researchers and beyond regarding key issues that might hinder the progressive development and uptake of shared research infrastructures as well as showcase successful areas of application * To support the adaptation of relevant new methods from other fields into CLS to increase excellence and innovation in CLS research * To explore the utility of literary data as a research asset beyond the community of CLS research --- ### WP 3: Concrete objectives * Gather and document current needs and practices in CLS * Infer requirements for infrastructure from publications * Gather and document current best practices in CLS research * Feed best practices into the training programme (WP4) * Showcase best practices and emerging trends * Try out cutting-edge research / adapt new methods * Document it in appealing showcases * Outline current hindrances to go further * Explore use of literary data beyond CLS --- ### WP 3: Deliverables * D3.1: Report on the methodological baseline for CLS/LS (M12) * D3.2: Five short survey papers on key methodological concerns (M24) * D3.3: Four showcases accompanied by explanatory papers (two at M24, another two at M36) * D3.4: Series of three position papers (released for debate from M15) and two pilot studies (final deliverable at M40) on emerging trends in CLS methodologies * D3.5: Report on user needs beyond academic research, including user stories/scenarios (M40) --- ### WP 3: Tasks and partners * T3.1 (baseline): Trier with Amsterdam (WP 4) * T3.2 (key methods): Trier with Krákow and Amsterdam (WP 4) * (a) authorship * (b) literary genres * (c) literary history * T3.3 (Showcases) * (a) ELTeC for stylometry: Trier with Krákow * (b) character networks in drama (Potsdam) * (c) poetry scansion (Madrid); * (d) spatio-temporal mapping with LOD (Vienna) * T3.4 (pilot studes): Trier with Krákow * (a) Deep Learning * (b) Linked Open Data * (c) tbc. * T3.5: Users beyond CLS (Dublin) -- ## Conclusion --- ### How to get involved? * Get updates on Twitter: @CLSinfra * Watch out for TNA (= research stay) calls * Participate in workshops as a trainer or trainee * Contribute corpora / metadata / tools * Talk to us about your infrastructure needs * Get involved via national initiatives
(like SPP-CLS or Text+ 🤞 in DE) ---