Publications

  1. What do dialect speakers want?
    A survey of attitudes towards language technology for German dialects
    Verena Blaschke, Christoph Purschke, Hinrich Schütze & Barbara Plank
    Preprint | Abstract Cite PDF
  2. MaiBaam: A multi-dialectal Bavarian Universal Dependency treebank
    Verena Blaschke, Barbara Kovačić, Siyao Peng, Hinrich Schütze & Barbara Plank
    LREC–COLING 2024 | Abstract PDF Data Code Annotation guidelines
  3. Sebastian, Basti, Wastl?!
    Recognizing named entities in Bavarian dialectal data
    Siyao Peng, Zihang Sun, Huangyan Shan, Marie Kolm, Verena Blaschke, Ekaterina Artemova & Barbara Plank
    LREC–COLING 2024 | Abstract PDF Data
  4. Exploring the robustness of task-oriented dialogue systems for colloquial German varieties
    Ekaterina Artemova, Verena Blaschke & Barbara Plank
    EACL 2024 | Abstract Cite PDF Slides Poster Code
  5. A survey of corpora for Germanic low-resource languages and dialects
    Verena Blaschke, Hinrich Schütze & Barbara Plank
    NoDaLiDa 2023 | Abstract Cite Website PDF Slides
  6. Does manipulating tokenization aid cross-lingual transfer?
    A study on POS tagging for non-standardized languages
    Verena Blaschke, Hinrich Schütze & Barbara Plank
    VarDial @ EACL 2023 | Abstract Cite PDF Slides Video Poster Code
  7. Navigable atom-rule interactions in PSL models enhanced by rule verbalizations, with an application to etymological inference
    Verena Blaschke, Thora Daneyko, Jekaterina Kaparina, Zhuge Gao & Johannes Dellert
    ILP 2022 | Abstract Cite PDF Slides Poster Code (PSL-Infrastructure) Code (PSL-RAGviewer)
  8. CyberWallE at SemEval-2020 task 11: An analysis of feature engineering for ensemble models for propaganda detection
    Verena Blaschke, Maxim Korniyenko & Sam Tureski
    SemEval @ COLING 2020 | Abstract Cite PDF Poster Code
  9. Tübingen-Oslo Team at the VarDial 2018 evaluation campaign: An analysis of n-gram features in language variety identification
    Çağrı Çöltekin, Taraka RamaVerena Blaschke
    VarDial @ COLING 2018 | Abstract Cite PDF

Talks

(Excluding paper presentations, which are linked in the Publications section where applicable.)

  1. Large language models and small language varieties:
    Challenges and current methods
    Verena Blaschke & Barbara Plank
    Embracing Variability in Natural Language Processing @ ICLaVE 2024 |
  2. Natural dialect processing:
    NLP for non-standardized language varieties
    Verena Blaschke & Barbara Plank
    Quantitative approaches in dialectology and variationist sociolinguistics (12/2023) | Slides
  3. Configurable language-specific tokenization for CLDF databases
    Johannes DellertVerena Blaschke
    Exploiting standardized cross-linguistic data in historical linguistics @ ICHL 2023 | Abstract
  4. Correlating borrowing events across concepts to derive a data-driven source of evidence for loanword etymologies
    Verena Blaschke & Johannes Dellert
    Model and Evidence in Quantitative Comparative Linguistics @ DGfS 2021 | Abstract Slides Code
  5. Clustering dialect varieties based on historical sound correspondences
    Verena Blaschke
    GSCL Student Award nominee presentations @ KONVENS 2019 | Abstract Summary Slides Code

Theses

  1. Explainable machine learning in linguistics and applied NLP:
    Two case studies of Norwegian dialectometry and sexism detection in French tweets
    Master’s thesis, 2021, supervised by Çağrı Çöltekin & John Nerbonne | Abstract PDF Code
  2. Clustering dialect varieties based on historical sound correspondences
    Bachelor’s thesis, 2018, supervised by Çağrı Çöltekin. Finalist for the GSCL Award for the best Bachelor’s thesis in computational linguistics 2017–2019 | Abstract PDF Summary Slides Code