Publications

  1. What do dialect speakers want?
    A survey of attitudes towards language technology for German dialects

    Verena Blaschke, Christoph Purschke, Hinrich Schütze & Barbara Plank
  2. MaiBaam: A multi-dialectal Bavarian Universal Dependency treebank

    Verena Blaschke, Barbara Kovačić, Siyao Peng, Hinrich Schütze & Barbara Plank
  3. Sebastian, Basti, Wastl?!
    Recognizing named entities in Bavarian dialectal data

    Siyao Peng, Zihang Sun, Huangyan Shan, Marie Kolm, Verena Blaschke, Ekaterina Artemova & Barbara Plank
  4. Exploring the robustness of task-oriented dialogue systems for colloquial German varieties

    Ekaterina Artemova, Verena Blaschke & Barbara Plank
  5. A survey of corpora for Germanic low-resource languages and dialects

    Verena Blaschke, Hinrich Schütze & Barbara Plank
  6. Does manipulating tokenization aid cross-lingual transfer?
    A study on POS tagging for non-standardized languages

    Verena Blaschke, Hinrich Schütze & Barbara Plank
  7. Navigable atom-rule interactions in PSL models enhanced by rule verbalizations, with an application to etymological inference

    Verena Blaschke, Thora Daneyko, Jekaterina Kaparina, Zhuge Gao & Johannes Dellert
  8. CyberWallE at SemEval-2020 task 11: An analysis of feature engineering for ensemble models for propaganda detection

    Verena Blaschke, Maxim Korniyenko & Sam Tureski
  9. Tübingen-Oslo Team at the VarDial 2018 evaluation campaign: An analysis of n-gram features in language variety identification

    Çağrı Çöltekin, Taraka RamaVerena Blaschke

Talks

(Excluding paper presentations, which are linked in the Publications section where applicable.)

  1. Dialect NLP: How (and why) to process non-standard language varieties

    Verena Blaschke
  2. Large language models and small language varieties:
    Challenges and current methods

    Verena Blaschke & Barbara Plank
  3. Natural dialect processing:
    NLP for non-standardized language varieties

    Verena Blaschke & Barbara Plank
  4. Configurable language-specific tokenization for CLDF databases

    Johannes DellertVerena Blaschke
  5. Correlating borrowing events across concepts to derive a data-driven source of evidence for loanword etymologies

    Verena Blaschke & Johannes Dellert
  6. Clustering dialect varieties based on historical sound correspondences

    Verena Blaschke

Theses

  1. Explainable machine learning in linguistics and applied NLP:
    Two case studies of Norwegian dialectometry and sexism detection in French tweets

    Master’s thesis, 2021, supervised by Çağrı Çöltekin & John Nerbonne · Abstract PDF Code
  2. Clustering dialect varieties based on historical sound correspondences

    Bachelor’s thesis, 2018, supervised by Çağrı Çöltekin. Finalist for the GSCL Award for the best Bachelor’s thesis in computational linguistics 2017–2019 · Abstract PDF Summary Slides Code