Publications & preprints

(*Denotes equal contribution.)

  1. Queer NLP: A critical survey on literature gaps, biases and trends

    Sabine Weber, Angelina Wang*, Ankush Gupta*, Arjun Subramonian*, Dennis Ulmer*, Eshaan Tanwar*, Geetanjali Aich*, Hannah Devinney*, Jacob Hobbs*, Jennifer Mickel*, Joshua Tint*, Mae Sosto*, Ray Groshan*, Simone Astarita*, Vagrant Gautam*, Verena Blaschke*, William Agnew*, Wilson Y Lee* & Yanan Long*
    Preprint · Abstract Cite Preprint Data
  2. Are non-English papers reviewed fairly?
    Language-of-study bias in NLP peer reviews

    Ehsan Barkhordar, Abdulfattah Safa, Verena Blaschke, Erika Lombart, Marie‑Catherine de Marneffe & Gözde Gül Şahin
  3. Standard-to-dialect transfer trends differ across text and speech:
    A case study on intent and topic classification in German dialects

    Verena Blaschke, Miriam Winkler & Barbara Plank
  4. Variation is the norm: Embracing sociolinguistics in NLP

    Anne‑Marie Lutgen, Alistair Plum, Verena Blaschke, Barbara Plank & Christoph Purschke
  5. Indirect question answering in English, German and Bavarian:
    A challenging task for high- and low-resource languages alike

    Miriam Winkler, Verena Blaschke & Barbara Plank
  6. Information asymmetry across language varieties:
    A case study on Cantonese–Mandarin and Bavarian–German QA

    Renhao Pei*, Siyao Peng*, Verena Blaschke, Robert Litschko & Barbara Plank
  7. Make every letter count:
    Building dialect variation dictionaries from monolingual corpora

    Robert Litschko, Verena Blaschke, Diana Burkhardt, Barbara Plank & Diego Frassinelli
  8. DistaLs: A comprehensive collection of language distance measures

    Rob van der Goot, Esther Ploeger, Verena Blaschke & Tanja Samardžić
  9. A multi-dialectal dataset for German dialect ASR and dialect-to-standard speech translation

    Verena Blaschke, Miriam Winkler, Constantin Förster, Gabriele Wenger‑Glemser & Barbara Plank
  10. Analyzing the effect of linguistic similarity on cross-lingual transfer:
    Tasks and experimental setups matter

    Verena Blaschke, Masha Fedzechkina & Maartje ter Hoeve
  11. Methods and resources in Germanic variationist linguistics

    John Nerbonne, Verena Blaschke, Hinrich Schütze & Barbara Plank
  12. Evaluating pixel language models on non-standardized languages

    Alberto Muñoz‑Ortiz, Verena Blaschke & Barbara Plank
  13. Cross-dialect information retrieval:
    Information access in low-resource and high-variance languages

    Robert Litschko, Oliver Kraus, Verena Blaschke & Barbara Plank
  14. Improving dialectal slot and intent detection with auxiliary tasks:
    A multi-dialectal Bavarian case study

    Xaver Maria Krückl*, Verena Blaschke* & Barbara Plank
  15. Add noise, tasks, or layers?
    MaiNLP at the VarDial 2025 shared task on Norwegian dialectal slot and intent detection

    Verena Blaschke*, Felicia Körner* & Barbara Plank
  16. What do dialect speakers want?
    A survey of attitudes towards language technology for German dialects

    Verena Blaschke, Christoph Purschke, Hinrich Schütze & Barbara Plank
  17. MaiBaam: A multi-dialectal Bavarian Universal Dependency treebank

    Verena Blaschke, Barbara Kovačić, Siyao Peng, Hinrich Schütze & Barbara Plank
  18. Sebastian, Basti, Wastl?!
    Recognizing named entities in Bavarian dialectal data

    Siyao Peng, Zihang Sun, Huangyan Shan, Marie Kolm, Verena Blaschke, Ekaterina Artemova & Barbara Plank
  19. Exploring the robustness of task-oriented dialogue systems for colloquial German varieties

    Ekaterina Artemova, Verena Blaschke & Barbara Plank
  20. A survey of corpora for Germanic low-resource languages and dialects

    Verena Blaschke, Hinrich Schütze & Barbara Plank
  21. Does manipulating tokenization aid cross-lingual transfer?
    A study on POS tagging for non-standardized languages

    Verena Blaschke, Hinrich Schütze & Barbara Plank
  22. Navigable atom-rule interactions in PSL models enhanced by rule verbalizations, with an application to etymological inference

    Verena Blaschke, Thora Daneyko, Jekaterina Kaparina, Zhuge Gao & Johannes Dellert
  23. CyberWallE at SemEval-2020 task 11: An analysis of feature engineering for ensemble models for propaganda detection

    Verena Blaschke, Maxim Korniyenko & Sam Tureski
  24. Tübingen-Oslo Team at the VarDial 2018 evaluation campaign: An analysis of n-gram features in language variety identification

    Çağrı Çöltekin, Taraka RamaVerena Blaschke

Theses

  1. Explainable machine learning in linguistics and applied NLP:
    Two case studies of Norwegian dialectometry and sexism detection in French tweets

    Master’s thesis, 2021, supervised/examined by Çağrı Çöltekin & John Nerbonne · Abstract PDF Code
  2. Clustering dialect varieties based on historical sound correspondences

    Bachelor’s thesis, 2018, supervised by Çağrı Çöltekin. Finalist for the GSCL Award for the best Bachelor’s thesis in computational linguistics 2017–2019 · Abstract PDF Summary Slides Code