Our research focuses on the development and application of algorithmic methods and theories for the social sciences and humanities. The social sciences and humanities are confronted with an increasing volume of data about social systems, -processes and -phenomena. The work of our group is thus geared towards leveraging the potential of large amounts of data about social systems towards current challenges in the social sciences and humanities. In research and teaching, we develop and transfer (i) methods for mining and modelling textual data and (ii) methods for mining and modelling relational (networks) and sequential (procedural) data. This enables the application of computational methods to answer social science and humanities research questions.

Methods for Understanding Sequence Data

Human behavior is ofter captured in data sequences, e.g., as sequences of visited places or as sequences of visited websites. We aim to develop novel data analysis algorithms that allow for understanding such sequences

Selected Publications:
  • Philipp Singer, Denis Helic, Andreas Hotho, Markus Strohmaier: HypTrails: A Bayesian Approach for Comparing Hypotheses About Human Trails on the Web. WWW 2015: 1003-1013. Best paper.
  • Florian Lemmerich, Martin Becker, Philipp Singer, Denis Helic, Andreas Hotho, Markus Strohmaier: Mining Subgroups with Exceptional Transition Behavior. KDD 2016: 965-974.
  • Martin Becker, Florian Lemmerich, Philipp Singer, Markus Strohmaier, Andreas Hotho: MixedTrails: Bayesian hypothesis comparison on heterogeneous sequential data. Data Min. Knowl. Discov. 31(5): 1359-1390 (2017)
Identifying Patterns in Data

Big Data is often heterogeneous. In that regard, we aim at developing new methods that can identify interesting (exceptional) subparts of the data through approaches of Pattern Mining (in particular Subgroup Discovery and Exceptional Model Mining)

Selected Publications:
  • Florian Lemmerich, Martin Atzmueller, Frank Puppe: Fast exhaustive subgroup discovery with numerical target concepts. Data Min. Knowl. Discov. 30(3): 711-762 (2016)
  • Florian Lemmerich, Martin Becker, Frank Puppe: Difference-Based Estimates for Generalization-Aware Subgroup Discovery. ECML/PKDD (3) 2013: 288-303. Best Paper.
Analyzing User Behavior

We apply data mining and knowledge discovery methods to study human behavior in online environments.

Selected Publications:
  • Philipp Singer, Florian Lemmerich, Robert West, Leila Zia, Ellery Wulczyn, Markus Strohmaier, Jure Leskovec: Why We Read Wikipedia. WWW 2017: 1591-1600
  • Haiko Lietz, Claudia Wagner, Arnim Bleier, Markus Strohmaier: When Politicians Talk: Assessing Online Conversational Practices of Political Parties on Twitter. ICWSM 2014. Best Paper.
  • Claudia Wagner, Matthew Rowe, Markus Strohmaier, Harith Alani: Ignorance Isn't Bliss: An Empirical Analysis of Attention Patterns in Online Communities. SocialCom/PASSAT 2012: 101-110. Best Paper.
Bias and Social issues

We analyze data to investigate biases and cultural differences

Selected Publications:
  • Anna Samoilenko, Florian Lemmerich, Katrin Weller, Maria Zens, Markus Strohmaier: Analysing Timelines of National Histories Across Wikipedia Editions: A Comparative Computational Approach. ICWSM 2017: 210-219
  • Paul Laufer, Claudia Wagner, Fabian Flöck, Markus Strohmaier: Mining cross-cultural relations from Wikipedia: A study of 31 European food cultures. WebSci 2015: 3:1-3:10. Best paper.
  • Claudia Wagner, David Garcia, Mohsen Jadidi, Markus Strohmaier: It's a Man's Wikipedia? Assessing Gender Inequality in an Online Encyclopedia. ICWSM 2015: 454-463