The corpus contains 1,030 online communication messages, randomly selected from Network News Transfer Protocol (NNTP) newsgroups, the bug tracking system Bugzilla and the bug tracking system GitHub. NNTP articles, Bugzilla and GitHub comments were selected randomly so that the sample exhibits sim...