Event Date: Tuesday, 21 June, 2016
Location: Via Santa Maria, 36, Pisa, PI, Italia [2nd floor seminar room]
Speaker: Malvina Nissim
Title: Analysing Authors
Abstract: As publishing has become more and more accessible and basically cost-free, virtually anyone can get their words printed, whether online or on paper. Such ease of disseminating content doesn’t necessarily go together with author identifiability. In other words: it’s very simple for anyone to publicly write any text, but it isn’t equally simple to always tell who the author of a text is. Telling the author of a text can be thought of at various levels of detail. For example, in some contexts, and possibly in the interest of companies who want to advertise, or legal institutions, it can correspond to profiling, namely defining certain characteristics of the author, such as sex and age. In other contexts, and in the interest also of ancient and contemporary literary or historical studies, in addition to the aforementioned fields, identifying authors can mean being able to tell whether two texts are likely to have been written by the same person. The latter problem can take more than one form in practice, as one could be faced with one unknown text to compare to another one written by a known author, or could be given a large number of unknown texts to be clustered according to authorship.
In this talk, I will introduce the specifics of such tasks, and describe a couple of systems that perform author profiling and author identification on different kinds of texts from different languages, experimenting with various kinds of linguistic and structural features. I will also discuss such systems and their performance not only in terms of how they compare to state-of-the-art, but also in terms of how a broad analysis of authors should be designed and performed, especially in terms of problem representation.