This study was coordinated by Dr Magali Duran under the scope of PropBank project.
A challenging topic of Portuguese language processing is the multifunctional and ambiguous use of the clitic pronoun “se”, which impacts Natural Language Processing tasks, as syntactic parsing, semantic role labeling and machine translation, among others. Aiming to give a step towards automatic disambiguation of “se”, in this study we focus on the identification of pronominal verbs. These verbs undergo one of the six uses of “se” as a clitic pronoun, the only one in which “se” has neither syntactic nor semantic function. For this reason, it is considered a constitutive part of the verb lemma to which it is bound, as a multiword unit. Our strategy to identify such verbs was to analyze the results of a corpus search to rule out all the other possible uses of “se”. The process obviated the features needed in a computational lexicon to automatically perform the disambiguation task. This lexicon of Pronominal Verbs will be made available in the web to enable their inclusion in broader Portuguese lexical resources.