Better dictionaries through natural language processing

February 23, 2018 at 2:00 PM to 3:30 PM

Location:5345 Herzberg Laboratories


Word-level knowledge stored in lexicons is essential to high-quality systems for a variety of natural language processing tasks. Similarly, dictionaries are important tools for language learners and translators, as well as valuable cultural artefacts. One problem, however, is that language is constantly changing. New words are coined every day, and new meanings of established words commonly emerge. Lexicons and dictionaries therefore need to be constantly updated, but doing so manually is very expensive. Techniques for automatically keeping them up to date are therefore required. In this talk I will first present an overview of my research in natural language processing related to this theme, and then discuss my recent research on two topics in more detail:

  1. automatically identifying instances of multiword expressions (e.g., idiomatic expressions such as “spill the beans” and “blow up”) in text; and
  2. modeling word meaning without relying on word senses. I will conclude with my vision for future research in this area.


Paul Cook is an Assistant Professor in the Faculty of Computer Science at the University of New Brunswick. He received his PhD in Computer Science from the University of Toronto in 2010. Prior to joining the University of New Brunswick he was a McKenzie Postdoctoral Fellow at the University of Melbourne from 2011-2014. Paul’s research in natural language processing focuses primarily on methods for automatically learning lexical knowledge from large text corpora. His specific research interests include lexical semantics, neologisms, multiword expressions, social media text processing, and web corpus construction and analysis. He is particularly interested in applications of natural language processing in lexicography for building better dictionaries and keeping dictionaries up to date.