Natural Language Processing Group

FourIE

FourIE is a neural information extraction system developed by the Natural Language Processing group at the University of Oregon. FourIE annotates text for entity mentions (names, pronouns, nominals), relations, event triggers and argument roles using the information schema defined in the ACE 2005 dataset. FourIE leverages deep learning and graph convolutional networks to jointly perform four tasks in information extraction, i.e., entity mention detection, relation extraction, event detection and argument role prediction in an end-to-end fashion. Our system achieves the state-of-the-art performance for joint information extraction on ACE 2005.

Trankit

Trankit is a light-weight Transformer-based Toolkit for multilingual Natural Language Processing (NLP). It provides a trainable pipeline for fundamental NLP tasks over 100 languages, and 90 pretrained pipelines for 56 languages. Built on a state-of-the-art pretrained language model, Trankit significantly outperforms prior multilingual NLP pipelines over sentence segmentation, part-of-speech tagging, morphological feature tagging, and dependency parsing while maintaining competitive performance for tokenization, multi-word token expansion, and lemmatization over 90 Universal Dependencies treebanks. Our pipeline also obtains competitive or better named entity recognition (NER) performance compared to existing popular toolkits on 11 public NER datasets over 8 languages.

Trankit can be easily installed via pip:

pip install trankit

For more information, please visit our Github repo, documentation page, and technical paper.