The Semantic Lab at Pratt Institute has unveiled a new Named Entity Recognition (NER) toolchain and demo, in which six NER tools are combined into a single “dockerized” server to “lower the difficulty in leveraging them.”
The tools used are:
- DBpedia Spotlight
- Stanford NLP (using the english.muc.7class.distsim.crf.ser.gz classifier)
- NLTK trained on the Groningen Meaning Bank corpus
- SpaCy
- The OpeNER Project
- Tensorflow SyntaxNet: Parsey McParseface (POS tagger used to extract proper nouns)
The new NER toolchain is part of an IMLS funded project, DADAnalytics, a modular tool that performs supervised entity extraction from archival documents for generating linked open datasets.