The Semantic Lab at Pratt Institute has unveiled a new Named Entity Recognition (NER) toolchain and demo, in which six NER tools are combined into a single “dockerized” server to “lower the difficulty in leveraging them.”
The tools used are:
- DBpedia Spotlight
- Stanford NLP (using the english.muc.7class.distsim.crf.ser.gz classifier)
- NLTK trained on the Groningen Meaning Bank corpus
- SpaCy
- The OpeNER Project
- Tensorflow SyntaxNet: Parsey McParseface (POS tagger used to extract proper nouns)
The new NER toolchain is part of an IMLS funded project, DADAnalytics, a modular tool that performs supervised entity extraction from archival documents for generating linked open datasets.
dh+lib Review
This post was produced through a cooperation between Anna Kijas, Lorena O'English, Susanne Pichler, Shilpa Rele, Sandra Sawchuk (Editors-at-large for the week), Roxanne Shirazi (Editor for the week), and Caitlin Christian-Lamb, Nickoal Eichmann-Kalwara, Sarah Melton, and Patrick Williams (dh+lib Review Editors).