PhD & Assoc. Prof., Inalco, Paris, France
"The question whether machines can think is about as relevant as the question whether submarines can swim" (E. Dijkstra)
During my PhD, I have designed, exprimented, implemented and evaluated a named entity recognition system (for the moment, for French) that leverage automatically extracted annotation rules (using pattern mining). Besides annotation, this system also allows to view and manipulate patterns (sequential, hierarchical) that are correlated to so-called markers (currently, XML tags).
The version that is here downloadable allows to annotate Named Entities in French texts by using a model learned over the Etape corpora. If you use a Linux or Mac computer, it should remain quite simple (this is detailed in README file):
You may download a version of my Data Mining and Named Entity Recognition system on github, here. This is an alpha version which may change a lot in the newt few weeks, don't hesitate to contact me for instruction on how to install and use it!
This system is described in the following communication: Recognizing Named Entities using Automatically Extracted Transduction Rules (Nouvel et.al. 2011) [pdf]. Many thanks to Arnaud Soulet for providing the skeleton of this system and to Nathalie Friburger for the resources shared from her CasEN NER system.