How to format data for Named Entity Recognition (NER)
NER dataset shapes:
- Key information file: The key information file contains a list of entities, which serves as key information for the training data.
- Training data: Training data consists of a file (.txt, .tsv) containing columns separated by a Tab character. One of the columns is a sentence column, while the others represent labels for tokens within the sentence column.
Collaborate with us on GitHub
The source for this content can be found on GitHub, where you can also create and review issues and pull requests. For more information, see our contributor guide.