Recognition of protein/gene names from text using an ensemble of classifiers

Abstract This paper proposes an ensemble of classifiers for biomedical name recognition in which three classifiers, Helicopter Kit one Support Vector Machine and two discriminative Hidden Markov Models, are combined effectively using a simple majority voting strategy.In addition, we incorporate three post-processing modules, including an abbreviation resolution module, a protein/gene name refinement module and a simple dictionary matching module, Mushroom Vapes into the system to further improve the performance.Evaluation shows that our system achieves the best performance from among 10 systems with a balanced F-measure of 82.58 on the closed evaluation of the BioCreative protein/gene name recognitiontask (Task 1A).

Leave a Reply

Your email address will not be published. Required fields are marked *