Taxonomic identification of clinical isolates is routinely achieved using matrix-assisted laser desorption ionization time-of-flight mass spectrometry (MALDI-TOF MS). If the species cannot be reliably identified, whole genome sequencing can be applied. The aim of this study was to compare the results of approaches for taxonomic assignment for classification of isolates that are difficult to identify. Fifty-seven isolates were included in the study. The isolates were whole genome sequenced and de novo assembled. Assembly-based classification was performed with the Genome Taxonomy Database Toolkit (GTDB-Tk), BLAST against 16S databases, the Type (Strain) Genome Server (TYGS), ribosomal MLST (rMLST). Read-based classification was performed with MetaPhlAn and Kraken. Twenty-nine were assigned to the same species with all four classifiers, while the remaining 28 showed diverging assignments. When evaluating the results for the latter isolates, GTDB-Tk outperformed the other classifiers regarding which assignments were most likely correct. Of the read-based classifiers, MetaPhlAn performed better than Kraken. Our evaluation identified GTDB-Tk to be the strongest tool for taxonomic assignment of isolates that are difficult to identify. Disagreements between classifiers are likely due to database limitations, wrongly assigned taxonomy, or unreliable 16S-based assignments.
Less...