What is Biocollections?
BioCollections is a curated dataset of metadata for culture collections, museums, herbaria and other natural history collections, including Darwin Core institution and collection codes, and URL formulae for mapping specimen ids to web pages at the collection site.
Biocollections stores acronyms used in “structured vouchers” for sequence entries submitted to the International Nucleotide Sequence Database (INSDC)(GenBank, European Nucleotide Archive (ENA), and DNA Databank of Japan (DDBJ)) and NCBI’s BioSample.
What are the Sources?
Collection metadata was imported from online directories of specimen repositories such as Index Hebariorum, World Federation of Culture Collections (CCINFO), Insect and Spider Collections of the World, Amphibian Species of the World (AMNH), collections abbreviation in the Catalog of Fishes (CAS) and other biorepositories published in professional journals. Additional repository records are made for collections for which sequence data is submitted to INSDC. These new collections are validated to ensure that they are curated collection prior to inclusion in the database. Other directories of repositories are periodically reviewed, to ensure that the NCBI Biocollections database is up-to-date.
Which type of “structured vouchers” should be used to submit data to INSDC?
Specimens identifiers stored in collections should be annotated using one of the following source qualifiers
-----(specimen_voucher)-----
museum
herbarium
frozen tissue collection
-----(culture_collection)-----
Microbial culture collection
cell lines
-----(bio_material)-----
zoo
aquarium
arboretum
botanical garden <- live plants
DNA bank
stock center
germplasm repository
seed bank
What is the proper format for the “structured voucher”?
The Darwin Core triplet is used for “structured voucher” <institution_code>:<OPTIONAL collection_code>:<specimen_id>
For example:
/specimen_voucher="AMNH:298075"
/specimen_voucher="USNM:MAMM:602070"
/culture_collection="ATCC:26370"
/culture_collection="ISBC:CMF:1866"
/bio_material="K:MWC 3856"
/bio_material="USDA:GRIN:PI 588454-b"
More information:
http://www.ncbi.nlm.nih.gov/books/NBK53701/#gbankquickstart.MuseumReference_Collecti
What is Collection code?
Sometimes there are several collections within an institution. Collection codes are acronyms used for collections within institutions.
For example, we list UAM as institution code for University of Alaska Museum. Within the museum there are several collections like mammals, fish, insects etc. with collection codes Mamm, Fish, Ento respectively.
/specimen_voucher="UAM:Mamm:24119"
/specimen_voucher="UAM:Fish:6144"
/specimen_voucher="UAM:Ento:235327
How are duplicated Institution codes treated?
When more than one institution uses the same acronym for their specimen, the ISO (International Organization for Standardization) three letter country code is used to unique the collections. The acronym that is already in the database retains the acronym (without the country code) and the subsequent ones are registered with three letter country codes.
For example, all the following Institutions use UAM as an acronym for their collections. In order to distinguish between the collections, the institution codes are listed as:
University of Alaska, Museum of the North UAM
University of Arkansas at Monticello UAM<USA-AR>
University of Alabama, Malacology Collection UAM<USA-AL>
Universidad Autonoma De Madrid culture collection of cyanobacteria UAM<ESP>
Universidad de los Andes, Facultad de Ciencias UAM<VEN>
What happens when one Institution use several Institution codes?
For various reasons some institutions use more than one Institution code. For example, University of Maryland uses MARY for its herbarium collection and UMDC for its museum collection. These are listed as separate records.
If the institution changes the acronym for their collection institution code and adopts a new one, old acronym is retained in the database as a synonym.
USNM - National Museum of Natural History, Smithsonian
NMNH - National Museum of Natural History, Smithsonian
NMNH is listed as an Institution synonym for USNM
Are the specimen in personal collections listed in the Biocollections database?
At present, personal collections are not listed in the database. However, personal and private collections can be annotated in the INSDC entries as:
/specimen_voucher="personal:Antonio Machado:AMC 3410
/specimen_voucher="personal:Dan Janzen:05-SNRP-981
Contact
To register a collection or for more information, please send an email to gb-admin@ncbi.nlm.nih.gov.