Checklist: MIGS bacteria (MigsBa)
Minimal Information about a Genome Sequence: cultured bacteria/archaea
Terms
MIXS ID | Name | Cardinality and Range | Description |
---|---|---|---|
MIXS:0001107 | samp_name | 1 String |
A local identifier or name that for the material sample used for extracting n... |
MIXS:0000043 | lib_screen | 0..1 String |
Specific enrichment or screening methods applied before and/or after creating... |
MIXS:0000062 | ref_db | 0..1 String |
List of database(s) used for ORF annotation, along with version number and re... |
MIXS:0000038 | nucl_acid_amp | 0..1 recommended String |
A link to a literature reference, electronic resource or a standard operating... |
MIXS:0000039 | lib_size | 0..1 Integer |
Total number of clones in the library prepared for the project |
MIXS:0000057 | assembly_name | 0..1 recommended String |
Name/version of the assembly provided by the submitter that is used in the ge... |
MIXS:0000113 | temp | 0..1 recommended String |
Temperature of the sample at the time of sampling |
MIXS:0000069 | compl_score | 0..1 String |
Completeness score is typically based on either the fraction of markers found... |
MIXS:0000037 | nucl_acid_ext | 0..1 recommended String |
A link to a literature reference, electronic resource or a standard operating... |
MIXS:0000001 | samp_size | 0..1 String |
The total amount or size (volume (ml), mass (g) or area (m2) ) of sample coll... |
MIXS:0000003 | isol_growth_condt | 1 String |
Publication reference in the form of pubmed ID (pmid), digital object identif... |
MIXS:0000094 | alt | 0..1 recommended String |
Heights of objects such as airplanes, space shuttles, rockets, atmospheric ba... |
MIXS:0000026 | source_mat_id | * recommended String |
A unique identifier assigned to a material sample (as defined by http://rs |
MIXS:0000023 | extrachrom_elements | 0..1 recommended Integer |
Do plasmids exist of significant phenotypic consequence (e |
MIXS:0000024 | estimated_size | 0..1 String |
The estimated size of the genome prior to sequencing |
MIXS:0000111 | samp_vol_we_dna_ext | 0..1 String |
Volume (ml) or mass (g) of total collected sample processed for DNA extractio... |
MIXS:0000027 | pathogenicity | 0..1 recommended String |
To what is the entity pathogenic |
MIXS:0000040 | lib_reads_seqd | 0..1 Integer |
Total number of clones sequenced from the library |
MIXS:0000015 | rel_to_oxygen | 0..1 recommended RelToOxygenEnum |
Is this organism an aerobe, anaerobe? Please note that aerobic and anaerobic ... |
MIXS:0000034 | encoded_traits | 0..1 String |
Should include key traits like antibiotic resistance or xenobiotic degradatio... |
MIXS:0000002 | samp_collect_device | 0..1 String |
The device used to collect an environmental sample |
MIXS:0000060 | number_contig | 1 Integer |
Total number of contigs in the cleaned/submitted assembly that makes up a giv... |
MIXS:0000028 | biotic_relationship | 0..1 recommended BioticRelationshipEnum |
Description of relationship(s) between the subject organism and other organis... |
MIXS:0000022 | num_replicons | 1 Integer |
Reports the number of replicons in a nuclear genome of eukaryotes, in the gen... |
MIXS:0000041 | lib_layout | 0..1 LibLayoutEnum |
Specify whether to expect single, paired, or other configuration of reads |
MIXS:0000056 | assembly_qual | 1 AssemblyQualEnum |
The assembly quality category is based on sets of criteria outlined for each ... |
MIXS:0000025 | ref_biomaterial | 1 String |
Primary publication if isolated before genome publication; otherwise, primary... |
MIXS:0000092 | project_name | 1 String |
Name of the project within which the sequencing was organized |
MIXS:0000042 | lib_vector | 0..1 String |
Cloning vector type(s) used in construction of libraries |
MIXS:0000030 | host_spec_range | * String |
The range and diversity of host species that an organism is capable of infect... |
MIXS:0001321 | neg_cont_type | 0..1 recommended NegContTypeEnum |
The substance or equipment used as a negative control in an investigation |
MIXS:0000048 | adapters | 0..1 recommended String |
Adapters provide priming sequences for both amplification and sequencing of t... |
MIXS:0000058 | assembly_software | 1 String |
Tool(s) used for assembly, including version number and parameters |
MIXS:0000053 | tax_ident | 0..1 recommended TaxIdentEnum |
The phylogenetic marker(s) used to assign an organism name to the SAG or MAG |
MIXS:0000059 | annot | 0..1 recommended String |
Tool used for annotation, or for cases where annotation was provided by a com... |
MIXS:0000032 | trophic_level | 0..1 recommended TrophicLevelEnum |
Trophic levels are the feeding position in a food chain |
MIXS:0001322 | pos_cont_type | 0..1 recommended String |
The substance, mixture, product, or apparatus used to verify that a process w... |
MIXS:0000020 | subspecf_gen_lin | 0..1 recommended String |
Information about the genetic distinctness of the sequenced organism below th... |
MIXS:0000061 | feat_pred | 0..1 String |
Method used to predict UViGs features such as ORFs, integration site, etc |
MIXS:0000013 | env_local_scale | 1 String |
Report the entity or entities which are in the sample or specimen s local vic... |
MIXS:0000070 | compl_software | 0..1 String |
Tools used for completion estimate, i |
MIXS:0000016 | samp_mat_process | 0..1 String |
A brief description of any processing applied to the sample during or after r... |
MIXS:0000063 | sim_search_meth | 0..1 String |
Tool used to compare ORFs with database, along with version and cutoffs used |
MIXS:0000031 | host_disease_stat | 0..1 recommended String |
List of diseases with which the host has been diagnosed; can include multiple... |
MIXS:0000018 | depth | 0..1 recommended String |
The vertical distance below local surface |
MIXS:0001225 | samp_collect_method | 0..1 String |
The method employed for collecting the sample |
MIXS:0000029 | specific_host | 0..1 recommended String |
Report the host's taxonomic name and/or NCBI taxonomy ID |
MIXS:0000014 | env_medium | 1 String |
Report the environmental material(s) immediately surrounding the sample or sp... |
MIXS:0001320 | samp_taxon_id | 1 String |
NCBI taxon id of the sample |
MIXS:0000010 | geo_loc_name | 1 String |
The geographical origin of the sample as defined by the country or sea name f... |
MIXS:0000011 | collection_date | 1 Datetime |
The time of sampling, either as an instance (single point in time) or interva... |
MIXS:0000050 | seq_meth | 1 String |
Sequencing machine used |
MIXS:0000009 | lat_lon | 1 String |
The geographical origin of the sample as defined by latitude and longitude |
MIXS:0000093 | elev | 0..1 recommended String |
Elevation of the sampling site is its height above a fixed reference point, m... |
MIXS:0000012 | env_broad_scale | 1 String |
Report the major environmental system the sample or specimen came from |
MIXS:0000064 | tax_class | 0..1 String |
Method used for taxonomic classification, along with reference database used,... |
MIXS:0000008 | experimental_factor | * String |
Variable aspects of an experiment design that can be used to describe an expe... |
MIXS:0000091 | associated_resource | * recommended String |
A related resource that is referenced, cited, or otherwise associated to the ... |
MIXS:0000090 | sop | * recommended String |
Standard operating procedures used in assembly and/or annotation of genomes, ... |
Aliases
- migs_ba
LinkML Source
Direct
name: MigsBa
description: 'Minimal Information about a Genome Sequence: cultured bacteria/archaea'
title: MIGS bacteria
from_schema: https://w3id.org/mixs
aliases:
- migs_ba
is_a: Checklist
mixin: true
slots:
- samp_name
- lib_screen
- ref_db
- nucl_acid_amp
- lib_size
- assembly_name
- temp
- compl_score
- nucl_acid_ext
- samp_size
- isol_growth_condt
- alt
- source_mat_id
- extrachrom_elements
- estimated_size
- samp_vol_we_dna_ext
- pathogenicity
- lib_reads_seqd
- rel_to_oxygen
- encoded_traits
- samp_collect_device
- number_contig
- biotic_relationship
- num_replicons
- lib_layout
- assembly_qual
- ref_biomaterial
- project_name
- lib_vector
- host_spec_range
- neg_cont_type
- adapters
- assembly_software
- tax_ident
- annot
- trophic_level
- pos_cont_type
- subspecf_gen_lin
- feat_pred
- env_local_scale
- compl_software
- samp_mat_process
- sim_search_meth
- host_disease_stat
- depth
- samp_collect_method
- specific_host
- env_medium
- samp_taxon_id
- geo_loc_name
- collection_date
- seq_meth
- lat_lon
- elev
- env_broad_scale
- tax_class
- experimental_factor
- associated_resource
- sop
slot_usage:
adapters:
name: adapters
recommended: true
alt:
name: alt
recommended: true
annot:
name: annot
recommended: true
assembly_name:
name: assembly_name
recommended: true
assembly_qual:
name: assembly_qual
required: true
assembly_software:
name: assembly_software
required: true
biotic_relationship:
name: biotic_relationship
recommended: true
depth:
name: depth
examples:
- value: 10 meter
recommended: true
elev:
name: elev
recommended: true
extrachrom_elements:
name: extrachrom_elements
recommended: true
host_disease_stat:
name: host_disease_stat
examples:
- value: rabies [DOID:11260]
recommended: true
isol_growth_condt:
name: isol_growth_condt
required: true
nucl_acid_amp:
name: nucl_acid_amp
recommended: true
nucl_acid_ext:
name: nucl_acid_ext
recommended: true
num_replicons:
name: num_replicons
required: true
number_contig:
name: number_contig
required: true
pathogenicity:
name: pathogenicity
recommended: true
ref_biomaterial:
name: ref_biomaterial
required: true
rel_to_oxygen:
name: rel_to_oxygen
recommended: true
samp_collect_device:
name: samp_collect_device
examples:
- value: swab, biopsy, niskin bottle, push core, drag swab [GENEPIO:0002713]
samp_collect_method:
name: samp_collect_method
examples:
- value: swabbing
sop:
name: sop
recommended: true
source_mat_id:
name: source_mat_id
recommended: true
specific_host:
name: specific_host
recommended: true
subspecf_gen_lin:
name: subspecf_gen_lin
recommended: true
tax_ident:
name: tax_ident
recommended: true
temp:
name: temp
recommended: true
trophic_level:
name: trophic_level
recommended: true
class_uri: MIXS:0010003
Induced
name: MigsBa
description: 'Minimal Information about a Genome Sequence: cultured bacteria/archaea'
title: MIGS bacteria
from_schema: https://w3id.org/mixs
aliases:
- migs_ba
is_a: Checklist
mixin: true
slot_usage:
adapters:
name: adapters
recommended: true
alt:
name: alt
recommended: true
annot:
name: annot
recommended: true
assembly_name:
name: assembly_name
recommended: true
assembly_qual:
name: assembly_qual
required: true
assembly_software:
name: assembly_software
required: true
biotic_relationship:
name: biotic_relationship
recommended: true
depth:
name: depth
examples:
- value: 10 meter
recommended: true
elev:
name: elev
recommended: true
extrachrom_elements:
name: extrachrom_elements
recommended: true
host_disease_stat:
name: host_disease_stat
examples:
- value: rabies [DOID:11260]
recommended: true
isol_growth_condt:
name: isol_growth_condt
required: true
nucl_acid_amp:
name: nucl_acid_amp
recommended: true
nucl_acid_ext:
name: nucl_acid_ext
recommended: true
num_replicons:
name: num_replicons
required: true
number_contig:
name: number_contig
required: true
pathogenicity:
name: pathogenicity
recommended: true
ref_biomaterial:
name: ref_biomaterial
required: true
rel_to_oxygen:
name: rel_to_oxygen
recommended: true
samp_collect_device:
name: samp_collect_device
examples:
- value: swab, biopsy, niskin bottle, push core, drag swab [GENEPIO:0002713]
samp_collect_method:
name: samp_collect_method
examples:
- value: swabbing
sop:
name: sop
recommended: true
source_mat_id:
name: source_mat_id
recommended: true
specific_host:
name: specific_host
recommended: true
subspecf_gen_lin:
name: subspecf_gen_lin
recommended: true
tax_ident:
name: tax_ident
recommended: true
temp:
name: temp
recommended: true
trophic_level:
name: trophic_level
recommended: true
attributes:
samp_name:
name: samp_name
annotations:
Preferred_unit:
tag: Preferred_unit
value: ''
description: A local identifier or name that for the material sample used for
extracting nucleic acids, and subsequent sequencing. It can refer either to
the original material collected or to any derived sub-samples. It can have any
format, but we suggest that you make it concise, unique and consistent within
your lab, and as informative as possible. INSDC requires every sample name from
a single Submitter to be unique. Use of a globally unique identifier for the
field source_mat_id is recommended in addition to sample_name
title: sample name
examples:
- value: ISDsoil1
in_subset:
- investigation
from_schema: https://w3id.org/mixs
keywords:
- sample
slot_uri: MIXS:0001107
alias: samp_name
owner: MigsBa
domain_of:
- MigsBa
- MigsEu
- MigsOrg
- MigsPl
- MigsVi
- Mimag
- MimarksC
- MimarksS
- Mims
- Misag
- Miuvig
- Air
- BuiltEnvironment
- FoodAnimalAndAnimalFeed
- FoodFarmEnvironment
- FoodFoodProductionFacility
- FoodHumanFoods
- HostAssociated
- HumanAssociated
- HumanGut
- HumanOral
- HumanSkin
- HumanVaginal
- HydrocarbonResourcesCores
- HydrocarbonResourcesFluidsSwabs
- MicrobialMatBiofilm
- MiscellaneousNaturalOrArtificialEnvironment
- PlantAssociated
- Sediment
- Soil
- SymbiontAssociated
- WastewaterSludge
- Water
range: string
required: true
lib_screen:
name: lib_screen
annotations:
Expected_value:
tag: Expected_value
value: screening strategy name
description: Specific enrichment or screening methods applied before and/or after
creating libraries
title: library screening strategy
examples:
- value: enriched, screened, normalized
in_subset:
- sequencing
from_schema: https://w3id.org/mixs
keywords:
- library
slot_uri: MIXS:0000043
alias: lib_screen
owner: MigsBa
domain_of:
- MigsBa
- MigsEu
- MigsOrg
- MigsPl
- MigsVi
- Mimag
- MimarksS
- Mims
- Misag
- Miuvig
- Agriculture
range: string
ref_db:
name: ref_db
annotations:
Expected_value:
tag: Expected_value
value: names, versions, and references of databases
description: List of database(s) used for ORF annotation, along with version number
and reference to website or publication
title: reference database(s)
examples:
- value: pVOGs;5;http://dmk-brain.ecn.uiowa.edu/pVOGs/ Grazziotin et al. 2017
doi:10.1093/nar/gkw975
in_subset:
- sequencing
from_schema: https://w3id.org/mixs
keywords:
- database
string_serialization: '{database};{version};{reference}'
slot_uri: MIXS:0000062
alias: ref_db
owner: MigsBa
domain_of:
- MigsBa
- MigsEu
- MigsOrg
- MigsPl
- MigsVi
- Mimag
- Mims
- Misag
- Miuvig
range: string
nucl_acid_amp:
name: nucl_acid_amp
description: A link to a literature reference, electronic resource or a standard
operating procedure (SOP), that describes the enzymatic amplification (PCR,
TMA, NASBA) of specific nucleic acids
title: nucleic acid amplification
examples:
- value: https://phylogenomics.me/protocols/16s-pcr-protocol/
in_subset:
- sequencing
from_schema: https://w3id.org/mixs
slot_uri: MIXS:0000038
alias: nucl_acid_amp
owner: MigsBa
domain_of:
- MigsBa
- MigsEu
- MigsOrg
- MigsPl
- MigsVi
- Mimag
- MimarksC
- MimarksS
- Mims
- Misag
- Miuvig
- Agriculture
range: string
recommended: true
pattern: ^^PMID:\d+$|^doi:10.\d{2,9}/.*$|^https?:\/\/(?:www\.)?[-a-zA-Z0-9@:%._\+~#=]{1,256}\.[a-zA-Z0-9()]{1,6}\b(?:[-a-zA-Z0-9()@:%_\+.~#?&\/=]*)$$
structured_pattern:
syntax: ^{PMID}|{DOI}|{URL}$
interpolated: true
partial_match: true
lib_size:
name: lib_size
description: Total number of clones in the library prepared for the project
title: library size
examples:
- value: '50'
in_subset:
- sequencing
from_schema: https://w3id.org/mixs
keywords:
- library
- size
slot_uri: MIXS:0000039
alias: lib_size
owner: MigsBa
domain_of:
- MigsBa
- MigsEu
- MigsOrg
- MigsPl
- MigsVi
- Mimag
- MimarksS
- Mims
- Misag
- Miuvig
- Agriculture
range: integer
assembly_name:
name: assembly_name
annotations:
Expected_value:
tag: Expected_value
value: name and version of assembly
description: Name/version of the assembly provided by the submitter that is used
in the genome browsers and in the community
title: assembly name
examples:
- value: HuRef, JCVI_ISG_i3_1.0
in_subset:
- sequencing
from_schema: https://w3id.org/mixs
string_serialization: '{text} {text}'
slot_uri: MIXS:0000057
alias: assembly_name
owner: MigsBa
domain_of:
- MigsBa
- MigsEu
- MigsOrg
- MigsPl
- MigsVi
- Mimag
- Mims
- Misag
- Miuvig
- Agriculture
range: string
recommended: true
temp:
name: temp
annotations:
Preferred_unit:
tag: Preferred_unit
value: degree Celsius
description: Temperature of the sample at the time of sampling
title: temperature
examples:
- value: 25 degree Celsius
in_subset:
- environment
from_schema: https://w3id.org/mixs
keywords:
- temperature
slot_uri: MIXS:0000113
alias: temp
owner: MigsBa
domain_of:
- MigsBa
- MigsEu
- MigsOrg
- MigsPl
- MigsVi
- Mimag
- MimarksC
- MimarksS
- Mims
- Misag
- Miuvig
- Agriculture
- Air
- FoodAnimalAndAnimalFeed
- FoodFarmEnvironment
- FoodHumanFoods
- HostAssociated
- HumanAssociated
- HumanGut
- HumanOral
- HumanSkin
- HumanVaginal
- HydrocarbonResourcesCores
- HydrocarbonResourcesFluidsSwabs
- MicrobialMatBiofilm
- MiscellaneousNaturalOrArtificialEnvironment
- PlantAssociated
- Sediment
- Soil
- SymbiontAssociated
- WastewaterSludge
- Water
range: string
recommended: true
pattern: ^[-+]?[0-9]*\.?[0-9]+(?:[eE][-+]?[0-9]+)?( *- *[-+]?[0-9]*\.?[0-9]+(?:[eE][-+]?[0-9]+)?)?
*([^\s-]{1,2}|[^\s-]+.+[^\s-]+)$
structured_pattern:
syntax: ^{scientific_float}( *- *{scientific_float})? *{text}$
interpolated: true
partial_match: true
compl_score:
name: compl_score
annotations:
Expected_value:
tag: Expected_value
value: quality;percent completeness
description: 'Completeness score is typically based on either the fraction of
markers found as compared to a database or the percent of a genome found as
compared to a closely related reference genome. High Quality Draft: >90%, Medium
Quality Draft: >50%, and Low Quality Draft: < 50% should have the indicated
completeness scores'
title: completeness score
examples:
- value: med;60%
in_subset:
- sequencing
from_schema: https://w3id.org/mixs
keywords:
- score
string_serialization: '[high|med|low];{percentage}'
slot_uri: MIXS:0000069
alias: compl_score
owner: MigsBa
domain_of:
- MigsBa
- MigsEu
- MigsOrg
- MigsPl
- MigsVi
- Mimag
- Misag
- Miuvig
range: string
nucl_acid_ext:
name: nucl_acid_ext
description: A link to a literature reference, electronic resource or a standard
operating procedure (SOP), that describes the material separation to recover
the nucleic acid fraction from a sample
title: nucleic acid extraction
examples:
- value: https://mobio.com/media/wysiwyg/pdfs/protocols/12888.pdf
in_subset:
- sequencing
from_schema: https://w3id.org/mixs
slot_uri: MIXS:0000037
alias: nucl_acid_ext
owner: MigsBa
domain_of:
- MigsBa
- MigsEu
- MigsOrg
- MigsPl
- MigsVi
- Mimag
- MimarksC
- MimarksS
- Mims
- Misag
- Miuvig
- Agriculture
- FoodAnimalAndAnimalFeed
- FoodFarmEnvironment
- FoodFoodProductionFacility
- FoodHumanFoods
range: string
recommended: true
pattern: ^^PMID:\d+$|^doi:10.\d{2,9}/.*$|^https?:\/\/(?:www\.)?[-a-zA-Z0-9@:%._\+~#=]{1,256}\.[a-zA-Z0-9()]{1,6}\b(?:[-a-zA-Z0-9()@:%_\+.~#?&\/=]*)$$
structured_pattern:
syntax: ^{PMID}|{DOI}|{URL}$
interpolated: true
partial_match: true
samp_size:
name: samp_size
description: The total amount or size (volume (ml), mass (g) or area (m2) ) of
sample collected
title: amount or size of sample collected
examples:
- value: 5 liter
in_subset:
- nucleic acid sequence source
from_schema: https://w3id.org/mixs
keywords:
- sample
- size
slot_uri: MIXS:0000001
alias: samp_size
owner: MigsBa
domain_of:
- MigsBa
- MigsEu
- MigsOrg
- MigsPl
- MigsVi
- Mimag
- MimarksC
- MimarksS
- Mims
- Misag
- Miuvig
- Agriculture
- FoodAnimalAndAnimalFeed
- FoodFarmEnvironment
- FoodFoodProductionFacility
- FoodHumanFoods
range: string
pattern: ^[-+]?[0-9]*\.?[0-9]+(?:[eE][-+]?[0-9]+)?( *- *[-+]?[0-9]*\.?[0-9]+(?:[eE][-+]?[0-9]+)?)?
*([^\s-]{1,2}|[^\s-]+.+[^\s-]+)$
structured_pattern:
syntax: ^{scientific_float}( *- *{scientific_float})? *{text}$
interpolated: true
partial_match: true
isol_growth_condt:
name: isol_growth_condt
description: Publication reference in the form of pubmed ID (pmid), digital object
identifier (doi) or url for isolation and growth condition specifications of
the organism/material
title: isolation and growth condition
examples:
- value: doi:10.1016/j.syapm.2018.01.009
in_subset:
- nucleic acid sequence source
from_schema: https://w3id.org/mixs
keywords:
- condition
- growth
- isolation
slot_uri: MIXS:0000003
alias: isol_growth_condt
owner: MigsBa
domain_of:
- MigsBa
- MigsEu
- MigsOrg
- MigsPl
- MigsVi
- MimarksC
- Agriculture
range: string
required: true
pattern: ^^PMID:\d+$|^doi:10.\d{2,9}/.*$|^https?:\/\/(?:www\.)?[-a-zA-Z0-9@:%._\+~#=]{1,256}\.[a-zA-Z0-9()]{1,6}\b(?:[-a-zA-Z0-9()@:%_\+.~#?&\/=]*)$$
structured_pattern:
syntax: ^{PMID}|{DOI}|{URL}$
interpolated: true
partial_match: true
alt:
name: alt
annotations:
Preferred_unit:
tag: Preferred_unit
value: meter
description: Heights of objects such as airplanes, space shuttles, rockets, atmospheric
balloons and heights of places such as atmospheric layers and clouds. It is
used to measure the height of an object which is above the earth's surface.
In this context, the altitude measurement is the vertical distance between the
earth's surface above sea level and the sampled position in the air
title: altitude
examples:
- value: 100 meter
in_subset:
- environment
from_schema: https://w3id.org/mixs
slot_uri: MIXS:0000094
alias: alt
owner: MigsBa
domain_of:
- MigsBa
- MigsEu
- MigsOrg
- MigsPl
- MigsVi
- Mimag
- MimarksC
- MimarksS
- Mims
- Misag
- Miuvig
- Air
- HostAssociated
- MiscellaneousNaturalOrArtificialEnvironment
- SymbiontAssociated
range: string
recommended: true
pattern: ^[-+]?[0-9]*\.?[0-9]+(?:[eE][-+]?[0-9]+)?( *- *[-+]?[0-9]*\.?[0-9]+(?:[eE][-+]?[0-9]+)?)?
*([^\s-]{1,2}|[^\s-]+.+[^\s-]+)$
structured_pattern:
syntax: ^{scientific_float}( *- *{scientific_float})? *{text}$
interpolated: true
partial_match: true
source_mat_id:
name: source_mat_id
annotations:
Expected_value:
tag: Expected_value
value: 'for cultures of microorganisms: identifiers for two culture collections;
for other material a unique arbitrary identifer'
description: A unique identifier assigned to a material sample (as defined by
http://rs.tdwg.org/dwc/terms/materialSampleID, and as opposed to a particular
digital record of a material sample) used for extracting nucleic acids, and
subsequent sequencing. The identifier can refer either to the original material
collected or to any derived sub-samples. The INSDC qualifiers /specimen_voucher,
/bio_material, or /culture_collection may or may not share the same value as
the source_mat_id field. For instance, the /specimen_voucher qualifier and source_mat_id
may both contain 'UAM:Herps:14' , referring to both the specimen voucher and
sampled tissue with the same identifier. However, the /culture_collection qualifier
may refer to a value from an initial culture (e.g. ATCC:11775) while source_mat_id
would refer to an identifier from some derived culture from which the nucleic
acids were extracted (e.g. xatc123 or ark:/2154/R2)
title: source material identifiers
examples:
- value: MPI012345
in_subset:
- nucleic acid sequence source
from_schema: https://w3id.org/mixs
keywords:
- identifier
- material
- source
slot_uri: MIXS:0000026
alias: source_mat_id
owner: MigsBa
domain_of:
- MigsBa
- MigsEu
- MigsOrg
- MigsPl
- MigsVi
- Mimag
- MimarksC
- MimarksS
- Mims
- Misag
- Miuvig
- Agriculture
- SymbiontAssociated
range: string
recommended: true
multivalued: true
extrachrom_elements:
name: extrachrom_elements
description: Do plasmids exist of significant phenotypic consequence (e.g. ones
that determine virulence or antibiotic resistance). Megaplasmids? Other plasmids
(borrelia has 15+ plasmids)
title: extrachromosomal elements
examples:
- value: '5'
in_subset:
- nucleic acid sequence source
from_schema: https://w3id.org/mixs
slot_uri: MIXS:0000023
alias: extrachrom_elements
owner: MigsBa
domain_of:
- MigsBa
- MigsEu
- MigsOrg
- MimarksC
range: integer
recommended: true
estimated_size:
name: estimated_size
annotations:
Expected_value:
tag: Expected_value
value: number of base pairs
description: The estimated size of the genome prior to sequencing. Of particular
importance in the sequencing of (eukaryotic) genome which could remain in draft
form for a long or unspecified period
title: estimated size
examples:
- value: 300000 bp
in_subset:
- nucleic acid sequence source
from_schema: https://w3id.org/mixs
keywords:
- size
string_serialization: '{integer} bp'
slot_uri: MIXS:0000024
alias: estimated_size
owner: MigsBa
domain_of:
- MigsBa
- MigsEu
- MigsOrg
- MigsPl
- MigsVi
- Miuvig
range: string
samp_vol_we_dna_ext:
name: samp_vol_we_dna_ext
annotations:
Preferred_unit:
tag: Preferred_unit
value: milliliter, gram, milligram, square centimeter
description: 'Volume (ml) or mass (g) of total collected sample processed for
DNA extraction. Note: total sample collected should be entered under the term
Sample Size (MIXS:0000001)'
title: sample volume or weight for DNA extraction
examples:
- value: 1500 milliliter
in_subset:
- nucleic acid sequence source
from_schema: https://w3id.org/mixs
keywords:
- dna
- sample
- volume
- weight
slot_uri: MIXS:0000111
alias: samp_vol_we_dna_ext
owner: MigsBa
domain_of:
- MigsBa
- MigsEu
- MigsOrg
- MigsPl
- MigsVi
- Mimag
- MimarksC
- MimarksS
- Mims
- Misag
- Miuvig
- Agriculture
- Air
- FoodAnimalAndAnimalFeed
- FoodFarmEnvironment
- FoodFoodProductionFacility
- FoodHumanFoods
- HostAssociated
- HumanAssociated
- HumanGut
- HumanOral
- HumanSkin
- HumanVaginal
- HydrocarbonResourcesCores
- HydrocarbonResourcesFluidsSwabs
- MicrobialMatBiofilm
- MiscellaneousNaturalOrArtificialEnvironment
- PlantAssociated
- Sediment
- Soil
- SymbiontAssociated
- WastewaterSludge
- Water
range: string
pattern: ^[-+]?[0-9]*\.?[0-9]+(?:[eE][-+]?[0-9]+)?( *- *[-+]?[0-9]*\.?[0-9]+(?:[eE][-+]?[0-9]+)?)?
*([^\s-]{1,2}|[^\s-]+.+[^\s-]+)$
structured_pattern:
syntax: ^{scientific_float}( *- *{scientific_float})? *{text}$
interpolated: true
partial_match: true
pathogenicity:
name: pathogenicity
annotations:
Expected_value:
tag: Expected_value
value: names of organisms that the entity is pathogenic to
description: To what is the entity pathogenic
title: known pathogenicity
examples:
- value: human, animal, plant, fungi, bacteria
in_subset:
- nucleic acid sequence source
from_schema: https://w3id.org/mixs
slot_uri: MIXS:0000027
alias: pathogenicity
owner: MigsBa
domain_of:
- MigsBa
- MigsEu
- MigsVi
- Miuvig
- Agriculture
range: string
recommended: true
lib_reads_seqd:
name: lib_reads_seqd
description: Total number of clones sequenced from the library
title: library reads sequenced
examples:
- value: '20'
in_subset:
- sequencing
from_schema: https://w3id.org/mixs
keywords:
- library
slot_uri: MIXS:0000040
alias: lib_reads_seqd
owner: MigsBa
domain_of:
- MigsBa
- MigsEu
- MigsOrg
- MigsPl
- MigsVi
- Mimag
- MimarksS
- Mims
- Misag
- Miuvig
- Agriculture
range: integer
rel_to_oxygen:
name: rel_to_oxygen
description: Is this organism an aerobe, anaerobe? Please note that aerobic and
anaerobic are valid descriptors for microbial environments
title: relationship to oxygen
examples:
- value: aerobe
in_subset:
- nucleic acid sequence source
from_schema: https://w3id.org/mixs
keywords:
- oxygen
- relationship
slot_uri: MIXS:0000015
alias: rel_to_oxygen
owner: MigsBa
domain_of:
- MigsBa
- Mimag
- MimarksC
- MimarksS
- Mims
- Misag
range: RelToOxygenEnum
recommended: true
encoded_traits:
name: encoded_traits
annotations:
Expected_value:
tag: Expected_value
value: 'for plasmid: antibiotic resistance; for phage: converting genes'
description: Should include key traits like antibiotic resistance or xenobiotic
degradation phenotypes for plasmids, converting genes for phage
title: encoded traits
examples:
- value: beta-lactamase class A
in_subset:
- nucleic acid sequence source
from_schema: https://w3id.org/mixs
slot_uri: MIXS:0000034
alias: encoded_traits
owner: MigsBa
domain_of:
- MigsBa
- MigsPl
- MigsVi
range: string
samp_collect_device:
name: samp_collect_device
annotations:
Expected_value:
tag: Expected_value
value: device name
description: The device used to collect an environmental sample. This field accepts
terms listed under environmental sampling device (http://purl.obolibrary.org/obo/ENVO).
This field also accepts terms listed under specimen collection device (http://purl.obolibrary.org/obo/GENEPIO_0002094)
title: sample collection device
examples:
- value: swab, biopsy, niskin bottle, push core, drag swab [GENEPIO:0002713]
in_subset:
- nucleic acid sequence source
from_schema: https://w3id.org/mixs
keywords:
- device
- sample
string_serialization: '{termLabel} [{termID}]|{text}'
slot_uri: MIXS:0000002
alias: samp_collect_device
owner: MigsBa
domain_of:
- MigsBa
- MigsEu
- MigsOrg
- MigsPl
- MigsVi
- Mimag
- MimarksC
- MimarksS
- Mims
- Misag
- Miuvig
- Agriculture
- FoodAnimalAndAnimalFeed
- FoodFarmEnvironment
- FoodFoodProductionFacility
- FoodHumanFoods
range: string
number_contig:
name: number_contig
description: Total number of contigs in the cleaned/submitted assembly that makes
up a given genome, SAG, MAG, or UViG
title: number of contigs
examples:
- value: '40'
in_subset:
- sequencing
from_schema: https://w3id.org/mixs
keywords:
- number
slot_uri: MIXS:0000060
alias: number_contig
owner: MigsBa
domain_of:
- MigsBa
- MigsEu
- MigsOrg
- MigsPl
- MigsVi
- Mimag
- Mims
- Misag
- Miuvig
range: integer
required: true
biotic_relationship:
name: biotic_relationship
description: Description of relationship(s) between the subject organism and other
organism(s) it is associated with. E.g., parasite on species X; mutualist with
species Y. The target organism is the subject of the relationship, and the other
organism(s) is the object
title: observed biotic relationship
examples:
- value: free living
in_subset:
- nucleic acid sequence source
from_schema: https://w3id.org/mixs
keywords:
- observed
- relationship
slot_uri: MIXS:0000028
alias: biotic_relationship
owner: MigsBa
domain_of:
- MigsBa
- MigsEu
- MigsVi
- MimarksC
- Miuvig
- Agriculture
range: BioticRelationshipEnum
recommended: true
num_replicons:
name: num_replicons
annotations:
Expected_value:
tag: Expected_value
value: 'for eukaryotes and bacteria: chromosomes (haploid count); for viruses:
segments'
description: Reports the number of replicons in a nuclear genome of eukaryotes,
in the genome of a bacterium or archaea or the number of segments in a segmented
virus. Always applied to the haploid chromosome count of a eukaryote
title: number of replicons
examples:
- value: '2'
in_subset:
- nucleic acid sequence source
from_schema: https://w3id.org/mixs
keywords:
- number
string_serialization: '{integer}'
slot_uri: MIXS:0000022
alias: num_replicons
owner: MigsBa
domain_of:
- MigsBa
- MigsEu
- MigsVi
range: integer
required: true
lib_layout:
name: lib_layout
description: Specify whether to expect single, paired, or other configuration
of reads
title: library layout
examples:
- value: paired
in_subset:
- sequencing
from_schema: https://w3id.org/mixs
keywords:
- library
slot_uri: MIXS:0000041
alias: lib_layout
owner: MigsBa
domain_of:
- MigsBa
- MigsEu
- MigsOrg
- MigsPl
- MigsVi
- Mimag
- MimarksS
- Mims
- Misag
- Miuvig
- Agriculture
range: LibLayoutEnum
assembly_qual:
name: assembly_qual
description: 'The assembly quality category is based on sets of criteria outlined
for each assembly quality category. For MISAG/MIMAG; Finished: Single, validated,
contiguous sequence per replicon without gaps or ambiguities with a consensus
error rate equivalent to Q50 or better. High Quality Draft:Multiple fragments
where gaps span repetitive regions. Presence of the large subunit (LSU) RNA,
small subunit (SSU) and the presence of 5.8S rRNA or 5S rRNA depending on whether
it is a eukaryotic or prokaryotic genome, respectively. Medium Quality Draft:Many
fragments with little to no review of assembly other than reporting of standard
assembly statistics. Low Quality Draft:Many fragments with little to no review
of assembly other than reporting of standard assembly statistics. Assembly statistics
include, but are not limited to total assembly size, number of contigs, contig
N50/L50, and maximum contig length. For MIUVIG; Finished: Single, validated,
contiguous sequence per replicon without gaps or ambiguities, with extensive
manual review and editing to annotate putative gene functions and transcriptional
units. High-quality draft genome: One or multiple fragments, totaling 90%
of the expected genome or replicon sequence or predicted complete. Genome fragment(s):
One or multiple fragments, totalling < 90% of the expected genome or replicon
sequence, or for which no genome size could be estimated'
title: assembly quality
examples:
- value: High-quality draft genome
in_subset:
- sequencing
from_schema: https://w3id.org/mixs
keywords:
- quality
slot_uri: MIXS:0000056
alias: assembly_qual
owner: MigsBa
domain_of:
- MigsBa
- MigsEu
- MigsOrg
- MigsPl
- MigsVi
- Mimag
- Mims
- Misag
- Miuvig
- Agriculture
range: AssemblyQualEnum
required: true
ref_biomaterial:
name: ref_biomaterial
description: Primary publication if isolated before genome publication; otherwise,
primary genome report
title: reference for biomaterial
examples:
- value: doi:10.1016/j.syapm.2018.01.009
in_subset:
- nucleic acid sequence source
from_schema: https://w3id.org/mixs
slot_uri: MIXS:0000025
alias: ref_biomaterial
owner: MigsBa
domain_of:
- MigsBa
- MigsEu
- MigsOrg
- MigsPl
- MigsVi
- Mimag
- Mims
- Misag
- Miuvig
range: string
required: true
pattern: ^^PMID:\d+$|^doi:10.\d{2,9}/.*$|^https?:\/\/(?:www\.)?[-a-zA-Z0-9@:%._\+~#=]{1,256}\.[a-zA-Z0-9()]{1,6}\b(?:[-a-zA-Z0-9()@:%_\+.~#?&\/=]*)$$
structured_pattern:
syntax: ^{PMID}|{DOI}|{URL}$
interpolated: true
partial_match: true
project_name:
name: project_name
description: Name of the project within which the sequencing was organized
title: project name
examples:
- value: Forest soil metagenome
in_subset:
- investigation
from_schema: https://w3id.org/mixs
keywords:
- project
slot_uri: MIXS:0000092
alias: project_name
owner: MigsBa
domain_of:
- MigsBa
- MigsEu
- MigsOrg
- MigsPl
- MigsVi
- Mimag
- MimarksC
- MimarksS
- Mims
- Misag
- Miuvig
- Air
- BuiltEnvironment
- FoodAnimalAndAnimalFeed
- FoodFarmEnvironment
- FoodFoodProductionFacility
- FoodHumanFoods
- HostAssociated
- HumanAssociated
- HumanGut
- HumanOral
- HumanSkin
- HumanVaginal
- HydrocarbonResourcesCores
- HydrocarbonResourcesFluidsSwabs
- MicrobialMatBiofilm
- MiscellaneousNaturalOrArtificialEnvironment
- PlantAssociated
- Sediment
- Soil
- SymbiontAssociated
- WastewaterSludge
- Water
range: string
required: true
lib_vector:
name: lib_vector
annotations:
Expected_value:
tag: Expected_value
value: vector
description: Cloning vector type(s) used in construction of libraries
title: library vector
examples:
- value: Bacteriophage P1
in_subset:
- sequencing
from_schema: https://w3id.org/mixs
keywords:
- library
slot_uri: MIXS:0000042
alias: lib_vector
owner: MigsBa
domain_of:
- MigsBa
- MigsEu
- MigsOrg
- MigsPl
- MigsVi
- Mimag
- MimarksS
- Mims
- Misag
- Miuvig
- Agriculture
range: string
host_spec_range:
name: host_spec_range
annotations:
Expected_value:
tag: Expected_value
value: NCBI taxid
description: The range and diversity of host species that an organism is capable
of infecting, defined by NCBI taxonomy identifier
title: host specificity or range
examples:
- value: '9606'
in_subset:
- nucleic acid sequence source
from_schema: https://w3id.org/mixs
keywords:
- host
- host.
- range
string_serialization: '{integer}'
slot_uri: MIXS:0000030
alias: host_spec_range
owner: MigsBa
domain_of:
- MigsBa
- MigsEu
- MigsPl
- MigsVi
- Miuvig
- Agriculture
range: string
multivalued: true
neg_cont_type:
name: neg_cont_type
annotations:
Expected_value:
tag: Expected_value
value: enumeration or text
description: The substance or equipment used as a negative control in an investigation
title: negative control type
in_subset:
- investigation
from_schema: https://w3id.org/mixs
keywords:
- type
slot_uri: MIXS:0001321
alias: neg_cont_type
owner: MigsBa
domain_of:
- MigsBa
- MigsEu
- MigsOrg
- MigsPl
- MigsVi
- Mimag
- MimarksC
- MimarksS
- Mims
- Misag
- Miuvig
range: NegContTypeEnum
recommended: true
adapters:
name: adapters
description: Adapters provide priming sequences for both amplification and sequencing
of the sample-library fragments. Both adapters should be reported; in uppercase
letters
title: adapters
examples:
- value: AATGATACGGCGACCACCGAGATCTACACGCT;CAAGCAGAAGACGGCATACGAGAT
in_subset:
- sequencing
from_schema: https://w3id.org/mixs
slot_uri: MIXS:0000048
alias: adapters
owner: MigsBa
domain_of:
- MigsBa
- MigsEu
- MigsOrg
- MigsPl
- MigsVi
- Mimag
- MimarksS
- Mims
- Misag
- Miuvig
- Agriculture
range: string
recommended: true
structured_pattern:
syntax: ^{adapter_A_DNA_sequence};{adapter_B_DNA_sequence}$
interpolated: true
partial_match: true
assembly_software:
name: assembly_software
description: Tool(s) used for assembly, including version number and parameters
title: assembly software
examples:
- value: metaSPAdes;3.11.0;kmer set 21,33,55,77,99,121, default parameters otherwise
in_subset:
- sequencing
from_schema: https://w3id.org/mixs
keywords:
- software
slot_uri: MIXS:0000058
alias: assembly_software
owner: MigsBa
domain_of:
- MigsBa
- MigsEu
- MigsOrg
- MigsPl
- MigsVi
- Mimag
- MimarksS
- Mims
- Misag
- Miuvig
- Agriculture
range: string
required: true
pattern: ^([^\s-]{1,2}|[^\s-]+.+[^\s-]+);([^\s-]{1,2}|[^\s-]+.+[^\s-]+);([^\s-]{1,2}|[^\s-]+.+[^\s-]+)$
structured_pattern:
syntax: ^{software};{version};{parameters}$
interpolated: true
partial_match: true
tax_ident:
name: tax_ident
description: The phylogenetic marker(s) used to assign an organism name to the
SAG or MAG
title: taxonomic identity marker
examples:
- value: other
description: was other <colon> rpoB gene
in_subset:
- sequencing
from_schema: https://w3id.org/mixs
keywords:
- identifier
- marker
- taxon
slot_uri: MIXS:0000053
alias: tax_ident
owner: MigsBa
domain_of:
- MigsBa
- MigsEu
- MigsOrg
- MigsPl
- MigsVi
- Mimag
- Misag
- Miuvig
range: TaxIdentEnum
recommended: true
annot:
name: annot
annotations:
Expected_value:
tag: Expected_value
value: name of tool or pipeline used, or annotation source description
description: Tool used for annotation, or for cases where annotation was provided
by a community jamboree or model organism database rather than by a specific
submitter
title: annotation
examples:
- value: prokka
in_subset:
- sequencing
from_schema: https://w3id.org/mixs
slot_uri: MIXS:0000059
alias: annot
owner: MigsBa
domain_of:
- MigsBa
- MigsEu
- MigsOrg
- MigsPl
- MigsVi
- Mimag
- Mims
- Misag
- Miuvig
- Agriculture
range: string
recommended: true
trophic_level:
name: trophic_level
description: Trophic levels are the feeding position in a food chain. Microbes
can be a range of producers (e.g. chemolithotroph)
title: trophic level
examples:
- value: heterotroph
in_subset:
- nucleic acid sequence source
from_schema: https://w3id.org/mixs
keywords:
- level
slot_uri: MIXS:0000032
alias: trophic_level
owner: MigsBa
domain_of:
- MigsBa
- MigsEu
- MimarksC
- Agriculture
range: TrophicLevelEnum
recommended: true
pos_cont_type:
name: pos_cont_type
description: The substance, mixture, product, or apparatus used to verify that
a process which is part of an investigation delivers a true positive
title: positive control type
in_subset:
- investigation
from_schema: https://w3id.org/mixs
keywords:
- type
string_serialization: '{term} or {text}'
slot_uri: MIXS:0001322
alias: pos_cont_type
owner: MigsBa
domain_of:
- MigsBa
- MigsEu
- MigsOrg
- MigsPl
- MigsVi
- Mimag
- MimarksC
- MimarksS
- Mims
- Misag
- Miuvig
range: string
recommended: true
subspecf_gen_lin:
name: subspecf_gen_lin
annotations:
Expected_value:
tag: Expected_value
value: Genetic lineage below lowest rank of NCBI taxonomy, which is subspecies,
e.g. serovar, biotype, ecotype, variety, cultivar
description: Information about the genetic distinctness of the sequenced organism
below the subspecies level, e.g., serovar, serotype, biotype, ecotype, or any
relevant genetic typing schemes like Group I plasmid. Subspecies should not
be recorded in this term, but in the NCBI taxonomy. Supply both the lineage
name and the lineage rank separated by a colon, e.g., biovar:abc123
title: subspecific genetic lineage
examples:
- value: serovar:Newport
in_subset:
- nucleic acid sequence source
from_schema: https://w3id.org/mixs
keywords:
- lineage
string_serialization: '{rank name}:{text}'
slot_uri: MIXS:0000020
alias: subspecf_gen_lin
owner: MigsBa
domain_of:
- MigsBa
- MigsEu
- MigsOrg
- MigsPl
- MigsVi
- MimarksC
- FoodFoodProductionFacility
range: string
recommended: true
feat_pred:
name: feat_pred
description: Method used to predict UViGs features such as ORFs, integration site,
etc
title: feature prediction
examples:
- value: Prodigal;2.6.3;default parameters
in_subset:
- sequencing
from_schema: https://w3id.org/mixs
keywords:
- feature
- predict
slot_uri: MIXS:0000061
alias: feat_pred
owner: MigsBa
domain_of:
- MigsBa
- MigsEu
- MigsOrg
- MigsPl
- MigsVi
- Mimag
- Mims
- Misag
- Miuvig
range: string
pattern: ^([^\s-]{1,2}|[^\s-]+.+[^\s-]+);([^\s-]{1,2}|[^\s-]+.+[^\s-]+);([^\s-]{1,2}|[^\s-]+.+[^\s-]+)$
structured_pattern:
syntax: ^{software};{version};{parameters}$
interpolated: true
partial_match: true
env_local_scale:
name: env_local_scale
annotations:
Expected_value:
tag: Expected_value
value: Environmental entities having causal influences upon the entity at
time of sampling
description: 'Report the entity or entities which are in the sample or specimen
s local vicinity and which you believe have significant causal influences on
your sample or specimen. We recommend using EnvO terms which are of smaller
spatial grain than your entry for env_broad_scale. Terms, such as anatomical
sites, from other OBO Library ontologies which interoperate with EnvO (e.g.
UBERON) are accepted in this field. EnvO documentation about how to use the
field: https://github.com/EnvironmentOntology/envo/wiki/Using-ENVO-with-MIxS'
title: local environmental context
examples:
- value: hillside [ENVO:01000333]
in_subset:
- environment
from_schema: https://w3id.org/mixs
keywords:
- context
- environmental
slot_uri: MIXS:0000013
alias: env_local_scale
owner: MigsBa
domain_of:
- MigsBa
- MigsEu
- MigsOrg
- MigsPl
- MigsVi
- Mimag
- MimarksC
- MimarksS
- Mims
- Misag
- Miuvig
range: string
required: true
structured_pattern:
syntax: ^{termLabel} \[{termID}\]$
interpolated: true
partial_match: true
compl_software:
name: compl_software
annotations:
Expected_value:
tag: Expected_value
value: names and versions of software(s) used
description: Tools used for completion estimate, i.e. checkm, anvi'o, busco
title: completeness software
examples:
- value: checkm
in_subset:
- sequencing
from_schema: https://w3id.org/mixs
keywords:
- software
string_serialization: '{software};{version}'
slot_uri: MIXS:0000070
alias: compl_software
owner: MigsBa
domain_of:
- MigsBa
- MigsEu
- MigsOrg
- MigsPl
- MigsVi
- Mimag
- Misag
- Miuvig
range: string
samp_mat_process:
name: samp_mat_process
description: A brief description of any processing applied to the sample during
or after retrieving the sample from environment, or a link to the relevant protocol(s)
performed
title: sample material processing
examples:
- value: filtering of seawater, storing samples in ethanol
in_subset:
- nucleic acid sequence source
from_schema: https://w3id.org/mixs
keywords:
- material
- process
- sample
slot_uri: MIXS:0000016
alias: samp_mat_process
owner: MigsBa
domain_of:
- MigsBa
- MigsEu
- MigsOrg
- MigsPl
- MigsVi
- Mimag
- MimarksC
- MimarksS
- Mims
- Misag
- Miuvig
- Agriculture
range: string
sim_search_meth:
name: sim_search_meth
description: Tool used to compare ORFs with database, along with version and cutoffs
used
title: similarity search method
examples:
- value: HMMER3;3.1b2;hmmsearch, cutoff of 50 on score
in_subset:
- sequencing
from_schema: https://w3id.org/mixs
keywords:
- method
slot_uri: MIXS:0000063
alias: sim_search_meth
owner: MigsBa
domain_of:
- MigsBa
- MigsEu
- MigsOrg
- MigsPl
- MigsVi
- Mimag
- Mims
- Misag
- Miuvig
range: string
pattern: ^([^\s-]{1,2}|[^\s-]+.+[^\s-]+);([^\s-]{1,2}|[^\s-]+.+[^\s-]+);([^\s-]{1,2}|[^\s-]+.+[^\s-]+)$
structured_pattern:
syntax: ^{software};{version};{parameters}$
interpolated: true
partial_match: true
host_disease_stat:
name: host_disease_stat
annotations:
Expected_value:
tag: Expected_value
value: disease name or Disease Ontology term
description: List of diseases with which the host has been diagnosed; can include
multiple diagnoses. The value of the field depends on host; for humans the terms
should be chosen from the DO (Human Disease Ontology) at https://www.disease-ontology.org,
non-human host diseases are free text
title: host disease status
examples:
- value: rabies [DOID:11260]
in_subset:
- nucleic acid sequence source
from_schema: https://w3id.org/mixs
keywords:
- disease
- host
- host.
- status
string_serialization: '{termLabel} [{termID}]|{text}'
slot_uri: MIXS:0000031
alias: host_disease_stat
owner: MigsBa
domain_of:
- MigsBa
- MigsEu
- MigsVi
- Miuvig
- Agriculture
- FoodFarmEnvironment
- HostAssociated
- HumanAssociated
- HumanGut
- HumanOral
- HumanSkin
- HumanVaginal
- PlantAssociated
range: string
recommended: true
depth:
name: depth
annotations:
Preferred_unit:
tag: Preferred_unit
value: meter
description: The vertical distance below local surface. For sediment or soil samples
depth is measured from sediment or soil surface, respectively. Depth can be
reported as an interval for subsurface samples
title: depth
examples:
- value: 10 meter
in_subset:
- environment
from_schema: https://w3id.org/mixs
keywords:
- depth
slot_uri: MIXS:0000018
alias: depth
owner: MigsBa
domain_of:
- MigsBa
- MigsEu
- MigsOrg
- MigsPl
- MigsVi
- Mimag
- MimarksC
- MimarksS
- Mims
- Misag
- Miuvig
- Agriculture
- FoodFarmEnvironment
- HostAssociated
- MicrobialMatBiofilm
- MiscellaneousNaturalOrArtificialEnvironment
- PlantAssociated
- Sediment
- Soil
- SymbiontAssociated
- WastewaterSludge
- Water
range: string
recommended: true
pattern: ^[-+]?[0-9]*\.?[0-9]+(?:[eE][-+]?[0-9]+)?( *- *[-+]?[0-9]*\.?[0-9]+(?:[eE][-+]?[0-9]+)?)?
*([^\s-]{1,2}|[^\s-]+.+[^\s-]+)$
structured_pattern:
syntax: ^{scientific_float}( *- *{scientific_float})? *{text}$
interpolated: true
partial_match: true
samp_collect_method:
name: samp_collect_method
description: The method employed for collecting the sample
title: sample collection method
examples:
- value: swabbing
in_subset:
- nucleic acid sequence source
from_schema: https://w3id.org/mixs
keywords:
- method
- sample
slot_uri: MIXS:0001225
alias: samp_collect_method
owner: MigsBa
domain_of:
- MigsBa
- MigsEu
- MigsOrg
- MigsPl
- MigsVi
- Mimag
- MimarksC
- MimarksS
- Mims
- Misag
- Miuvig
- Agriculture
- FoodAnimalAndAnimalFeed
- FoodFoodProductionFacility
- FoodHumanFoods
range: string
pattern: ^^PMID:\d+$|^doi:10.\d{2,9}/.*$|^https?:\/\/(?:www\.)?[-a-zA-Z0-9@:%._\+~#=]{1,256}\.[a-zA-Z0-9()]{1,6}\b(?:[-a-zA-Z0-9()@:%_\+.~#?&\/=]*)$|([^\s-]{1,2}|[^\s-]+.+[^\s-]+)$
structured_pattern:
syntax: ^{PMID}|{DOI}|{URL}|{text}$
interpolated: true
partial_match: true
specific_host:
name: specific_host
annotations:
Expected_value:
tag: Expected_value
value: host scientific name, taxonomy ID
description: Report the host's taxonomic name and/or NCBI taxonomy ID
title: host scientific name
examples:
- value: Homo sapiens and/or 9606
in_subset:
- nucleic acid sequence source
from_schema: https://w3id.org/mixs
keywords:
- host
- host.
string_serialization: '{text}|{NCBI taxid}'
slot_uri: MIXS:0000029
alias: specific_host
owner: MigsBa
domain_of:
- MigsBa
- MigsEu
- MigsPl
- MigsVi
- Miuvig
- Agriculture
range: string
recommended: true
env_medium:
name: env_medium
description: 'Report the environmental material(s) immediately surrounding the
sample or specimen at the time of sampling. We recommend using subclasses of
''environmental material'' (http://purl.obolibrary.org/obo/ENVO_00010483). EnvO
documentation about how to use the field: https://github.com/EnvironmentOntology/envo/wiki/Using-ENVO-with-MIxS
. Terms from other OBO ontologies are permissible as long as they reference
mass/volume nouns (e.g. air, water, blood) and not discrete, countable entities
(e.g. a tree, a leaf, a table top)'
title: environmental medium
examples:
- value: bluegrass field soil [ENVO:00005789]
in_subset:
- environment
from_schema: https://w3id.org/mixs
keywords:
- environmental
slot_uri: MIXS:0000014
alias: env_medium
owner: MigsBa
domain_of:
- MigsBa
- MigsEu
- MigsOrg
- MigsPl
- MigsVi
- Mimag
- MimarksC
- MimarksS
- Mims
- Misag
- Miuvig
range: string
required: true
pattern: ^([^\s-]{1,2}|[^\s-]+.+[^\s-]+) \[[a-zA-Z]{2,}:[a-zA-Z0-9]\d+\]$
structured_pattern:
syntax: ^{termLabel} \[{termID}\]$
interpolated: true
partial_match: true
samp_taxon_id:
name: samp_taxon_id
description: NCBI taxon id of the sample. Maybe be a single taxon or mixed taxa
sample. Use 'synthetic metagenome for mock community/positive controls, or
'blank sample' for negative controls
title: taxonomy ID of DNA sample
examples:
- value: Gut Metagenome [NCBITaxon:749906]
in_subset:
- investigation
from_schema: https://w3id.org/mixs
keywords:
- dna
- identifier
- sample
- taxon
slot_uri: MIXS:0001320
alias: samp_taxon_id
owner: MigsBa
domain_of:
- MigsBa
- MigsEu
- MigsOrg
- MigsPl
- MigsVi
- Mimag
- MimarksC
- MimarksS
- Mims
- Misag
- Miuvig
range: string
required: true
pattern: ^([^\s-]{1,2}|[^\s-]+.+[^\s-]+) \[NCBITaxon:\d+\]$
structured_pattern:
syntax: ^{text} \[{NCBItaxon_id}\]$
interpolated: true
partial_match: true
geo_loc_name:
name: geo_loc_name
description: The geographical origin of the sample as defined by the country or
sea name followed by specific region name. Country or sea names should be chosen
from the INSDC country list (http://insdc.org/country.html), or the GAZ ontology
(http://purl.bioontology.org/ontology/GAZ)
title: geographic location (country and/or sea,region)
examples:
- value: 'USA: Maryland, Bethesda'
in_subset:
- environment
from_schema: https://w3id.org/mixs
keywords:
- geographic
- location
slot_uri: MIXS:0000010
alias: geo_loc_name
owner: MigsBa
domain_of:
- MigsBa
- MigsEu
- MigsOrg
- MigsPl
- MigsVi
- Mimag
- MimarksC
- MimarksS
- Mims
- Misag
- Miuvig
- FoodAnimalAndAnimalFeed
- FoodFarmEnvironment
- FoodFoodProductionFacility
- FoodHumanFoods
- SymbiontAssociated
range: string
required: true
pattern: '^([^\s-]{1,2}|[^\s-]+.+[^\s-]+): ([^\s-]{1,2}|[^\s-]+.+[^\s-]+), ([^\s-]{1,2}|[^\s-]+.+[^\s-]+)$'
structured_pattern:
syntax: '^{text}: {text}, {text}$'
interpolated: true
partial_match: true
collection_date:
name: collection_date
description: 'The time of sampling, either as an instance (single point in time)
or interval. In case no exact time is available, the date/time can be right
truncated i.e. all of these are valid times: 2008-01-23T19:23:10+00:00; 2008-01-23T19:23:10;
2008-01-23; 2008-01; 2008; Except: 2008-01; 2008 all are ISO8601 compliant'
title: collection date
examples:
- value: '2013-03-25T12:42:31+01:00'
in_subset:
- environment
from_schema: https://w3id.org/mixs
keywords:
- date
slot_uri: MIXS:0000011
alias: collection_date
owner: MigsBa
domain_of:
- MigsBa
- MigsEu
- MigsOrg
- MigsPl
- MigsVi
- Mimag
- MimarksC
- MimarksS
- Mims
- Misag
- Miuvig
- FoodAnimalAndAnimalFeed
- FoodFarmEnvironment
- FoodFoodProductionFacility
- FoodHumanFoods
- SymbiontAssociated
range: datetime
required: true
seq_meth:
name: seq_meth
description: Sequencing machine used. Where possible the term should be taken
from the OBI list of DNA sequencers (http://purl.obolibrary.org/obo/OBI_0400103)
title: sequencing method
examples:
- value: 454 Genome Sequencer FLX [OBI:0000702]
in_subset:
- sequencing
from_schema: https://w3id.org/mixs
keywords:
- method
slot_uri: MIXS:0000050
alias: seq_meth
owner: MigsBa
domain_of:
- MigsBa
- MigsEu
- MigsOrg
- MigsPl
- MigsVi
- Mimag
- MimarksC
- MimarksS
- Mims
- Misag
- Miuvig
- Agriculture
- FoodAnimalAndAnimalFeed
- FoodFarmEnvironment
- FoodFoodProductionFacility
- FoodHumanFoods
range: string
required: true
pattern: ^([^\s-]{1,2}|[^\s-]+.+[^\s-]+)|(([^\s-]{1,2}|[^\s-]+.+[^\s-]+) \[[a-zA-Z]{2,}:[a-zA-Z0-9]\d+\])$
structured_pattern:
syntax: ^{text}|({termLabel} \[{termID}\])$
interpolated: true
partial_match: true
lat_lon:
name: lat_lon
description: The geographical origin of the sample as defined by latitude and
longitude. The values should be reported in decimal degrees, limited to 8 decimal
points, and in WGS84 system
title: geographic location (latitude and longitude)
examples:
- value: 50.586825 6.408977
in_subset:
- environment
from_schema: https://w3id.org/mixs
keywords:
- geographic
- location
slot_uri: MIXS:0000009
alias: lat_lon
owner: MigsBa
domain_of:
- MigsBa
- MigsEu
- MigsOrg
- MigsPl
- MigsVi
- Mimag
- MimarksC
- MimarksS
- Mims
- Misag
- Miuvig
- FoodAnimalAndAnimalFeed
- FoodFarmEnvironment
- FoodFoodProductionFacility
- FoodHumanFoods
- SymbiontAssociated
range: string
required: true
pattern: ^(-?((?:[0-8]?[0-9](?:\.\d{0,8})?)|90)) -?[0-9]+(?:\.[0-9]{0,8})?$|^-?(1[0-7]{1,2})$
structured_pattern:
syntax: ^{lat} {lon}$
interpolated: true
partial_match: true
elev:
name: elev
annotations:
Preferred_unit:
tag: Preferred_unit
value: meter
description: Elevation of the sampling site is its height above a fixed reference
point, most commonly the mean sea level. Elevation is mainly used when referring
to points on the earth's surface, while altitude is used for points above the
surface, such as an aircraft in flight or a spacecraft in orbit
title: elevation
examples:
- value: 100 meter
in_subset:
- environment
from_schema: https://w3id.org/mixs
keywords:
- elevation
slot_uri: MIXS:0000093
alias: elev
owner: MigsBa
domain_of:
- MigsBa
- MigsEu
- MigsOrg
- MigsPl
- MigsVi
- Mimag
- MimarksC
- MimarksS
- Mims
- Misag
- Miuvig
- Agriculture
- Air
- HostAssociated
- HydrocarbonResourcesCores
- MicrobialMatBiofilm
- MiscellaneousNaturalOrArtificialEnvironment
- PlantAssociated
- Sediment
- Soil
- SymbiontAssociated
- Water
range: string
recommended: true
pattern: ^[-+]?[0-9]*\.?[0-9]+(?:[eE][-+]?[0-9]+)?( *- *[-+]?[0-9]*\.?[0-9]+(?:[eE][-+]?[0-9]+)?)?
*([^\s-]{1,2}|[^\s-]+.+[^\s-]+)$
structured_pattern:
syntax: ^{scientific_float}( *- *{scientific_float})? *{text}$
interpolated: true
partial_match: true
env_broad_scale:
name: env_broad_scale
description: 'Report the major environmental system the sample or specimen came
from. The system(s) identified should have a coarse spatial grain, to provide
the general environmental context of where the sampling was done (e.g. in the
desert or a rainforest). We recommend using subclasses of EnvO s biome class: http://purl.obolibrary.org/obo/ENVO_00000428.
EnvO documentation about how to use the field: https://github.com/EnvironmentOntology/envo/wiki/Using-ENVO-with-MIxS'
title: broad-scale environmental context
examples:
- value: rangeland biome [ENVO:01000247]
in_subset:
- environment
from_schema: https://w3id.org/mixs
keywords:
- context
- environmental
slot_uri: MIXS:0000012
alias: env_broad_scale
owner: MigsBa
domain_of:
- MigsBa
- MigsEu
- MigsOrg
- MigsPl
- MigsVi
- Mimag
- MimarksC
- MimarksS
- Mims
- Misag
- Miuvig
range: string
required: true
pattern: ^([^\s-]{1,2}|[^\s-]+.+[^\s-]+) \[[a-zA-Z]{2,}:[a-zA-Z0-9]\d+\]$
structured_pattern:
syntax: ^{termLabel} \[{termID}\]$
interpolated: true
partial_match: true
tax_class:
name: tax_class
description: Method used for taxonomic classification, along with reference database
used, classification rank, and thresholds used to classify new genomes
title: taxonomic classification
examples:
- value: vConTACT vContact2 (references from NCBI RefSeq v83, genus rank classification,
default parameters)
in_subset:
- sequencing
from_schema: https://w3id.org/mixs
keywords:
- classification
- taxon
slot_uri: MIXS:0000064
alias: tax_class
owner: MigsBa
domain_of:
- MigsBa
- MigsEu
- MigsOrg
- MigsPl
- MigsVi
- Mimag
- Mims
- Misag
- Miuvig
range: string
experimental_factor:
name: experimental_factor
annotations:
Expected_value:
tag: Expected_value
value: text or EFO and/or OBI
description: Variable aspects of an experiment design that can be used to describe
an experiment, or set of experiments, in an increasingly detailed manner. This
field accepts ontology terms from Experimental Factor Ontology (EFO) and/or
Ontology for Biomedical Investigations (OBI)
title: experimental factor
examples:
- value: time series design [EFO:0001779]
in_subset:
- investigation
from_schema: https://w3id.org/mixs
keywords:
- experimental
- factor
string_serialization: '{termLabel} [{termID}]|{text}'
slot_uri: MIXS:0000008
alias: experimental_factor
owner: MigsBa
domain_of:
- MigsBa
- MigsEu
- MigsOrg
- MigsPl
- MigsVi
- Mimag
- MimarksC
- MimarksS
- Mims
- Misag
- Miuvig
- FoodAnimalAndAnimalFeed
- FoodFoodProductionFacility
- FoodHumanFoods
range: string
multivalued: true
pattern: ^\S+.*\S+ \[[a-zA-Z]{2,}:\d+\]$
associated_resource:
name: associated_resource
annotations:
Expected_value:
tag: Expected_value
value: reference to resource
description: A related resource that is referenced, cited, or otherwise associated
to the sequence
title: relevant electronic resources
examples:
- value: http://www.earthmicrobiome.org/
in_subset:
- sequencing
from_schema: https://w3id.org/mixs
keywords:
- resource
string_serialization: '{PMID}|{DOI}|{URL}'
slot_uri: MIXS:0000091
alias: associated_resource
owner: MigsBa
domain_of:
- MigsBa
- MigsEu
- MigsOrg
- MigsPl
- MigsVi
- Mimag
- MimarksC
- MimarksS
- Mims
- Misag
- Miuvig
- Agriculture
range: string
recommended: true
multivalued: true
sop:
name: sop
annotations:
Expected_value:
tag: Expected_value
value: reference to SOP
description: Standard operating procedures used in assembly and/or annotation
of genomes, metagenomes or environmental sequences
title: relevant standard operating procedures
examples:
- value: http://press.igsb.anl.gov/earthmicrobiome/protocols-and-standards/its/
in_subset:
- sequencing
from_schema: https://w3id.org/mixs
keywords:
- procedures
string_serialization: '{PMID}|{DOI}|{URL}'
slot_uri: MIXS:0000090
alias: sop
owner: MigsBa
domain_of:
- MigsBa
- MigsEu
- MigsOrg
- MigsPl
- MigsVi
- Mimag
- MimarksC
- MimarksS
- Mims
- Misag
- Miuvig
- Agriculture
range: string
recommended: true
multivalued: true
class_uri: MIXS:0010003