Experiment metadata
Experiment metadata describes the conditions of your sequencing experiment. If you use different experimental settings for your samples, you must also document the connections between experiments and samples. Please note that experiment metadata is only required for FASTQ and BAM/CRAM submissions.
Metadata fields
The table below shows the fields that can be specified for an experiment. An asterisk indicates if the field is required.
Field name | Description |
---|---|
Alias* | The submitter’s designated name for the experiment. The name must be unique within the submission account. |
Design name* | A name of the design (e.g. “Whole genome sequencing - genomic library”) |
Instrument model* | The sequencing platform according to a controlled vocabulary (see permitted values) |
Library name | The submitter’s name for the library (e.g. “Solexa-32824”) |
Library layout* | Whether to expect single or paired reads (“PAIRED” or “SINGLE”) |
Library source* | The type of source material that is being sequenced according to a controlled vocabulary (see permitted values) |
Library selection* | Method used to enrich the target in the sequence library preparation according to a controlled vocabulary (see permitted values) |
Library construction protocol | Free form text describing the protocol by which the sequencing library was constructed |
Experiment metadata templates
Templates for experiment metadata are available in the following formats:
File format | Download | Description |
---|---|---|
Text CSV | experiment-metadata.csv | Plain text template without validation |
ODF Spreadsheet | experiment-metadata.ods | LibreOffice spreadsheet with validation |
Office Open XML Spreadsheet | experiment-metadata.xlsx | Microsoft Excel spreadsheet with validation |
Controlled vocabularies
Vocabularies to use when entering experiment metadata.
Permitted values for platform
The EGA Submitter Portal does not ask for this information, but the platform provides the context for the instrument (see the instrument vocabulary below). The table below has been compiled from ENA’s schema specification on GitHub.
Platform | Description |
---|---|
LS454 | 454 technology use 1-color sequential flows |
ILLUMINA | Illumina is 4-channel flowgram with 1-to-1 mapping between basecalls and flows |
HELICOS | Helicos is similar to 454 technology - uses 1-color sequential flows |
ABI_SOLID | ABI is 4-channel flowgram with 1-to-1 mapping between basecalls and flows |
COMP LETE_GENOMICS | CompleteGenomics platform type. At present there is no instrument model. |
BGISEQ | |
OXFORD_NANOPORE | Oxford Nanopore platform type. nanopore-based electronic single molecule analysis |
PACBIO_SMRT | PacificBiosciences platform type for the single molecule real time (SMRT) technology. |
ION_TORRENT | Ion Torrent Personal Genome Machine (PGM) from Life Technologies. |
CAPILLARY | Sequencers based on capillary electrophoresis technology manufactured by LifeTech (formerly Applied BioSciences). |
DNBSEQ | Sequencers based on DNBSEQ by MGI Tech. |
Permitted values for instrument
Use this vocabulary to specify the kind of instrument that was used. Select one of the instrument models below. The table below has been compiled from ENA’s schema specification on GitHub. For explanations of the values in the platform column, see Permitted values for platform above.
Instrument model | Platform | Remarks |
---|---|---|
454 GS | LS454 | |
454 GS 20 | LS454 | |
454 GS FLX | LS454 | |
454 GS FLX+ | LS454 | |
454 GS FLX Titanium | LS454 | |
454 GS Junior | LS454 | |
unspecified | LS454 | |
HiSeq X Five | ILLUMINA | |
HiSeq X Ten | ILLUMINA | |
Illumina Genome Analyzer | ILLUMINA | |
Illumina Genome Analyzer II | ILLUMINA | |
Illumina Genome Analyzer IIx | ILLUMINA | |
Illumina HiScanSQ | ILLUMINA | |
Illumina HiSeq 1000 | ILLUMINA | |
Illumina HiSeq 1500 | ILLUMINA | |
Illumina HiSeq 2000 | ILLUMINA | |
Illumina HiSeq 2500 | ILLUMINA | |
Illumina HiSeq 3000 | ILLUMINA | |
Illumina HiSeq 4000 | ILLUMINA | |
Illumina HiSeq X | ILLUMINA | |
Illumina iSeq 100 | ILLUMINA | |
Illumina MiSeq | ILLUMINA | |
Illumina MiniSeq | ILLUMINA | |
Illumina NovaSeq 6000 | ILLUMINA | |
NextSeq 500 | ILLUMINA | |
NextSeq 550 | ILLUMINA | |
NextSeq 1000 | ILLUMINA | |
NextSeq 2000 | ILLUMINA | |
unspecified | ILLUMINA | |
Helicos HeliScope | HELICOS | |
unspecified | HELICOS | |
AB SOLiD System | ABI_SOLID | Undifferentiated early AB SOLiD system |
AB SOLiD System 2.0 | ABI_SOLID | |
AB SOLiD System 3.0 | ABI_SOLID | |
AB SOLiD 3 Plus System | ABI_SOLID | |
AB SOLiD 4 System | ABI_SOLID | |
AB SOLiD 4hq System | ABI_SOLID | |
AB SOLiD PI System | ABI_SOLID | |
AB 5500 Genetic Analyzer | ABI_SOLID | |
AB 5500xl Genetic Analyzer | ABI_SOLID | |
AB 5500xl-W Genetic Analysis System | ABI_SOLID | |
unspecified | ABI_SOLID | |
Complete Genomics | COM PLETE_GENOMICS | |
unspecified | COM PLETE_GENOMICS | |
BGISEQ-50 | BGISEQ | |
BGISEQ-500 | BGISEQ | |
MGISEQ-2000RS | BGISEQ | |
PacBio RS | PACBIO_SMRT | |
PacBio RS II | PACBIO_SMRT | |
Sequel | PACBIO_SMRT | |
Sequel II | PACBIO_SMRT | |
Sequel IIe | PACBIO_SMRT | |
unspecified | PACBIO_SMRT | |
Ion Torrent PGM | ION_TORRENT | |
Ion Torrent Proton | ION_TORRENT | |
Ion Torrent S5 | ION_TORRENT | |
Ion Torrent S5 XL | ION_TORRENT | |
Ion Torrent Genexus | ION_TORRENT | |
Ion GeneStudio S5 | ION_TORRENT | |
Ion GeneStudio S5 Prime | ION_TORRENT | |
Ion GeneStudio S5 Plus | ION_TORRENT | |
unspecified | ION_TORRENT | |
AB 3730xL Genetic Analyzer | CAPILLARY | |
AB 3730 Genetic Analyzer | CAPILLARY | |
AB 3500xL Genetic Analyzer | CAPILLARY | |
AB 3500 Genetic Analyzer | CAPILLARY | |
AB 3130xL Genetic Analyzer | CAPILLARY | |
AB 3130 Genetic Analyzer | CAPILLARY | |
AB 310 Genetic Analyzer | CAPILLARY | |
unspecified | CAPILLARY | |
DNBSEQ-T7 | DNBSEQ | |
DNBSEQ-G400 | DNBSEQ | |
DNBSEQ-G50 | DNBSEQ | |
DNBSEQ-G400 FAST | DNBSEQ | |
unspecified | DNBSEQ | |
MinION | OXFORD_NANOPORE | |
GridION | OXFORD_NANOPORE | |
PromethION | OXFORD_NANOPORE | |
unspecified | OXFORD_NANOPORE |
Permitted values for library source
Use this vocabulary to specify the type of source material that was sequenced. Select one of the values below. The table below has been compiled from ENA’s schema specification on GitHub.
Library source | Description |
---|---|
GENOMIC | Genomic DNA (includes PCR products from genomic DNA). |
GENOMIC SINGLE CELL | |
TRANSCRIPTOMIC | Transcription products or non genomic DNA (EST, cDNA, RT-PCR, screened libraries). |
TRANSCRIPTOMIC SINGLE CELL | |
METAGENOMIC | Mixed material from metagenome. |
MET ATRANSCRIPTOMIC | Transcription products from community targets |
SYNTHETIC | Synthetic DNA. |
VIRAL RNA | Viral RNA. |
OTHER | Other, unspecified, or unknown library source material. |
Permitted values for library selection
Use this vocabulary to specify how the target was enriched in the sequence library preparation. Select one of the values below. The table below has been compiled from ENA’s schema specification on GitHub.
Library selection | Description |
---|---|
RANDOM | No Selection or Random selection |
PCR | target enrichment via PCR |
RANDOM PCR | Source material was selected by randomly generated primers. |
RT-PCR | target enrichment via |
HMPR | Hypo-methylated partial restriction digest |
MF | Methyl Filtrated |
repeat fractionation | Selection for less repetitive (and more gene rich) sequence through Cot filtration (CF) or other fractionation techniques based on DNA kinetics. |
size fractionation | Physical selection of size appropriate targets. |
MSLL | Methylation Spanning Linking Library |
cDNA | PolyA selection or enrichment for messenger RNA (mRNA); synonymize with PolyA |
cDNA_ randomPriming | |
cDNA_oligo_dT | |
PolyA | PolyA selection or enrichment for messenger RNA (mRNA); should replace cDNA enumeration. |
Oligo-dT | enrichment of messenger RNA (mRNA) by hybridization to Oligo-dT. |
Inverse rRNA | depletion of ribosomal RNA by oligo hybridization. |
Inverse rRNA selection | depletion of ribosomal RNA by inverse oligo hybridization. |
ChIP | Chromatin immunoprecipitation |
ChIP-Seq | Chromatin immunoPrecipitation, reveals binding sites of specific proteins, typically transcription factors (TFs) using antibodies to extract DNA fragments bound to the target protein. |
MNase | Identifies well-positioned nucleosomes. uses Micrococcal Nuclease (MNase) is an endo-exonuclease that processively digests DNA until an obstruction, such as a nucleosome, is reached. |
DNase | DNase I endonuclease digestion and size selection reveals regions of chromatin where the DNA is highly sensitive to DNase I. |
Hybrid Selection | Selection by hybridization in array or solution. |
Reduced R epresentation | Reproducible genomic subsets, often generated by restriction fragment size selection, containing a manageable number of loci to facilitate re-sampling. |
Restriction Digest | DNA fractionation using restriction enzymes. |
5-m ethylcytidine antibody | Selection of methylated DNA fragments using an antibody raised against 5-methylcytosine or 5-methylcytidine (m5C). |
MBD2 protein methyl-CpG binding domain | Enrichment by methyl-CpG binding domain. |
CAGE | Cap-analysis gene expression. |
RACE | Rapid Amplification of cDNA Ends. |
MDA | Multiple Displacement Amplification, a non-PCR based DNA amplification technique that amplifies a minute quantifies of DNA to levels suitable for genomic analysis. |
padlock probes capture method | Targeted sequence capture protocol covering an arbitrary set of nonrepetitive genomics targets. An example is capture bisulfite sequencing using padlock probes (BSPP). |
other | Other library enrichment, screening, or selection process. |
unspecified | Library enrichment, screening, or selection is not specified. |