ISO 20397-2:2021 pdf download

ISO 20397-2:2021 pdf download – Biotechnology — Massively parallel sequencing — Part 2: Quality evaluation of sequencing data

ISO 20397-2:2021 pdf download – Biotechnology — Massively parallel sequencing — Part 2: Quality evaluation of sequencing data.
3.13 deletion loss of one (or more) nucleotide base pair(s) from a nucleic acid sequence compared to its reference sequence 3.14 duplication level number of identical repeats for every sequence in a library Note 1 to entry: The duplication level is usually displayed in a plot showing the relative number of sequences with different degrees of duplication. 3.15 GC content percentage of guanine and cytosine in one or more nucleic acid sequence(s) Note 1 to entry: The amount of guanine and cytosine in a polynucleic acid, is usually expressed in mole fraction (or percentage) of total nitrogenous bases. Total nitrogenous bases comprise the total number of nucleotide bases of reads from one or more MPS run. 3.16 gene sequence of nucleotides in DNA or RNA encoding either an RNA or a protein product Note 1 to entry: Genes are recognized as the basic unit of heredity. Note 2 to entry: A gene can consist of non-contiguous nucleic acid segments that are rearranged through a nuclear processing step. Note 3 to entry: A gene may include or be part of an operon that includes elements for gene expression. 3.17 indel insertion (3.18) or /and deletion (3.13) of nucleotides in genomic DNA Note 1 to entry: Indels are less than 1 000 bases in length. 3.18 insertion addition of one (or more) nucleotide base pair(s) into a nucleic acid sequence [SOURCE: ISO/TS 20428: 2017, 3.19, modified — DNA was replaced by nucleic acid.] 3.19 sequencing determining the order and the content of nucleotide bases (adenine, guanine, cytosine, thymine, and uracil) of a nucleic acid molecule Note 1 to entry: A sequence is generally described from the 5’ to 3’ end. [SOURCE: ISO/TS 17822-1:2020, 3.19, modified — DNA was deleted in the term; DNA was replaced by nucleic acid, and uracil was added in the definition.]
3.21 raw data primary sequencing data produced by a sequencer without involving any software-based pre-filtering for analysis purpose 3.22 RNA ribonucleic acid polymer of ribonucleotides occurring in a double-stranded or single-stranded form Note 1 to entry: Synthesis of proteins in cells is directed by genetic information carried in the sequence of nucleotides in a class of RNA known as messenger RNA (mRNA). 3.23 ribonucleotide nucleotide containing ribose as its pentose component forming the basic building blocks for RNA Note 1 to entry: The ribonucleotides consist of adenylate (AMP), guanylate (GMP), cytidylate (CMP), or uridylate (UMP). 3.24 read sequence read nucleotide sequence generated by a sequencing device Note 1 to entry: A read is a deduced sequence of nucleic acid base pairs (or base pairs probabilities) corresponding to all (or part of) a single nucleic acid fragment. Read can be used to refer to as those sequences obtained from MPS experiments. 3.25 read type category of sequence that depends on how the sequence reading experiment is designed and conducted EXAMPLE consensus. Read type can be single-end, paired-end, mate-paired end, continuous long read, circular 3.26 reference sequence nucleic acid sequence used either to align by mapping sequence reads or as the basis for annotations such as genes and sequence variations 3.27 demultiplexing computational reverse of multiplexing process, mixing two or more samples together such that they can be sequenced in a single run on an MPS instrument Note 1 to entry: Samples that are to be combined need to be barcoded/indexed prior to being mixed together. Note 2 to entry: Demultiplexing is a computational algorithm that separates a pool of reads according to their original sample based on the barcode.
4 Raw data 4.1 General Each nucleotide in a sequence should be assigned a numerical value (base quality score) that correlates to the inferred accuracy of the base calling process, if applicable. 4.2 Raw data file Generation of sequence read files should use instrument-specific software and/or instrument-specific pipelines. Monitored physical parameters such as signal to noise ratio shall be documented. These physical parameters should be monitored of during each sequencing experiment. Sequence read files should be configured in the appropriate file format, containing the compilation of individual sequence reads, each with its own identifier, and an associated base quality score for each nucleotide. NOTE FASTQ format (or convertible to FASTQ format) can be used as a de facto standard format for downstream analysis of the quality of MPS data sets. FASTQ is widely accepted as a cross platform interchange file format. The output files generated after a sequencing run, and associated quality metrics should be analysed in the downstream bioinformatics pipeline using appropriate software.

Leave a Reply

Your email address will not be published. Required fields are marked *