Genomics

Explaining DNA vs. RNA sequencing

The process of establishing the precise order of nucleotides contained within an RNA or DNA molecule is referred to as sequencing.


The process of establishing the precise order of nucleotides contained within an RNA or DNA molecule is referred to as sequencing. Researchers have spent the past 56 years inventing new methodologies and technologies in order to improve the accuracy with which they can determine the nucleic acid sequences present in biological materials. The precision with which we are able to sequence DNA and RNA has had a significant impact, not just in a variety of scientific subfields but also in clinical practice. We propose delving into the various types of sequencing DNA and RNA and discussing their concurrent uses.

 

Flow diagram indicating possible sequencing strategies for different sample types.
Figure 1. Flow diagram indicating possible sequencing strategies for different sample types.

RNA-Seq vs. DNA-Sequencing

Generally, it is to consider that data generated from the DNA sequencing reflect a snapshot of genome information, i.e. inherited gene mutations, acquired single nucleotide polymorphisms (SNPs), genome reorganization events like insertion/deletions and others. RNA sequencing, on the contrary, provides us with the dynamic picture of intracellular events, like gene expression, miRNA profiling and so on.

In clinical samples of glioblastoma, single-cell RNA sequencing showed that individual tumor cells vary in terms of their degree of stemness-related gene expression from extremely stem-like to differentiated states. Additionally, the existence of cancer stem cells that continuously differentiate into astrocyte- and oligodendrocyte-like cells has been demonstrated in oligodendrogliomas by single-cell RNA sequencing. Single-cell DNA sequencing has also been applied to breast cancer samples to evaluate intra-tumoral heterogeneity originating in genomic DNA, leading to the suggestion of stepwise/sweepstake or gradual evolution of cancer cells from single-nucleotide variation (SNV) data. However, these types of DNA sequence analyses highlighting respective evolutionary mechanisms are based on snapshot data at one time point. Therefore, in terms of oncology studies, it is highly important and necessary to deliver both RNA and DNA sequencing data over time for the full elucidation of tumor evolutionary dynamics and the progress of malignant transformations.

Next-generation sequencing (NGS) is a massively parallel sequencing technology that offers ultra-high throughput, scalability, and speed. The technology is used to determine the order of nucleotides in entire genomes or targeted regions of DNA or RNA. NGS has revolutionized the biological sciences, allowing labs to perform a wide variety of applications and study biological systems at a level never before possible. A detailed discussion of different NGS platforms is beyond the scope of this article and can be found elsewhere.

The two basic approaches used by NGS technology are short-read sequencing and long-read sequencing, each with its own set of advantages and disadvantages (Table 1).nex

 

Table 1. The advantages and limitations of short-read sequencing versus long-read sequencing.
The advantages and limitations of short-read sequencing versus long-read sequencing.

 

The enormous potential of NGS in clinical and scientific settings is the fundamental motivation for investing in its development. In clinical settings, next-generation sequencing (NGS) is used to diagnose diseases by detecting germline and somatic mutations. Given of the method's efficacy and continually dropping costs, the use of NGS in clinical practice is justified. NGS has proven to be an invaluable resource in metagenomic research, in addition to its utility in infectious disease diagnosis, monitoring, and management. NGS approaches were vital in identifying the SARSCoV-2 genome in 2020 and continue to make significant contributions to tracking the COVID-19 pandemic.

Types of DNA Sequencing

In the most common type of DNA sequencing - Sequencing by Synthesis - the nucleotides are chemically tagged with a fluorescent label. The sequence information can be used to determine genetic variations/mutations or disease-causing change (substitution, deletion, or addition) of base pairs.

1. Whole genome sequencing (WGS)

An organism's genome sequence is initially ascertained using whole genome sequencing. Additionally, genome-wide association studies (GWAS) can employ WGS to estimate the frequency of variants (mutations) in populations of organisms and link genetic variations to disease (GWAS). In a GWAS, WGS is performed on two populations, and trait and genetic differences are compared in order to link detected traits to identified variants. WGS is currently more frequently used as a diagnostic tool as its cost drops.

2. Whole exome sequencing (WES)

Whole exome sequencing identifies all the protein-coding genes in the genome. Focusing on protein-coding exons (and excluding other regions of the genome) can lower the cost and time of sequencing, as exons make up only 1% of the genome. Variants in protein-coding exons are responsible for many diseases, so this level of sequencing is often sufficient for diagnostic applications. WES is a more practical method for mapping variants that are rare in the population to elucidate complex disorders. It is also a feasible option for discovery science. WES is particularly useful in oncology research and is currently used for cancer diagnostics. Information gained from WES can provide insight into prognoses and personalized treatment options. WES is most often carried out with hybridization probes rather than amplicons.

3. Targeted sequencing

Targeted NGS makes it easier and more affordable than whole genome sequencing to sequence specific regions of the genome for in-depth studies. Targeted sequencing finds new and known variations in the area you're interested in. This technique typically generates less data than WGS, which facilitates analysis. There are various approaches of targeted sequencing, each suitable for particular uses. The most often used techniques include molecular inversion probes (MIPs), amplicon sequencing, and hybridization capture. See the summary below for a more thorough comparison of amplicon sequencing and hybridization capture.

3A. Hybridization capture

Prior to hybridization capture, samples are converted into sequencing libraries. Regions of interest in this library are then captured using long oligonucleotide biotinylated baits (Fig. 2). Because the DNA was randomly sheared during library preparation, captured fragments are overlapping and unique. Baits can be tiled, overlapped, and positioned to overcome challenges of repetitive sequences, etc. With advanced design, capture can be made very uniform. Hybridization capture is often used for targeted exome sequencing. Other applications include genotyping, rare variant detection, and oncology diagnostics.

 

Samples through hybridization converted into sequencing libraries.

Figure 2. Samples through hybridization converted into sequencing libraries.

3B. Amplicon sequencing

The sequencing of amplicons is an extremely specific method that, when used, allows for the investigation of variation in certain genomic areas. This technique makes use of PCR in order to amplify DNA and produce amplicons. Indexing and sequencing of the amplicons have been performed (Fig. 2). Amplicon sequencing is often utilized in the process of diagnostics and the discovery of disease-associated variants. In addition to that, it can be used for sequencing to determine a person's genotype and for confirming CRISPR genome changes. Learn more about how you can analyze CRISPR-Cas9 modifications in a speedy and precise manner by using rhAmpSeq targeted sequencing by reading the related reading material.

Types of RNA Sequencing

RNA-seq is similar to DNA sequencing but with an added step. Instead of isolating DNA, RNA is extracted from a sample and then reverse transcribed to produce cDNA. From there, the cDNA is fragmented and run through a high-throughput next generation sequencing system.

High-throughput sequencing of RNA (RNA-Seq) has revolutionized biological research by providing a new means for quantitative measurement of transcription, a highly dynamic process that controls many cellular functions. A transcriptome encompasses the full range of coding RNA and non-coding RNA transcripts expressed by an organism, also referred to as total RNA (Fig. 3). The term “transcriptome” can also be used to describe the array of mRNA transcripts produced in a particular cell or tissue type.

In contrast with the genome, the transcriptome actively changes due to many factors, including the organism’s developmental stage and environmental conditions. RNA-Seq can be used to simultaneously measure expression in thousands of genes under one condition or compare it across multiple conditions, the latter is known as differential gene expression.

 

RNA and DNA circular chart
Figure 3. Circular chart showing rRNA, tRNA, mRNA, IncRNA, miRNA, and other small RNA.

Coding RNA: or messenger RNA (mRNA), is an RNA molecule that is translated into proteins.

Non-coding RNA: an RNA molecule that is not translated into a protein. Non-coding RNA typically includes, but is not limited to, rRNA, tRNA, lncRNA, miRNA, piRNA, and snRNA.

1. Whole transcriptome sequencing (WTS) or RNA-seq

Understanding the transcriptome can be helpful in gaining significant knowledge because the degree of RNA expression might change depending on the kind of cell and the illness condition. The sequencing of the entire transcriptome yields the most exhaustive data since it offers details not only about coding and noncoding RNA but also about known and novel variants. This is because it comprises information about the entire transcriptome. Stranded RNA-seq is a kind of whole transcriptome sequencing (also known as RNA-seq), which stores information regarding the strand from which individual transcripts were transcribed. This enables the discovery makes use of RNA-seq.

Typically, RNA-seq is used to evaluate messenger RNA (mRNA) expression levels and can also be used to evaluate changes in gene expression over time, gene fusions, single nucleotide polymorphisms (SNPs), alternative splicing, and RNA modifications.

2. Targeted gene expression with RNA-Sequencing

Like DNA, RNA can be targeted for sequencing using hybridization capture or amplicon sequencing technologies. Specific populations of RNA can be targeted. Most often, coding transcripts are targeted, but other populations of RNA such as tRNA (transfer RNA) or small RNA may be enriched.

3. Ribosomal RNA depletion

Ribosomal RNA (rRNA) makes up 80-95% of the cell’s RNA; however, since its expression is constant, it is rarely of interest. Therefore, it can be advantageous to avoid sequencing these molecules and focus your sequencing on more useful data. rRNA may be removed from samples by 1 of 2 methods:

1). Biotinylated probes can be used to bind the rRNA and thus remove it from the RNA sample.

2). DNA probes that bind the rRNA can be used in conjunction with RNase H to degrade the rRNA from the sample library before library prep.

Epigenomics

1. ChIP-Seq

Chromatin immunoprecipitation (ChIP) is used to evaluate protein interactions with DNA. Protein can regulate DNA, impacting its expression. This type of regulation can be influenced by the environment and can change over time. ChIP identifies sites in the DNA sequence where protein is bound. An antibody is used to bind the protein of interest, which allows immunoprecipitation of the DNA bound to the protein.

When this type of identification is done using an array, it is called ChIP-chip. Evaluating protein interactions of the whole genome by sequencing is called ChIP-Seq. ChIP-Seq experiments usually focus on transcription factors, histones, and histone modifications so they can reveal information about gene regulation, cell proliferation, and disease progression.

2. Methyl-seq

Methylation, another method by which DNA is regulated, is also influenced by the environment and can change over time. Methyl groups are added to the DNA sequence and can repress DNA expression.

Methyl-Seq, also known as bisulfite sequencing, treats the DNA with bisulfite before sequencing to provide information about the methylation status of the DNA sequence. This information is primarily used to evaluate gene-environment interactions.

Since RNA expression level can vary based on cell type and disease state, understanding the transcriptome can provide valuable insight. Sequencing the whole transcriptome provides the most comprehensive data because it contains information regarding both coding and noncoding RNA and both known and novel variants. A variation of Whole Transcriptome Sequencing, or RNA-seq, is stranded RNA-seq, which retains information about the strand of origin of transcripts. This makes it possible to identify novel transcripts, including antisense RNAs. RNA-seq is often used for discovery science.

Single-cell sequencing is a powerful technology for investigating ITH by identifying genomic alterations and distinct transcriptomic states in single tumor cells.

In a study published in Cancer Discovery, genomic analyses of metastatic pancreatic cancers have suggested approximately one third of pancreatic cancer patients may have a genomic alteration that could impact treatment decisions and guide doctors to choose a specific therapy in a personalized medicine approach.

The study describes a metastatic tumor biopsy protocol now being used at Dana-Farber Cancer Institute called PancSeq, which was implemented to perform whole exome-DNA and RNA-sequencing for patients with advanced pancreatic cancer. Additionally, both tumor DNA and inherited DNA were sequenced for all patients. The analyzed data were then given to their clinicians to assist in the patients’ care.

Overall, 30% of enrolled patients had a change in their clinical care as a result of their genomic data, including the recommendation for some patients that family members consider genetic testing due to a potential inherited predisposition to pancreatic cancer. This data demonstrates how the timely collection of genetic information can impact treatment decision-making in pancreatic cancer through enrollment of patients in clinical or the use of off-label targeted therapies.

RNA sequencing (RNA-Seq) is a powerful method for studying the transcriptome qualitatively and quantitatively. It can identify the full catalogue of transcripts, precisely define the structure of genes, and accurately measure gene expression levels. Several RNA sequencing services provide unparalleled flexibility in the analysis of different RNA species (coding, non-coding, and small transcripts) from a wide range of starting material using long- or short-read sequencing. which parallelizes the sequencing process producing thousands or millions of sequences or fragments (short reads) concurrently. The multiple fragmented sequence reads need to be assembled together on the basis of the overlapping areas or aligned to the reference Genome (RNA-alignment).

Similar posts

DISPENDIX Blog