Structure and Analysis of Eukaryotic Genes
Split genes Multigene families Functional analysis of eukaryotic genes
Split genes and introns
- The mRNA-coding portion of a gene can be split by DNA sequences that do not encode mature mRNA
- • Exons code for mRNA, introns are segments of genes that do not encode mRNA.
- • Introns are found in most genes in eukaryotes • Also found in some bacteriophage genes and in some genes in archaea
Restriction digestion of DNA PRACTICAL kit guide
R-loops can reveal introns
mRNA coding regions (exons) separated (by introns) on the chromosome
Examples of R-loops in mammalian hemoglobin genes
Types of exons
- Finding exons with computers
- Ab initio computation –
- Uses an explicit, sophisticated model of gene structure, splice site properties, etc to predict exons •
- Compare cDNA sequence with genomic sequence – BLAST2 alignments between cDNA and genomic sequences
Eukaryotic and prokaryotic gene structure
Find exons for HBB
- Sequence for the human beta-globin gene (HBB): – Accession number L48217 – Thalassemia variant •
- Sequence for HBB mRNA – NM_000518 •
- Retrieve those from GenBank at NCBI– Get the files in FASTA format •
- Run Genscan and BLAST2 sequences
Genscan analysis of the HBB gene
BLAST2: HBB gene vs. cDNA
Introns are removed by splicing RNA precursors
Alternative splicing can generate multiple polypeptides from a single gene
The mRNA for Protein A is made by splicing together exons 1, 2 and 3:
Alternative splicing can generate multiple polypeptides from a single gene, part 2
Or, by an alternative pathway of splicing that skips over exon2, Protein B can be made:
Multigene families, e.g. encoding hemoglobin
Blot-hybridization analysis showing multiple beta-like globin genes in mammals
- A: clones, gel
- B: clones, blot hybridization
- C: genomic DNA, blot hybridization
Functional analysis of isolated genes
Gene Expression: where and how much?
- A gene is expressed when a functional product is made from it.
- • One wants to know many things about how a gene is expressed, e.g. –In which tissues?
- –At what developmental stages?
- –In response to which environmental conditions?
- –At which stages of the cell cycle?
- –How much product is made?
RNA blot-hybridizations = Northern
RNA blot-hybridization: Stage specificity
RT-PCR to detect RNA
In situ hybridization and immunoreactions
Sequence everything, find function later
- Determine the sequence of hundreds of thousands of cDNA clones from libraries constructed from many different tissues and stages of development of organisms of interest.
- • Initially, the sequences are partials and are referred to as expressed sequence tags (ESTs).
- • Use these cDNAs in high-throughput screening and testing, e.g.
- expression microarrays (next presentation).
Massively parallel screening of high-density chip arrays - • Once the sequence of an entire genome has been determined, a diagnostic sequence can be generated for all the genes
- Synthesize this diagnostic sequence (a tag) for each gene on a high-density array on a chip, e.g. 6000 to 20,000 gene tags per chip.
- • Hybridize the chip with labeled cDNA from each of the cellular states being examined.
- • Measure the level of hybridization signals from each gene under each state.
- • Identify the genes whose expression level differs in each state. The genes are already available.
Prokaryotic Gene Structure
Expression profiling using microarrays
Find clusters of co-regulated genes
Search the databases
- What can be learned from the DNA sequence of a novel gene or polypeptide?
- Many metabolic functions are carried out by proteins conserved from bacteria or yeast to humans – one may find a homolog with a known function.
- Many sequence motifs are associated with a specific biochemical function (e.g. kinase, ATPase). A match to such a motif identifies a potential class of reactions for the novel polypeptide
Databases, cont’d
- One may find a match to other genes with no known function, but their pattern of expression may be known.
- Types of databases:
- – Whole and partial genomic DNA sequences
- – Partial cDNAs from tissues (ESTs
- = expressed sequence tags)
- – Databases on gene expression – Genetic maps
Express the protein product
- Express the protein in large amounts – In bacteria – In mammalian cells – In insect cells (baculovirus vectors) • Purify it
- Assay for various enzymatic or other activities, guided by (e.g.) – The way you screened for the clone – Sequence matches
The phenotype of directed mutation - Mutate the gene in the organism of interest, and then test for a phenotype
- The gain of function – Over-expression – Ectopic expression (where normally is silent)
- Loss of function – Knock-out expression of the endogenous gene (homologous recombination, antisense) – Express dominant-negative alleles – Conditional loss-of-function, e.g. knock-out by recombination only in selected tissues
Production of biological processes
Localization on a gene map
- E.g., use gene-specific probes for in situ hybridizations to mitotic chromosomes.
- Align the hybridization pattern with the banding pattern
- Are there any previously mapped genes in this region that provide some insight into your gene?