HUMAN GENOME ANALYSIS USING BIOINFORMATICS
- The human genome is the complete set of genetic information of humans stored as the DNA sequence.
- This DNA encodes the proteins and other products that define our cells and ultimately define who we are as biological entities.
- It is variations in the genome that account for the differences between people, from physical features to personality to disease states.
Human genome project (HGP)
- HGP:- Started in October 1990 & completed in 2003.
- HGP produced 1st complete sequence of the entire human genome.
- In HGP, HG was sequenced using the hierarchical shotgun method of sequencing.
- A shotgun approach breaks the genome into random, overlapping fragments & sequence each fragment.
- Based on overlaps the sequence is assembled by a computer.
GOAL OF HUMAN GENOME PROJECT
- Identify all the approximately 20,500 genes in human DNA,
- Determine the sequences of the 3 billion chemical base pairs that make up human DNA,
- Store this information in databases,
- Improve tools for data analysis,
- Transfer related technologies to the private sector, and
- Address the ethical, legal, and social issues (ELSI) that may arise from the project.
Bioinformatics and Computational Biology
- Improve the content and utility of databases.
- Develop better tools for data generation, capture, and annotation.
- Develop and improve tools and databases for comprehensive functional studies.
- Develop and improve tools for representing and analyzing sequence similarity and variation.
- Create mechanisms to support effective approaches for producing robust, exportable software that can be widely shared.
What is genome annotation?
- Define annotation as a subfield in the general field of genome analysis, which includes more or less anything that can be done with genome sequences by computational means.
- The annotation may be defined as the part of genome analysis that is customarily performed before a genome sequence is deposited in GenBank.
Automation of genome annotation
- GeneQuiz project was the first automatic system for genome analysis, which performed similarity searches followed by automatic evaluation of results and generation of functional annotation by an expert system based on a set of several predefined rules.
- Several other similar systems have been created since then, but GeneQuiz remains the only such tool that is open to the general public.
- GeneQuiz runs automated database searches and sequence analysis by taking a protein sequence and comparing it against a non-redundant protein
Tools used for analysis:
GATEWAYS TO ACCESS THE HUMAN GENOME
The NCBI offers two main ways to access data on the human genome. From the main page of NCBI, you can select “human genome resources,” which provides links to each chromosome and a variety of web resources. The Map Viewer this page allows searches by clicking on a chromosome Or by entering a text query. The human Map Viewer integrates human sequence and data from cytogenetic maps, genetic linkage maps, radiation hybrid maps, and YAC chromosomes. A query “hbb” links to the Map Viewer.
Ensembl is a comprehensive resource for information about the human genome as well as many other genomes. Ensembl is that supports research in comparative genomics, evolution, sequence variation, and transcriptional regulation. Ensembl annotates genes, computes multiple alignments, predicts regulatory function, and collects disease data. Ensembl tools include BLAST, BLAT, BioMart, and the Variant Effect Predictor (VEP) for all supported species.
- The University of California at Santa Cruz Human Genome Browser
The “Golden Path” is the human genome sequence annotated at UCSC. Following are some of the tools used genome browser, BLAT, In-silico PCR, etc
The National Human Genome Research Institute (NHGRI) has a leading role in genome sequencing, coordinating pilot-scale and large-scale sequencing efforts, technology development, and policy development.