CSE527 Notes Lecture One, Sept. 26. 2007, Prof. Ruzzo

Notes by: evr @u.       

The course web page is http://www.cs.washington.edu/527 Six to eight homeworks ( reading, exercises, programming help is available) and a project (5-10 page report) will be the basis for the grades.

The human genome of about three billion nucleotides has been sequenced but still things need to be understood. The goals include disease diagnosis, drug discovery and individualized medicine along with basic biology. Cancer, for example, of the blood (leukemia ) has a dozen of so different forms. Prostate cancer, which may be present in all men over 50 years of age, probably also takes several forms, but some kinds of it kill, others don't.

Computer Science points of contact and opportunities are scientific visualization, databases, text mining, machine learning, new and old algorithms, and artificial intelligence. We will learn about HMM, EM, MLE, Gibbs, Viterbi, algorithms etc.

The central dogma of genetics is that the nucleic acid sequence of DNA is transfered to messenger RNA and translates through the codon to proteins. Aside from environmental effects that are manifest as stochastic chemical reactions, we are a huge computer program with the code written in the genes. The DNA is encoded into messenger RNA which then migrates to the ribosomes which read the RNA and makes proteins. The process of going form DNA to mRNA is called transcription and the process from mRNA to protein is called translation. Going from mRNA to DNA is called reverse transcription and is what retro-viruses do in order to get their genetic material incorporated into that of the cell. Many functionally different types of RNA exist including mRNA, tRNA, and rRNA. The process of transcription which is initiated at the DNA is begun at a site on the DNA called the promoter (a particular sequence of bases) which is near the 5’ end. A’s become U’s, T’s become A’s, C’s become G’s, and G’s become C’s in the mRNA. The U is another nucleotide that basically holds the same position that a T usually would. Each codon (3 pairs of nucleotides) codes for an amino acid or a special stop/start value. Three pairs and four nucleotides allow for 4^3 = 64 different codons

  

Each of the two strands of the DNA helix is going in opposite directions . In DNA, there is a major groove and a minor groove in the double helix. Click on the model, to see another model. The groove on the left side of the picture is much larger than the right side. This is because the paired bases in the center meet each other at an angle. RNA usually exists as a single strand, but helical sections in RNA can be formed by RNA folding back on itself as well. More complex tertiary structures also form. The DNA backbone is a covalently bonded polymer with information in the ordering ot the A,T,C and G nucleotides.

   

Genetics is the study of heredity. The gene is the unit of heredity. (Recent discoveries may modify this.) The genotype is what is in the genes. The phenotype is what is displayed by the organism. Mendelian genetics says each parent contributes one gene (not so exactly). Most cells in multicellular oganisms and yeast, are eukaryotes (has nucleus, mitochondria, etc.) or prokaryotes (no nucleus). The prokaryotes with more generations of evolution, may be more highly evolved for their lifestyle. The mitochondria in animals make energy; isolation of this chemical process in a separate organelle probably protects the rest of the cell from damage by harmful byproducts. Mitochondria and chloroplasts replicate on their own. Red blood cells have no nucleus. Mammalian sperm cells have no mitochondria. Y chromosomes all come from the male side of the family. The chromosome in humans is almost a two meter long DNA molecule, consisting of many genes. Eukaryotes, human and bat have 46 chromosomes in pairs. Most prokaryotes have a single chromosome. Mitosis is cell division - all chromosomes are copied and one copy of each pulled into each daughter cell. In higher organisms, cells are usually diploid: each cell has 2 copies of each chromosome (one paternal, one maternal; exception: XY in male mammals, e.g.). Gametes (egg, sperm) are haploid: they have only one copy of each chromosome. Meiosis is the process by which haploid gametes are produced from diploid germ gells. Humans have about 20,000 genes and 23 chromosomes in the gametes. So, if independent segregation of chromosomes into gametes were the only process going on in meiosis, any individual could form only 2^(23) genetically different gametes. That's a lot, but small compared to the number of potential allele combinations, and would mean alleles on the same chromosome would always be inherited together. Recombination makes this system much richer. During meiosis, the maternal and paternal copies of each chromosome are brought together and crossover occurs, meaning that the chromosomes in each gamete are a mosaic of maternal and paternal alleles.

  

Proteins are made up of any of twenty amino acids linked by peptide bonds.