DNA (Slide 1) is almost synonymous with life. But it is also just a chemical in the form of a long polymer chain of repeating units. This lecture will focus on the physical and chemical structure of DNA.


1.1 Chemical Structure of DNA and RNA

DNA (deoxyribonucleic acid) is the best known of all the molecules of life. Each DNA molecule in a cell is a long chain of repeating units called nucleotides, of which there are four types. These form the genetic code that carries the information necessary to specify the growth and life of any particular organism. RNA (ribonucleic acid) is very similar to DNA, but with subtle differences in its structure and biological roles. Both are good examples of polymer chains, and their physical properties can be modeled as freely-jointed and worm-like chains (which will be discussed later).

DNA and RNA are called nucleic acids because they are acids and are found in high concentration in the nucleus of cells. They are long, linear polymers made from repeated monomer units (nucleotides) containing a 5-carbon sugar, a phosphate and a base. The sugar and phosphate link together to form the backbone, a long chain of alternating sugar and phosphate groups. Each sugar attaches to a base, hanging off the side of the backbone (Slide 2).


The phosphates are acids: chemical groups with a tendency to donate H+. to water, often becoming negatively charged in the process. (The opposite is a base, with a tendency to receive H+ from water, often becoming positively charged in the process.) The phosphates release H+ into water, becoming negatively charged and giving nucleic acids an overall negative charge.

RNA and DNA contain different sugars. The carbon atoms in the sugars are numbered 1’ to 5’. Ribonucleic acid, or RNA, contains D-ribose and deoxyribonucleic acid, or DNA, contains 2-deoxy-D-ribose. The only difference between the sugars is that deoxyribose lacks the hydroxyl group on the 2′ carbon atom.

The sugars are linked by phosphosdiester bridges formed between the 5′ and 3′ hydroxyl groups on adjacent sugars to form a repeating sugar-phosphate backbone (Slide 3). Again, the only difference between the DNA and RNA backbones is the OH group at the 2′ position on the ribose ring in RNA. To each sugar at the 1′ position is attached one of four bases.


For both DNA and RNA, two of the four bases are derivatives of purine and two are derivatives of a smaller molecule, pyramidine (Slide 4). In the case of DNA, the purine bases are adenine (A) and guanine (G) and the pyramidines are cytosine (C) and thymine (T). RNA has the same bases, except that uracil (U) is substituted for thymine – they are very similar – uracil lacks a methyl (CH3)group.


Slide 5 shows the final chemical structure (ie. the covalently bonded structure, also known as single-stranded or ss-DNA). If this methyl group is not here, then this base is uracil and the molecule is RNA. If we reinstate the methyl group to make this thymine, and remove all the 2′ hydroxyl groups, then this would be DNA. Genetic information is stored in the sequence of bases. Note that there is no centre of inversion in the backbone – the two ends would be distinguishable even if all the bases were the same. The base sequences of nucleic acids are always written in the 5′ to 3′ direction. This is also the direction in which DNA and RNA are synthesised in living cells.


1.2 Physical Structure

1.2.1 DNA

The physical structure of DNA was worked out by Francis Crick and James Watson in 1953. This is the foundation on which modern biology is built – so biophysics is not new, and it can be fantastically important. Watson and Crick were working in the Cavendish laboratory, under Sir Lawrence Bragg, of Bragg’s Law. They knew that genes were made from DNA, and believed that finding its physical structure was the key to understanding how it could be replicated. Watson’s book, The Double Helix, gives a very good idea of how exciting science can be.

Watson and Crick knew Chargaff’s rule: that for any organism the concentrations of bases obey

[A]/[T] ≈ 1

and

[C]/[G] ≈ 1

but [T]/[G] varies quite widely.

They also had X-ray diffraction data (Slide 6) obtained by Rosalind Franklin and Maurice Wilkins at Kings College, London. The shape of this diffraction pattern told them that DNA has a helical structure with two characteristic repeat distances of 0.34 nm and 3.4 nm, a diameter of roughly 2 nm, and a twofold symmetry axis.


Crick and Watson deduced the structure of the molecule by building a model, using retort stands and metal plates, that satisfied these constraints. The model is in King’s College, London. Slide 7 shows the model that Watson and Crick constructed: two antiparallel strands of DNA wind round each other with the sugar-phosphate backbones on the outside. The bases meet in the middle, and hydrogen bonds between bases help to hold the two strands together.


Watson and Crick found that if a purine attached to one backbone is paired with a pyrimidine attached to the other then they can neatly fill the space in the centre of the helix, and if the pairs were A-T and C-G then the bases could be positioned ideally to allow hydrogen bonding to bind them together (Slide 8). An A-T pair is stabilized by two hydrogen bonds, and a C-G pair by three. [We will discuss the hydrogen bond in the next lecture – it is a weak chemical bond formed between two electronegative atoms, one of which is covalently attached to a hydrogen atom – the covalent bond is strongly polarized, leaving a fractional positive charge on the hydrogen which forms a weak bond to the second electronegative atom.]. The helix is also stabilized by van der Waals interactions between adjacent base pairs, and hydrophobic interactions [see next lecture] – less polar regions of the bases are hidden in the centre of the helix, whereas more polar regions, including the charged backbone, are exposed to water.


Slide 9 shows that both types of base-pair have the same span between 1’ carbons. Both are planar and can be flipped 180° about the red axis. The backbone link is slightly longer than base-pair thickness – so base-pairs twist to stack, generating a helix. Notice that all mis-paired bases fit very badly. For example, swapping T and C opposes like partial-charges instead of the H-bonds in the correct pair, while swapping A and C increases the distance between 1’ carbons.


Slide 10 shows the same view as Slide 9 but with the backbone included, viewed end-on down the axis of the double helix. Notice that a base-pair is not symmetric about the diameter of the helix. In this view there is more space above (“major groove”) than below (“minor groove”) the base-pair.


Slide 11 depicts molecular models that show the two backbones winding round each other, and the bases packing tightly in the centre with complementary pairs of bases from the two strands meeting in the centre to form hydrogen bonds.


Crick and Watson’s model fitted the X-ray data, and the base pairing rules accounted for Chargaff’s result that the number of As matches the number of Ts, and the number of Cs matches the number of Gs.

As illustrated in Slide 11, ss-DNA is made in the 5’ – 3’ direction, and sequences are written and read in the same direction. The two strands in the double helix of double-stranded DNA (ds-DNA) run in opposite directions.

1.2.2 RNA

Like DNA RNA can also form a double helix. However this structure is less stable in RNA, and the backbone is more flexible allowing more complicated folded structures. One example is ribosomal RNA.

Protein synthesis is catalyzed by an immensely complicated molecular machine called the ribosome. This contains proteins, but is largely made of RNA – and it is the RNA component that is largely responsible for its catalytic activity. The E. coli ribosome has a mass of 2.5 MDa and contains about 4500 nucleotides – it is ~ 25 nm across.

The pictures in Slide 12 give an idea of the complexity of the structure. On the left is the secondary structure (intramolecular base-pairing interactions) of the largest of the three main strands of RNA. Ladder-like sections are double-helices like DNA, which are linked in a complicated 3-dimensional folded pattern by “loops” with different physical structure. On the right is the X-ray crystal structure of the whole complex.


1.3 DNA replication

Above all, the DNA double-helix structure immediately suggested how genetic information could be copied. The two strands of DNA that make up a double helix have complementary base sequences – they carry the same information, in different forms. If the strands are separated, each one can act as a template on which a perfect copy of the original double helix can be built by templated poymerisation (Slide 13). In their 1953 paper in the journal Nature, Watson and Crick simply made the statement ‘It has not escaped our notice that the specific pairing we have postulated immediately suggests a possible copying mechanism for the genetic material’..... which (along with their elucidation of the double helix structure) was enough to win them the Nobel Prize.


The image on the left of Slide 14 is picture of a single molecule of fruit fly DNA, 1.2 cm long. It is obviously rather hard to take such a picture of a molecule only 2 nm in diameter. This is an autoradiograph - the DNA was radioactively labelled, spread on a glass slide then covered with photographic emulsion and left for five months to develop. The image on the right is the DNA from a T2 bacteriophage – a virus that infects bacteria. The protein capsid (shell) of the virus was broken open by osmotic shock by placing it in distilled water.


The ways in which DNA and RNA are used to store and transfer information in living cells are covered in Lecture 6 of this topic: ‘Self-organisation and evolution’. If desired and if time permits, some of Lecture 6 could be incorporated into Lecture 1.

Alternatively, some of the material from Lecture 2 in this topic (‘Modelling DNA and RNA’) could be brought forward into Lecture 1.