Subject Areas: Molecular Biology
The structure of the histone octamer core was deduced by Kornberg and Thomas in 1974   , and perfected by X-ray crystallography by Luger et al. in 1997  . According to the current nucleosome model, a standard Watson-Crick double-helix superhelically winds its way around the octamer core by following a path called the “superhelical ramp” (or “superhelical groove”). This is, in actual crystallographic fact, the pathway DNA follows when the nucleosome is reconstituted in vitro. But must we assume that the laboratory-reconstituted structure is identical to the native structure in vivo, in the intact cell nucleus? It is presently impossible to isolate the native structure in a form amenable to detailed structural study (such as X-ray crystallography). It is therefore impossible to be sure, at the present time, that the in vitro and in vivo structures are truly the same.
The problem with the current in vitro reconstituted model is twofold. First of all, the region of the histone octamer containing both the highest number, and also the highest concentration of positively-charged basic amino acid residues, is not the superhelical ramp, but rather the 8 subunit N-termini collectively (see Table 1). Since opposite charges attract, one might ask why DNA (if we may speak anthropomorphically) would “prefer” the superhelical ramp, when considerably more potential salt bridges are available in the n-termini. Surprisingly however, the current textbook model of histone structure depicts the N-termini as being variably-degraded random coils, with no relationship to DNA, and, in fact, with no important function of any sort whatsoever.
The second part of the problem is that, although the superhelical ramp is indeed well-endowed with basic residues (although less so than the N-termini), the alignment in the current model, between DNA phosphate groups and the guanidinium/ε-amino groups of arginine/lysine respectively, is surprisingly poor  . While there is no reason, a priori, to suppose that a “perfect” charge alignment between DNA and protein is biologically-mandated in the nucleosome (“perfect” being defined as each salt bridge being near to 3 Å in length), we again must ask: If, as we shall show to be the case when the DNA is placed in the n-terminal domain, a perfect DNA-protein charge alignment is possible between all DNA phosphate groups and basic amino acid R-groups, then why would DNA “prefer” the superhelical ramp, where many potential salt bridges simply cannot form, and where the alignment of those that are seen is surprisingly poor? Three possible explanations come quickly to mind:
1) The cell has no need for better alignment than that which is seen, or
2) conformational differences to be found in the ~100% humidity conditions of the cell nucleus might perhaps improve the alignment in vivo, or
3) the superhelical ramp is not the usual resting place of DNA in the nucleosome.
Possibility #1 cannot be ruled out, since the requirements of “functional” biology (as opposed to “structural” biology) cannot necessarily be foreseen. But regardless of what anyone predicts from “functional” biological considerations, the fact remains that the energetics, when considered purely from an energy-minimization stand- point, would appear to dictate that the n-terminal location for DNA, as presented herein, be given at least serious consideration.
Possibility #2 cannot be addressed at all, at the present time, for the reasons already stated above (i.e., the impossibility of isolating the native structure in a form amenable to study by tools such as X-ray crystallography). Therefore, if we are to base the answer to the above question upon available information only, the answer must logically default to #3, i.e., that the superhelical ramp may not, after all, be the in vivo location of DNA in the nucleosome.
Accordingly, I wish to propose a new nucleosome model; one which removes DNA from the superhelical ramp, and places it instead in the N-terminaldomain of the histone octamer. The new model is based upon the structure of the sperm cell protein, protamine.
1.1. Protamine-DNA Structure
In 2006 I proposed a detailed molecular model for the structure of the protamine-DNA complex in sperm cells  -  . To the best of my knowledge, it remains, to the date of this writing, the only serious model ever suggested, either before or since.
The model, in which the DNA is non-helical and the protein is in a β-sheet conformation, is favored by a multitude of salt bridges and hydrogen bonds, and is devoid of any serious steric hindrances, wherefore I would suggest that that structure, or one very similar to it, is likely to form unless some unforeseeable external force prevents it from doing to.
Figure 1 shows the main features of the protamine model. In Figure 1(A) we see a partly-schematic, longitudinal view of the structure. In the center are the protamine P1 and P2 chains, showing 5 amino acid residues from each. Since the full lengths of P1 and P2 are 51 and 57 residues, respectively, this is therefore approximately 10% of the full length of the protamine-DNA complex. In the region shown, all the residues are arginine, which is realistic if we understand the section to be taken from the middle of the structure, which is densely populated with arginine residues.
Figure 1. (A) Schematic longitudinal view of the mid-portion of the unit cell of a P1-P2 heterodimer and its associated DNA. Only the sugar-phosphate backbones of the DNA are shown. The DNA is in the Wu “straight ladder” conformation, with 6.8 Å spacing between residues. The protein is in the form of a β-sheet, also with 6.8 Å spacing. The arginine residues therefore align precisely with the DNA phosphate groups; (B) an axial view of the same structure. In all DNA models, the cross-duplex phosphate-to-phosphate distance is about 20 Å. In the protamine β-sheet, the distance between the guanidinium groups of fully-extended arginine residues is about 14 Å. This leaves a space of 3 Å for the formation of salt brides between protein and DNA. (Adapted from Figure 2 of reference  ).
Although not evident in this schematic rendition, the protamine P1 and P2 chains are conformed as β-strands, whose peptide backbones interact in the manner of a small β-sheet. In the published form of the structure  , the strands are parallel, although an antiparallel orientation is entirely possible  .
The unit cell may be regarded as one occurrence of the P1-P2 heterodimer, associated with approximately 58 base pairs of DNA, of which six bp are depicted in Figure 1(A). Only the sugar-phosphate backbones are shown.
It should hopefully be apparent in the figure that there is a perfect alignment of negative charges from DNA phosphate groups, and positive charges from arginine guanidinium groups. This perfect alignment comes about in two ways. For the DNA, it comes about by virtue of the “straight ladder” duplex structure of Tai Te Wu   , which is a fully-extended, non-helical structure with 6.8 Å base pair spacing. The complete Wu structure is a tetraplex, requiring that two adjacent DNA duplexes have their base pairs mutually intercalated, thus restoring the base pair spacing to 3.4 Å. Each of the DNA duplexes shown in Figure 1(A) represents only half of the tetraplex, the remainder of which we shall see shortly.
The 6.8 Å spacing in the protein β-strands comes about by the assignment of ψ and f angles of approximately ±130.5˚, respectively, which are nearly-optimal angles for β-sheet formation in the Ramachandran Plot. Furthermore, these angles impart a straight, non-helical conformation to the protein.
As we shall see, the precisely-aligned charge-charge interactions shown here will be readily reproduced in the nucleosome, by simply modeling the eight N-termini as β-strands, and by changing the DNA from the “classic” double helix to the Wu “straight ladder” tetraplex.
Figure 1(B) is an axial view of Figure 1(A); a cross-section taken in the vicinity of one of the three disulfide bonds which stabilize the P1-P2 structure. The full extension of the arginine side-chains, giving a cross-protamine guanidinium-to-guanidinium spacing of 14 Å, in the setting of the DNA cross-duplex phosphate-to-phosphate spacing of 20 Å, readily allows for the formation of 3 Å salt bridges between protein and DNA. Thus, not only are the opposite charges of protein and DNA perfectly aligned in the model, but also perfectly spaced.
These alignments and spacings will be readily reproduced in the nucleosome, except that the protein must be rotated. Please note that the P1-P2 dimer cross-section shown in Figure 1(B) is approximately square-shaped. The structure shown would not be importantly altered if the protein was rotated 90˚. As I pointed out in my original report   , salmon protamine, which has only a single subunit type and no cysteine, might be envisioned as having a structure similar to that shown in Figure 1(B), if we simply rotate the protein 90˚. The quality of the charge alignment between protein and DNA would be nearly unaffected by the rotation, and the association of adjacent protamine β-strands, although weaker in the absence of the sort of covalent disulfide linkage shown in Figure 1(B), could still be realized through hydrogen bonding alone, as seen in β-sheet structures such as keratin.
As we shall see below, it is the 90˚-rotated structure which will be required in histones. The protamine-to- histone analogy will be as follows: The hydrogen-bonded P1-P2 peptide backbones, in the protamine-DNA complex, will correspond to the hydrogen bonding of the N-terminal peptide backbones of adjacent octamers in the histone-DNA complex.
1.2. Intercalation of Base Pairs
Figure 2(A) shows the conceptual unit cell for protamine-DNA, consisting of the complete P1-P2 dimer plus its associated DNA. Also added to Figure 2(A) is a single additional DNA duplex, “borrowed” from an otherwise-invisible adjacent unit cell on the left. Thus, on the left-hand side of Figure 2(A), the bases are stacked at 3.4 Å, on the right, 6.8 Å; the difference is visually evident.
The manner of base stacking is shown in more detail in Figure 2(B), which depicts portions of two adjacent “unit cells”, one gray and one black. In order to cogently depict the stacking, it was necessary to pull the unit cells apart 2 - 3 Å, otherwise, in this view, they would have had the appearance of a single fused structure.
The type of DNA tetraplex structure on the left of Figure 2(A) is shown here because it will be almost exactly duplicated in the new histone-DNA model. (It is perhaps worth noting that a few close contacts in the DNA component of the published 2006 protamine-DNA model have been improved in the current model).
1.3. Extension of Unit Cell Structure to Fill the Sperm Cell
Figure 3 is an extended axial view of the structure introduced in Figure 1(B). Shown here are columns of DNA tetraplexes (the “X” shapes in this axial view) alternating with columns of protamine heterodimers (the “H”
Figure 2. (A) The complete unit cell of protamine-DNA structure may be thought of as consisting of protamine P1 (51 residues) and P2 (57 residues), plus about 58 base pairs of DNA. P1 and P2, largely overlapping in this view, are held together by b-sheet-like hydrogen bonds between their peptide backbones, and additionally by three disulfide bonds. The many arginine residues of protamine can be seen projecting laterally toward the DNA. On the left of panel A, an additional DNA duplex from an otherwise-invisible adjacent unit cell has been included, to show the effect of mutual intercalation of base pairs. Thus, the DNA bp spacing on the right of the unit cell in panel A is 6.8 Å, while on the left, after intercalation, the spacing is 3.4 Å; (B) a magnified view, showing the intercalation of adjacent unit cells. These are pulled about 2 - 3 Å apart, because in the fully-intercalated structure (shown on the left of panel A), it is difficult to tell which DNA sugar-phosphate backbone any given base pair belongs to.
Figure 3. Axial view of extended protamine-DNA structure. The “H” shapes are P1-P2 heterodimers; the “X” shapes are DNA tetraplexes. The protein dimers are held together both by disulfide bonds, and by standard β-sheet-like backbone hydrogen bonding (neither of which, however, are visible in this axial view). Note that adjacent protein-DNA rows are offset by half of the unit cell, giving rise to an extraordinarily-fortuitous pattern of charge arrays, indicated by the three gray squares. Also indicated is the 15 - 16 Å spacing between protamine dimers which is necessary to accommodate the Wu straight ladder DNA tetraplex.
shapes). This regular alternation gives rise to rows of DNA and protein columns (vertically-disposed with respect to the page), which will extend throughout the sperm cell nucleus. Adjacent rows are offset by one-half of the unit cell, whereby they can interact by the rather extraordinarily fortuitous square-shaped charge arrays indicated by the three gray boxes. Positive charges from arginine (or, less commonly in protamine, lysine) lie at opposite corners of one diagonal, and negative charges from DNA at opposite corners of the other diagonal.
It should be specifically noted that, because charge-charge interactions are related to the square of the distance, the charge attractions at the sides of these three gray boxes are almost exactly twice as strong as the charge repulsions at the diagonals, making these square charge arrays very stable structures.
We shall invoke this sort of square charge array again, in the new histone-DNA complex, in order to promote the association of adjacent histone octamers. Furthermore, for such octamer-octamer associations, we shall also invoke standard β-strand backbone-to-backbone hydrogen-bonding, as implied for protamine P1 and P2 in Figure 1(B) and Figure 3 (although not directly visible due to the axial perspective), only minus the disulfide bonds, as there are no disulfide bonds in the histone octamer.
These two species of interaction, namely the charge-charge interactions shown in Figure 3, and the β-sheet-like hydrogen bonding interactions exemplified by the protamine P1-P2 dimer, are critically-important in both the protamine and histone structures presented here.
The final thing I wish to point out in Figure 3 is the 15 - 16 Å spacing between basic residue side-chains on either side of an “X”-shaped DNA tetraplex. This is the protein-protein spacing necessary to accommodate the Wu DNA tetraplex. We shall seek, and find such spacings in the histone octamer.
As stated above, the nucleosome cannot, at the present time in history, be isolated from living cells in a form amenable to direct structural studies, such as X-ray crystallography. If we wish to know its structure in vivo, we shall therefore be compelled to employ the process of logical deduction. This process begins with the non-con- troverted fact that essentially all important nuclear DNA-protein interactions are based upon charge-charge attractions between the negatively-charged phosphate groups of DNA, and the positively-charged R-groups of lysine and arginine residues in the protein. That being the case, it would be difficult to refute the proposition that protamine, which is essentially nothing more than a long string of arginine residues, totally unencumbered by the intellectually-confounding effects of non-DNA-binding domains, must be the prototype structure for all nucleoprotein.
The logic continues by noting that in the 60+ years since the publication of the Watson-Crick structure, the entirety of the molecular biological modeling establishment has been unable to come up with even a rudimentary structure proposal for protamine. That is because it is categorically impossible to align the negative charges in DNA with the positive charges in protamine, if one insists that DNA must have the twisted Watson-Crick structure in the resulting complex. As soon as the arbitrary requirement for helicity is relaxed, however, the protamine-DNA structure virtually solves itself, as previously described  . Since both a globular structure or an α-helix can be conclusively ruled out  , the structure of protamine in the protamine-DNA complex defaults to a β-sheet. If this β-sheet is assigned ψ and f angles of ±130.5˚. respectively, which places it in the most-favora- ble portion of the Ramachandran Plot, then the sheet is untwisted, with a protein side-chain spacing of 6.8 Å on each side of the β-strand, exactly twice that found in all published DNA structures. The only known DNA structures that have 6.8Å residue spacings are tetraplex structures, in which the energetically-required 3.4 Å base stacking distances are restored by mutual intercalation of the base-pairs of the adjacent duplexes that comprise the tetraplex. The best-known of the tetraplex DNA structures is the Gehring Tetramer  , but that is a twisted structure, whose negative charges cannot possibly be aligned with the positive charges of protamine. Of all existing published DNA structures I know of, the only one that can be aligned with protamine is the Wu “straight ladder” tetraplex   . The resulting complex is essentially devoid of steric hindrances, and is so heavily-favored by a veritable abundance of perfectly-spaced hydrogen bonds and salt bridges, that, to quote my own 2006 cover letter to the Journal of Theoretical Biology, “...if this structure is not correct, at least in its essential details, then an evil demon in a parallel world must be playing a trick on humankind. Nothing that works this well can possibly be just ‘coincidence’”.
The application of protamine-DNA structure to the histone-DNA complex, i.e., the nucleosome, was relatively trivial, since the n-termini of the four histone subunits strongly resemble the protamine P1 and P2 strands. All that was required was virtual modeling software, which, at the time most of this work was done, was generally priced in accordance with its usual use in new drug development; a price range entirely beyond my means. The 2006 protamine-DNA model was created with the virtual modeling program AmiraMol, which was loaned to me by Mercury Computer Systems. That company no longer supports that software, wherefore I am grateful that Schrödinger, Inc., kindly loaned me their Suite, which includes the Maestro virtual modeling program, which was used to create the histone-DNA model reported here. With the few exceptions noted, however, the graphic images of the models were not exported from Maestro, but from Discovery Studio DS Visualizer, which is provided as freeware by Accelrys, Inc.
3.1. Application of the Principles of Protamine Structure to Histone Structure
The histone octamer is a marvelous work of internal symmetries, which, alas, can only be hinted at in a two- dimensional printed page. An extensive and minutely-detailed exposition of that structure has, however, been prepared in PowerPoint format (see the entirety of reference  ).
Figure 4 shows a view of the octamer which I would define, not entirely arbitrarily, as “axial”. This view is derived directly from the Luger et al.  pdb file [1AOI]. In the top panel is the complete nucleosome, consisting of the octamer core plus 146 base pairs of DNA, wrapped 1.75 times around the superhelical ramp. In the middle panel is the same picture of the octamer, only minus the DNA. In this, the currently-accepted model, the N-termini are portrayed as being essentially random coils, without any definite structure, projecting outward like starfish arms. In particular, they are not currently regarded as being significantly or regularly involved in the binding of DNA. That function, rather, is almost universally considered to be limited to the superhelical ramp in the octamer core. This, however, is a matter of some cognitive dissonance, because the preponderance of lysine and arginine residues are located in the N-termini [Table 1]. The primary purpose of the model we shall present here is to structure these N-termini as β-strands, and to move the DNA out of the octamer core, into a “straight ladder” relationship with the newly-ordered N-termini.
The lowest panel in Figure 4 shows a slightly magnified view of the isolated octamer core, truncated by removal of the unstructured N-termini. Here we can begin to see some of the many symmetries and subunit-subunit relationships within this extraordinary structure.
At the moment, however, we are less concerned with admiring the octamer, than in establishing its geographical orientation and polarity with respect to its surroundings. This we can begin to do by defining coordinate
Figure 4. Three views of the currently-accepted model of the nucleosome. The top panel shows the entire structure, the core of which consists of two instances each of histone subunits H3, H4, H2A and H2B. Superhelically-wrapped around the core are 146 base pairs of DNA, following a path known as the “superhelical ramp”. The middle panel shows the isolated histone octamer, stripped of DNA. Note the random coil nature of the long N-termini, and their variable lengths. This is best exemplified by the large differences between the lengths of the N-termini of the yellow subunit H3[e], on the left, and the white subunit H3[a] on the right. The N-termini of these two should be the same in length, but in fact differ markedly. This is because of variable degradation in the process of purifying these subunits for X-ray crystallography. The bottom panel is a magnified view of the isolated octamer core, with the N- and C-termini removed. In this truncated view, we can begin to appreciate its orderliness, and its many symmetries.
Table 1. Percentage of DNA phosphate groups bound by Lys/Arg residues.
1This is merely [DNA base pairs] × 2; 2Arginine and lysine only; histidine not counted; 3Defined as the total of all the amino acid residues, and associated DNA, contained within the spiral path of the [Helix I] + [β-strand 1] + [β-strand 2] regions contributed by each of the 8 subunits collectively; 4Defined (in accordance with the new nucleosome model presented herein) as the total of all the amino acid residues, and associated DNA, of the 8 octamer N-termini collectively, proximal to the core α-helices.
axes in these figures. With respect to the manuscript page, or computer screen, the x-axis is left-to-right, the y-axis is down-to-up, and the z-axis rises in a direction perpendicular to the page. With respect to the z-axis, we may now establish a “top” and “bottom” of the view in Figure 4, corresponding respectively to what lies in the foreground and background of the figure.
The histone octamer consists of two copies each of the four unique subunits H3, H4, H2A and H2B. Each instance of the four subunits constitutes a “logical tetramer”, by which I mean a tetramer which does not form spontaneously in the laboratory from purified subunits, but which nevertheless functions as a distinct structural unit in the octamer setting. The two logical tetramers in the Luger et al. pdb file [1AOI] have chain names a-b-c-d and e-f-g-h respectively.
Referring back to the lowest panel in Figure 4, logical tetramer #1 is therefore H3[a](white), H4[b](red), H2A[c](green), and H2B[d](blue). This tetramer is in the foreground of the picture, closest to us along the z-axis, wherefore I have come to think of it as the “north pole” of the octamer.
Logical tetramer #2, consisting of H3[e](yellow), H4[f](brown), H2A[g](turquoise) and H2B[h](pink), is in the background of the picture, farther away from us along the z-axis, wherefore I refer to that half of the octamer as the “south pole”.
Logical tetramers #1 and #2 are related by a 2-fold axis of symmetry, wherefore we may deduce that the octamer has a plane of symmetry somewhere between them. It is important to locate this plane of symmetry, because it will create a basis for assigning directions to the DNA strands which will be associated with the N-termini. In order to facilitate locating the plane of symmetry, we next strip everything away from the truncated octamer core except Helix I [Figure 5]. (The reader not familiar with histone “anatomy” may refer to any of several extensive discussions thereof    ).
The eight instances of Helix I in the octamer core represent the most “N-ward” part of the core. Therefore, the lowest-numbered amino acid residues of each of the eight Helix I’s may be thought of as “sentinels”, standing at the junction between the octamer core behind them and the N-terminal strands lying without. We might perhaps therefore be well-justified in considering the geometries of the N-terminal ends of Helix I to be significantly important in determining the directions that the eight currently-unstructured N-terminal strands will take, after we re-model them as β-strands. The re-modeled N-terminal strands will then, in turn, determine the directions of the DNA strands to which they will bind.
With this in mind please consider Figure 5(A). The 15 Å labels show the approximate distances between the α-carbon atoms of the most N-terminal of the amino acid residues, in each of the four pairs of adjacent Helix I helices at the four octamer corners. We have therefore located the 15 - 16 Å spacing alluded to earlier, in Figure 3, namely the spacing necessary to accommodate the Wu “straight ladder” DNA tetraplex. This is a major step in applying the principles of protamine structure in the setting of the histone octamer.
Moreover, we learn something else from the α-carbon atoms of these most N-terminal amino acid residues of Helix I. If lines are drawn, connecting the α-carbon atoms of the most-proximal residues of logical tetramer #1 (the red, white, green and blue subunits), we find that we have defined a nearly-planar structure, depicted in the figure as a flat-looking green quasi-rectangle. The green shape, when examined by virtual modeling software in 3D, is not perfectly planar, but it is very close to planarity. Moreover, insofar as it can be regarded as a plane, it turns out that it is very nearly parallel to the x-y plane of the image.
This is something I had not previously read about or expected, so I repeated the exercise with logical tetramer #2, giving the red quasi-rectangle in Figure 5. Upon further examination, these two nearly-planar quasi-rectangles
Figure 5. (A) Axial view of the histone octamer core, stripped down so that only Helix I remains. Shown here are the 15 Å spacings between the α carbon atoms of the most-proximal (i.e., “N-ward”) residues of the paired Helix I helices, in each of the four corners of the octamer core. Also shown are the nearly-planar, quasi-rectangular shapes that result when lines are drawn connecting these α carbon atoms. The green plane belongs to “logical tetramer #1”; the red plane to “logical tetramer #2”, as such terms are defined in the text. (B) This shows the natural “up” and “down” directions of the eight instances of Helix I in the crystal structure of the octamer core. As explained in the text, the directions “up” and “down” can be defined by reference to a virtual “histone equatorial plane”.
prove to be not exactly co-planar, but nearly so, suggesting the existence of a virtual “compromise” plane located between them. This compromise plane is depicted in purple in Figure 6.
The view in Figure 6 is the same as the view in Figure 5, only rotated 90˚ “to the left”, that is, −90˚ about the y-axis of Figure 5. In Figure 6(B), we see a purple plane separating the white-red-green-blue subunits of logical tetramer #1 to its left, from the yellow-brown-turquoise-pink subunits of logical tetramer #2 to its right (although the red and brown H4 subunits overlap).
I refer to this purple “compromise” plane, located midway between the green and red quasi-rectangular planes of Figure 5 (still visible here, but only as not-quite-linear edge views), as the “histone equatorial plane”.
The equatorial plane separates the histone octamer into a “north pole” (rotated -90˚ relative to Figure 5, so that here, in Figure 6(B), it is on the left), consisting of the a-b-c-d chains, and a “south pole” (now on the right), consisting of the e-f-g-h chains. The panels on the top and bottom of Figure 6 (Figure 6(A) and Figure 6(C)) show the two “poles” in isolation.
If we accept the “equatorial plane” as being the de facto plane of symmetry between logical tetramers #1 and #2, we may then use it to define likely directions for the N-terminal β-strands and their associated DNA. The final piece of evidence necessary to define these N-terminal strand directions is shown in the previous figure, in the panel labeled Figure 5(B). That figure attempts, with hopefully at least partial success, to demonstrate the natural “up” and “down” directions of the eight Helix I helices. The terms “up” and “down” may now be defined with respect to the histone equatorial plane. Please note the regular alteration of the “up” and “down” directions in the figure, as one travels around the octamer core. Note also that the directionality is subject to the plane of
Figure 6. “Histone equatorial plane”. This is the purple plane, which is a virtual construct lying midway between the green and red quasi-rectangular, quasi-planar shapes in Figure 5. The view here is the same, except that it’s rotated −90˚ about the Y-axis. The green and red quasi-planes are still visible here, but only as thin lines (if they were true planes, they would disappear entirely at this angle). Please note that the purple plane has not been accurately drawn; it’s visible here only because it was rotated a few degrees about the Y-axis; otherwise it would have disappeared entirely in this view. The terms “north pole” and “south pole” are defined in the text. The isolated “north pole” is shown in panel A; the “south pole” in panel C.
symmetry; thus H4[b] (red), in what we now define as the “northern hemisphere”, points up, but H4[f] (brown), in the “southern hemisphere”, points down (etc.). (Needless to say, these directionalities are better-appreciated in the virtual 3-D model than on a flat page).
3.2. New Histone Model
We are now ready to look at the new protamine-based model of histone-DNA structure. (An extensive presentation of this new model, in PowerPoint slide format, appears elsewhere  ). Figure 7 shows the axial view. The only difference between this figure and Figure 4 is that the unordered N-termini in Figure 4 are portrayed here as being highly-ordered.
The upper panel, Figure 7(A), is an all-ribbon “perspective” view, showing four Wu straight-ladder tetraplexes rising up from the octamer core (and extending in the opposite direction as well, although we cannot see that), at an angle normal to the histone equatorial plane defined above.
Figure 7. Axial view of the new nucleosome structure. (A) shows an all-ribbon perspective view. (B) is a hybrid view; the octamer core is portrayed as ribbons, but the N-terminal protein-DNA complexes are portrayed as non-perspective atomic models. In (B), the labels in the lower left-hand corner of the octamer show the two β-strands which support the DNA tetraplex between them. The α-helix in the upper right of (B) is an additional helix not found in the crystal structure, as explained later in the text.
Figure 7(B), the lower panel, is a hybrid view. The core is still portrayed as ribbons, but to better-demonstrate certain features of the structure, the N-termini and their associated DNA have been portrayed as “non-perspective” atomic models.
The nucleosome, previously a quasi-circular shape, is now approximately square-shaped, with a DNA tetraplex in each of its four corners. The orientations of the DNA/protein complexes at the corners are not all the same, because in modeling them, I have endeavored to follow, as closely as possible, the natural directions of the N-termini as they emerge from the core of the Luger et al.  crystal structure, which directions are not all the same.
Specifically, with respect to the two H3-H4 corners at the top of Figure 7(B) (i.e., the white/red H3[a]/H4[b] corner on the right, and the yellow/brown H3[e]/H4[f] corner on the left), the DNA tetraplex “X” shapes are oriented in a direction approximately parallel to the x-axis of the drawing.
With respect to the two H2A-H2B corners at the bottom of Figure 7(B) (i.e., the green/blue H2A[c]/H2B[d] corner on the left, and the turquoise/pink H2A[g]/H2B[h] corner on the right), the DNA tetraplexes appear to be oriented along lines which emanate radially from the center of the octamer core.
Although these diverse orientations have an almost disturbingly-arbitrary look at first, it turns out that the two different orientations facilitate, here in the setting of the histone-DNA complex, the two different modes of stabilization found in the protamine-DNA complex, both of which we have discussed previously, namely (1) the peculiar square charge arrays depicted in Figure 3 above, and (2) the alignment of adjacent β-strands, to form hydrogen-bonded β-sheet-like structures, as exemplified by the protamine P1-P2 dimer. How these two modes of stabilization apply to the histone-DNA complex will be shown presently.
The “X” shapes of the DNA tetraplexes, comparable to those of the protamine-DNA complex depicted above in Figure 3, are clearly seen in all four corners of Figure 7(B). The “H” shapes of the protamine P1-P2 dimers in Figure 3, however, are not seen here, because only one-half of the hydrogen-bonded β-sheet-like structures are present. The other halves, as we shall see, will come from adjacent nucleosomes which we have not yet added to the picture.
Although it will hardly be evident in Figure 7, either the protein or the DNA could have been rotated 90˚ relative to the orientations shown in the figure. The rotational state shown was therefore chosen over three competing rotational alternatives. The reasons for excluding the other three have been discussed elsewhere  .
Figure 8 shows part of the protein-DNA complex depicted in Figure 7. The part shown is subunit H3[a], the white subunit in the upper right-hand corner of either panel of Figure 7. In that previous view, the N-terminus of H3[a] pointed down, as such direction was defined in Figure 5(B). Thus, to get the view in Figure 8, in which the N-terminus points to the left, the subunit and its associated DNA, now shown as atomic models, had to be rotated exactly +90˚ about the y-axis of the drawing.
The uppermost drawing of Figure 8 shows the complete H3[a] subunit without DNA, with all the N-terminal lysine and arginine residues labeled. A comparison between this orderly structure, and the strikingly-disordered N-terminal structures portrayed in the middle panel of Figure 4, will reveal, at a glance, the central purpose of the entire histone-modeling endeavor reported in this manuscript.
In the lower portion of Figure 8 we see the manner of association between the β-strand (with ψ/f = ±130.5˚) and the DNA (with 6.8 Å bp spacing), which is exactly the same as that seen in the protamine-DNA complex (refer back to Figure 1), except for the 90˚ rotation of the protein (which cannot be demonstrated in this view, but which was discussed with respect to salmon protamine in the text relating to Figure 1). Thus, as Figure 8 hopefully demonstrates, the axial alignment of positive and negative charges is nearly perfect, and the spacing of the salt bridges can also be nearly perfectly set at 3 Å.
We therefore see that not only is the number of potentially-DNA-binding basic residues substantially greater in the histone N-terminal domain than in the superhelical ramp (as quantitated in Table 1), but, perhaps more
Figure 8. Two atomic-model views of histone subunit H3[a]. This was portrayed by a white ribbon in Figure 7; here the ribbon is removed, and the entire subunit is rotated +90˚ about the y-axis. The top of the figure shows the isolated protein, with the N-terminal β-strand lysine and arginine residues labeled. The bottom shows the same structure, bound by salt bridges to a Wu 6.8 Å-spaced DNA “straight ladder” duplex.
importantly, the alignment and spacing of the positive and negative charges is far better in our new model than in the current one.
3.3. How Do Adjacent Octamers Interact?
Figure 9(A) and Figure 9(B) show longitudinal views of the new nucleosome model in two rotational states. (The ribbon style and coloring are slightly different than the previous figures, because this is a Schrödinger Maestro graphic export, and the previous slides were Accelrys Discovery Studio graphic exports).
The figure demonstrates a key attribute of the structure. At each corner of the histone octamer (where “corner” is defined by the axial view in Figure 7 above), one of the paired β-strands points “up”, and the other “down” (as such directionality was previously defined in Figure 5(B)). This is far-better demonstrated in the longitudinal view shown here.
Note, for example, that in Figure 9(A), the N-terminal β-strands of H2A[c] (green) and H2B[d] (blue) are clearly seen to be on opposite sides of the DNA, and to be pointing in opposite directions. (The “up” and “down” directions, as previously defined with respect to the “histone equatorial plane”, are indicated in the middle of the current figure). The manner in which the directionality of the proximal end of Helix I leads to the N-terminal β-strand directions shown is, of course, best seen in the pdb virtual structure file or Jmol applet  . It can also be seen in PowerPoint slide format  or movie format  .
Because the N-terminal protein strands are on opposite sides of the DNA tetraplex, exactly half the DNA has no associated protein, as indicated for H2A by the four green question marks in Figure 9(A). These “bare” areas, as we shall see next, will be filled by the N-terminal β-strands of adjacent histone octamers.
Figure 9. Two longitudinal views of the new histone model. Panels A and B differ only in the rotational state of the structure about the x-axis. The “up” and “down” orientations of the N-termini, introduced in axial view in Figure 5(B), are clearly seen here. Also seen here is that half of the DNA is unassociated with a protein N-terminus. This is highlighted for the green H2A[c] subunit by the four green question marks above it. The bare DNA strands will be “filled” by the N-termini of adjacent histone octamers, as illustrated in Figure 10 below.
3.4. Two Octamers
Figure 10 continues where Figure 9 left off, by adding a second histone octamer to each of the nucleoprotein structures in Figure 9, giving a pair of di-nucleosome structures. As in Figure 9, the two pictures here (Figure 10(A) and Figure 10(B)) are the same except for the rotational state. The nucleosome monomers on the left-hand side of each di-nucleosome structure can be thought of as representing the cognate nucleosomes in Figure 9. Note that the formerly-bare region of DNA (identified by the four green question marks in Figure 9; now identified by a single green asterisk) is now inhabited by the blue N-terminus of the H2B[d] subunit of the histone octamer to its right. The nucleosome to the right, however, now has a large bare region of its own (marked with 4 green question marks), which will, in the future, be filled by a 3rd octamer to the right of it.
Note also that when multiple octamers are longitudinally-aligned as shown, to form an extended nucleoprotein structure, all like-colored N-termini are coaxially-aligned.
With regard to adjacent histone octamers binding to the same DNA, what will the inter-octamer spacing be? The question remains open. In the downloadable pdb file  the spacing is as shown in Figure 10, which is almost maximally-close-packed, with the N-terminal end of one β-strand abutting on the C-terminal end of the next, so that the two form a nearly-continuous structure (separated only by a very short “bare” stretch of DNA, as indicated for the white H3[a] subunits in Figure 10(B)). This close spacing seems logical, since it leaves almost no DNA without protein support, but it is by no means mandated by the model. There is at least one reason for doubting that the spacing should quite this close, which reasonhas been discussed elsewhere  .
The “traditional”, or superhelical ramp model of nucleosome structure includes lengths of “spacer” DNA, which alternate with lengths of DNA that are superhelically-wound about the octamer core. Histone subunit H1/H5 is involved in this arrangement, although the molecular details, and the precise atomic modeling of it, are lacking. Our new nucleosome model makes no mention of H1/H5, because there is no logical way to define a role for it. Likewise, our new model could accommodate any length of “spacer” DNA between octamer cores,
Figure 10. Panels A and B show two x-rotated states of a nucleoprotein length containing two adjacent histone octamers. The left-hand nucleosome in each panel can be thought of as representing the same nucleosome depicted in the corresponding location in Figure 9. The junction between the two octamers is best seen in the lower panel, for the white H3[a] subunits, between which a short length of “bare” DNA can be seen. The much larger “bare” stretch of DNA in Figure 9(A), marked there by four green question marks, is marked here by a single green asterisk, and is no longer “bare”, having become associated with the blue N-terminus of the H2B[d] subunit of the adjacent octamer. That octamer, however, has its own “bare” stretch of DNA (marked again by four green question marks), which will eventually be filled by the addition of a 3rd octamer not shown.
but there is no way to logically choose any one spacing over any other, wherefore I have left adjacent octamers close-packed, pending information to the contrary.
3.5. 10 nm and 30 nm Fiber
The diameter of our new nucleosome model is approximately 10 nm, the same as that of the 10 nm “beads on a string” strands which are shown in most textbooks, and which are cited as the next step in higher chromatin structure. The “beads on a string” are presumed to be individual histone octamers connected by the above-refe- renced “linker DNA”. The new model presented here gives a strand of the correct diameter, but it would not be expected to give the same “beads on a string” appearance, because we have not included any linker DNA. I have discussed this potential discrepancy at considerable length elsewhere  .
On the other hand, with respect to the mysterious 30-nm fiber, a currently poorly-defined condensation of 10-nm fibers, our new structure not only readily suggests an exact atomic model, but several such models. The problem therefore changes from one of coming up with any structure at all, to one of selecting from among a multiplicity of plausible structures which compete for our attention.
Figure 11(A) shows an axial view of the simplest possible form of a 30-nm fiber, consisting of a quadrilateral array of four nucleosome strands. Its size can be verified at a glance, being the size of three 10-nm nucleosomes, or, more precisely stated, two 10-nm nucleosomes plus a 10-nm hole, or channel in the middle.
The purple squares in Figure 11(A) highlight the two types of interactions which stabilize this, and all other like structures I have examined. The contents of the upper purple square are twice-magnified in Figure 11(B), which reveals a structure which is precisely the same as that shown in Figure 3 for protamine. This is a square charge array, with positive charges at one diagonal provided by basic residues, and negative charges at the other diagonal provided by DNA phosphate groups.
The contents of the lower purple square of Figure 11(A) are magnified in Figure 11(C), which is merely an axial view of a pair of β-strands, which (although obviously not visible in this axial view) have formed a small β-sheet, by hydrogen bonding along their peptide backbones, exactly as in the case of the protamine P1-P2 dimer (see Figure 1(B) and Figure 3), except without the benefit of disulfide bonds.
The 10-nm channel shown in Figure 11(A) would allow passage of smaller enzymes such as DNA polymerase I, but not necessarily larger polymerase enzyme complexes, wherefore I would not be quick to presume that the cell nucleus requires a perpetual channel such as this, simply for the passage of enzymes. If we eliminate the channel, then the simplest type of 30-nm model would be one such as that depicted in Figure 12, where six nucleosomes are arranged as two horizontal rows of three nucleosomes each. These rows are related by a two-fold axis of symmetry, and are bound securely by β-sheet-like hydrogen bonds, exactly as previously depicted in Figure 11(C). Additional stabilization of the Figure 12 structure can be readily perceived, if, rather than regarding the six nucleosomes as being two horizontal rows of three nucleosomes each, we think of them instead as three vertical columns of two nucleosomes each, wherein we notice that adjacent columns are bound by square-array charge interactions, exactly as previously depicted in Figure 11(B).
It is not at all difficult to extend the structure shown in Figure 12, by means of 300-nm and 700-nm folding, into an entire chromosome, consisting of a single DNA duplex with no helical twist whatsoever. The resulting chromosome structure is sufficiently hypothetical, however, that I shall refrain from presenting it here, although I confess that I find it surprisingly plausible. The interested reader may find a presentation of the complete chromosome structure elsewhere  .
3.6. N-Terminal Length Mismatches
The completion of this modeling project required reconciliation of the extreme length mismatches between the N-terminal regions of the histone subunits, as such are portrayed in the current crystal structure. The lengths of the pre-Helix-I, N-terminal amino acid sequences of the four unique histone subunit types are:
H2A: 27 residues
H2B: 37 residues
H3: 63 residues
H4: 30 residues
Subunit H2B begins with an extraordinarily atypical, 10-amino-acid sequence that includes four proline residues: PEPAKSAPAP. Notwithstanding the lone lysine residue, it is difficult to imagine that the purpose of such
Figure 11. (A) Simplest form of a 30-nm fiber, consisting of 4 nucleosomes in a square array. The purple squares highlight the two modes of stabilization for this, and all other 30-nm models which arise from our new nucleosome model. These are magnified in the panels below; (B) this is a magnified view of the upper purple square in panel A. This is the square charge array seen previously in the protamine-DNA complex, as explained in the text; (C) this is a magnified view of the lower purple square in panel A, showing an axial view of two adjacent β-strands, bound together by hydrogen bonding between their peptide backbones. (Obviously the hydrogen bonding cannot be seen in axial view).
Figure 12. Simple 6-nucleosome model for a 30-nm fiber. Viewed in axial perspective, we can think of this structure as being either two rows of 3 octamers each, or as three columns of 2 octamers each. The two rows are related by a 2-fold axis, and are stabilized by the hydrogen bonds between their β-strand peptide backbones, exactly as depicted in Figure 11(C). The three columns are related by square charge arrays, exactly as depicted in Figure 11(B).
a sequence is the binding of DNA phosphate groups. Equally extraordinarily, of the remaining H2B N-terminal residues, over 50%―14 of the remaining 27 residues―are lysine or arginine, making this latter region the very most basic of all the histone subunit N-termini!
I cannot make any sense out of the peculiar H2B N-terminal amino acid sequence, and I mention it here for one reason only: that if we disallow the first 10 amino acids, presuming that the high proline content removes them from the otherwise precise alignment with DNA which is depicted in Figures 8-10 (although how those 10 residues would then be conformed I cannot say), then the lengths of 3-out-of-4 subunits; namely H2A, H2B and H4; suddenly become very well matched, at approximately 30 residues each. H3, however, has more than twice that many residues―63 in total―giving rise to an H3 N-terminus which is so severely length-mismatched as to render any β-strand-based modeling of the histone octamer impossible.
This severe length mismatch is partly resolved by simply allowing H3 residues 45 - 56 to remain in the octamer core, in the α-helical conformation reported for them in the Luger et al. crystal structure  . This I make special mention of only because the language I have used up until now, in discussions of the “N-terminus”, may have seemed to imply “everything proximal to Helix I”, that, however, being a region within which subunit H3 has an additional and significantly-long α-helix. But even if we allow this additional α-helix to remain in the octamer core as such, that still leaves H3 with a 44-residue N-terminus, which is much longer than any of the other three subunits.
I have solved this problem (conceptually, at least) by configuring H3 residues Ser28 to Arg42 as yet a second additional α-helix; one which was completely unforeseen at the outset of these studies. One of these novel α-helices, in the white H3[a] subunit, is shown and labeled in axial view in Figure 7(B). The α-helices of both the H3[a](white) and H3[e](yellow) subunits, however, are best depicted in longitudinal view in Figure 9(B) and Figure 10(B). The addition of these novel α-helices to the H3 N-termini reduces their lengths to 27 residues each, fully-consistent with the N-terminal lengths of the other three unique subunit types.
This ad hoc additional H3 α-helix is not justified merely by my personal need to solve the length mismatch problem. More significantly, the new α-helix puts its residues Lys36 and Arg40 into such good alignment with DNA phosphate that I’d be hard-pressed to dismiss that alignment as mere coincidence. That fortuitous alignment, and the length mismatch problem generally, are discussed and illustrated in more detail elsewhere  .
4. Discussion and Conclusions
The present histone re-modeling project is merely the latest step in what has become, for this 66 year-old author, a life-long effort to persuade scientists not so much that DNA has a non-helical structure in the nuclei of living cells, but rather this: to persuade scientists that the question of helicity vs. non-helicity of DNA, in the nuclei of living cells, is a question of surpassing importance, and that the question needs to be addressed much more firmly and directly than it has been to date. Thus, the mere willingness to accept the possibility of a non-helical structure for DNA has resulted in an exceedingly-plausible structure for protamine, an important nucleoprotein whose structure was previously a total mystery for over a half-century. Likewise, the same non-helical DNA structure also now provides a solution―perhaps not quite rising to the level of “exceedingly-plausible”, but plausible all the same―for the histone N-terminal domain, where the preponderance of basic residues have always been known to reside, but whose structure, like that of protamine, was previously a complete mystery.
It is widely believed that the helix model of DNA structure is supported by what some have referred to as a veritable “mountain” of evidence. But when this “mountain” is examined, as I have been doing continuously for over 40 years now, it turns out that nearly every study involves DNA that has been subjected to one or more of the following perturbations:
1) The DNA has been torn from its native environment, where its inevitable association with a preponderance of positively-charged basic proteins must surely alter the artificial structure seen upon its removal and “purification”.
2) The DNA has been broken up into thousands, or millions of small fragments, which is almost certain to destroy the native structure, and replace it with the usual structure for small linear DNA in solution, which is the Watson-Crick double helix.
3) The DNA has been nicked with nucleases. This is particularly destructive in the case of circular DNA, where such treatment destroys, in an instant, the native winding.
4) The DNA, if circular, has been relaxed with topoisomerases, which then re-seal the chromosome with unnatural linking numbers.
The number of papers which purport to directly address the question of helicity vs. non-helicity in totally uncorrupted native DNA, and which conclude that DNA does indeed have a net helical twist, is perishingly small, and that sparse work has, to date, been severely compromised, both by experimental error, and by conclusions not supported by the data presented, as has been extensively discussed previously  . On the other hand, there are several studies which would appear to present compelling evidence that pristine, unperturbed native DNA may not have a net helical twist after all, and they ought at least to be mentioned here. One such paper forcefully demonstrates that the strands of native circular duplex plasmid DNA can apparently be separated, as fully-intact circular single strands, without any strand breakage; something that would be impossible if double-stranded plasmid circular DNA had a net plectonemic twist  . It is incomprehensible to me that this latter study, which had ironclad controls, has been almost totally ignored for nearly 20 years.
In 2002 I published a paper demonstrating that the topological behavior of small circular DNA in alkali denaturation experiments, once considered by many to be all-but-conclusive proof of the Watson-Crick structure, could in fact be better-explained by a non-helical model  . This too has been almost totally ignored.
In 2006 I published the protamine-DNA model  which formed the basis for the current study of histone structure. Since protamine is little more than a long string of positive charges from arginine, and DNA a long string of negative charges from its phosphate groups, one might think that protamine should, as its name suggests, be the very prototype for all nucleoprotein structure, and, moreover, that its structure should have been deduced within months of publication of the double helix in 1953. And yet that did not happen, either then, or ever. The reason is that the protamine-DNA structure, like the classic “square peg in a round hole”, is simply impossible to solve with helical DNA, as has been discussed at length elsewhere  .
In the present manuscript I have proposed a new nucleosome structure, placing the DNA in the histone subunit N-termini, where the majority of basic amino acid residues lie. Unfortunately, in contrast to my opinion of the protamine model, which I regard as well-nigh incontrovertible, I readily concede that there is no such incontrovertibility in the new histone model. For one thing, we cannot be certain that the crystal structure of the octamer core  is a true model of the same structure in the 100% humidity environment of the living cell nucleus, since very little if anything in biology depends upon it being so. Nevertheless, there is no alternative histone core structure of which I am aware. Being of a basically argumentative nature, I have myself spent innumerable hours poring over the current octamer core structure, looking for alternative structures which might have been missed by others. Everywhere I looked, however, all I found was beauty, logic and symmetry. Since there is no logical basis for altering a single atom of it, we must therefore trust in it, and hope that it will survive the test of time.
Obviously, however, no such situation pertains to the histone N-terminal domain, which is currently portrayed as being a set of eight random coils, lacking any defined structure or function. As “nature abhors a vacuum”, I think that nature might perhaps also abhor eight random coils at the extremities of the protein support structure for the genetic material of all higher life. The fact that the greatest density of basic residues is found in that very domain is therefore startling, to say the least.
Although I am not certain that the principles of the protamine model must necessarily be held applicable to the histone N-termini, the plain and simple fact of the matter is that protamine-like structure is found there. Anthropomorphically speaking, therefore, these histone N-termini logically “cried out” to be so modeled, and I have obliged the human logical thought process by doing so. I believe that the result is sufficiently plausible that to not bring it to the attention of histone scientists, and to leave the N-terminal domain remaining in the intellectual darkness within which it is currently enshrouded, would be a disservice to science.
On the other hand, whether this re-modeling project is merely a stimulating intellectual exercise, with no relevance to biology, or whether, alternatively, it is a doorway into the still-mysterious world of intranuclear protein-DNA structure, only time will tell.
The Schrödinger Suite, which includes Maestro, the software used to create all the molecular models described herein, was generously loaned to me by Schrödinger, Inc., whose gift is gratefully acknowledged.
 Luger, K., Mader, A.W., Richmond, R.K., Sargent, D.F. and Richmond, T.J. (1997) Crystal Structure of the Nucleosome Core Particle at 2.8 Å Resolution. Nature, 389, 251-260.
 Biegeleisen, K. (2005) Protein Data Bank. Accession Numbers 2AWR (Protamine-DNA Complex 1) and 2AWS (Protamine-DNA Complex 2).
 Biegeleisen, K. (2014) Histone Structure. Part II. A Model Which Places DNA in the N-Terminal Region of the Octamer. Slides 82-130 review Protamine-DNA structure, assuming a parallel relationship between P1 and P2. Slides 131-141 present an antiparallel variation on the structure.
 Wu, T.T. (1969) Secondary Structures of DNA. Proceedings of the National Academy of Sciences of the United States of America, 63, 400-405.
 Gehring, K., Leroy, J.L. and Gueron, M. (1993) A Tetrameric DNA Structure with Protonated Cytosine-Cytosine Base Pairs. Nature, 363, 561-565.
 Ramakrishnan, V. (1997) Histone Structure and the Organization of the Nucleosome. Annual Review of Biophysics and Biomolecular Structure, 26, 83-112.