Wednesday, March 17, 2010

What I learned today

We had an interesting talk today by Antti Niemi from Uppsala University modestly titled "Can Theory of Everything Explain Life?" It was about string theory of a somewhat different kind. The string in this case is a protein and what the theory should explain is its folding. The talk was basically a summary of this paper: "A phenomenological model of protein folding." In a nutshell the idea is to put a U(1) gauge theory on a discretized string (the protein), define a gauge-invariant free energy and minimize it. The claim is that this provides a good match to available data.

I know next to nothing about protein folding, so it's hard for me to tell how good the model is. From the data he showed, I wasn't too impressed that one can fit a scatter plot with two maxima by a function that has 5 free parameters, but then that fit is not in the paper and I didn't quite catch the details. One thing I learned from this talk though is that PDB sometimes doesn't stand for Particle Data Book, but for Protein Data Bank. If you know more about protein folding than I, let me know what you think. I found it quite interesting.

Something else that I learned in the talk is that the DNA of the bacterium Escherichia Coli is a closed string rather than an open string (see picture). I think I had heard that before. There's enzymes that act on the DNA, so-called topoisomerases that don't change the DNA sequence but the topology of the string. In other words, these enzymes can produce knots. Simple knots, but still. I think I had also heard that before. However, I thought the topology-change of the DNA is a process that is useful for the winding/unwinding and reading/reproducing of the DNA. It seems however that the topology of the DNA affects the synthesis of proteins, in particular the folding and function of the proteins. This probably isn't really news for anybody who works in the field but I actually didn't know that the topology of the DNA, not only it's sequence, has functional consequences. Alas, that flashed by only briefly and wasn't really content of the talk. But I find it intriguing.

46 comments:

Uncle Al said...

Global parallel minimalization of protein comformer energy in vacuum is not the way Nature folds protein. A protein's N-terminus (free amino group) emerges from its ribosome into water, followed by amino acid residue after residue, until its C-terminus (free carboxyl group) emerges - into water.

Serially minimize protein conformer energy, successively introducing each new residue and calculating, in a blob of water.

One approach faster than massively parallel computation that does not work is serial computation that does work.]

Bee said...

Hi Uncle,
Yes, the speaker also commented on that question. I believe the answer was roughly that even though in reality it may happen differently, it seems that the result of the proteins' folding can be classified in only some few thousand bases, so there seems to be some universality to the process that a simple model might be able to capture even though the details of the environment may be quite complicated. (I hope I got this roughly right. I think the paper might be more accurate a summary than I.) Best,

B.

stefan said...

Dear Bee,

thanks for sharing, this seems to have been an interesting seminar! I've heard once (cannot remember now where, unfortunately) that the landscape of protein folding is similarly vast as that of vacua in string theory - did he mention something like that in the talk? And I am also a bit confused that the knottedness of DNA influences the folding of the proteins... complicated stuff.

Cheers, Stefan

Aaron Sheldon said...

Hmmm...about the only conclusion I can reach is that this paper shows that it is plausible that the peptide bond angle sits in a quartic potential. The radius of gyration is roughly close.

Overall a have some concerns with assuming U(1) symmetry along the peptide chain axis, when individual amino acids are far from molecularly symmetric along that axis. Or maybe I am misinterpreting that part?

I really wish they would have published a table of the fitted parameters, and some form of physical interpretation of the fits. For example I think the chiral term represents natures preference for L isomers over D isomers of the amino acids; but it would be better that they say that instead of me guess it.

Finally, while it is great to propose a general potential, conformation is dependent on the organism. Bacterial ribosomes will not produce functional mammalian enzymes given the same genetic code.

Actually there is other stuff to like. What they have proposed is a class of fast algorithms to study conformation, by randomly sampling torsion and curvature and then inferring structure. The really question will be how well can these algorithms be adapted to more realistic potential energies, like folding in Lipid environments versus aqueous environments.

Aaron Sheldon said...

Someone else is hot on there heals, with a similar metropolis algorythm.

Check out this link in PNAS

http://www.pnas.org/content/107/11/4961.full

Bee said...

Hi Aaron,

Good point. Actually nobody asked how justified the U(1) symmetry is to begin with, so can't say much about it. It seems to me the motivation of the whole approach was somewhat backwards: let's just try this model and see how well it fits with the data. That's generally fine with me except that it's hard for me to tell how well it fits the data. I had the impression there will be another paper coming out, at least there was more in the talk than what is in the papers on the arxiv, so I'll have an eye on that and keep you updated. Best,

B.

Bee said...

Hi Stefan,

Yes, as you can guess that was the starting point of the seminar :-) The protein landscape seems to be "only" about 10^50 though, thus 450 zeros less than the string theory landscape. The amazing thing though is how rapidly the proteins fold into their final shape despite the huge number of possibilities. You'd think they err around in the landscape for quite a bit. (The folding time depends on the environment. In particular, the higher the temperature, the longer the folding takes. There weren't really any exact numbers for that in the talk, but I think the process can take place in μs in reality. How long would a computer need to scan the whole landscape?) I'm also still confused that the DNA shape influences the protein folding. As I said, that was mentioned only very briefly in the introduction though and not actually content of the talk. It doesn't seem entirely implausible to me. Just consider you "read off" the DNA sequence, you might imagine that the local curvature, not only the sequence, does play a role for the result. It's just that I had never heard of that before. I'm still not entirely sure about that. Best,

B.

Steven Colyer said...

More questions than answers, as usual:

1) How can Biology, which grew out of Chemistry, the modern version of which grew out of work by Pauli and Schrodinger especially, tie itself to "String Theory."? "Knot theory" I can understand, especially simple ones. Both LQG and ST use knots, they are not restricted to any one QG theory.

2) Why is Death a good thing? Well, that one I can answer. The reason why death is a good thing because after I'm dead I won't have to listen to superstrings theorists falsely promote their theory as a "Theory of Everything" anymore! Argh.

Seriously, folks, if you want a T.O.E., go for it, but is it too much to ask we not leapfrog over G.U.T., the Grand Unified Theory of connecting the strong force and the electroweak force, first? What is wrong here? Just because Glashow et. al. made a serious, but failed attempt, to come up with a GUT doesn't mean we didn't learn anything. At least we learned what doesn't work. Michelson-Morley is the greatest "failed" experiment of all time, but that doesn't mean we learned nothing by it.

3)Is this a good field to get into, especially since it's complicated (as Stefan says) but not complex (and also for economic reasons as politicians, wishing to live longer, are more prone to pour money into chemistry/biology than high energy theoretical physics)? Superstrings theory mathematics is complex. The handedness of DNA molecules seem to follow simple rules.

4) Will investigation into this stuff in any way help Physics?

5) Can we work entropy into this somehow? :-)

Phil Warnell said...

Hi Bee,

I find it interesting that you’ve taken us from those sub atomic collisions which resembled frog spawn, to the folding of proteins that comprise the real thing. In each case however what is fundamental to both is the information contained, how that information is organized and reliably duplicated.

To be honest though I don’t know much about protein folding beyond it being one of the key yet still mysterious processes of the many that has life to be life. I do though find a synthesis between the two as it hints that despite all the ways things might organize there are a limited number that can work in a practical (purposeful) sense. However fo me in some respects much of this inquiry is carried out like as if we were still sorting through all the possible shapes to consider how something is round, while ignoring it’s the shape that in character conserves what one has to begin with in terms of what is required to be done. That’s to suggest that by looking more to how things are qualified and not simply quantified may lend a better starting point in looking for the answers.

Best,

Phil

Bee said...

Hi Steven,

The remark about string theory was tongue-in-cheek, as they say. The protein is some sort of "string" and if you construct a dynamical theory for that string with a gauge-symmetry on it there aren't many options. It's thus not surprising there's some superficial resemblance to actual string theory as in theory of elementary particles. The idea to go from a point particle to a string is a quite general one, which is why it has applications in other domains than elementary particle physics.

I don't know if it will help physics. One never knows. That's the thing with fundamental research that one doesn't know in advance where it leads. There might be surprising cross-relations (let's take some proteins and have them scan the string landscape ;-)). However, I'm of course biased but while I know many examples of exporting physics models to other fields, the opposite seems to happen if then rarely. I would guess though that there is a return path, it's just less documented. Best,

B.

Bee said...

Hi Phil,

Indeed, there's still many mysteries left to solve. In particular I guess the question is how is the information encoded and how does the reading work. Naively you'd think once you have a human genome on a computer and the genome is some sort of "code" why not run the code! Well, one problem is that I suspect our biggest super-computer clusters would be sweating over what Nature does in a fraction of a second (!). But besides that the genome isn't "it." You also need to know what to do with it. How does the "string" become life? The amazing thing about it is that I can just imagine we'll be able to understand that step within a decade or two. Best,

B.

Steven Colyer said...

Hi Bee,

Thanks, good point s as usual. I'd like a list if it's not too much trouble of the ways non-Physics fields have helped Physics. It's way the other way around from what I can tell, because of Physics' fundamentality. So it must be a short list. The "Eureka" discovery/inspiration of how Benzene molecules must connect (in a hexagon) for benzene to have the properties it has is the only example I can think of atm off the top of my head.

Hi Phil,

You wrote: However for me in some respects much of this inquiry is carried out like as if we were still sorting through all the possible shapes to consider how something is round, while ignoring it’s the shape that in character conserves what one has to begin with in terms of what is required to be done. That’s to suggest that by looking more to how things are qualified and not simply quantified may lend a better starting point in looking for the answers.

Well said! Geometry over Algebra! Form first, then function. And DNA is the perfect playground for that, from what I can see.

Bee said...

Steven,

Is what I'm saying, if there's a list it's probably short, read: I don't know any examples. Best,

B.

Phil Warnell said...

Hi Bee,

As a follow up to what you just said it had me become reminded of a TED presentation I watched a few years ago, where the folding of DNA rather than proteins was the discussion. It left me mindful it would before be hard to imagine that DNA would become the building blocks for the purposes this speaker has demonstrated and in future may be used. It had me to think about what is the more important as being fundamental to the essence of things, being its substance or form and to come better to understand that either has little meaning at all without first considering the information. Soalthough it is clear with the speakers orgamy who was the architect to have them to be as they are and yet still difficult to imagine what might serve to be that for those things that already are, as to come to believe that be nothing at all, as it being without reason.

Best,

Phil

Georg said...

Hello Bee,
You wrote:

"I'm also still confused that the DNA shape influences the protein folding. As I said, that was mentioned only very briefly in the introduction though and not actually content of the talk."

As far as I remember, the two parallel
DNA strands are separated before
a piece of them is transcripted to a messenger-RNA and so on...
So, which "shape" of DNA does
influence the protein shape?
As it is in the gene, or after
split?
Second problem: proteins are very
far from minimum energy. There
are maybe examles for minimum
(I think this is the case
for proteins like silk or wool)
There are e. g. -S-S- bonds to
in proteins prevent that,
the simple fact that one can
denaturate egg albumin
(a every-morning experience)
shows that this protein at least
is far from minimum.
Regards
Georg

Phil Warnell said...

Hi Steven,

For me it is neither function or form, yet rather what has each to be of meaning demonstrated in purpose as to have them together so able. That as indicated here relates to the information, yet not of the kind thought as entropy which has things move to disorder, yet rather the kind that arranges things to be ordered into the forms with functions that we observe.


Like J.S. Bell used to complain of such conception “information about what”, with me thinking him also wanting to ask information from where and for what. Plato’s simple answer being of course for “the good” and which I would define as what’s found being its quality. Of course this is not an original notion, yet rather one of Robert Pirsig’s whose philosophy I have always found as although reasonably logical, still has one not to understand what is quality as to know it for its reason beyond simply it is. So therefore from here my wondering still continues as do for many I suppose.

Best,

Phil

Bee said...

Hi George,

I don't know either. Best,

B.

Aaron Sheldon said...

Okay I have a completely hypothetical model for how DNA topology can affect protein conformation.

After transcription, RNA does not remain as a simple strand, but rather folds back on itself, pairing adjoint bases. As one can imagine there are multiple ways of pairing the adjoint bases, creating different loop structures. The folding could possibly be sensitive to transcription rate, which itself is sensitive to the DNA topology. Once the folded RNA reaches the ribosome it then has to unfold, which could affect the rate of translation, which finally could easily affect the final conformation of the protein, i.e. a peptide chain that is translated quickly will fold differently than a peptide chain that is translated slowly.

But this is purely speculative, and would make for a great mol.bol.phd.

Aaron Sheldon said...

Not a great introduction but a start to mRAN folding

http://nar.oxfordjournals.org/cgi/content/abstract/34/8/2428

http://arjournals.annualreviews.org/doi/abs/10.1146/annurev.biophys.26.1.113

http://www.jbc.org/content/270/36/20871.full

None of these articles mention DNA topology though, bummer.

Bee said...

Mol Bol, very nice :-) would go well with a Dipl Bibl.

Thomas Larsson said...

The standard model of folding of linear polymers is the SAW (self-avoiding random walk). It can be shown to be equivalent to an O(N) spin model in the N -> 0 limit, i.e. an N-component phi^4 theory or, in 2D, a CFT in the c -> 0 limit. Its fractal dimension in d dimensions is well approximated by the Flory formula D = (d+2)/3, exact for d = 1,2,4. The analogous model for branched polymers is called a lattice animal.

Bee said...

Yes... I believe the speaker said something like that. You know, there's research showing the average person can learn 7 new information items per hour. My 7 items were exhausted after the introduction. Today I learned the following: "Your body produces and destroys 50kg of ATP per day, half it's body weight." Okay, so, unless I've exaggerated the chocolate intake lately, 50kg is not half my body weight, but even up to a factor Pi or so the number seems enormous. I have no clue if that's true, the speaker certainly sounded convincing. Best,

B.

Aaron Sheldon said...
This comment has been removed by the author.
Aaron Sheldon said...

the 50kg/day so ATP processed sounded off so from:

http://en.wikipedia.org/wiki/Basal_metabolic_rate

and

http://en.wikipedia.org/wiki/Adenosine_triphosphate

We can estimate the human metabolic consumption at 6000kJ/day using the Katch-McArdle formula, and converting calories to joules.

The Gibbs Free energy per mole of ATP of the Krebs Cycle under cellular physiological conditions is approximately 60kJ/mol

And the molar mass of ATP is approximately 500g/mol

Low and behold this equals approximately 50kg/day of ATP processed. What an increase in entropy just for breathing and thinking!

Took a couple of tries to find the right units, there is a fair amount of confusion in the sports nutrition literature, the kilo in kilo calories gets dropped quiet often.

Bee said...

Hi Aaron,

Oh, wow, you beat me to it. I meant to check the number, but couldn't find the time. I suppose though most of the energy goes into maintaining a constant body temperature, not thinking? Best,

B.

Aaron Sheldon said...

Its lunch time in Mountain Standard Time.

Georg said...

Hello Bee,
here is a Wikipedia-Link:
http://en.wikipedia.org/wiki/Chaperone_%28protein%29
I knew about chaperons vaguely, but I did not remember
exactly.
This is a quite sophisticated
folding "apparatus", not easily
compatible with the ideas You heard of in
that seminar.
Regards
Georg

stefan said...

There was an article in SciAm about ATP last December, which I found quite surprising: ATP has many more functions in the body than just in metabolism. But no idea how this may influence the production rate...

Cheers, Stefan

Aaron Sheldon said...

In relation to metabolism there is more or less a constant amount of ATP in the body. The 50kg/day number can literally be translated by molar relations into the number of Krebs cycles processed in the body in a single day. So the number is a measure of metabolic activity, not actual mass output or input.

Plato said...

One of the first and most enduring facts most students learn in biology class is that all living cells use a small molecule called adenosine triphosphate (ATP) as fuel. That universal energy currency drives the biological reactions that allow cells to function and life to flourish—making ATP a crucial player in the biological world.The Double Life of ATP in Humans( bold added for emphasis.

I mean I can see the correlation in relation to false vacuum to the true as a force to be measured in the expression of life( genus figures)?:) Again, microseconds versus seconds and what reductio system has been adopted in the biology?

Thanks Stefan and thanks Bee for the "overlap" on reality. Who knew of the applicability to transcended not only the particularization of life from use of building blocks, but to mastered it as if in some quantum messenger system chemically induced over the organization of the human body and mind defined in the higgs field?:)

I know I used pheromones as an example, but in quantum computation how will this be seen? A leaf is a strange messenger about using the sun as a example.

I mean when you've been trained for the physics side of life and cannot find work, why not go over to "finance on wall street" to waste one's talent with the wholesale destruction of really caring about people.

I mean sure I am having trouble here identifying "quantity versus quality" of life too Phil. If you hold a "mantra intonation valuation" as a focus to resonate in life the message, then, that is what you shall get. Using the body to get that message across, eventually crystallizes into the "message transcribed."

Are we always aware of it?

A soul searching question for sure, and at a point to ask whether it is possible "to really make a difference" in the world with regards to people?

It seems "a hopeless message to me" that we should only care about ourselves? In face of such hopelessness the quality of life, had indeed managed to overlap the messages to the cells that would re-orientate the whole body because of that mind?

Best,

Steven Colyer said...
This comment has been removed by the author.
Steven Colyer said...

Have ye ever heard of Panspermia, the idea that life's origin is not only endemic to Earth, but that Earth life started somewhere else, perhaps in another now dead solar system (thanks to novae), and I see no reason to not keep extrapolating backwards to another galaxy as well.

Our solar system is a young-un, a mere 5 billion years old in a Universe 8.7 billion years older.

8.7 billion years is a long enough period for all sorts of wonderful things to have emerged, not just ATP. We are all of us so very damned ignorant, sigh.

Bee said...

Well, yes, there's plenty of science fiction that plays with the idea (you know, at some point we Earthlings discover our true home etc). While it doesn't seem impossible, it also doesn't seem particularly likely. Best,

B.

Plato said...

Only cosmologists like to think inside the box Steven:)

As if out of some Jack Nicholson phrase, "they can't handle the truth?":)

I swear, your going to get "bumped" out of your orbit some day:)

Best,

Steven Colyer said...
This comment has been removed by the author.
Bee said...

Well, ask yourself it it's more promising to trying to explain the emergence of life on another planet and then try to get it here.

Steven Colyer said...

And yet Bee, scientists who seek the production of life on Earth (beginning with the simplest amino acids) from given initial conditions of the early Earth a mere half-billion years after formation when it is assumed conditions seem to have been conducive to life-formation (the time itself not being in contention given the fossil record), have run into considerable roadblocks.

How can we explain that? And no it's not as though this field needs time ... it's been around since the 70's if not earlier.

Plato said...

Well in cosmology there is only this universe, and it came from where?

So one has to look for "the proo" somewhere as to the cosmological state of the universe.

I mean that's where Veneziano comes in right?:)

Plato said...

“You don’t see what you’re seeing until you see it,” Dr. Thurston said, “but when you do see it, it lets you see many other things.Elusive Proof, Elusive Prover: A New Mathematical Mystery


Pre-big-bang scenario: the general picture

See:THE PRE-BIG BANG SCENARIO

Thinking outside of the box should not be warranted with such intonationsso as to discourage the mind to consider "other options" that can help move the mind to see in different ways?

Is economics a good choice as to reveal the "seeds of a caring human being" with regard to others?

Best,

Aaron Sheldon said...

After my dozen comment rant on entropy change in the solar photon flux on Cosmic Variance, I can reasonably hypothesize that, given the purpose of life is to maximize the entropy in the solar photon flux, this amounts to keeping the Earth as warm as possible while minimizing the formation of upper atmospheric clouds.

The reason is that the Earth maximally increases the entropy in the solar photon flux when it is a black body at 390K. However if the atmosphere were that hot then it would form high level cloud cover, effectively dropping the Earth's thermal flux to 273K. Life strikes an elegant balance keeping the Earth's thermal flux to around 300K.

Aaron Sheldon said...

Sorry one more comment on the double life of ATP.

Biological systems have a relatively small number of direct chemical reactions available to modulate enzyme function. Basically phosphorylation and methylation. However there are a large number of substrates available to act on: the 5 nucleic acids, and the 23 amino acids.

So for example the nucleic acid adenosine, can be recursively phosphorylated to form
adenosine mono-phosphate, which is important in signal transduction, such as in axons. Adenosine di-phosphate which is used in photosynthesis. And adenosine tri-phosphate, which is used in among other things aerobic metabolism.

Hope this helps.

Plato said...

Hmm..interesting Aaron.

Bee and Stefan have always kept an inducible environment too thought production wile keeping science interesting.

I mean if ever we to construct a computer based on quantum chlorophyll correlations of the photosynthetic process closely related to activation of a "neuronal transmitter" might be of interest to me?

Quantum computers injecting a hormonal fluid environment for some "endocrinologist messaging system."

"Conversion of energy" to batteries to sustain transportation?

If you can imagine "a painted house," then you can a car too? Off the grid, without using solar cells? Imagine what you could do to the economy?

Seeing polarizations on a 'geometrisized surface' would seem of relevance to me(layman)when considering the surface localization, as from
changing locations of the sun, as to impute the largest factor of energy concentrated, so as to get the greatest energy into that storage?

The skin like(paint) acting as a solar cell? The "body shape" design painted regardless of sun's location for that greatest impute.

"I had an idea" is now out there.

So what do you think?:)

Best,

Aaron Sheldon said...

Lunch break.

hmmm...I don't have that breadth of knowledge plato.

I think given life's preference for metabolic processes at 310K, that this temperature probably represents some optimum balance between high altitude cloud cover and terrestrial thermal luminosity, that maximizes the entropy increase in the incident solar flux. So anything that can keep the Earth between 300-320k is probably going to be favorable.

I'm not sure of any biological processes that could incorporate quantum computing. Off hand the closest thing is the myelinated axons acting like coaxial cables (with a non-vacuum dielectric constant) and the nodes of Ranvier acting like transistor amplifiers. This gives very fast signal transduction over 10s of cm. But this mechanism is purely classical.

Don't quote me on this but I think it has yet to be determined whether the current through ion channels is classical or super-fluid. The balance between ion currents is well described classically by the Nernst equation, but the actual pore conductances I think are not explained as yet.

tspin said...

In general DNA topology does not influence protein folding. There may be some exceptions in bacteria where transcription of DNA into mRNA and translation of mRNA into proteins are closely linked but it's very unlikely in eucariotes where both processes are decoupled and happen at different time and place.

As for the folding model it seems like a huge simplification of what actually happens so it all boils down to one question - are results good enough to be useful?

I also disagree with the suggestion made by the authors of the paper that "theoretical analysis of protein folding, is maybe the most important problem in molecular biology." It certainly is not, the most important problem is knowing the function of proteins not their structure. Although the structure can help in ruling out certain hypotheses it is often of relatively little help in determining the function as it depends not only on the structure of the protein itself but also on the structures and properties of all the other molecules in the cell.

As for ATP mentioned in the comments serves great many roles in cells:
energy storage,
a building block of polymers DNA and RNA,
an enzyme cofactor,
a molecular switch/clock (in proteins like actin or tubulin which have different properties depending on whether their bound ATP is intact or hydrolized to ADP),
a signaling molecule,
a donor of various groups (mostly PO4 and adenine) in biosynthesis and protein modification,
a precursor of other related compounds like cyclic AMP,
can also serve as a source of carbon, nitrogen and phosphorus.

Plato said...

My last comment on this subject so as to get back to Bee's post.

Just a couple of points to clarify.

Thin-Film Solar with High Efficiency

The idea:A vehicles surface is never altogether flat in relation to the sun, and as the vehicle moves the sun over head is always changing angles in respect to the vehicle, so, you have to capture the sun "in spots" most focused in relation to the sun at any one time. Extreme points, or bends, as Surface geometry "to focus" that energy


An Idea: Percolating to the Surface

It also remains unclear exactly how a plant's structure permits this quantum effect to take place. "[The protein structure] of the plant has to be tuned to allow transfer among chromophores but not to allow transfers into [heat]," Engel says. "How that tuning works and how it is controlled, we don't know." Inside every spring leaf is a system capable of performing a speedy and efficient quantum computation, and therein lies the key to much of the energy on Earth.

Best,

Jérôme CHAUVET said...

Hi everybody,

Regarding proteins folding, the situation is even worse if one considers that many proteins of interest have no pre-determined structure (see this). The so-called IUP seems to rapidly shift between different possible stable conformations so to adapt to several molecular targets. And in cells, like Uncle Al wrote it firstly here, many processes are accompanied by other proteins, which the model should take into account. Proteins undergo many biochemical post-translational modifications in a cell (e.g., phosphoryllation), so the amino acids sequence alone may not be a complete explanation of its 3d structure...

DNA folding has an influence on transcription because it regulates accessiblity to DNA strands by DNA-polymerase; a DNA knot lessens the possibility for it to reach at this part of the strand. The way DNA folds depends on the sequence it is made of (A-T-C-G), as the link A<->T consists in 2 hydrogen bonds while the link C<->G consists in 3, so that a model for DNA folding should probably bear this constraint in itself, which by definition cannot be reduced... Special ATCG motifs could be however considered, together with the limitation it implies, but a universal theory for this issue seems to me quite a big stuff to achieve.

Best,