SsDNA Kuhn length: Difference between revisions

From OxDNA
Jump to navigation Jump to search
No edit summary
 
(7 intermediate revisions by one other user not shown)
Line 1: Line 1:
In this example we will compute the kuhn length strand of ssDNA with two
[[Category:Examples]]
In this example we will compute the Kuhn length strand of ssDNA with two
different lengths: 10 and 30 nucleotides. Given the simplicity of the system,
different lengths: 10 and 30 nucleotides. Given the simplicity of the system,
the computation required is of the order on one hour to get very accurate
the computation required is of the order on one hour to get very accurate
Line 5: Line 6:


Since ssDNA has stacked and unstacked regions, in constant evolution, the usual
Since ssDNA has stacked and unstacked regions, in constant evolution, the usual
definition of persistence length as a function of the correleation function
definition of persistence length as a function of the correlation function
between segments is not directly appliable. Instead, we resort (see
between segments is not directly applicable. Instead, we resort (see
[[Publications|Ref. 2]]) to a more general property, which is the Kuhn length defined as:
[[Publications|Ref. 2]]) to a more general property, which is the Kuhn length defined as:


k = <'''L''' * '''L'''> / L_max
k = <'''L''' * '''L'''> / ''L''_max


where '''L''' is the end to end vector of the strand and '''L_max''' is the
where '''L''' is the end to end vector of the strand and ''L''_max is the
maximum lenght of the polymer. Since our ssDNA is not a rigid polymer, we use
maximum length of the polymer. Brackets indicate thermal average. Since our ssDNA is not a rigid polymer, we use
instead the average distance between nucleotides times the number of covalent
as a proxy of the maximum length the average distance between nucleotides times the number of covalent
bonds along the backbone.  Brakets indicate thermal average.
bonds along the backbone.


For this simulations, we will use Brownian Dynamics with the usual values for
For this simulations, we will use Brownian Dynamics with the usual values for
Line 28: Line 29:
respectively.
respectively.


==10-base ssDNA persistence length==
The files relative to this example are in the directory <tt>${oxDNA}/EXAMPLES/SSDNAKUHNLENGTH/</tt>.
 
==10-base ssDNA Kuhn length==
We start by creating a sequence file to
We start by creating a sequence file to
generate the initial confiugration. We just need to put 30 A's in a file which
generate the initial configuration. We just need to put 30 A's in a file which
we'll call <tt>seq.txt</tt>
we'll call <tt>seq.txt</tt>


Line 51: Line 54:


The end-to-end distance of ssDNA, which is the relevant parameter to monitor
The end-to-end distance of ssDNA, which is the relevant parameter to monitor
for measuring persistence length, has very large fluctuations in our model. A
for measuring the Kuhn length, has very large fluctuations in our model. A
large number of values is thus needed to have a reliable average; we store 2000
large number of values is thus needed to have a reliable average; we store 2000
configurations, thus we use a total numeber of steps of 50 000 * 2 000 = 100
configurations, thus we use a total number of steps of 50 000 * 2 000 = 100
000 000.
000 000.
The subdirectory <tt>10B</tt> contains a working sequence and input file ready to be run.


The next step is to generate the trajectory by running the program on the input
The next step is to generate the trajectory by running the program on the input
Line 63: Line 68:
</pre>
</pre>
Let's now discuss the analysis script <tt>sspl.py</tt> that will extract
Let's now discuss the analysis script <tt>sspl.py</tt> that will extract
the persistence length from the trajectory. The core of the program is as
the kuhn length from the trajectory. The core of the program is as
follows:
follows:


Line 113: Line 118:
and ''l_0''. All distances are computed relative to backbone sites, which are
and ''l_0''. All distances are computed relative to backbone sites, which are
0.4 simulation units away from the centres along the principal axis of the
0.4 simulation units away from the centres along the principal axis of the
molecule as described in [[Documentation]]. The script produces a file called <tt>end2end.dat</tt> with the time series of the end-to-end distance, which can be used to check for good statistics.
molecule as described in the [[Documentation#Geometry_of_the_Model|geometry]] of the model. The script produces a file called <tt>end2end.dat</tt> with the time series of the end-to-end distance, which can be used to check for good statistics.


At the end, we use those the averages to compute the Kuhn
At the end, we use those the averages to compute the Kuhn
Line 119: Line 124:


The result coming out should be around 1.9 simulation length units, which
The result coming out should be around 1.9 simulation length units, which
translates to 1.6nm.
translates to 1.6nm (see [[Model_introduction#Simulation_units|units]]).
 


==30-Base ssDNA==
==30-Base ssDNA==
All the steps above can be repeated specifying a longer sequence in the
All the steps above can be repeated specifying a longer sequence in the
<tt>seq.txt</tt> file. The Kuhn lenght should come out as roughly 2.9 simulation units.
<tt>seq.txt</tt> file.  
The subdirectory <tt>30B</tt> contains a working sequence and input file ready to be run. Note that we expect a threefold increase in simulation time due to the system being three times as large. We can use the same python script to analyse the trajectory; the Kuhn length should come out as roughly 2.9 simulation units.

Latest revision as of 17:15, 21 April 2012

In this example we will compute the Kuhn length strand of ssDNA with two different lengths: 10 and 30 nucleotides. Given the simplicity of the system, the computation required is of the order on one hour to get very accurate results.

Since ssDNA has stacked and unstacked regions, in constant evolution, the usual definition of persistence length as a function of the correlation function between segments is not directly applicable. Instead, we resort (see Ref. 2) to a more general property, which is the Kuhn length defined as:

k = <L * L> / L_max

where L is the end to end vector of the strand and L_max is the maximum length of the polymer. Brackets indicate thermal average. Since our ssDNA is not a rigid polymer, we use as a proxy of the maximum length the average distance between nucleotides times the number of covalent bonds along the backbone.

For this simulations, we will use Brownian Dynamics with the usual values for the [[Thermostat|thermostat]. We'll run the simulations at room temperature, and we will use poly-A sequence with the average parametrisation of the model. We use nucleotides that are all the same to avoid the formation of secondary structure that could affect the end-to-end distance. With the sequence-independent parametrisation, a poly-X sequence is for all intents and purposes the same as any sequence that does not form secondary structure.

The directories 30B and 10B inside EXAMPLES/SSDNA/ are ready to run the simulations of the 10-base and 30-base ssDNA, respectively.

The files relative to this example are in the directory ${oxDNA}/EXAMPLES/SSDNAKUHNLENGTH/.

10-base ssDNA Kuhn length

We start by creating a sequence file to generate the initial configuration. We just need to put 30 A's in a file which we'll call seq.txt

AAAAAAAAAAAAAAAAAAAAAAAAAAAAAA

and then generate the initial configuration with

${OXDNA}/UTILS/generate-sa.py 20. seq.txt

which generates generated.top and generated.dat. We can use any standart input file for dynamics with the temperature set to 23C, which we will call input. We store configurations every 50000 steps, which should produce almost independent values for the end-to-end distance with the standart thermostat settings. Of course this is cannot be known a priori, but we know this since we have ran this simulation before.

The end-to-end distance of ssDNA, which is the relevant parameter to monitor for measuring the Kuhn length, has very large fluctuations in our model. A large number of values is thus needed to have a reliable average; we store 2000 configurations, thus we use a total number of steps of 50 000 * 2 000 = 100 000 000.

The subdirectory 10B contains a working sequence and input file ready to be run.

The next step is to generate the trajectory by running the program on the input file.

${OXDNA}/Release/oxDNA input

Let's now discuss the analysis script sspl.py that will extract the kuhn length from the trajectory. The core of the program is as follows:

line = conffile.readline ()
line = conffile.readline ()
line = conffile.readline ()
line = conffile.readline ()
while line:
    # end 2 end
    r1 = [float(x) for x in line.split()[0:3]]
    a1 = [float(x) for x in line.split()[3:6]]
    line = conffile.readline ()

    r2 = [float(x) for x in line.split()[0:3]]
    a2 = [float(x) for x in line.split()[3:6]]
    
    b1 = [r1[i] - 0.4 * a1[i] for i in xrange(3)]
    b2 = [r2[i] - 0.4 * a2[i] for i in xrange(3)]
    
    for i in xrange (nnucl - 2):
        line = conffile.readline()

    rN = [float(x) for x in line.split()[0:3]]
    aN = [float(x) for x in line.split()[3:6]]
    bN = [rN[i] - 0.4 * aN[i] for i in xrange(3)]
    
    for i in xrange (4):
        line = conffile.readline()

    r1N = [rN[i] - r1[i] for i in xrange(3)]
    r12 = [r2[i] - r1[i] for i in xrange(3)]
    b1N = [bN[i] - b1[i] for i in xrange(3)]
    b12 = [b2[i] - b1[i] for i in xrange(3)]
    
    niter += 1

Ll0 /= float (niter)
L2 /= float (niter)
l0 /= float (niter)

Kl = L2 / ((nnucl -1) * l0)

print "Kuhn length: ", Kl 

We read the position of the first, second and last nucleotide in the system, skipping all other lines, and compute the thermal average of L * L and l_0. All distances are computed relative to backbone sites, which are 0.4 simulation units away from the centres along the principal axis of the molecule as described in the geometry of the model. The script produces a file called end2end.dat with the time series of the end-to-end distance, which can be used to check for good statistics.

At the end, we use those the averages to compute the Kuhn length of the strand.

The result coming out should be around 1.9 simulation length units, which translates to 1.6nm (see units).

30-Base ssDNA

All the steps above can be repeated specifying a longer sequence in the seq.txt file. The subdirectory 30B contains a working sequence and input file ready to be run. Note that we expect a threefold increase in simulation time due to the system being three times as large. We can use the same python script to analyse the trajectory; the Kuhn length should come out as roughly 2.9 simulation units.