# Documentation

## Compile options

Compiling oxDNA requires that you change the first rows in the makefile to match your machine configuration. The following parameters can be passed to make:

• dbg=1 oxDNA will be compiled with debug flags (both for nvcc and gcc). The resulting executable will be put in the Debug directory.
• g=1 oxDNA will be compiled with both debug and optimization flags. The resulting executable will be put in the Release directory.
• intel=1 oxDNA will be compiled using the Intel icpc compiler. The resulting executable will be named oxDNA_intel.

## Usage

oxDNA input_file

The input file contains all the relevant information for the program to run, such as what initial configuration to use, the topology of the system, how often to print the energies to a file, etc. Please make sure you read the thermostat page if you use molecular dynamics.

## Input file

As always in UNIX environments, everything is case sensitive. The options are in the form key = value. There can be arbitrary spaces before and after both key and value. Line with a leading # will be treated as comments. In this part | (pipe) is the separator between the different values that can be used to specify a value for the key. Keys between [ and ] are optional, the value after the equal sign is the default value.

### Generic options

The options listed here define the generic behavior of the entire program.

[sim_type=MD]
MD|MC
MD = Molecular Dynamics, MC = Monte Carlo
backend
CPU
backend_precision
float|double
[debug=0]
0|1
1 if you want verbose logs, 0 otherwise.

### Simulation options

The options listed here specify the behaviour of the simulation.

steps
number of steps to be performed.
[restart_step_counter=0]
0|1
0 means that the step counter will start from the value read in the configuration file; if 1, the step counter will be reset to 0. The total duration of the simulation is unchanged.
[seed=time(NULL)]
seed for the random number generator. On Unix systems, it will use by default a number from /dev/urandom + time(NULL)
T
temperature of the simulation. It can be expressed in simulation units or kelvin (append a k or K after the value) or celsius (append a c or C after the value).
Examples:
Value Simulation Units
0.1 0.1
300 K 0.1
300k 0.1
26.85c 0.1
26.85 C 0.1
verlet_skin
if a particle moves more than verlet_skin then the lists will be updated. Its name is somewhat misleading: the actual verlet skin is 2*verlet_skin.
[use_average_seq=1]
0|1
specifies whether to use the default hard-coded average parameters for base-pairing and stacking interaction strengths or not. If sequence dependence is to be used, set this to 0 and specify seq_dep_file.
[seq_dep_file]
specifies the file from which the sequence dependent parameters should be read. Mandatory if use_average_seq=no, ignored otherwise. A sample file is provided (sequence_dependent_parameters.txt).
[external_forces=0]
0|1
specifies whether there are external forces acting on the nucleotides or not. If it is set to 1, then a file which specifies the external forces' configuration has to be provided (see external_forces_file).
[external_forces_file]
specifies the file containing all the external forces' configurations. Currently there are six supported force types (see EXAMPLES/TRAPS for some examples):
• string
• twist
• trap
• repulsion_plane
• repulsion_plane_moving
• mutual_trap

#### Molecular dynamics simulations options

dt
time step of the integration.
thermostat
no|refresh|john
no means no thermostat will be used. refresh will refresh all the particle's velocities from a maxwellian every newtonian_steps steps. john is an Anderson-like thermostat (see pt). Make sure you read thermostat.
newtonian_steps
required if thermostat != no
number of steps after which a procedure of thermalization will be performed.
pt
used if thermostat == john. It's the probability that a particle's velocity will be refreshed during a thermalization procedure.
diff_coeff
required if pt is not specified
used internally to automatically compute the pt that would be needed if we wanted such a self diffusion coefficient. Not used if pt is set.

#### Monte Carlo simulations options

[check_energy_every=10]
this number times print_energy_every gives the number of steps after which the energy will be computed from scratch and checked against the actual value computed adding energy differences.
[check_energy_threshold=1e-4]
if abs((old_energy - new_energy)/old_energy) > check_energy_threshold then the program will die and warn the user.
ensemble
NVT
ensemble of the simulation. More ensembles could be added in future versions.
delta_translation
maximum displacement (per dimension) for translational moves in simulation units.
delta_translation
maximum displacement for rotational moves in simulation units.

### Input/output

The options listed here are used to manage the I/O (read and write configurations, energies and so on)

conf_file
initial configuration file.
topology
file containing the system's topology.
trajectory_file
the main output of the program. All the configurations will be appended to this file as they are printed.
[confs_to_skip=0]
valid only if conf_file is a trajectory. Skip the first confs_to_skip configurations and then load in memory the (confs_to_skip+1)th.
[lastconf_file=last_conf.dat]
this is the file where the last configuration is saved (when the program finishes or is killed). Set to last_conf.dat by default
[refresh_vel=0]
0|1
if 1 the initial velocities will be refreshed from a maxwellian.
energy_file
energy output file.
[print_energy_every=1000]
this will make the program print the energies every print_energy_every steps.
[no_stdout_energy=0]
0|1
if 1 the energy will be printed just to the energy_file.
[time_scale=linear]
linear|log_lin
using linear configurations will be saved every print_conf_interval.
using log_lin configurations will be saved logarithmically for print_conf_ppc times. After that the logarithmic sequence will restart.
print_conf_interval
linear interval if time_scale == linear. First step of the logarithmic scale if time_scale == log_lin.
print_conf_ppc
used if time_scale == log_lin
points per logarithmic cycle.
[print_reduced_conf_every=0]
every print_reduced_conf_every steps the program will print out the reduced configurations (i.e. confs containing only the centers of mass of strands).
reduced_conf_output_dir
used if print_reduced_conf_every > 0
output directory for reduced_conf files.
[log_file=stderr]
file where generic and debug informations will be logged. If not specified then stderr will be used.
[print_timings=0]
0|1
if 1 the MD step timing have be printed to a file.
timings_filename
used if print_timings == 1
output file where the MD step timing will be appended to.

## Output files

• The log file contains all relevant informations about the simulation (specified options, activated external forces, warnings about misconfiguratios, critical errors, etc.). If the log file is omitted, all these informations will be displayed on the standard output.
• The energy file layout for MD simulations is
 time potential energy kinetic energy total energy hydrogen bonding energy
while for MC simulations is
 time potential energy hydrogen bonding energy acceptance ratio for translational moves acceptance ratio for rotational moves
Mind that potential, kinetic and total energies are divided by the number of particles whereas the hydrogen bonding energy is not.
• Configurations are saved in the trajectory file.

## Configuration and topology files

The current state of a system, as by oxDNA, is described by two files: a configuration file and a topology file. The configuration file contains all the general informations (timestep, energy and box size) and orientations and positions of each nucleotide. The topology file, on the other hand, keeps track of the backbone-backbone bonds between nucleotides in the same strand. Working configuration and topology files can be found in the EXAMPLES directory.

### Configuration file

The first three rows of a configuration file contain the timestep T at which the configuration has been printed, the length of the box sides Lx, Ly and Lz and the total, potential and kinetic energies, Etot, U and K, respectively:

t = T
b = Lz Ly Lz
E = Etot U K


after this header, each row contains position of the centre of mass, orientation, velocity and angular velocity of a single nucleotide in the following order:

${\displaystyle \overbrace {r_{x}r_{y}r_{z}} ^{\rm {Position}}\overbrace {b_{x}b_{y}b_{z}} ^{\rm {Backbone-baseversor}}\overbrace {n_{x}n_{y}n_{z}} ^{\rm {Normalversor}}\overbrace {v_{x}v_{y}v_{z}} ^{\rm {Velocity}}\overbrace {L_{x}L_{y}L_{z}} ^{\rm {Angularvelocity}}}$

### Topology file

The topology file stores the intra-strand, fixed bonding topology (i.e. which nucleotides share backbone links). The first row contains the total number of nucleotides N and the number of strands Ns:

N Ns


After this header, the i-th row specifies strand, base and 3' and 5' neighbors of the i-th nucleotide in this way:

S B 3' 5'


where S is the number of the strand (starting from 1) which the nucleotide belongs to, B is the base and 3' and 5' specify the index of the nucleotides with which the i-th nucleotide is bonded in the 3' and 5' direction, respectively. A -1 signals that the nucleotide terminates the strand in either 3' or 5' direction. The topology file of a strand of sequence GCGTTG would be:

6 1
1 G -1 1
1 C 0 2
1 G 1 3
1 T 2 4
1 T 3 5
1 G 4 -1


Specifying the topology in this way can simplify the process of simulating, for example, circular DNA.

In order to generate initial configuration and topology files, we provide the ${oxDNA}/UTILS/generate-sa.py script. The usage of the script is generate-sa.py <box side> <file with sequence> where <box side> specifies the length of the box side in simulation units and <file with sequence> contains the sequence of the strands to be generated, one row per strand. If double strands are needed, each sequence must be preceded by DOUBLE. For example, a file containing DOUBLE AGGGCT CCTGTA  would generate a double strand with a sequence AGGGCT and a single strand with a sequence CCTGTA. Positions and orientations of the strands are all chosen at random in such a way that the resulting initial configuration does not contain significant excluded volume interactions between nucleotides belonging to different strands. Generated single- and double-strands have helical conformations (i.e. they are in the minimum of the intra-strand interaction energy). The output configuration and topology are stored in generated.dat and generated.top, respectively. Since this script will initialize nucleotides' velocities and angular velocities to 0, when performing molecular (or Brownian) dynamics simulation remember to put refresh_vel = 1 in the input file. ## Analysis of configurations The configurations produced by oxDNA can be analysed with the output_bonds program in${oxDNA}/UTILS/process_data/ directory. This program takes an input the input file (to recover the temperature and topology file), a configuration/trajectory file and an optional number. Since output_bonds reads analyses a single configuration, the optional number selects the configuration which it needs to analyse in the trajectory. Analysing a whole trajectory can be done by looping over a counter.

Please note that output_bonds is not compiled automatically. If you never compiled it, do so as described in the installation instructions.

output_bonds can be used as follows:

(where $oxDNA is the oxDNA source directory) to get the xyz representation in a file called the same as the trajectory file with .xyz appended. Please note that boundary conditions are implemented strand-wise, so strands that are bound might appear at two different sizes of the box. Also, the center of mass of the system (where each strand is weighted the same regardless of the length) is set to 0 at each frame. Carbons represent the backbone sites and oxygens the base sites. The resulting file can be read with a variety of programs. Here we will explain how to visualise it sensibly in VMD. • Run VMD and load the xyz file. • In the graphics menu, go to Representations. • In the Selected Atoms line, input name C. Also select Drawing method CPK, sphere scale 0.8 and Bond Radius 0. • In the Selected Atoms line, input name O. Also select Drawing method CPK, sphere scale 0.6 and Bond Radius 0. This should produce a ball representation of our model DNA. Bonds automatically produced by VMD are NOT meaningful in our context. ### pdb format Run $oxDNA/UTILS/traj2vis.py xyz <trajectory> <topology>

to produce a trajectory/configuration in the pdb format. A further file called chimera.com will be produced (more on this later). All comments above about periodic boundaries and centre of mass apply here as well.

The pdb file can be visualised in VMD just like the xyz format, but a nicer output can be produced with UCSF Chimera (although only for snapshots at the moment) as follows:

Run chimera and load the pdb file. An ugly output will be displayed.

Bring up the command line under the Tools → General Controls menu. Input read chimera.com in the command line and press enter. You should get a nicer visualisation with different bases in different colors, all the covalent bonds in the right place, etc.

On large configurations, the production of ellipsoids will be extremely slow. You can remove it by removing the line

aniso scale 0.75 smoothing 4

from the commands file. Loading the resulting file should be much faster.

UCSF chimera can in turn export the scene in a variety of formats.