mcmc

`mcmc`

Provide a map distribution using Markov Chain Monte Carlo. It works only in the comparative mapping approach (Section 2.7), for that a reference order dataset has to be loaded and merged with the biological dataset.

Synopsis:

The mcmc command is invoked either as:

mcmc Options
mcmc RandomSeed NbIter Burning

Description:

The mcmc command is Markov Chain Monte Carlo algorithm for estimating the posterior distribution of the marker order. It starts from the best map found in the CarthaGene heap. In order to visit maps, it uses a stochastic-biased complex (2-opt submap reversals, single marker re-insertions, and submap transpositions) operator based on 2-point likelihood genetic / Radiated Hybrid data. The random generator number is initialized with the RandomSeed parameter (should be a strictly positive integer). The number of iterations of MCMC is controlled by the NbIter parameter. At the end, it gives a list of maps with their posterior probability. The first Burning iterations are not used to estimate the map distribution. The maps are inserted in the CarthaGene heap. The map distribution is written to a file names PID.mcmc where PID is the Process ID of the current carthagene session.

The format of the output file is as follows: the first line recalls the starting map, then each line is a map from the distribution, sorted in decreasing (multipoint) posterior probability. For each line, it is indicated (in that order): the posterior probability using the multipoint likelihood, the posterior probability using the 2-pt approximation of the likelihood, the weight measuring the ratio of these two probabilities, the number of breakpoints with the reference order and finally the map given as the positions of the markers relative to the starting map (from 0 to N-1) where N is the number of markers.

Arguments :

Options : -u to obtain the synopsis of the normal use, -h to print a one line description, -H to print a short help.
RandomSeed: the random generator seed.
NbIter: the number of iterations requested.
Burning: the number of iterations (less than NbIter) for the burning period (these iterations are not taken into account in the final map distribution).

Returns:

An output file containing the map distribution.

Example:

   CarthaGene version 1.2-LKH, Copyright (c) 1997-2010 (INRA).

   CarthaGene comes with ABSOLUTELY NO WARRANTY.
   CarthaGene is free software. You are welcome to redistribute it,
   under certain conditions. See the License file for information.

Type 'help' for help.

CG> dsload Data/rh12_true.cg
{1 haploid RH 51 93 /home/tschiex/Dev/carthagene/doc/user/exemple/Data/rh12...
CG> dsload Data/order12_ref.cg
{2 order 51 /home/tschiex/Dev/carthagene/doc/user/exemple/Data/order12_ref.cg}
CG> dsmergor 1 2
{3 merged by order 51 93}
CG>

#select the comparative mapping criterion
CG> dsbplambda 2 1 1
New coefficient factor value is -2.682486.
1.0
CG>

#find a first good map
CG> lkhn 1 -1
[-522.68]
Best map with log10-likelihood = -522.68
TSP: optimum= 517.192000 lowerbound= 517.192000 gap= 0.000000% totaltime= 0.01

Map -1 : log10-likelihood =  -522.68
-------:
 Set : Marker List ...
   1 : UniSTS208788 UniSTS180373 UniSTS160192 UniSTS160197 UniSTS165815 Uni...
   2 : UniSTS208788 UniSTS180373 UniSTS160192 UniSTS160197 UniSTS165815 Uni...
Optimum found, equal to 517192! The minimum 1-tree is a tour.

CG>

#compute map distribution starting from this initial map
#see result in ProcessID.mcmc output file
CG> mcmc 1 1000 500
[-4.9e+02]
key#: 0 2pt_loglike: -488.80
Maximum transposed segment size = 10
[-488.52]
[-487.15]
[-486.90]
Completed.   178 maps visited. time elapsed:9.20 sec.
MCMC done
Gibbs: accepted/proposed = 0.002929
Computing IS weigths:

Done.

totaltime= 9.30 sec.

CG>