appendix b: BuildVPIndex options
| option | meaning | default value |
value in dna/build.sh |
|
|---|---|---|---|---|
| -t | data type: one of "protein", "dna", "vector", "image", "ms", "msms" | n/a |
dna |
|
| -d | location of "mobiosData" directory; if directory does not exist, will be created at specified location | current directory |
../ |
|
| -i | input data file name | n/a |
data.fasta |
|
| -o | output index name | n/a |
dna_18_100000 |
|
| -psm | pivot selection method: random, fft, center, pcaonfft, pca | fft |
||
| -p | number of pivots in an index node | 3 |
2 |
|
| -dpm | data partition method: balanced, clusteringkmeans, clusteringboundary |
balanced |
||
| -f | fanout of a pivot | 3 |
||
| -m | maximum number of children in a leaf node | 100 |
||
| -pl | path length | 0 |
||
| -g | debug level | 0 |
||
| -frag | fragment length, only meaningful for sequences* | n/a |
11 |
|
| -dim | dimension of vector data to load, only meaningful for vectors | n/a |
n/a |
|
| -b | bucketing | 1: will be used |
0 |
1 |
| -s | size of index (number of data points) | all data points in source file |
100000 |
|
| -r | maximum radius for partition | 0.1 |
n/a |
|
see mobios-v0.9-examples/dna/build.sh for an example of how to call BuildVPIndex.
advanced options
It is possible to build more than one index at a time with BuildVPIndex using the following options:
| option | meaning |
|---|---|
| -sm | size of smallest index |
| -la | size of largest index |
| -st | step size of index |
When using the advanced options, the size of the index will be appended to the given index name for each index.
*When building an index over sequences, each sequence is broken up into sets of overlapping fragments, or k-mers. For more information on this methodology, see the paper: ...
"Using MoBIoS' Scalable Genome Joins to Find Conserved Primer Pair Candidates Between Two Genomes," Weijia Xu, Willard J Briggs, Joanna Padolina, Wenguo Liu, C. Randall Linder, Daniel P. Miranker. ISMB Bioinformatics, 2004.
