Command Line Interface¶
This page shows the MicroHapulator command line interface: how inputs and settings are specified for each subcommand.
NOTE: The MicroHapulator CLI is under Semantic Versioning. In brief, this means that every stable version of the MicroHapulator software is assigned a version number, and that any changes to the software's behavior or interface require the software version number to be updated in prescribed and predictable ways.
Haplotype calling¶
mhpl8r type
¶
Perform haplotype calling
usage: mhpl8r type [-h] [-o FILE] [-b B] [-m M] tsv bam
Positional Arguments¶
- tsv
path of a TSV file containing marker metadata, specifically the offset of each SNP for every marker in the panel
- bam
path of a BAM file containing NGS reads aligned to marker reference sequences and sorted
Named Arguments¶
- -o, --out
write output to FILE; by default, output is written to the terminal (standard output)
- -b, --base-qual
minimum base quality (PHRED score) to be considered reliable for haplotype calling; by default B=10, corresponding to Q10, i.e., 90% probability that the base call is correct
- -m, --max-depth
maximum permitted read depth; by default M=1000000
mhpl8r filter
¶
Apply static and/or dynamic thresholds to distinguish true and false haplotypes. Thresholds are applied to the haplotype read counts of a raw typing result. Static integer thresholds are commonly used as detection thresholds, below which any haplotype count is considered noise. Dynamic thresholds are commonly used as analytical thresholds and represent a percentage of the total read count at the marker, after any haplotypes failing a static threshold are discarded.
usage: mhpl8r filter [-h] [-o FILE] [-s ST] [-d DT] [-c FILE] result
Positional Arguments¶
- result
MicroHapulator typing result in JSON format
Named Arguments¶
- -o, --out
write output to FILE; by default, output is written to the terminal (standard output)
- -s, --static
global fixed read count threshold
- -d, --dynamic
global percentage of total read count; e.g. use --dynamic=0.02 to apply a 2% analytical threshold
- -c, --config
CSV file specifying marker-specific thresholds to override global thresholds; three required columns: 'Marker' for the marker name; 'Static' and 'Dynamic' for marker-specific thresholds
Analysis and interpretation¶
mhpl8r balance
¶
Compute interlocus balance
usage: mhpl8r balance [-h] [-c FILE] [-D] input
Positional Arguments¶
- input
a typing result including haplotype counts in JSON format
Named Arguments¶
- -c, --csv
write read counts to FILE in CSV format
- -D, --no-discarded
do not included mapping but discarded reads in read counts; by default, reads that are mapped to the marker but discarded because they do not span all variants at the marker are included
mhpl8r contrib
¶
Estimate the minimum number of DNA contributors to a suspected mixture
usage: mhpl8r contrib [-h] [-o FILE] result
Positional Arguments¶
- result
typing result in JSON format
Named Arguments¶
- -o, --out
write output to FILE; by default, output is written to the terminal (standard output)
mhpl8r prob
¶
Compute a profile random match probability (RMP) or an RMP-based likelihood ratio (LR) test
usage: mhpl8r prob [-h] [-e ε] [-o FILE] freq profile1 [profile2]
Positional Arguments¶
- freq
population haplotype frequencies in tabular (TSV) format
- profile1
typing result or simulated genotype in JSON format
- profile2
typing result or simulated genotype in JSON format; optional
Named Arguments¶
- -e, --erate
rate of genotyping error; by default ε=0.001
- -o, --out
write output to FILE; by default, output is written to the terminal (standard output)
mhpl8r diff
¶
Compare two profiles and determine the markers at which their genotypes differ
usage: mhpl8r diff [-h] [-o FILE] profile1 profile2
Positional Arguments¶
- profile1
typing result or simulated profile in JSON format
- profile2
typing result or simulated profile in JSON format
Named Arguments¶
- -o, --out
write output to "FILE"; by default, output is written to the terminal (standard output)
mhpl8r dist
¶
Compute a simple Hamming distance between two profiles
usage: mhpl8r dist [-h] [-o FILE] profile1 profile2
Positional Arguments¶
- profile1
typing result or simulated profile in JSON format
- profile2
typing result or simulated profile in JSON format
Named Arguments¶
- -o, --out
write output to "FILE"; by default, output is written to the terminal (standard output)
mhpl8r contain
¶
Perform a simple containment test
usage: mhpl8r contain [-h] [-o FILE] profile1 profile2
Positional Arguments¶
- profile1
simulated or inferred genotype profile in JSON format
- profile2
simulated or inferred genotype profile in JSON format
Named Arguments¶
- -o, --out
write output to "FILE"; by default, output is written to the terminal (standard output)
mhpl8r convert
¶
Convert a typing result to a format compatible with probabilistic genotyping software applications
usage: mhpl8r convert [-h] [-o FILE] [--no-counts] [-f] result sample
Positional Arguments¶
- result
filtered MicroHapulator typing result in JSON format
- sample
sample name
Named Arguments¶
- -o, --out
write output to 'FILE'; by default, output is written to the terminal (standard output)
- --no-counts
do not include haplotype counts if you are interpreting your data with a semi-continuous probgen model such as LRMix Studio; by default, haplotype counts are included for interpretation with fully continuous probgen model such as EuroForMix
- -f, --fix-homo
duplicate a homozygous haplotype so that it is reported twice
Simulation¶
mhpl8r sim
¶
Simulate a diploid genotype from the specified microhaplotype frequencies
usage: mhpl8r sim [-h] [-s INT] [-o FILE] [--haplo-seq FILE]
[--sequences FILE] [--markers FILE]
freq
Positional Arguments¶
- freq
population microhaplotype frequencies in tabular (tab separated) format
Named Arguments¶
- -s, --seed
seed for random number generator
- -o, --out
write simulated profile data in JSON format to FILE
- --haplo-seq
write simulated haplotype sequences in FASTA format to FILE
- --sequences
microhaplotype sequences in FASTA format; required if --haplo-seq enabled, ignored if not
- --markers
microhaplotype marker definitions in tabular (tab separated) format; required if --haplo-seq enabled, ignored if not
mhpl8r mix
¶
Combine simulated profiles into a mock DNA mixture
usage: mhpl8r mix [-h] [-o FILE] profiles [profiles ...]
Positional Arguments¶
- profiles
simulated genotype profiles in JSON format
Named Arguments¶
- -o, --out
write output to "FILE"; by default, output is written to the terminal (standard output)
mhpl8r unite
¶
Simulate the creation of a new profile from a mother and father
usage: mhpl8r unite [-h] [-o FILE] [-s INT] mom dad
Positional Arguments¶
- mom
simulated or inferred genotype in JSON format
- dad
simulated or inferred genotype in JSON format
Named Arguments¶
- -o, --out
write output to "FILE"; by default, output is written to the terminal (standard output)
- -s, --seed
seed for random number generator
mhpl8r seq
¶
Simulate paired-end Illumina MiSeq sequencing of the given profile(s)
usage: mhpl8r seq [-h] [-o OUT [OUT ...]] [-n N] [-p P [P ...]]
[-s INT [INT ...]]
tsv refrseqs profiles [profiles ...]
Positional Arguments¶
- tsv
microhaplotype marker definitions in tabular (TSV) format
- refrseqs
microhaplotype reference sequences in FASTA format
- profiles
one or more simple or complex profiles (JSON files)
Named Arguments¶
- -o, --out
write simulated paired-end MiSeq reads in FASTQ format to the specified file(s); if one filename is provided, reads are interleaved and written to the file; if two filenames are provided, reads are written to paired files; by default, reads are interleaved and written to the terminal (standard output)
- -n, --num-reads
number of reads to simulate; default is 500000
- -p, --proportions
simulated mixture samples with multiple contributors at the specified proportions; by default even proportions are used
- -s, --seeds
seeds for random number generator, 1 per profile