Command Line Interface¶

This page shows the MicroHapulator command line interface: how inputs and settings are specified for each subcommand.

NOTE: The MicroHapulator CLI is under Semantic Versioning. In brief, this means that every stable version of the MicroHapulator software is assigned a version number, and that any changes to the software's behavior or interface require the software version number to be updated in prescribed and predictable ways.

Haplotype calling¶

`mhpl8r type`¶

Perform haplotype calling

usage: mhpl8r type [-h] [-o FILE] [-b B] [-m M] tsv bam

Positional Arguments¶

tsv: path of a TSV file containing marker metadata, specifically the offset of each SNP for every marker in the panel
bam: path of a BAM file containing NGS reads aligned to marker reference sequences and sorted

Named Arguments¶

-o, --out: write output to FILE; by default, output is written to the terminal (standard output)
-b, --base-qual: minimum base quality (PHRED score) to be considered reliable for haplotype calling; by default B=10, corresponding to Q10, i.e., 90% probability that the base call is correct
-m, --max-depth: maximum permitted read depth; by default M=1000000

`mhpl8r filter`¶

Apply static and/or dynamic thresholds to distinguish true and false haplotypes. Thresholds are applied to the haplotype read counts of a raw typing result. Static integer thresholds are commonly used as detection thresholds, below which any haplotype count is considered noise. Dynamic thresholds are commonly used as analytical thresholds and represent a percentage of the total read count at the marker, after any haplotypes failing a static threshold are discarded.

usage: mhpl8r filter [-h] [-o FILE] [-s ST] [-d DT] [-c FILE] result

Positional Arguments¶

result: MicroHapulator typing result in JSON format

Named Arguments¶

-o, --out: write output to FILE; by default, output is written to the terminal (standard output)
-s, --static: global fixed read count threshold
-d, --dynamic: global percentage of total read count; e.g. use --dynamic=0.02 to apply a 2% analytical threshold
-c, --config: CSV file specifying marker-specific thresholds to override global thresholds; three required columns: 'Marker' for the marker name; 'Static' and 'Dynamic' for marker-specific thresholds

Analysis and interpretation¶

`mhpl8r balance`¶

Compute interlocus balance

usage: mhpl8r balance [-h] [-c FILE] [-D] input

Positional Arguments¶

input: a typing result including haplotype counts in JSON format

Named Arguments¶

-c, --csv: write read counts to FILE in CSV format
-D, --no-discarded: do not included mapping but discarded reads in read counts; by default, reads that are mapped to the marker but discarded because they do not span all variants at the marker are included

`mhpl8r contrib`¶

Estimate the minimum number of DNA contributors to a suspected mixture

usage: mhpl8r contrib [-h] [-o FILE] result

Positional Arguments¶

result: typing result in JSON format

Named Arguments¶

-o, --out: write output to FILE; by default, output is written to the terminal (standard output)

`mhpl8r prob`¶

Compute a profile random match probability (RMP) or an RMP-based likelihood ratio (LR) test

usage: mhpl8r prob [-h] [-e ε] [-o FILE] freq profile1 [profile2]

Positional Arguments¶

freq: population haplotype frequencies in tabular (TSV) format
profile1: typing result or simulated genotype in JSON format
profile2: typing result or simulated genotype in JSON format; optional

Named Arguments¶

-e, --erate: rate of genotyping error; by default ε=0.001
-o, --out: write output to FILE; by default, output is written to the terminal (standard output)

`mhpl8r diff`¶

Compare two profiles and determine the markers at which their genotypes differ

usage: mhpl8r diff [-h] [-o FILE] profile1 profile2

Positional Arguments¶

profile1: typing result or simulated profile in JSON format
profile2: typing result or simulated profile in JSON format

Named Arguments¶

-o, --out: write output to "FILE"; by default, output is written to the terminal (standard output)

`mhpl8r dist`¶

Compute a simple Hamming distance between two profiles

usage: mhpl8r dist [-h] [-o FILE] profile1 profile2

Positional Arguments¶

profile1: typing result or simulated profile in JSON format
profile2: typing result or simulated profile in JSON format

Named Arguments¶

-o, --out: write output to "FILE"; by default, output is written to the terminal (standard output)

`mhpl8r contain`¶

Perform a simple containment test

usage: mhpl8r contain [-h] [-o FILE] profile1 profile2

Positional Arguments¶

profile1: simulated or inferred genotype profile in JSON format
profile2: simulated or inferred genotype profile in JSON format

Named Arguments¶

-o, --out: write output to "FILE"; by default, output is written to the terminal (standard output)

`mhpl8r convert`¶

Convert a typing result to a format compatible with probabilistic genotyping software applications

usage: mhpl8r convert [-h] [-o FILE] [--no-counts] [-f] result sample

Positional Arguments¶

result: filtered MicroHapulator typing result in JSON format
sample: sample name

Named Arguments¶

-o, --out: write output to 'FILE'; by default, output is written to the terminal (standard output)
--no-counts: do not include haplotype counts if you are interpreting your data with a semi-continuous probgen model such as LRMix Studio; by default, haplotype counts are included for interpretation with fully continuous probgen model such as EuroForMix
-f, --fix-homo: duplicate a homozygous haplotype so that it is reported twice

Simulation¶

`mhpl8r sim`¶

Simulate a diploid genotype from the specified microhaplotype frequencies

usage: mhpl8r sim [-h] [-s INT] [-o FILE] [--haplo-seq FILE]
                  [--sequences FILE] [--markers FILE]
                  freq

Positional Arguments¶

freq: population microhaplotype frequencies in tabular (tab separated) format

Named Arguments¶

-s, --seed: seed for random number generator
-o, --out: write simulated profile data in JSON format to FILE
--haplo-seq: write simulated haplotype sequences in FASTA format to FILE
--sequences: microhaplotype sequences in FASTA format; required if --haplo-seq enabled, ignored if not
--markers: microhaplotype marker definitions in tabular (tab separated) format; required if --haplo-seq enabled, ignored if not

`mhpl8r mix`¶

Combine simulated profiles into a mock DNA mixture

usage: mhpl8r mix [-h] [-o FILE] profiles [profiles ...]

Positional Arguments¶

profiles: simulated genotype profiles in JSON format

Named Arguments¶

-o, --out: write output to "FILE"; by default, output is written to the terminal (standard output)

`mhpl8r unite`¶

Simulate the creation of a new profile from a mother and father

usage: mhpl8r unite [-h] [-o FILE] [-s INT] mom dad

Positional Arguments¶

mom: simulated or inferred genotype in JSON format
dad: simulated or inferred genotype in JSON format

Named Arguments¶

-o, --out: write output to "FILE"; by default, output is written to the terminal (standard output)
-s, --seed: seed for random number generator

`mhpl8r seq`¶

Simulate paired-end Illumina MiSeq sequencing of the given profile(s)

usage: mhpl8r seq [-h] [-o OUT [OUT ...]] [-n N] [-p P [P ...]]
                  [-s INT [INT ...]]
                  tsv refrseqs profiles [profiles ...]

Positional Arguments¶

tsv: microhaplotype marker definitions in tabular (TSV) format
refrseqs: microhaplotype reference sequences in FASTA format
profiles: one or more simple or complex profiles (JSON files)

Named Arguments¶

-o, --out: write simulated paired-end MiSeq reads in FASTQ format to the specified file(s); if one filename is provided, reads are interleaved and written to the file; if two filenames are provided, reads are written to paired files; by default, reads are interleaved and written to the terminal (standard output)
-n, --num-reads: number of reads to simulate; default is 500000
-p, --proportions: simulated mixture samples with multiple contributors at the specified proportions; by default even proportions are used
-s, --seeds: seeds for random number generator, 1 per profile

Command Line Interface¶

Haplotype calling¶

mhpl8r type¶

Positional Arguments¶

Named Arguments¶

mhpl8r filter¶

Positional Arguments¶

Named Arguments¶

Analysis and interpretation¶

mhpl8r balance¶

Positional Arguments¶

Named Arguments¶

mhpl8r contrib¶

Positional Arguments¶

Named Arguments¶

mhpl8r prob¶

Positional Arguments¶

Named Arguments¶

mhpl8r diff¶

Positional Arguments¶

Named Arguments¶

mhpl8r dist¶

Positional Arguments¶

Named Arguments¶

mhpl8r contain¶

Positional Arguments¶

Named Arguments¶

mhpl8r convert¶

Positional Arguments¶

Named Arguments¶

Simulation¶

mhpl8r sim¶

Positional Arguments¶

Named Arguments¶

mhpl8r mix¶

Positional Arguments¶

Named Arguments¶

mhpl8r unite¶

Positional Arguments¶

Named Arguments¶

mhpl8r seq¶

Positional Arguments¶

Named Arguments¶

`mhpl8r type`¶

`mhpl8r filter`¶

`mhpl8r balance`¶

`mhpl8r contrib`¶

`mhpl8r prob`¶

`mhpl8r diff`¶

`mhpl8r dist`¶

`mhpl8r contain`¶

`mhpl8r convert`¶

`mhpl8r sim`¶

`mhpl8r mix`¶

`mhpl8r unite`¶

`mhpl8r seq`¶