================================================================
to download all of the files from one of these admin/exe/ directories,
  for example: admin/exe/linux.x86_64/
    using the rsync command to your current directory:

  rsync -aP rsync://hgdownload.soe.ucsc.edu/genome/admin/exe/linux.x86_64/ ./

================================================================
========   addCols   ====================================
================================================================
### kent source version 492 ###
addCols - Sum columns in a text file.
usage:
   addCols <fileName>
adds all columns in the given file, 
outputs the sum of each column.  <fileName> can be the
name: stdin to accept input from stdin.
Options:
    -maxCols=N - maximum number of colums (defaults to 16)

================================================================
========   ameme   ====================================
================================================================
ameme - find common patterns in DNA
usage
    ameme good=goodIn.fa [bad=badIn.fa] [numMotifs=2] [background=m1] [maxOcc=2] [motifOutput=fileName] [html=output.html] [gif=output.gif] [rcToo=on] [controlRun=on] [startScanLimit=20] [outputLogo] [constrainer=1]
where goodIn.fa is a multi-sequence fa file containing instances
of the motif you want to find, badIn.fa is a file containing similar
sequences but lacking the motif, numMotifs is the number of motifs
to scan for, background is m0,m1, or m2 for various levels of Markov
models, maxOcc is the maximum occurrences of the motif you 
expect to find in a single sequence and motifOutput is the name 
of a file to store just the motifs in. rcToo=on searches both strands.
If you include controlRun=on in the command line, a random set of 
sequences will be generated that match your foreground data set in size, 
and your background data set in nucleotide probabilities. The program 
will then look for motifs in this random set. If the scores you get in a 
real run are about the same as those you get in a control run, then the motifs
Improbizer has found are probably not significant.

================================================================
========   autoDtd   ====================================
================================================================
### kent source version 492 ###
autoDtd - Give this a XML document to look at and it will come up with a DTD
to describe it.
usage:
   autoDtd in.xml out.dtd out.stats
options:
   -tree=out.tree - Output tag tree.
   -atree=out.atree - Output attributed tag tree.

================================================================
========   autoSql   ====================================
================================================================
### kent source version 492 ###
autoSql - create SQL and C code for permanently storing
a structure in database and loading it back into memory
based on a specification file
usage:
    autoSql specFile outRoot {optional: -dbLink -withNull -json} 
This will create outRoot.sql outRoot.c and outRoot.h based
on the contents of specFile. 

options:
  -dbLink - optionally generates code to execute queries and
            updates of the table.
  -addBin - Add an initial bin field and index it as (chrom,bin)
  -withNull - optionally generates code and .sql to enable
              applications to accept and load data into objects
              with potential 'missing data' (NULL in SQL)
              situations.
  -defaultZeros - will put zero and or empty string as default value
  -django - generate method to output object as django model Python code
  -json - generate method to output the object in JSON (JavaScript) format.

================================================================
========   autoXml   ====================================
================================================================
autoXml - Generate structures code and parser for XML file from DTD-like spec
usage:
   autoXml file.dtdx root
This will generate root.c, root.h
options:
   -textField=xxx what to name text between start/end tags. Default 'text'
   -comment=xxx Comment to appear at top of generated code files
   -picky  Generate parser that rejects stuff it doesn't understand
   -main   Put in a main routine that's a test harness
   -prefix=xxx Prefix to add to structure names. By default same as root
   -positive Don't write out optional attributes with negative values

================================================================
========   ave   ====================================
================================================================
ave - Compute average and basic stats
usage:
   ave file
options:
   -col=N Which column to use.  Default 1
   -tableOut - output by columns (default output in rows)
   -noQuartiles - only calculate min,max,mean,standard deviation
                - for large data sets that will not fit in memory.
================================================================
========   aveCols   ====================================
================================================================
aveCols - average together columns
usage:
   aveCols file
adds all columns (up to 16 columns) in the given file, 
outputs the average (sum/#ofRows) of each column.  <fileName> can be the
name: stdin to accept input from stdin.
================================================================
========   bamToPsl   ====================================
================================================================
### kent source version 492 ###
bamToPsl - Convert a bam file to a psl and optionally also a fasta file that contains the reads.
usage:
   bamToPsl [options] in.bam out.psl
options:
   -fasta=output.fa - output query sequences to specified file
   -chromAlias=file - specify a two-column file: 1: alias, 2: other name
          for target name translation from column 1 name to column 2 name
          names not found are passed through intact
   -nohead          - do not output the PSL header, default has header output
   -dots=N          - output progress dot(.) every N alignments processed

Note: a chromAlias file can be obtained from a UCSC database, e.g.:
 hgsql -N -e 'select alias,chrom from chromAlias;' hg38 > hg38.chromAlias.tab
 Or from the downloads server:
  wget https://hgdownload.soe.ucsc.edu/goldenPath/hg38/database/chromAlias.txt.gz
See also our tool chromToUcsc

================================================================
========   barChartMaxLimit   ====================================
================================================================
Can't open file '-verbose=2' for reading
================================================================
========   bedClip   ====================================
================================================================
### kent source version 492 ###
bedClip - Remove lines from bed file that refer to off-chromosome locations.
usage:
   bedClip [options] input.bed chrom.sizes output.bed
chrom.sizes is a two-column file/URL: <chromosome name> <size in bases>
If the assembly <db> is hosted by UCSC, chrom.sizes can be a URL like
  http://hgdownload.soe.ucsc.edu/goldenPath/<db>/bigZips/<db>.chrom.sizes
or you may use the script fetchChromSizes to download the chrom.sizes file.
If not hosted by UCSC, a chrom.sizes file can be generated by running
twoBitInfo on the assembly .2bit file.
options:
   -truncate  - truncate items that span ends of chrom instead of the
                default of dropping the items
   -verbose=2 - set to get list of lines clipped and why
================================================================
========   bedCommonRegions   ====================================
================================================================
### kent source version 492 ###
bedCommonRegions - Create a bed file (just bed3) that contains the regions common to all inputs.
Regions are common only if exactly the same chromosome, starts, and end.  Overlap is not enough.
Each region must be in each input at most once. Output is stdout.
usage:
   bedCommonRegions file1 file2 file3 ... fileN

================================================================
========   bedGeneParts   ====================================
================================================================
### kent source version 492 ###
bedGeneParts - Given a bed, spit out promoter, first exon, or all introns.
usage:
   bedGeneParts part in.bed out.bed
Where part is either 'exons' or 'firstExon' or 'introns' or 'promoter' or 'firstCodingSplice'
or 'secondCodingSplice'
options:
   -proStart=NN - start of promoter relative to txStart, default -100
   -proEnd=NN - end of promoter relative to txStart, default 50

================================================================
========   bedGraphPack   ====================================
================================================================
### kent source version 492 ###
bedGraphPack v1 - Pack together adjacent records representing same value.
usage:
   bedGraphPack in.bedGraph out.bedGraph
The input needs to be sorted by chrom and this is checked.  To put in a pipe
use stdin and stdout in the command line in place of file names.

================================================================
========   bedGraphToBigWig   ====================================
================================================================
### kent source version 492 ###
bedGraphToBigWig v 2.10 - Convert a bedGraph file to bigWig format (bbi version: 4).
usage:
   bedGraphToBigWig in.bedGraph chrom.sizes out.bw
where in.bedGraph is a four column file in the format:
      <chrom> <start> <end> <value>
and chrom.sizes is a two-column file/URL: <chromosome name> <size in bases>
and out.bw is the output indexed big wig file.
If the assembly <db> is hosted by UCSC, chrom.sizes can be a URL like
  http://hgdownload.soe.ucsc.edu/goldenPath/<db>/bigZips/<db>.chrom.sizes
or you may use the script fetchChromSizes to download the chrom.sizes file.
If not hosted by UCSC, a chrom.sizes file can be generated by running
twoBitInfo on the assembly .2bit file.
The input bedGraph file must be sorted, use the unix sort command:
  LC_ALL=C sort -k1,1 -k2,2n unsorted.bedGraph > sorted.bedGraph
The LC_ALL=C variable activates case-sensitive sorting.
options:
   -blockSize=N - Number of items to bundle in r-tree.  Default 256
   -itemsPerSlot=N - Number of data points bundled at lowest level. Default 1024
   -sizesIsBb  -- If set, the chrom.sizes file is assumed to be a bigBed file.
   -unc - If set, do not use compression.
================================================================
========   bedJoinTabOffset   ====================================
================================================================
bedJoinTabOffset - Add file offset and length of line in a text file with the same name as the BED name to each row of BED.
usage:
   bedJoinTabOffset inTabFile inBedFile outBedFile

Given a bed file and tab file where each have a column with matching values:
1. first get the value of column0, the offset and line length from inTabFile.
2. Then go over the bed file, use the -bedKey (defaults to the name field)
   field and append its offset and length to the bed file as two separate
   fields. Write the new bed file to outBed.
options:
   -bedKey=integer   0-based index key of the bed file to use to match up with
                     the tab file. Default is 3 for the name field.

================================================================
========   bedJoinTabOffset.py   ====================================
================================================================
/usr/bin/env: 'python2.7': No such file or directory
================================================================
========   bedMergeAdjacent   ====================================
================================================================
### kent source version 492 ###
bedMergeAdjacent - merge adjacent blocks in a BED 12
usage:
   bedMergeAdjacent inBed outBed
options:

================================================================
========   bedPartition   ====================================
================================================================
### kent source version 492 ###
bedPartition - split BED ranges into non-overlapping ranges
usage:
   bedPartition [options] bedFile rangesBed

Split ranges in a BED into non-overlapping sets for use in cluster jobs.
Output is a BED 4 of the ranges and a generated name.
The bedFile maybe compressed and no ordering is assumed.

options:
   -verbose=1 - print statistics if >= 1
   -minPartitionItems=0 - minimum number of input items in a partition. Partitions with
    fewer items will merged with into subsequent partitions
   -partMergeDist=0 - will combine adjacent non-overlapping partitions that are
    separated by no more that this number of bases. -partMergeSize is an obsolete
    name for this option.
   -parallel=n - use this many cores for parallel sorting
notes:
   - Generate name is useful for identifying partition

================================================================
========   bedPileUps   ====================================
================================================================
### kent source version 492 ###
bedPileUps - Find (exact) overlaps if any in bed input
usage:
   bedPileUps in.bed
Where in.bed is in one of the ascii bed formats.
The in.bed file must be sorted by chromosome,start,
  to sort a bed file, use the unix sort command:
     sort -k1,1 -k2,2n unsorted.bed > sorted.bed

Options:
  -name - include BED name field 4 when evaluating uniqueness
  -tab  - use tabs to parse fields
  -verbose=2 - show the location and size of each pileUp

================================================================
========   bedRemoveOverlap   ====================================
================================================================
### kent source version 492 ###
bedRemoveOverlap - Remove overlapping records from a (sorted) bed file.  Gets rid of
`the smaller of overlapping records.
usage:
   bedRemoveOverlap in.bed out.bed
options:
   -xxx=XXX

================================================================
========   bedRestrictToPositions   ====================================
================================================================
### kent source version 492 ###
bedRestrictToPositions - Filter bed file, restricting to only ones that match chrom/start/ends specified in restrict.bed file.
usage:
   bedRestrictToPositions in.bed restrict.bed out.bed
options:
   -xxx=XXX

================================================================
========   bedSort   ====================================
================================================================
bedSort - Sort a .bed file by chrom,chromStart
usage:
   bedSort in.bed out.bed
in.bed and out.bed may be the same.
================================================================
========   bedToBigBed   ====================================
================================================================
### kent source version 492 ###
bedToBigBed v. 2.10 - Convert bed file to bigBed. (bbi version: 4)
usage:
   bedToBigBed in.bed chrom.sizes out.bb
Where in.bed is in one of the ascii bed formats, but not including track lines
and chrom.sizes is a two-column file/URL: <chromosome name> <size in bases>
and out.bb is the output indexed big bed file.

If the assembly <db> is hosted by UCSC, chrom.sizes can be a URL like
  http://hgdownload.soe.ucsc.edu/goldenPath/<db>/bigZips/<db>.chrom.sizes
or you may use the script fetchChromSizes to download the chrom.sizes file.
If you have bed annotations on patch sequences from NCBI, a more inclusive
chrom.sizes file can be found using a URL like
  http://hgdownload.soe.ucsc.edu/goldenPath/<db>/database/chromInfo.txt.gz
If not hosted by UCSC, a chrom.sizes file can be generated by running
twoBitInfo on the assembly .2bit file or the 2bit file or used directly
if the -sizesIs2Bit option is specified.

The chrom.sizes file may also be a chromAlias bigBed file, or a URL to
such a file, by specifying the -sizesIsChromAliasBb option.  When using
a chromAlias bigBed file, the input BED file may have chromosome names
matching any of the sequence name aliases in the chromAlias file.

For UCSC provided genomes, the chromAlias files can be found under:
    https://hgdownload.soe.ucsc.edu/goldenPath/<db>/bigZips/<db>.chromAlias.bb
For UCSC GenArk assembly hubs, the chrom aliases are namedd in the form:
    https://hgdownload.soe.ucsc.edu/hubs/GCF/006/542/625/GCF_006542625.1/GCF_006542625.1.chromAlias.bb
For a description of generating chromAlias files for your own assembly hub, see:
      http://genomewiki.ucsc.edu/index.php/Chrom_Alias

Without the -sort option, the in.bed file must be sorted by the chromosome and start fields.
  To sort a BED file, you can use bedSort or the following Unix command:
     sort -k1,1 -k2,2n unsorted.bed > sorted.bed
Sequences must be sorted by name so all sequences with the same name
are collected together, but they don't need to be in any particular order.

options:
   -type=bedN[+[P]] : 
                      N is between 3 and 15, 
                      optional (+) if extra "bedPlus" fields, 
                      optional P specifies the number of extra fields. Not required, but preferred.
                      Examples: -type=bed6 or -type=bed6+ or -type=bed6+3 
                      (see http://genome.ucsc.edu/FAQ/FAQformat.html#format1)
   -as=fields.as - If you have non-standard "bedPlus" fields, it's great to put a definition
                   of each field in a row in AutoSql format here.
   -blockSize=N - Number of items to bundle in r-tree.  Default 256
   -itemsPerSlot=N - Number of data points bundled at lowest level. Default 512
   -unc - If set, do not use compression.
   -tab - If set, expect fields to be tab separated, normally
           expects white space separator.
   -extraIndex=fieldList - If set, make an index on each field in a comma separated list
           extraIndex=name and extraIndex=name,id are commonly used.
   -sizesIs2Bit  -- If set, the chrom.sizes file is assumed to be a 2bit file.
   -sizesIsChromAliasBb -- If set, then chrom.sizes file is assumed to be a chromAlias
    bigBed file or a URL to a such a file (see above).
   -sizesIsBb  -- Obsolete name for -sizesIsChromAliasBb.
   -udcDir=/path/to/udcCacheDir  -- sets the UDC cache dir for caching of remote files.
   -allow1bpOverlap  -- allow exons to overlap by at most one base pair
   -fixScores  -- change non-integer scores to 0 and force integer scores into the range 0..1000
   -maxAlloc=N -- Set the maximum memory allocation size to N bytes
   -sort -- sort the input file

================================================================
========   bedToPsl   ====================================
================================================================
### kent source version 492 ###
bedToPsl - convert bed format files to psl format
usage:
   bedToPsl [options] chromSizes bedFile pslFile

Convert a BED file to a PSL file. This the result is an alignment.
 It is intended to allow processing by tools that operate on PSL.
If the BED has at least 12 columns, then a PSL with blocks is created.
Otherwise single-exon PSLs are created.

Options:
-tabs        -  use tab as a separator
-keepQuery   -  instead of creating a fake query, create PSL with identical query and
                target specs. Useful if bed features are to be lifted with pslMap and one 
                wants to keep the source location in the lift result.

================================================================
========   bedWeedOverlapping   ====================================
================================================================
### kent source version 492 ###
bedWeedOverlapping - Filter out beds that overlap a 'weed.bed' file.
usage:
   bedWeedOverlapping weeds.bed input.bed output.bed
options:
   -maxOverlap=0.N - maximum overlapping ratio, default 0 (any overlap)
   -invert - keep the overlapping and get rid of everything else

================================================================
========   bigBedInfo   ====================================
================================================================
### kent source version 492 ###
bigBedInfo - Show information about a bigBed file.
usage:
   bigBedInfo file.bb
options:
   -udcDir=/dir/to/cache - place to put cache for remote bigBed/bigWigs
   -chroms - list all chromosomes and their sizes
   -zooms - list all zoom levels and their sizes
   -as - get autoSql spec
   -asOut - output only autoSql spec
   -extraIndex - list all the extra indexes

================================================================
========   bigBedNamedItems   ====================================
================================================================
### kent source version 492 ###
bigBedNamedItems - Extract item of given name from bigBed
usage:
   bigBedNamedItems file.bb name output.bed
options:
   -nameFile - if set, treat name parameter as file full of space delimited names
   -field=fieldName - use index on field name, default is "name"
   -header - output a autoSql-style header (starts with '#').

================================================================
========   bigBedSummary   ====================================
================================================================
### kent source version 492 ###
bigBedSummary - Extract summary information from a bigBed file.
usage:
   bigBedSummary file.bb chrom start end dataPoints
Get summary data from bigBed for indicated region, broken into
dataPoints equal parts.  (Use dataPoints=1 for simple summary.)
options:
   -type=X where X is one of:
         coverage - % of region that is covered (default)
         mean - average depth of covered regions
         min - minimum depth of covered regions
         max - maximum depth of covered regions
   -fields - print out information on fields in file.
      If fields option is used, the chrom, start, end, dataPoints
      parameters may be omitted
   -udcDir=/dir/to/cache - place to put cache for remote bigBed/bigWigs

================================================================
========   bigBedToBed   ====================================
================================================================
### kent source version 492 ###
bigBedToBed v1 - Convert from bigBed to ascii bed format.
usage:
   bigBedToBed input.bb output.bed
options:
   -chrom=chr1 - if set restrict output to given chromosome
   -start=N - if set, restrict output to only that over start 0 based coordiate
   -end=N - if set, restrict output to only that under end
   -bed=in.bed - restrict output to all regions in a BED file
   -positions=in.pos - restrict output to all regions in a position file with 1-based start
   -udcDir=/dir/to/cache - place to put cache for remote bigBed/bigWigs
   -header - output a autoSql-style header (starts with '#').
   -tsv - output a TSV header (without '#').

================================================================
========   bigChainBreaks   ====================================
================================================================
### kent source version 492 ###
bigChainBreaks - output a set of rearrangement breakpoints
usage:
   bigChainBreaks bigChain.bb label breaks.txt
options:
   -xxx=XXX

================================================================
========   bigGuessDb   ====================================
================================================================
Usage: bigGuessDb [options] inFile - given a bigBed or "bigWig file or URL, 
guess the assembly based on the chrom names and sizes. Must have bigBedInfo and
bigWigInfo in PATH. Also requires a bigGuessDb.txt.gz, an alpha version of
which can be downloaded at https://hgwdev.gi.ucsc.edu/~max/bigGuessDb/bigGuessDb.txt.gz

Example run:
    $ wget https://hgwdev.gi.ucsc.edu/~max/bigGuessDb/bigGuessDb.txt.gz
    $ bigGuessDb --best https://ftp.ncbi.nlm.nih.gov/geo/samples/GSM1014nnn/GSM1014177/suppl/GSM1014177_mm9_wgEncodeUwDnaseNih3t3NihsMImmortalSigRep2.bigWig
    mm9


bigGuessDb: error: no such option: -v
================================================================
========   bigHeat   ====================================
================================================================
Usage: bigHeat [options] locationBed locationMatrixFnames chromSizes outDir - create one feature
            Duplicate BED features and color by them by the values in locationMatrix.
            Creates new bigBed files in outDir and creates a basic trackDb.ra file there.

    BED file looks like this:

            chr1 1 1000 myGene 0 + 1 1000 0,0,0
            chr2 1 1000 myGene2 0 + 1 1000 0,0,0

    locationMatrix looks like this:

            gene sample1 sample2 sample3
            myGene 1 2 3
            myGene2 0.1 3 10
            myGene2_probe2 0.1 3 10

    This will create a composite with three subtracks (sample1, sample2, sample). Each subtrack will have myGene,
    and colored in intensity with sample3 more intense than sample1 and sample2. Same for myGene2.
    Also can add a bigWig with a summary of all these values, one per nucleotide
    

bigHeat: error: no such option: -v
================================================================
========   bigMafToMaf   ====================================
================================================================
### kent source version 492 ###
bigMafToMaf - convert bigMaf to maf file
usage:
   bigMafToMaf file.bigMaf file.maf
options:
   -chrom=chr1 - if set restrict output to given chromosome
   -start=N - if set, restrict output to only that over start 0 based coordinate
   -end=N - if set, restrict output to only that under end
   -bed=in.bed - restrict output to all regions in a BED file
   -positions=in.pos - restrict output to all regions in a position file with 1-based start
   -udcDir=/dir/to/cache - place to put cache for remote bigBed/bigWigs

================================================================
========   bigPslToPsl   ====================================
================================================================
### kent source version 492 ###
bigPslToPsl - convert bigPsl file to psl
usage:
   bigPslToPsl bigPsl.bb output.psl
options:
   -chrom=chr1 - if set restrict output to given chromosome
   -start=N - if set, restrict output to only that over start
   -end=N - if set, restict output to only that under end
   -bed=in.bed - restrict output to all regions in a BED file
   -positions=in.pos - restrict output to all regions in a position file with 1-based start
   -collapseStrand   if target strand is '+', don't output it
   -udcDir=/dir/to/cache - place to put cache for remote bigBed/bigWigs

================================================================
========   bigWigAverageOverBed   ====================================
================================================================
### kent source version 492 ###
bigWigAverageOverBed v2 - Compute average score of big wig over each bed, which may have introns.
usage:
   bigWigAverageOverBed in.bw in.bed out.tab
The output columns are:
   name - name field from bed, which should be unique
   size - size of bed (sum of exon sizes
   covered - # bases within exons covered by bigWig
   sum - sum of values over all bases covered
   mean0 - average over bases with non-covered bases counting as zeroes
   mean - average over just covered bases
Options:
   -stats=stats.ra - Output a collection of overall statistics to stat.ra file
   -bedOut=out.bed - Make output bed that is echo of input bed but with mean column appended
   -sampleAroundCenter=N - Take sample at region N bases wide centered around bed item, rather
                     than the usual sample in the bed item.
   -minMax - include two additional columns containing the min and max observed in the area.
   -tsv - include a TSV header for input to other tools.

================================================================
========   bigWigCat   ====================================
================================================================
### kent source version 492 ###
bigWigCat v 4 - merge non-overlapping bigWig files
directly into bigWig format
usage:
   bigWigCat out.bw in1.bw in2.bw ...
Where in*.bw is in big wig format
and out.bw is the output indexed big wig file.
options:
   -itemsPerSlot=N - Number of data points bundled at lowest level. Default 1024

Note: must use wigToBigWig -fixedSummaries -keepAllChromosomes (perhaps in parallel cluster jobs) to create the input files.
Note: By non-overlapping we mean the entire span of each file, from first data point to last data point, must not overlap with that of other files.

================================================================
========   bigWigCluster   ====================================
================================================================
### kent source version 492 ###
bigWigCluster - Cluster bigWigs using a hacTree
usage:
   bigWigCluster input.list chrom.sizes output.json output.tab
where: input.list is a list of bigWig file names
       chrom.sizes is tab separated <chrom><size> for assembly for bigWigs
       output.json is json formatted output suitable for graphing with D3
       output.tab is tab-separated file of  of items ordered by tree with the fields
           label - label from -labels option or from file name with no dir or extention
           pos - number from 0-1 representing position according to tree and distance
           red - number from 0-255 representing recommended red component of color
           green - number from 0-255 representing recommended green component of color
           blue - number from 0-255 representing recommended blue component of color
           path - file name from input.list including directory and extension
options:
   -labels=fileName - label files from tabSeparated file with fields
           path - path to bigWig file
           label - a string with no tabs
   -precalc=precalc.tab - tab separated file with <file1> <file2> <distance>
            columns.
   -threads=N - number of threads to use, default 10
   -tmpDir=/tmp/path - place to put temp files, default current dir

================================================================
========   bigWigCorrelate   ====================================
================================================================
### kent source version 492 ###
bigWigCorrelate - Correlate bigWig files, optionally only on target regions.
usage:
   bigWigCorrelate a.bigWig b.bigWig
or
   bigWigCorrelate listOfFiles
options:
   -restrict=restrict.bigBed - restrict correlation to parts covered by this file
   -threshold=N.N - clip values to this threshold
   -rootNames - if set just report the root (minus directory and suffix) of file
                names when using listOfFiles
   -ignoreMissing - if set do not correlate where either side is missing data
                Normally missing data is treated as zeros

================================================================
========   bigWigInfo   ====================================
================================================================
### kent source version 492 ###
bigWigInfo - Print out information about bigWig file.
usage:
   bigWigInfo file.bw
options:
   -udcDir=/dir/to/cache - place to put cache for remote bigBed/bigWigs
   -chroms - list all chromosomes and their sizes
   -zooms - list all zoom levels and their sizes
   -minMax - list the min and max on a single line

================================================================
========   bigWigMerge   ====================================
================================================================
### kent source version 492 ###
bigWigMerge v2 - Merge together multiple bigWigs into a single output bedGraph.
You'll have to run bedGraphToBigWig to make the output bigWig.
The signal values are just added together to merge them
usage:
   bigWigMerge in1.bw in2.bw .. inN.bw out.bedGraph
options:
   -threshold=0.N - don't output values at or below this threshold. Default is 0.0
   -adjust=0.N - add adjustment to each value
   -clip=NNN.N - values higher than this are clipped to this value
   -inList - input file are lists of file names of bigWigs
   -max - merged value is maximum from input files rather than sum

================================================================
========   bigWigSummary   ====================================
================================================================
### kent source version 492 ###
bigWigSummary - Extract summary information from a bigWig file.
usage:
   bigWigSummary file.bigWig chrom start end dataPoints
Get summary data from bigWig for indicated region, broken into
dataPoints equal parts.  (Use dataPoints=1 for simple summary.)

NOTE:  start and end coordinates are in BED format (0-based)

options:
   -type=X where X is one of:
         mean - average value in region (default)
         min - minimum value in region
         max - maximum value in region
         std - standard deviation in region
         coverage - % of region that is covered
   -udcDir=/dir/to/cache - place to put cache for remote bigBed/bigWigs

================================================================
========   bigWigToBedGraph   ====================================
================================================================
### kent source version 492 ###
bigWigToBedGraph - Convert from bigWig to bedGraph format.
usage:
   bigWigToBedGraph in.bigWig out.bedGraph
options:
   -chrom=chr1 - if set restrict output to given chromosome
   -start=N - if set, restrict output to only that over start
   -end=N - if set, restict output to only that under end
   -bed=in.bed - restrict output to all regions in a BED file
   -positions=in.pos - restrict output to all regions in a position file with 1-based start
   -udcDir=/dir/to/cache - place to put cache for remote bigBed/bigWigs

================================================================
========   bigWigToWig   ====================================
================================================================
### kent source version 492 ###
bigWigToWig - Convert bigWig to wig.  This will keep more of the same structure of the
original wig than bigWigToBedGraph does, but still will break up large stepped sections
into smaller ones.
usage:
   bigWigToWig in.bigWig out.wig
options:
   -chrom=chr1 - if set restrict output to given chromosome
   -start=N - if set, restrict output to only that over start
   -end=N - if set, restict output to only that under end
   -bed=input.bed  Extract values for all ranges specified by input.bed. If bed4, will also print the bed name.
   -positions=in.pos - restrict output to all regions in a position file with 1-based start
   -udcDir=/dir/to/cache - place to put cache for remote bigBed/bigWigs

================================================================
========   blat   ====================================
================================================================
### kent source version 492 ###
blat - Standalone BLAT v. 39x1 fast sequence search command line tool
usage:
   blat database query [-ooc=11.ooc] output.psl
where:
   database and query are each either a .fa, .nib or .2bit file,
      or a list of these files with one file name per line.
   -ooc=11.ooc tells the program to load over-occurring 11-mers from
      an external file.  This will increase the speed
      by a factor of 40 in many cases, but is not required.
   output.psl is the name of the output file.
   Subranges of .nib and .2bit files may be specified using the syntax:
      /path/file.nib:seqid:start-end
   or
      /path/file.2bit:seqid:start-end
   or
      /path/file.nib:start-end
   With the second form, a sequence id of file:start-end will be used.
options:
   -t=type        Database type.  Type is one of:
                    dna - DNA sequence
                    prot - protein sequence
                    dnax - DNA sequence translated in six frames to protein
                  The default is dna.
   -q=type        Query type.  Type is one of:
                    dna - DNA sequence
                    rna - RNA sequence
                    prot - protein sequence
                    dnax - DNA sequence translated in six frames to protein
                    rnax - DNA sequence translated in three frames to protein
                  The default is dna.
   -prot          Synonymous with -t=prot -q=prot.
   -ooc=N.ooc     Use overused tile file N.ooc.  N should correspond to 
                  the tileSize.
   -tileSize=N    Sets the size of match that triggers an alignment.  
                  Usually between 8 and 12.
                  Default is 11 for DNA and 5 for protein.
   -stepSize=N    Spacing between tiles. Default is tileSize.
   -oneOff=N      If set to 1, this allows one mismatch in tile and still
                  triggers an alignment.  Default is 0.
   -minMatch=N    Sets the number of tile matches.  Usually set from 2 to 4.
                  Default is 2 for nucleotide, 1 for protein.
   -minScore=N    Sets minimum score.  This is the matches minus the 
                  mismatches minus some sort of gap penalty.  Default is 30.
   -minIdentity=N Sets minimum sequence identity (in percent).  Default is
                  90 for nucleotide searches, 25 for protein or translated
                  protein searches.
   -maxGap=N      Sets the size of maximum gap between tiles in a clump.  Usually
                  set from 0 to 3.  Default is 2. Only relevant for minMatch > 1.
   -noHead        Suppresses .psl header (so it's just a tab-separated file).
   -makeOoc=N.ooc Make overused tile file. Target needs to be complete genome.
   -repMatch=N    Sets the number of repetitions of a tile allowed before
                  it is marked as overused.  Typically this is 256 for tileSize
                  12, 1024 for tile size 11, 4096 for tile size 10.
                  Default is 1024.  Typically comes into play only with makeOoc.
                  Also affected by stepSize: when stepSize is halved, repMatch is
                  doubled to compensate.
   -noSimpRepMask Suppresses simple repeat masking.
   -mask=type     Mask out repeats.  Alignments won't be started in masked region
                  but may extend through it in nucleotide searches.  Masked areas
                  are ignored entirely in protein or translated searches. Types are:
                    lower - mask out lower-cased sequence
                    upper - mask out upper-cased sequence
                    out   - mask according to database.out RepeatMasker .out file
                    file.out - mask database according to RepeatMasker file.out
   -qMask=type    Mask out repeats in query sequence.  Similar to -mask above, but
                  for query rather than target sequence.
   -repeats=type  Type is same as mask types above.  Repeat bases will not be
                  masked in any way, but matches in repeat areas will be reported
                  separately from matches in other areas in the psl output.
   -minRepDivergence=NN   Minimum percent divergence of repeats to allow 
                  them to be unmasked.  Default is 15.  Only relevant for 
                  masking using RepeatMasker .out files.
   -dots=N        Output dot every N sequences to show program's progress.
   -trimT         Trim leading poly-T.
   -noTrimA       Don't trim trailing poly-A.
   -trimHardA     Remove poly-A tail from qSize as well as alignments in 
                  psl output.
   -fastMap       Run for fast DNA/DNA remapping - not allowing introns, 
                  requiring high %ID. Query sizes must not exceed 5000.
   -out=type      Controls output file format.  Type is one of:
                    psl - Default.  Tab-separated format, no sequence
                    pslx - Tab-separated format with sequence
                    axt - blastz-associated axt format
                    maf - multiz-associated maf format
                    sim4 - similar to sim4 format
                    wublast - similar to wublast format
                    blast - similar to NCBI blast format
                    blast8- NCBI blast tabular format
                    blast9 - NCBI blast tabular format with comments
   -fine          For high-quality mRNAs, look harder for small initial and
                  terminal exons.  Not recommended for ESTs.
   -maxIntron=N  Sets maximum intron size. Default is 750000.
   -extendThroughN  Allows extension of alignment through large blocks of Ns.

  To filter PSL files to the best hits (e.g. minimum ID > 90% or 'only best match'),
  you can use the commands pslReps, pslCDnaFilter or pslUniq.
================================================================
========   blatHuge   ====================================
================================================================
### kent source version 492 ###
blat - Standalone BLAT v. 39x1 fast sequence search command line tool
usage:
   blat database query [-ooc=11.ooc] output.psl
where:
   database and query are each either a .fa, .nib or .2bit file,
      or a list of these files with one file name per line.
   -ooc=11.ooc tells the program to load over-occurring 11-mers from
      an external file.  This will increase the speed
      by a factor of 40 in many cases, but is not required.
   output.psl is the name of the output file.
   Subranges of .nib and .2bit files may be specified using the syntax:
      /path/file.nib:seqid:start-end
   or
      /path/file.2bit:seqid:start-end
   or
      /path/file.nib:start-end
   With the second form, a sequence id of file:start-end will be used.
options:
   -t=type        Database type.  Type is one of:
                    dna - DNA sequence
                    prot - protein sequence
                    dnax - DNA sequence translated in six frames to protein
                  The default is dna.
   -q=type        Query type.  Type is one of:
                    dna - DNA sequence
                    rna - RNA sequence
                    prot - protein sequence
                    dnax - DNA sequence translated in six frames to protein
                    rnax - DNA sequence translated in three frames to protein
                  The default is dna.
   -prot          Synonymous with -t=prot -q=prot.
   -ooc=N.ooc     Use overused tile file N.ooc.  N should correspond to 
                  the tileSize.
   -tileSize=N    Sets the size of match that triggers an alignment.  
                  Usually between 8 and 12.
                  Default is 11 for DNA and 5 for protein.
   -stepSize=N    Spacing between tiles. Default is tileSize.
   -oneOff=N      If set to 1, this allows one mismatch in tile and still
                  triggers an alignment.  Default is 0.
   -minMatch=N    Sets the number of tile matches.  Usually set from 2 to 4.
                  Default is 2 for nucleotide, 1 for protein.
   -minScore=N    Sets minimum score.  This is the matches minus the 
                  mismatches minus some sort of gap penalty.  Default is 30.
   -minIdentity=N Sets minimum sequence identity (in percent).  Default is
                  90 for nucleotide searches, 25 for protein or translated
                  protein searches.
   -maxGap=N      Sets the size of maximum gap between tiles in a clump.  Usually
                  set from 0 to 3.  Default is 2. Only relevant for minMatch > 1.
   -noHead        Suppresses .psl header (so it's just a tab-separated file).
   -makeOoc=N.ooc Make overused tile file. Target needs to be complete genome.
   -repMatch=N    Sets the number of repetitions of a tile allowed before
                  it is marked as overused.  Typically this is 256 for tileSize
                  12, 1024 for tile size 11, 4096 for tile size 10.
                  Default is 1024.  Typically comes into play only with makeOoc.
                  Also affected by stepSize: when stepSize is halved, repMatch is
                  doubled to compensate.
   -noSimpRepMask Suppresses simple repeat masking.
   -mask=type     Mask out repeats.  Alignments won't be started in masked region
                  but may extend through it in nucleotide searches.  Masked areas
                  are ignored entirely in protein or translated searches. Types are:
                    lower - mask out lower-cased sequence
                    upper - mask out upper-cased sequence
                    out   - mask according to database.out RepeatMasker .out file
                    file.out - mask database according to RepeatMasker file.out
   -qMask=type    Mask out repeats in query sequence.  Similar to -mask above, but
                  for query rather than target sequence.
   -repeats=type  Type is same as mask types above.  Repeat bases will not be
                  masked in any way, but matches in repeat areas will be reported
                  separately from matches in other areas in the psl output.
   -minRepDivergence=NN   Minimum percent divergence of repeats to allow 
                  them to be unmasked.  Default is 15.  Only relevant for 
                  masking using RepeatMasker .out files.
   -dots=N        Output dot every N sequences to show program's progress.
   -trimT         Trim leading poly-T.
   -noTrimA       Don't trim trailing poly-A.
   -trimHardA     Remove poly-A tail from qSize as well as alignments in 
                  psl output.
   -fastMap       Run for fast DNA/DNA remapping - not allowing introns, 
                  requiring high %ID. Query sizes must not exceed 5000.
   -out=type      Controls output file format.  Type is one of:
                    psl - Default.  Tab-separated format, no sequence
                    pslx - Tab-separated format with sequence
                    axt - blastz-associated axt format
                    maf - multiz-associated maf format
                    sim4 - similar to sim4 format
                    wublast - similar to wublast format
                    blast - similar to NCBI blast format
                    blast8- NCBI blast tabular format
                    blast9 - NCBI blast tabular format with comments
   -fine          For high-quality mRNAs, look harder for small initial and
                  terminal exons.  Not recommended for ESTs.
   -maxIntron=N  Sets maximum intron size. Default is 750000.
   -extendThroughN  Allows extension of alignment through large blocks of Ns.

  To filter PSL files to the best hits (e.g. minimum ID > 90% or 'only best match'),
  you can use the commands pslReps, pslCDnaFilter or pslUniq.
================================================================
========   calc   ====================================
================================================================
### kent source version 492 ###
calc - Little command line calculator
usage:
   calc this + that * theOther / (a + b)
Options:
  -h - output result as a human-readable integer numbers, with k/m/g/t suffix

================================================================
========   catDir   ====================================
================================================================
catDir - concatenate files in directory to stdout.
For those times when too many files for cat to handle.
usage:
   catDir dir(s)
options:
   -r            Recurse into subdirectories
   -suffix=.suf  This will restrict things to files ending in .suf
   '-wild=*.???' This will match wildcards.
   -nonz         Prints file name of non-zero length files

================================================================
========   catUncomment   ====================================
================================================================
catUncomment - Concatenate input removing lines that start with '#'
Output goes to stdout
usage:
   catUncomment file(s)

================================================================
========   chainToBigChain   ====================================
================================================================
### kent source version 492 ###
chainToBigChain - converts chain to bigChain input (bed format with extra fields)
usage:
  chainToBigChain chainIn bigChainOut bigLinkOut

Output will be sorted

To build bigBed files:
  bedToBigBed -type=bed6+6 -as=bigChain.as -tab data.bigChain hg38.chrom.sizes data.bb
  bedToBigBed -type=bed4+1 -as=bigLink.as -tab data.bigLink hg38.chrom.sizes data.link.bb

================================================================
========   chopFaLines   ====================================
================================================================
chopFaLines - Read in FA file with long lines and rewrite it with shorter lines
usage:
   chopFaLines in.fa out.fa

================================================================
========   chromToUcsc   ====================================
================================================================
Usage: chromToUcsc [options] filename - change NCBI or Ensembl chromosome names to UCSC names in tabular or wiggle files, using a chromAlias table.

    Supports these UCSC file formats:
    BED, genePred, PSL, wiggle (all formats), bedGraph, VCF, SAM, GTF, Chain
    ... or any other csv or tsv format where the sequence (chromosome) name is a separate field.

    Requires a <genome>.chromAlias.tsv file which can be downloaded like this:
        chromToUcsc --get hg19              # download the file hg19.chromAlias.tsv into current directory
    Which also works for GenArk assemblies and can take an output directory:
        chromToUcsc --get GCF_000001735.3 -o /tmp/  # for GenArk assemblies, will translate to NCBI sequence names (accessions)

    If you do not want to use the --get option to retrieve the mapping tables, you can also download the alias mapping
    files yourself, e.g. for mm10 with 'wget https://hgdownload.soe.ucsc.edu/goldenPath/mm10/database/chromAlias.txt.gz'

    Then the script can be run like this:
        chromToUcsc -i in.bed -o out.bed -a hg19.chromAlias.tsv
        chromToUcsc -i in.bed -o out.bed -a https://hgdownload.soe.ucsc.edu/goldenPath/hg19/database/chromAlias.txt.gz
    Or in pipes, like this:
        cat test.bed | chromToUcsc -a mm10.chromAlias.tsv > test.ucsc.bed
    For BAM files use this program in a pipe with samtools:
        samtools view -h in.bam | ./chromToUcsc -a mm10.chromAlias.tsv | samtools -bS > out.bam

    By default, this script expects the chromosome name in the first field.
    The default works for BED, bedGraph, GTF, wiggle, VCF.
    For the following file formats, you will need to set the -k option to these values manually:
    genePred: 2 -- PSL: 10 (query) or 14 (target) -- chain: 2 (target) or 7 (query) -- SAM: 2
    (If a line starts with @ (SAM format), -k is automatically set to 2.)
    

Options:
  -h, --help            show this help message and exit
  --get=DOWNLOADDB      download a chrom alias table from UCSC for the
                        genomeDb into the current directory or directory
                        provided by -o and exit
  -a ALIASFNAME, --chromAlias=ALIASFNAME
                        a UCSC chromAlias file in tab-sep format or the
                        http/https URL to one
  -i INFNAME, --in=INFNAME
                        input filename, default: /dev/stdin
  -o OUTFNAME, --out=OUTFNAME
                        output filename, default: /dev/stdout
  -d, --debug           show debug messages
  -s, --skipUnknown     skip unknown sequence rather than generate an error.
  -k FIELDNO, --field=FIELDNO
                        Index of field (1-based) that contains the chromosome
                        name. No other field is touched by this program,
                        unless the SAM format is detected. Default is 1 (first
                        field).
================================================================
========   clusterMatrixToBarChartBed   ====================================
================================================================
### kent source version 492 ###
clusterMatrixToBarChartBed - Compute a barchart bed file from  a gene matrix
and a gene bed file and a way to cluster samples.
NOTE: consider using matrixClusterColumns and matrixToBarChartBed instead
usage:
   clusterMatrixToBarChartBed sampleClusters.tsv geneMatrix.tsv geneset.bed output.bed
where:
   sampleClusters.tsv is a two column tab separated file with sampleId and clusterId
   geneMatrix.tsv has a row for each gene. The first row uses the same sampleId as above
   geneset.bed has the maps the genes in the matrix (from it's first column) to the genome
        geneset.bed needs 6 standard bed fields.  Unless name2 is set it also needs a name2
        field as the last field
   output.bed is the resulting bar chart, with one column per cluster
options:
   -simple - don't store the position of gene in geneMatrix.tsv file in output
   -median - use median (instead of mean)
   -name2=twoColFile.tsv - get name2 from file where first col is same ase geneset.bed's name

================================================================
========   colTransform   ====================================
================================================================
colTransform - Add and/or multiply column by constant.
usage:
   colTransform column input.tab addFactor mulFactor output.tab
where:
   column is the column to transform, starting with 1
   input.tab is the tab delimited input file
   addFactor is what to add.  Use 0 here to not change anything
   mulFactor is what to multiply by.  Use 1 here not to change anything
   output.tab is the tab delimited output file

================================================================
========   countChars   ====================================
================================================================
countChars - Count the number of occurrences of a particular char
usage:
   countChars char file(s)
Char can either be a two digit hexadecimal value or
a single letter literal character
================================================================
========   cpg_lh   ====================================
================================================================
cpg_lh - calculate CpG Island data for cpgIslandExt tracks
usage:
    cpg_lh <sequence.fa>
where <sequence.fa> is fasta sequence, must be more than
   200 bases of legitimate sequence, not all N's

To process the output into the UCSC bed file format:

cpglh fastaInput.fa \
 | awk '{$2 = $2 - 1; width = $3 - $2;
   printf("%s\t%d\t%s\t%s %s\t%s\t%s\t%0.0f\t%0.1f\t%s\t%s\n",
    $1, $2, $3, $5, $6, width, $6, width*$7*0.01, 100.0*2*$6/width, $7, $9);}' \
     | sort -k1,1 -k2,2n > output.bed

The original cpg.c was written by Gos Miklem from the Sanger Center.
LaDeana Hillier added some modifications --> cpg_lh.c, and UCSC hass
added some further modifications to cpg_lh.c, so that its expected
number of CpGs in an island is calculated as described in
  Gardiner-Garden, M. and M. Frommer, 1987
  CpG islands in vertebrate genomes. J. Mol. Biol. 196:261-282

    Expected = (Number of C's * Number of G's) / Length

Instead of a sliding-window search for CpG islands, this cpg program
uses a running-sum score where a 'C' followed by a 'G' increases the
score by 17 and anything else decreases the score by 1.  When the
score transitions from positive to 0 (and at the end of the sequence),
the sequence in the current span is evaluated to see if it qualifies
as a CpG island (>200 bp length, >50% GC, >0.6 ratio of observed CpG
to expected).  Then the search recurses on the span from the position
with the max running score up to the current position.
================================================================
========   crTreeIndexBed   ====================================
================================================================
### kent source version 492 ###
crTreeIndexBed - Create an index for a bed file.
usage:
   crTreeIndexBed in.bed out.cr
options:
   -blockSize=N - number of children per node in index tree. Default 1024
   -itemsPerSlot=N - number of items per index slot. Default is half block size
   -noCheckSort - Don't check sorting order of in.tab

================================================================
========   crTreeSearchBed   ====================================
================================================================
### kent source version 492 ###
crTreeSearchBed - Search a crTree indexed bed file and print all items that overlap query.
usage:
   crTreeSearchBed file.bed index.cr chrom start end

================================================================
========   endsInLf   ====================================
================================================================
endsInLf - Check that last letter in files is end of line
usage:
   endsInLf file(s)
options:
   -zeroOk

================================================================
========   expMatrixToBarchartBed   ====================================
================================================================
usage: expMatrixToBarchartBed [-h] [--autoSql AUTOSQL]
                              [--groupOrderFile GROUPORDERFILE] [--useMean]
                              [--verbose]
                              sampleFile matrixFile bedFile outputFile

Generate a barChart bed6+5 file from a matrix, meta data, and coordinates.

positional arguments:
  sampleFile            Two column no header, the first column is the samples
                        which should match the matrix, the second is the
                        grouping (cell type, tissue, etc)
  matrixFile            The input matrix file. The samples in the first row
                        should exactly match the ones in the sampleFile. The
                        labels (ex ENST*****) in the first column should
                        exactly match the ones in the bed file.
  bedFile               Bed6+1 format. File that maps the column labels from
                        the matrix to coordinates. Tab separated; chr, start
                        coord, end coord, label, score, strand, gene name. The
                        score column is ignored.
  outputFile            The output file, bed 6+5 format. See the schema in
                        kent/src/hg/lib/barChartBed.as.

optional arguments:
  -h, --help            show this help message and exit
  --autoSql AUTOSQL     Optional autoSql description of extra fields in the
                        input bed.
  --groupOrderFile GROUPORDERFILE
                        Optional file to define the group order, list the
                        groups in a single column in the order desired. The
                        default ordering is alphabetical.
  --useMean             Calculate the group values using mean rather than
                        median.
  --verbose             Show runtime messages.
================================================================
========   faAlign   ====================================
================================================================
### kent source version 492 ###
faAlign - Align two fasta files
usage:
   faAlign target.fa query.fa output.axt
options:
   -dna - use DNA scoring scheme

================================================================
========   faCmp   ====================================
================================================================
### kent source version 492 ###
faCmp - Compare two .fa files
usage:
   faCmp [options] a.fa b.fa
options:
    -softMask - use the soft masking information during the compare
                Differences will be noted if the masking is different.
    -sortName - sort input files by name before comparing
    -peptide - read as peptide sequences
default:
    no masking information is used during compare.  It is as if both
    sequences were not masked.

Exit codes:
   - 0 if files are the same
   - 1 if files differ
   - 255 on an error


================================================================
========   faCount   ====================================
================================================================
### kent source version 492 ###
faCount - count base statistics and CpGs in FA files.
usage:
   faCount file(s).fa
     -summary  show only summary statistics
     -dinuc    include statistics on dinucletoide frequencies
     -strands  count bases on both strands

================================================================
========   faFilter   ====================================
================================================================
### kent source version 492 ###
faFilter - Filter fa records, selecting ones that match the specified conditions
usage:
   faFilter [options] in.fa out.fa

Options:
    -name=wildCard  - Only pass records where name matches wildcard
                      * matches any string or no character.
                      ? matches any single character.
                      anything else etc must match the character exactly
                      (these will will need to be quoted for the shell)
    -namePatList=filename - A list of regular expressions, one per line, that
                            will be applied to the fasta name the same as -name
    -v - invert match, select non-matching records.
    -minSize=N - Only pass sequences at least this big.
    -maxSize=N - Only pass sequences this size or smaller.
    -maxN=N Only pass sequences with fewer than this number of N's
    -uniq - Removes duplicate sequence ids, keeping the first.
    -i    - make -uniq ignore case so sequence IDs ABC and abc count as dupes.

All specified conditions must pass to pass a sequence.  If no conditions are
specified, all records will be passed.

================================================================
========   faFilterN   ====================================
================================================================
faFilterN - Get rid of sequences with too many N's
usage:
   faFilterN in.fa out.fa maxPercentN
options:
   -out=in.fa.out
   -uniq=self.psl

================================================================
========   faFrag   ====================================
================================================================
faFrag - Extract a piece of DNA from a .fa file.
usage:
   faFrag in.fa start end out.fa
options:
   -mixed - preserve mixed-case in FASTA file

================================================================
========   faNoise   ====================================
================================================================
faNoise - Add noise to .fa file
usage:
   faNoise inName outName transitionPpt transversionPpt insertPpt deletePpt chimeraPpt
options:
   -upper - output in upper case

================================================================
========   faOneRecord   ====================================
================================================================
faOneRecord - Extract a single record from a .FA file
usage:
   faOneRecord in.fa recordName

================================================================
========   faPolyASizes   ====================================
================================================================
### kent source version 492 ###
faPolyASizes - get poly A sizes
usage:
   faPolyASizes in.fa out.tab

output file has four columns:
   id seqSize tailPolyASize headPolyTSize

options:

================================================================
========   faRandomize   ====================================
================================================================
### kent source version 492 ###
faRandomize - Program to create random fasta records
usage:
  faRandomize [-seed=N] in.fa randomized.fa
    Use optional -seed argument to specify seed (integer) for random
    number generator (rand).  Generated sequence has the
    same base frequency as seen in original fasta records.
================================================================
========   faRc   ====================================
================================================================
faRc - Reverse complement a FA file
usage:
   faRc in.fa out.fa
In.fa and out.fa may be the same file.
options:
   -keepName - keep name identical (don't prepend RC)
   -keepCase - works well for ACGTUN in either case. bizarre for other letters.
               without it bases are turned to lower, all else to n's
   -justReverse - prepends R unless asked to keep name
   -justComplement - prepends C unless asked to keep name
                     (cannot appear together with -justReverse)

================================================================
========   faSize   ====================================
================================================================
### kent source version 492 ###
faSize - print total base count in fa files.
usage:
   faSize file(s).fa
Command flags
   -detailed        outputs name and size of each record
                    has the side effect of printing nothing else
   -tab             output statistics in a tab separated format
   -veryDetailed    outputs name, size, #Ns, #real, #upper, #lower of each record

================================================================
========   faSomeRecords   ====================================
================================================================
### kent source version 492 ###
faSomeRecords - Extract multiple fa records
usage:
   faSomeRecords in.fa listFile out.fa
options:
   -exclude - output sequences not in the list file.

================================================================
========   faSplit   ====================================
================================================================
### kent source version 492 ###
faSplit - Split an fa file into several files.
usage:
   faSplit how input.fa count outRoot
where how is either 'about' 'byname' 'base' 'gap' 'sequence' or 'size'.  
Files split by sequence will be broken at the nearest fa record boundary. 
Files split by base will be broken at any base.  
Files broken by size will be broken every count bases.

Examples:
   faSplit sequence estAll.fa 100 est
This will break up estAll.fa into 100 files
(numbered est001.fa est002.fa, ... est100.fa
Files will only be broken at fa record boundaries

   faSplit base chr1.fa 10 1_
This will break up chr1.fa into 10 files

   faSplit size input.fa 2000 outRoot
This breaks up input.fa into 2000 base chunks

   faSplit about est.fa 20000 outRoot
This will break up est.fa into files of about 20000 bytes each by record.

   faSplit byname scaffolds.fa outRoot/ 
This breaks up scaffolds.fa using sequence names as file names.
       Use the terminating / on the outRoot to get it to work correctly.

   faSplit gap chrN.fa 20000 outRoot
This breaks up chrN.fa into files of at most 20000 bases each, 
at gap boundaries if possible.  If the sequence ends in N's, the last
piece, if larger than 20000, will be all one piece.

Options:
    -verbose=2 - Write names of each file created (=3 more details)
    -maxN=N - Suppress pieces with more than maxN n's.  Only used with size.
              default is size-1 (only suppresses pieces that are all N).
    -oneFile - Put output in one file. Only used with size
    -extra=N - Add N extra bytes at the end to form overlapping pieces.  Only used with size.
    -out=outFile Get masking from outfile.  Only used with size.
    -lift=file.lft Put info on how to reconstruct sequence from
                   pieces in file.lft.  Only used with size and gap.
    -minGapSize=X Consider a block of Ns to be a gap if block size >= X.
                  Default value 1000.  Only used with gap.
    -noGapDrops - include all N's when splitting by gap.
    -outDirDepth=N Create N levels of output directory under current dir.
                   This helps prevent NFS problems with a large number of
                   file in a directory.  Using -outDirDepth=3 would
                   produce ./1/2/3/outRoot123.fa.
    -prefixLength=N - used with byname option. create a separate output
                   file for each group of sequences names with same prefix
                   of length N.

================================================================
========   faToFastq   ====================================
================================================================
### kent source version 492 ###
faToFastq - Convert fa to fastq format, just faking quality values.
usage:
   faToFastq in.fa out.fastq
options:
   -qual=X quality letter to use.  Default is '<' which is good I think....

================================================================
========   faToTab   ====================================
================================================================
faToTab - convert fa file to tab separated file
usage:
   faToTab infileName outFileName
options:
     -type=seqType   sequence type, dna or protein, default is dna
     -keepAccSuffix - don't strip dot version off of sequence id, keep as is

================================================================
========   faToTwoBit   ====================================
================================================================
### kent source version 492 ###
faToTwoBit - Convert DNA from fasta to 2bit format
usage:
   faToTwoBit in.fa [in2.fa in3.fa ...] out.2bit
options:
   -long            use 64-bit offsets for index.   Allow for twoBit to contain more than 4Gb of sequence. 
                    NOT COMPATIBLE WITH OLDER CODE.
   -noMask          Ignore lower-case masking in fa file.
   -stripVersion    Strip off version number after '.' for GenBank accessions.
   -ignoreDups      Convert first sequence only if there are duplicate sequence
                    names.  Use 'twoBitDup' to find duplicate sequences.
   -namePrefix=XX.  add XX. to start of sequence name in 2bit.
================================================================
========   faToVcf   ====================================
================================================================
### kent source version 492 ###
faToVcf - Convert a FASTA alignment file to Variant Call Format (VCF) single-nucleotide diffs
usage:
   faToVcf in.fa out.vcf
options:
   -ambiguousToN         Treat all IUPAC ambiguous bases (N, R, V etc) as N (no call).
   -excludeFile=file     Exclude sequences named in file which has one sequence name per line
   -includeNoAltN        Include base positions with no alternate alleles observed, but at
                         least one N (missing base / no-call)
   -includeRef           Include the reference in the genotype columns
                         (default: omitted as redundant)
   -maskSites=file       Exclude variants in positions recommended for masking in file
                         (typically https://github.com/W-L/ProblematicSites_SARS-CoV2/raw/master/problematic_sites_sarsCov2.vcf)
   -maxDiff=N            Exclude sequences with more than N mismatches with the reference
                         (if -windowSize is used, sequences are masked accordingly first)
   -minAc=N              Ignore alternate alleles observed fewer than N times
   -minAf=F              Ignore alternate alleles observed in less than F of non-N bases
   -minAmbigInWindow=N   When -windowSize is provided, mask any base for which there are at
                         least this many N, ambiguous or gap characters within the window.
                         (default: 2)
   -noGenotypes          Output 8-column VCF, without the sample genotype columns
   -ref=seqName          Use seqName as the reference sequence; must be present in faFile
                         (default: first sequence in faFile)
   -resolveAmbiguous     For IUPAC ambiguous characters like R (A or G), if the character
                         represents two bases and one is the reference base, convert it to the
                         non-reference base.  Otherwise convert it to N.
   -startOffset=N        Add N bases to each position (for trimmed alignments)
   -vcfChrom=seqName     Use seqName for the CHROM column in VCF (default: ref sequence)
   -windowSize=N         Mask any base for which there are at least -minAmbigWindow bases in a
                         window of +-N bases around the base.  Masking approach adapted from
                         https://github.com/roblanf/sarscov2phylo/ file scripts/mask_seq.py
                         Use -windowSize=7 for same results.
in.fa must contain a series of sequences with different names and the same length.
Both N and - are treated as missing information.
================================================================
========   faTrans   ====================================
================================================================
### kent source version 492 ###
faTrans - Translate DNA .fa file to peptide
usage:
   faTrans in.fa out.fa
options:
   -stop stop at first stop codon (otherwise puts in Z for stop codons)
   -offset=N start at a particular offset.
   -cdsUpper - cds is in upper case

================================================================
========   fastqStatsAndSubsample   ====================================
================================================================
### kent source version 492 ###
fastqStatsAndSubsample v2 - Go through a fastq file doing sanity checks and collecting stats
and also producing a smaller fastq out of a sample of the data.  The fastq input may be
compressed with gzip or bzip2.
Paired-end samples: run on both files, the seed is fixed so it will chose the paired reads
usage:
   fastqStatsAndSubsample in.fastq out.stats out.fastq
options:
   -sampleSize=N - default 100000
   -seed=N - Use given seed for random number generator.  Default 0.
   -smallOk - Not an error if less than sampleSize reads.  out.fastq will be entire in.fastq
   -json - out.stats will be in json rather than text format
Use /dev/null for out.fastq and/or out.stats if not interested in these outputs

================================================================
========   fastqToFa   ====================================
================================================================
### kent source version 492 ###
#	no name checks will be made on lines beginning with @
#	ignore quality scores
#	using default Phread quality score algorithm
#	all errors will cause exit
fastqToFa - Convert from fastq to fasta format.
usage:
   fastqToFa [options] in.fastq out.fa
options:
   -nameVerify='string' - for multi-line fastq files, 'string' must
	match somewhere in the sequence names in order to correctly
	identify the next sequence block (e.g.: -nameVerify='Supercontig_')
   -qual=file.qual.fa - output quality scores to specifed file
	(default: quality scores are ignored)
   -qualSizes=qual.sizes - write sizes file for the quality scores
   -noErrors - warn only on problems, do not error out
              (specify -verbose=3 to see warnings
   -solexa - use Solexa/Illumina quality score algorithm
	(instead of Phread quality)
   -verbose=2 - set warning level to get some stats output during processing
================================================================
========   fetchChromSizes   ====================================
================================================================
fetchChromSizes - script to grab chrom.sizes from UCSC via either of: mysql, wget or ftp

usage: fetchChromSizes <db> > <db>.chrom.sizes
    used to fetch chrom.sizes information from UCSC for the given <db>
<db> - name of UCSC database, e.g.: hg38, hg18, mm9, etc ...

This script expects to find one of the following commands:
    wget, mysql, or ftp in order to fetch information from UCSC.
Route the output to the file <db>.chrom.sizes as indicated above.
This data is available at the URL:
  http://hgdownload.soe.ucsc.edu/goldenPath/<db>/bigZips/<db>.chrom.sizes

Example:   fetchChromSizes hg38 > hg38.chrom.sizes
================================================================
========   findMotif   ====================================
================================================================
### kent source version 492 ###
findMotif - find specified motif in sequence
usage:
   findMotif [options] -motif=<acgt...> sequence
where:
   sequence is a .fa , .nib or .2bit file or a file which is a list of
       sequence files.
options:
   -motif=<acgt...> - search for this specified motif
                      (case ignored, [acgt] only)
   NOTE: motif must be at least 4 characters, less than 32
   -chr=<chrN> - process only this one chrN from the sequence
   -strand=<+|-> - limit to only one strand.  Default is both.
   -bedOutput - output bed format (this is the default)
   -wigOutput - output wiggle data format instead of bed file
   -misMatch=N - allow N mismatches (0 default == perfect match)
   -verbose=N - set information level [1-4]
   -verbose=4 - will display gaps as bed file data lines to stderr

 * libpopcnt.h - C/C++ library for counting the number of 1 bits (bit
 * population count) in an array as quickly as possible using
 * specialized CPU instructions i.e. POPCNT, AVX2, AVX512, NEON.
 *
 * Copyright (c) 2016 - 2020, Kim Walisch
 * Copyright (c) 2016 - 2018, Wojciech Muła
 *
 * All rights reserved.
 *
 * Redistribution and use in source and binary forms, with or without
 * modification, are permitted provided that the following conditions are met:
 *
 * 1. Redistributions of source code must retain the above copyright notice,
 *    this list of conditions and the following disclaimer.
 * 2. Redistributions in binary form must reproduce the above copyright notice,
 *    this list of conditions and the following disclaimer in the documentation
 *    and/or other materials provided with the distribution.
 *
 * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS 'AS IS'
 * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
 * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
 * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
 * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
 * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
 * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS)
 * INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
 * CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
 * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
 * POSSIBILITY OF SUCH DAMAGE.
================================================================
========   fixStepToBedGraph.pl   ====================================
================================================================
fixStepToBedGraph.pl - read fixedStep wiggle input data, output four column bedGraph format data
usage: fixStepToBedGraph.pl
	run in a pipeline like this:
usage: zcat fixedStepData.gz | fixStepToBedGraph.pl | gzip > bedGraph.gz
reading input data from stdin ...
Can't open -verbose=2: No such file or directory at ./fixStepToBedGraph.pl line 28.
================================================================
========   gencodeVersionForGenes   ====================================
================================================================
### kent source version 492 ###
gencodeVersionForGenes - Figure out which version of a gencode gene set a set of gene 
identifiers best fits
usage:
   gencodeVersionForGenes genes.txt geneSymVer.tsv
where:
   genes.txt is a list of gene symbols or identifiers, one per line
   geneSymVer.tsv is output of gencodeGeneSymVer, usually /hive/data/inside/geneSymVerTx.tsv
options:
   -bed=output.bed - Create bed file for mapping genes to genome via best gencode fit
   -upperCase - Force genes to be upper case
   -allBed=outputDir - Output beds for all versions in geneSymVer.tsv
   -geneToId=geneToId.tsv - Output two column file with symbol from gene.txt and gencode
                  gene names as second. Symbols with no gene found are omitted
   -miss=output.txt - unassigned genes are put here, one per line
   -target=ucscDb - something like hg38 or hg19.  If set this will use most recent
                version of each gene that exists for the assembly in symbol mode

================================================================
========   gensub2   ====================================
================================================================
gensub2 - version 12.20
Generate condor submission file from template and two file lists.
Usage:
   gensub2 <file list 1> <file list 2> <template file> <output file>
This will substitute each file in the file lists for $(path1) and $(path2)
in the template between #LOOP and #ENDLOOP, and write the results to
the output.  Other substitution variables are:
       $(path1)  - Full path name of first file.
       $(path2)  - Full path name of second file.
       $(dir1)   - First directory. Includes trailing slash if any.
       $(dir2)   - Second directory.
       $(lastDir1) - The last directory in the first path. Includes trailing slash if any.
       $(lastDir2) - The last directory in the second path. Includes trailing slash if any.
       $(lastDirs1=<n>) - The last n directories in the first path.
       $(lastDirs2=<n>) - The last n directories in the second path.
       $(root1)  - First file name without directory or extension.
       $(root2)  - Second file name without directory or extension.
       $(ext1)   - First file extension.
       $(ext2)   - Second file extension.
       $(file1)  - Name without dir of first file.
       $(file2)  - Name without dir of second file.
       $(num1)   - Index of first file in list.
       $(num2)   - Index of second file in list.
The <file list 2> parameter can be 'single' if there is only one file list and 
'selfPair' if there is a single list, but you want all
pairs of single list with itself.  By default the order is diagonal, meaning if 
the first list is ABC and the second list is abc the combined 
order is Aa Ba Ab Ca Bb Ac  Cb Bc Cc.  This tends to put the 
largest jobs first if the file lists are both sorted by size. 
The following options can change this:
    -group1 - write elements in order Aa Ab Ac Ba Bb Bc Ca Cb Cc
    -group2 - write elements in order Aa Ba Ca Ab Bb Cb Ac Bc Cc
template file syntax help for check statement: {check 'when' 'what' <file>}
 where 'when' is either 'in' or 'out'
 and 'what' is one of: 'exists' 'exists+' 'line' 'line+'
 'exists' means file exists, may be zero size
 'exists+' means file exists and is non-zero size
 'line' means file may have 0 or more lines of ascii data and is properly
        line-feed terminated
 'line+' means file is 1 or more lines of ascii data and is properly
        line-feed terminated
================================================================
========   gfClient   ====================================
================================================================
### kent source version 492 ###
gfClient v. 39x1 - A client for the genomic finding program that produces a .psl file
usage:
   gfClient host port seqDir in.fa out.psl
where
   host is the name of the machine running the gfServer
   port is the same port that you started the gfServer with
   seqDir is the path of the .2bit or .nib files relative to the current dir
       (note these are needed by the client as well as the server)
   in.fa is a fasta format file.  May contain multiple records
   out.psl is where to put the output
options:
   -t=type       Database type. Type is one of:
                   dna - DNA sequence
                   prot - protein sequence
                   dnax - DNA sequence translated in six frames to protein
                 The default is dna.
   -q=type       Query type. Type is one of:
                   dna - DNA sequence
                   rna - RNA sequence
                   prot - protein sequence
                   dnax - DNA sequence translated in six frames to protein
                   rnax - DNA sequence translated in three frames to protein
   -prot         Synonymous with -t=prot -q=prot.
   -dots=N       Output a dot every N query sequences.
   -nohead       Suppresses 5-line psl header.
   -minScore=N   Sets minimum score.  This is twice the matches minus the 
                 mismatches minus some sort of gap penalty.  Default is 30.
   -minIdentity=N   Sets minimum sequence identity (in percent).  Default is
                 90 for nucleotide searches, 25 for protein or translated
                 protein searches.
   -out=type     Controls output file format.  Type is one of:
                   psl - Default.  Tab-separated format without actual sequence
                   pslx - Tab-separated format with sequence
                   axt - blastz-associated axt format
                   maf - multiz-associated maf format
                   sim4 - similar to sim4 format
                   wublast - similar to wublast format
                   blast - similar to NCBI blast format
                   blast8- NCBI blast tabular format
                   blast9 - NCBI blast tabular format with comments
   -maxIntron=N   Sets maximum intron size. Default is 750000.
   -genome=name  When using a dynamic gfServer, The genome name is used to 
                 find the data files relative to the dynamic gfServer root, named 
                 in the form $genome.2bit, $genome.untrans.gfidx, and $genome.trans.gfidx
   -genomeDataDir=path
                 When using a dynamic gfServer, this is the dynamic gfServer root directory
                 that contained the genome data files.  Defaults to being the root directory.
                
================================================================
========   gfPcr   ====================================
================================================================
### kent source version 492 ###
gfPcr - In silico PCR version 39x1 using gfServer index.
usage:
   gfPcr host port seqDir fPrimer rPrimer output
or
   gfPcr host port seqDir batch output
Where:
   host is the name of the machine running the gfServer
   port is the gfServer port (usually 17779)
   seqDir is where the nib or 2bit files for the genome database are
   fPrimer is the forward strand primer
   rPrimer is the reverse strand primer
   output is the output file.  Use 'stdout' for output to standard output
   batch is a space or tab delimited file with the following fields on each line
       name/fPrimer/rPrimer/maxProductSize
options:
   -maxSize=N - Maximum size of PCR product (default 4000)
   -minPerfect=N - Minimum size of perfect match at 3' end of primer (default 15)
   -minGood=N - Minimum size where there must be 2 matches for each mismatch (default 18)
   -out=XXX - Output format.  Either
      fa - fasta with position, primers in header (default)
      bed - tab delimited format. Fields: chrom/start/end/name/score/strand
      psl - blat format.
   -name=XXX - Name to use in bed output.
   -genome=name  When using a dynamic gfServer, The genome name is used to 
                 find the data files relative to the dynamic gfServer root, named 
                 in the form $genome.2bit, and $genome.untrans.gfidx.
   -genomeDataDir=path
                 When using a dynamic gfServer, this is the dynamic gfServer root directory
                 that contained the genome data files.  Defaults to being the root directory.
                

================================================================
========   gfServer   ====================================
================================================================
### kent source version 492 ###
gfServer v 39x1 - Make a server to quickly find where DNA occurs in genome (32-bit index)
   To set up a server:
      gfServer start host port file(s)
      where the files are .2bit or .nib format files specified relative to the current directory
   To remove a server:
      gfServer stop host port
   To query a server with DNA sequence:
      gfServer query host port probe.fa
   To query a server with protein sequence:
      gfServer protQuery host port probe.fa
   To query a server with translated DNA sequence:
      gfServer transQuery host port probe.fa
   To query server with PCR primers:
      gfServer pcr host port fPrimer rPrimer maxDistance
   To process one probe fa file against a .2bit format genome (not starting server):
      gfServer direct probe.fa file(s).2bit
   To test PCR without starting server:
      gfServer pcrDirect fPrimer rPrimer file(s).2bit
   To figure out if server is alive, on static instances get usage statics as well:
      gfServer status host port
     For dynamic gfServer instances, specify -genome and optionally the -genomeDataDir
     to get information on an untranslated genome index. Include -trans to get about information
     about a translated genome index
   To get input file list:
      gfServer files host port
   To generate a precomputed index:
      gfServer index gfidx file(s)
     where the files are .2bit or .nib format files.  Separate indexes are
     be created for untranslated and translated queries.  These can be used
     with a persistent server as with 'start -indexFile or a dynamic server.
     They must follow the naming convention for for dynamic servers.
   To run a dynamic server (usually called by xinetd):
      gfServer dynserver rootdir
     Data files for genomes are found relative to the root directory.
     Queries are made using the prefix of the file path relative to the root
     directory.  The files $genome.2bit, $genome.untrans.gfidx, and
     $genome.trans.gfidx are required. Typically the structure will be in
     the form:
         $rootdir/$genomeDataDir/$genome.2bit
         $rootdir/$genomeDataDir/$genome.untrans.gfidx
         $rootdir/$genomeDataDir/$genome.trans.gfidx
     in this case, one would call gfClient with 
         -genome=$genome -genomeDataDir=$genomeDataDir
     Often $genomeDataDir will be the same name as $genome, however it
     can be a multi-level path. For instance:
          GCA/902/686/455/GCA_902686455.1_mSciVul1.1/
     The translated or untranslated index maybe omitted if there is no
     need to handle that type of request.
     The -perSeqMax functionality can be implemented by creating a file
         $rootdir/$genomeDataDir/$genome.perseqmax

options:
   -tileSize=N     Size of n-mers to index.  Default is 11 for nucleotides, 4 for
                   proteins (or translated nucleotides).
   -stepSize=N     Spacing between tiles. Default is tileSize.
   -minMatch=N     Number of n-mer matches that trigger detailed alignment.
                   Default is 2 for nucleotides, 3 for proteins.
   -maxGap=N       Number of insertions or deletions allowed between n-mers.
                   Default is 2 for nucleotides, 0 for proteins.
   -trans          Translate database to protein in 6 frames.  Note: it is best
                   to run this on RepeatMasked data in this case.
   -log=logFile    Keep a log file that records server requests.
   -seqLog         Include sequences in log file (not logged with -syslog).
   -ipLog          Include user's IP in log file (not logged with -syslog).
   -debugLog       Include debugging info in log file.
   -syslog         Log to syslog.
   -logFacility=facility  Log to the specified syslog facility - default local0.
   -mask           Use masking from .2bit file.
   -repMatch=N     Number of occurrences of a tile (n-mer) that triggers repeat masking the
                   tile. Default is 1024.
   -noSimpRepMask  Suppresses simple repeat masking.
   -maxDnaHits=N   Maximum number of hits for a DNA query that are sent from the server.
                   Default is 100.
   -maxTransHits=N Maximum number of hits for a translated query that are sent from the server.
                   Default is 200.
   -maxNtSize=N    Maximum size of untranslated DNA query sequence.
                   Default is 40000.
   -maxAaSize=N    Maximum size of protein or translated DNA queries.
                   Default is 8000.
   -perSeqMax=file File contains one seq filename (possibly with ':seq' suffix) per line.
                   -maxDnaHits will be applied to each filename[:seq] separately: each may
                   have at most maxDnaHits/2 hits.  The filename MUST not include the directory.
                   Useful for assemblies with many alternate/patch sequences.
   -canStop        If set, a quit message will actually take down the server.
   -indexFile      Index file created by `gfServer index'. Saving index can speed up
                   gfServer startup by two orders of magnitude.  The parameters must
                   exactly match the parameters when the file is written or bad things
                   will happen.
   -timeout=N      Timeout in seconds.
                   Default is 90.

================================================================
========   gfServerHuge   ====================================
================================================================
### kent source version 492 ###
gfServer v 39x1 - Make a server to quickly find where DNA occurs in genome (64-bit index)
   To set up a server:
      gfServer start host port file(s)
      where the files are .2bit or .nib format files specified relative to the current directory
   To remove a server:
      gfServer stop host port
   To query a server with DNA sequence:
      gfServer query host port probe.fa
   To query a server with protein sequence:
      gfServer protQuery host port probe.fa
   To query a server with translated DNA sequence:
      gfServer transQuery host port probe.fa
   To query server with PCR primers:
      gfServer pcr host port fPrimer rPrimer maxDistance
   To process one probe fa file against a .2bit format genome (not starting server):
      gfServer direct probe.fa file(s).2bit
   To test PCR without starting server:
      gfServer pcrDirect fPrimer rPrimer file(s).2bit
   To figure out if server is alive, on static instances get usage statics as well:
      gfServer status host port
     For dynamic gfServer instances, specify -genome and optionally the -genomeDataDir
     to get information on an untranslated genome index. Include -trans to get about information
     about a translated genome index
   To get input file list:
      gfServer files host port
   To generate a precomputed index:
      gfServer index gfidx file(s)
     where the files are .2bit or .nib format files.  Separate indexes are
     be created for untranslated and translated queries.  These can be used
     with a persistent server as with 'start -indexFile or a dynamic server.
     They must follow the naming convention for for dynamic servers.
   To run a dynamic server (usually called by xinetd):
      gfServer dynserver rootdir
     Data files for genomes are found relative to the root directory.
     Queries are made using the prefix of the file path relative to the root
     directory.  The files $genome.2bit, $genome.untrans.gfidx, and
     $genome.trans.gfidx are required. Typically the structure will be in
     the form:
         $rootdir/$genomeDataDir/$genome.2bit
         $rootdir/$genomeDataDir/$genome.untrans.gfidx
         $rootdir/$genomeDataDir/$genome.trans.gfidx
     in this case, one would call gfClient with 
         -genome=$genome -genomeDataDir=$genomeDataDir
     Often $genomeDataDir will be the same name as $genome, however it
     can be a multi-level path. For instance:
          GCA/902/686/455/GCA_902686455.1_mSciVul1.1/
     The translated or untranslated index maybe omitted if there is no
     need to handle that type of request.
     The -perSeqMax functionality can be implemented by creating a file
         $rootdir/$genomeDataDir/$genome.perseqmax

options:
   -tileSize=N     Size of n-mers to index.  Default is 11 for nucleotides, 4 for
                   proteins (or translated nucleotides).
   -stepSize=N     Spacing between tiles. Default is tileSize.
   -minMatch=N     Number of n-mer matches that trigger detailed alignment.
                   Default is 2 for nucleotides, 3 for proteins.
   -maxGap=N       Number of insertions or deletions allowed between n-mers.
                   Default is 2 for nucleotides, 0 for proteins.
   -trans          Translate database to protein in 6 frames.  Note: it is best
                   to run this on RepeatMasked data in this case.
   -log=logFile    Keep a log file that records server requests.
   -seqLog         Include sequences in log file (not logged with -syslog).
   -ipLog          Include user's IP in log file (not logged with -syslog).
   -debugLog       Include debugging info in log file.
   -syslog         Log to syslog.
   -logFacility=facility  Log to the specified syslog facility - default local0.
   -mask           Use masking from .2bit file.
   -repMatch=N     Number of occurrences of a tile (n-mer) that triggers repeat masking the
                   tile. Default is 1024.
   -noSimpRepMask  Suppresses simple repeat masking.
   -maxDnaHits=N   Maximum number of hits for a DNA query that are sent from the server.
                   Default is 100.
   -maxTransHits=N Maximum number of hits for a translated query that are sent from the server.
                   Default is 200.
   -maxNtSize=N    Maximum size of untranslated DNA query sequence.
                   Default is 40000.
   -maxAaSize=N    Maximum size of protein or translated DNA queries.
                   Default is 8000.
   -perSeqMax=file File contains one seq filename (possibly with ':seq' suffix) per line.
                   -maxDnaHits will be applied to each filename[:seq] separately: each may
                   have at most maxDnaHits/2 hits.  The filename MUST not include the directory.
                   Useful for assemblies with many alternate/patch sequences.
   -canStop        If set, a quit message will actually take down the server.
   -indexFile      Index file created by `gfServer index'. Saving index can speed up
                   gfServer startup by two orders of magnitude.  The parameters must
                   exactly match the parameters when the file is written or bad things
                   will happen.
   -timeout=N      Timeout in seconds.
                   Default is 90.

================================================================
========   gff3ToPsl   ====================================
================================================================
### kent source version 492 ###
gff3ToPsl - convert a GFF3 CIGAR file to a PSL file
usage:
   gff3ToPsl [options] queryChromSizes targetChomSizes inGff3 out.psl
arguments:
   queryChromSizes file with query (main coordinates) chromosome sizes  .
               File formatted:  chromeName<tab>chromSize
   targetChromSizes file with target (Target attribute)  chromosome sizes .
   inGff3     GFF3 formatted file with Gap attribute in match records
   out.psl    PSL formatted output
options:
   -dropQ     drop record when query not found in queryChromSizes
   -dropT     drop record when target not found in targetChromSizes
This converts:
The first step is to parse GFF3 file, up to 50 errors are reported before
aborting.  If the GFF3 files is successfully parse, it is converted to PSL

Input file must conform to the GFF3 specification:
   http://www.sequenceontology.org/gff3.shtml

================================================================
========   gmtime   ====================================
================================================================
gmtime - convert unix timestamp to date string
usage: gmtime <time stamp>
	<time stamp> - integer 0 to 2147483647
================================================================
========   headRest   ====================================
================================================================
### kent source version 492 ###
headRest - Return all *but* the first N lines of a file.
usage:
   headRest count fileName
You can use stdin for fileName
options:
   -xxx=XXX

================================================================
========   hgBbiDbLink   ====================================
================================================================
### kent source version 492 ###
hgBbiDbLink - Add table that just contains a pointer to a bbiFile to database.  This program 
is used to add bigWigs and bigBeds.
usage:
   hgBbiDbLink database trackName fileName

================================================================
========   hgFakeAgp   ====================================
================================================================
### kent source version 492 ###
hgFakeAgp - Create fake AGP file by looking at N's
usage:
   hgFakeAgp input.fa output.agp
options:
   -minContigGap=N Minimum size for a gap between contigs.  Default 25
   -minScaffoldGap=N Min size for a gap between scaffolds. Default 50000
   -singleContigs - when a full sequence has no gaps, maintain contig
	name without adding index extension.

================================================================
========   hgLoadSqlTab   ====================================
================================================================
### kent source version 492 ###
hgLoadSqlTab - Load table into database from SQL and text files.
usage:
   hgLoadSqlTab database table file.sql file(s).tab
file.sql contains a SQL create statement for table
file.tab contains tab-separated text (rows of table)
The actual table name will come from the command line, not the sql file.
options:
  -warn - warn instead of abort on mysql errors or warnings
  -notOnServer - file is *not* in a directory that the mysql server can see
  -oldTable|-append - add to existing table

To load bed 3+ sorted tab files as hgLoadBed would do automatically
sort the input file:
  sort -k1,1 -k2,2n file(s).tab | hgLoadSqlTab database table file.sql stdin

================================================================
========   htmlCheck   ====================================
================================================================
### kent source version 492 ###
htmlCheck - Do a little reading and verification of html file
usage:
   htmlCheck how url
where how is:
   ok - just check for 200 return.  Print error message and exit -1 if no 200
   getAll - read the url (header and html) and print to stdout
   getHeader - read the header and print to stdout
   getCookies - print list of cookies
   getHtml - print the html, but not the header to stdout
   getForms - print the form structure to stdout
   getVars - print the form variables to stdout
   getLinks - print links
   getTags - print out just the tags
   checkLinks - check links in page
   checkLinks2 - check links in page and all subpages in same host
             (Just one level of recursion)
   checkLocalLinks - check local links in page
   checkLocalLinks2 - check local links in page and connected local pages
             (Just one level of recursion)
   submit - submit first form in page if any using 'GET' method
   validate - do some basic validations including TABLE/TR/TD nesting
   strictTagNestCheck - check tags are correctly nested
options:
   cookies=cookie.txt - Cookies is a two column file
           containing <cookieName><space><value><newLine>
   withSrc - causes the get and checkLinks commands to also include SRC= links.
note: url will need to be in quotes if it contains an ampersand or question mark.
================================================================
========   isPcr   ====================================
================================================================
### kent source version 492 ###
isPcr - Standalone v 39x1 In-Situ PCR Program
usage:
   isPcr database query output
where database is a fasta, nib, or twoBit file or a text file containing
a list of these files,  query is a text file file containing three columns: name,
forward primer, and reverse primer,  and output is where the results go.
The names 'stdin' and 'stdout' can be used as file names to make using the
program in pipes easier.
options:
   -ooc=N.ooc  Use overused tile file N.ooc.  N should correspond to 
               the tileSize
   -tileSize=N the size of match that triggers an alignment.  
               Default is 11 .
   -stepSize=N spacing between tiles. Default is 5.
   -maxSize=N - Maximum size of PCR product (default 4000)
   -minSize=N - Minimum size of PCR product (default 0)
   -minPerfect=N - Minimum size of perfect match at 3' end of primer (default 15)
   -minGood=N - Minimum size where there must be 2 matches for each mismatch (default 15)
   -mask=type  Mask out repeats.  Alignments won't be started in masked region
               but may extend through it in nucleotide searches.  Masked areas
               are ignored entirely in protein or translated searches. Types are
                 lower - mask out lower cased sequence
                 upper - mask out upper cased sequence
                 out   - mask according to database.out RepeatMasker .out file
                 file.out - mask database according to RepeatMasker file.out
   -makeOoc=N.ooc Make overused tile file. Database needs to be complete genome.
   -repMatch=N sets the number of repetitions of a tile allowed before
               it is marked as overused.  Typically this is 256 for tileSize
               12, 1024 for tile size 11, 4096 for tile size 10.
               Default is 1024.  Only comes into play with makeOoc
   -noSimpRepMask Suppresses simple repeat masking.
   -flipReverse Reverse complement reverse (second) primer before using
   -out=XXX - Output format.  Either
      fa - fasta with position, primers in header (default)
      bed - tab delimited format. Fields: chrom/start/end/name/score/strand
      psl - blat format.

================================================================
========   ixIxx   ====================================
================================================================
### kent source version 492 ###
ixIxx - Create indices for simple line-oriented file of format 
<symbol> <free text>
usage:
   ixIxx in.text out.ix out.ixx
Where out.ix is a word index, and out.ixx is an index into the index.
options:
   -prefixSize=N Size of prefix to index on in ixx.  Default is 5.
   -binSize=N Size of bins in ixx.  Default is 64k.
   -maxWordLength=N Maximum allowed word length. 
     Words with more characters than this limit are ignored and will not appear in index or be searchable.  Default is 31.

================================================================
========   linesToRa   ====================================
================================================================
### kent source version 492 ###
linesToRa - generate .ra format from lines with pipe-separated fields
usage:
   linesToRa in.txt out.ra

================================================================
========   localtime   ====================================
================================================================
localtime - convert unix timestamp to date string
usage: localtime <time stamp>
	<time stamp> - integer 0 to 2147483647
================================================================
========   matrixClusterColumns   ====================================
================================================================
### kent source version 492 ###
matrixClusterColumns - Group the columns of a matrix into clusters, and output a matrix with
the same number of rows and generally much fewer columns. Combines columns by taking mean.
usage:
   matrixClusterColumns inMatrix.tsv meta.tsv cluster outMatrix.tsv outStats.tsv [cluster2 outMatrix2.tsv outStats2.tsv ... ]
where:
   inMatrix.tsv is a file in tsv format with cell labels in first row and gene labels in first column
   meta.tsv is a table where the first row is field labels and the first column is sample ids
   cluster is the name of the field with the cluster names
You can produce multiple clusterings in the same pass through the input matrix by specifying
additional cluster/outMatrix/outStats triples in the command line.
options:
   -makeIndex=index.tsv - output index tsv file with <matrix-col1><input-file-pos><line-len>
   -median if set ouput median rather than mean cluster value
   -excludeZeros if set exclude zeros when calculating mean/median

================================================================
========   matrixMarketToTsv   ====================================
================================================================
### kent source version 492 ###
matrixMarketToTsv - Convert matrix file from Matrix Market sparse matrix format to tab-separated-values.
usage:
   matrixMarketToTsv in.mtx sampleLabels.lst geneLabels.lst out.tsv
where in.mtx is a matrix market format matrix.  SampleLabels is a text file
with one label per line.  It will end in the first row of the output.
GeneLabels.lst is a text file with one gene name per line.  It will end up
in the first column of the output

================================================================
========   matrixNormalize   ====================================
================================================================
### kent source version 492 ###
matrixNormalize - Normalize a matrix somehow - make it's columns or rows all sum to one or have vector length one.
usage:
   matrixNormalize direction how inMatrix outMatrix
where "direction" is one of
   row - normalize rows to one
   column - normalize columns to one
and "how" is one of
   sum - sum adds to one after normalization
   length - Euclidian length as a vector adds to one
options:
   -target=val - use target val instead of one for normalizing
   -serial - disable use of pthreads to speed up via parallelization

================================================================
========   matrixToBarChartBed   ====================================
================================================================
### kent source version 492 ###
matrixToBarChartBed - Attach a labeled expression matrix to a bed file joining
on the matrix's first column and the bed's name column.
usage:
   matrixToBarChartBed matrix.tsv mapping.bed barChartOutput.bed
where
   matrix is tab-separated values with the first row and column used as labels
   mapping.bed maps a row of the matrix using the bed's name field and matrix's 1st field
        The mapping.bed file is expected to have a 'name2' field as it's last column
        and otherwise be at least bed 6.   Only the first 6 fields plus the name2 field are
        used.  Often this file will be made with gencodeVersionForGenes
options:
   -bedIx=N - use the N'th column of the mapping.bed as the id column. Default 4
   -trackDb=stanza.txt -output a trackDb stanza for this as a track
   -stats=stats.tsv - stats file from matrixClusterColumns, makes coloring in trackDb better

================================================================
========   mktime   ====================================
================================================================
mktime - convert date string to unix timestamp
usage: mktime YYYY-MM-DD HH:MM:SS
valid dates: 1970-01-01 00:00:00 to 2038-01-19 03:14:07
================================================================
========   newProg   ====================================
================================================================
### kent source version 492 ###
newProg - make a new C source skeleton.
usage:
   newProg progName description words
This will make a directory 'progName' and a file in it 'progName.c'
with a standard skeleton

Options:
   -jkhgap - include jkhgap.a and mysql libraries as well as jkweb.a archives 
   -cgi    - create shell of a CGI script for web
================================================================
========   newPythonProg   ====================================
================================================================
### kent source version 492 ###
newPythonProg - Make a skeleton for a new python program
usage:
   newPythonProg programName "The usage statement"
options:
   -xxx=XXX

================================================================
========   nibFrag   ====================================
================================================================
### kent source version 492 ###
nibFrag - Extract part of a nib file as .fa (all bases/gaps lower case by default)
usage:
   nibFrag [options] file.nib start end strand out.fa
where strand is + (plus) or m (minus)
options:
   -masked       Use lower-case characters for bases meant to be masked out.
   -hardMasked   Use upper-case for not masked-out, and 'N' characters for masked-out bases.
   -upper        Use upper-case characters for all bases.
   -name=name    Use given name after '>' in output sequence.
   -dbHeader=db  Add full database info to the header, with or without -name option.
   -tbaHeader=db Format header for compatibility with tba, takes database name as argument.

================================================================
========   nibSize   ====================================
================================================================
### kent source version 492 ###
nibSize - print size of nibs
usage:
   nibSize nib1 [...]

================================================================
========   para   ====================================
================================================================
### kent source version 492 ###
para - version 12.20
Manage a batch of jobs in parallel on a compute cluster.
Normal usage is to do a 'para create' followed by 'para push' until
job is done.  Use 'para check' to check status.
usage:

   para [options] command [command-specific arguments]

The commands are:

para create jobList
   This makes the job-tracking database from a text file with the
   command line for each job on a separate line.
   options:
      -cpu=N  Number of CPUs used by the jobs, default 1.
      -ram=N  Number of bytes of RAM used by the jobs.
         Default is RAM on node divided by number of cpus on node.
         Shorthand expressions allow t,g,m,k for tera, giga, mega, kilo.
         e.g. 4g = 4 Gigabytes.
      -batch=batchDir - specify the directory path that is used to store the
       batch control files.  The batchDir can be an absolute path or a path
       relative to the current directory.  The resulting path is use as the
       batch name.  The directory is created if it doesn't exist.  When
       creating a new batch, batchDir should not have been previously used as
       a batch name.  The batchDir must be writable by the paraHub process.
       This does not affect the working directory assigned to jobs.  It defaults
       to the directory where para is run.  If used, this option must be specified
       on all para commands for the  batch.  For example to run two batches in the
       same directory:
          para -batch=b1 make jobs1
          para -batch=b2 make jobs2
para push 
   This pushes forward the batch of jobs by submitting jobs to parasol
   It will limit parasol queue size to something not too big and
   retry failed jobs.
   options:
      -retries=N  Number of retries per job - default 4.
      -maxQueue=N  Number of jobs to allow on parasol queue. 
         Default 2000000.
      -minPush=N  Minimum number of jobs to queue. 
         Default 1.  Overrides maxQueue.
      -maxPush=N  Maximum number of jobs to queue - default 100000.
      -warnTime=N  Number of minutes job runs before hang warning. 
         Default 4320 (3 days).
      -killTime=N  Number of minutes hung job runs before push kills it.
         By default kill off for backwards compatibility.
      -delayTime=N  Number of seconds to delay before submitting next job 
         to minimize i/o load at startup - default 0.
      -priority=x  Set batch priority to high, medium, or low.
         Default medium (use high only with approval).
         If needed, use with make, push, create, shove, or try.
         Or, set batch priority to a specific numeric value - default 10.
         1 is emergency high priority, 
         10 is normal medium, 
         100 is low for bottomfeeders.
         Setting priority higher than normal (1-9) will be logged.
         Please keep low priority jobs short, they won't be pre-empted.
      -maxJob=x  Limit the number of jobs the batch can run.
         Specify number of jobs, for example 10 or 'unlimited'.
         Default unlimited displays as -1.
      -jobCwd=dir - specify the directory path to use as the current working
       directory for each job.  The dir can be an absolute path or a path
       relative to the current directory. It defaults to the directory where
       para is run.
para try 
   This is like para push, but only submits up to 10 jobs.
para shove
   Push jobs in this database until all are done or one fails after N retries.
para make jobList
   Create database and run all jobs in it if possible.  If one job
   fails repeatedly this will fail.  Suitable for inclusion in makefiles.
   Same as a 'create' followed by a 'shove'.
para check
   This checks on the progress of the jobs.
para stop
   This stops all the jobs in the batch.
para chill
   Tells system to not launch more jobs in this batch, but
   does not stop jobs that are already running.
para finished
   List jobs that have finished.
para hung
   List hung jobs in the batch (running > killTime).
para slow
   List slow jobs in the batch (running > warnTime).
para crashed
   List jobs that crashed or failed output checks the last time they were run.
para failed
   List jobs that crashed after repeated restarts.
para status
   List individual job status, including times.
para problems
   List jobs that had problems (even if successfully rerun).
   Includes host info.
para running
   Print info on currently running jobs.
para hippos time
   Print info on currently running jobs taking > 'time' (minutes) to run.
para time
   List timing information.
para recover jobList newJobList
   Generate a job list by selecting jobs from an existing list where
   the `check out' tests fail.
para priority 999
   Set batch priority. Values explained under 'push' options above.
para maxJob 999
   Set batch maxJob. Values explained under 'push' options above.
para ram 999
   Set batch ram usage. Values explained under 'push' options above.
para cpu 999
   Set batch cpu usage. Values explained under 'push' options above.
para resetCounts
   Set batch done and crash counters to 0.
para flushResults
   Flush results file.  Warns if batch has jobs queued or running.
para freeBatch
   Free all batch info on hub.  Works only if batch has nothing queued or running.
para showSickNodes
   Show sick nodes which have failed when running this batch.
para clearSickNodes
   Clear sick nodes statistics and consecutive crash counts of batch.

Common options
   -verbose=1 - set verbosity level.

================================================================
========   paraFetch   ====================================
================================================================
### kent source version 492 ###
paraFetch - try to fetch url with multiple connections
usage:
   paraFetch N R URL {outPath}
   where N is the number of connections to use
         R is the number of retries
   outPath is optional. If not specified, it will attempt to parse URL to discover output filename.
options:
   -newer  only download a file if it is newer than the version we already have.
   -progress  Show progress of download.

================================================================
========   paraHub   ====================================
================================================================
### kent source version 492 ###
paraHub - parasol hub server version 12.20
usage:
    paraHub machineList
Where machine list is a file with the following columns:
    name - Network name
    cpus - Number of CPUs we can use
    ramSize - Megabytes of memory
    tempDir - Location of (local) temp dir
    localDir - Location of local data dir
    localSize - Megabytes of local disk
    switchName - Name of switch this is on

options:
   -spokes=N  Number of processes that feed jobs to nodes - default 30.
   -jobCheckPeriod=N  Minutes between checking on job - default 10.
   -machineCheckPeriod=N  Minutes between checking on machine - default 20.
   -subnet=XXX.YYY.ZZZ Only accept connections from subnet (example 192.168).
     Or CIDR notation (example 192.168.1.2/24).
     Supports comma-separated list of IPv4 or IPv6 subnets in CIDR notation.
   -nextJobId=N  Starting job ID number.
   -logFacility=facility  Log to the specified syslog facility - default local0.
   -logMinPriority=pri minimum syslog priority to log, also filters file logging.
    defaults to "warn"
   -log=file  Log to file instead of syslog.
   -debug  Don't daemonize
   -noResume  Don't try to reconnect with jobs running on nodes.
   -ramUnit=N  Number of bytes of RAM in the base unit used by the jobs.
      Default is RAM on node divided by number of cpus on node.
      Shorthand expressions allow t,g,m,k for tera, giga, mega, kilo.
      e.g. 4g = 4 Gigabytes.
   -defaultJobRam=N Number of ram units in a job has no specified ram usage.
      Defaults to 1.

================================================================
========   paraHubStop   ====================================
================================================================
paraHubStop - version 12.20
Shut down paraHub daemon.
usage:
   paraHubStop now

================================================================
========   paraNode   ====================================
================================================================
### kent source version 492 ###
paraNode - version 12.20
Parasol node server.
usage:
    paraNode start
options:
    -logFacility=facility  Log to the specified syslog facility - default local0.
    -logMinPriority=pri minimum syslog priority to log, also filters file logging.
     defaults to "warn"
    -log=file  Log to file instead of syslog.
    -debug  Don't daemonize
    -hub=host  Restrict access to connections from hub.
    -umask=000  Set umask to run under - default 002.
    -userPath=bin:bin/i386  User dirs to add to path.
    -sysPath=/sbin:/local/bin  System dirs to add to path.
    -env=name=value - add environment variable to jobs.  Maybe repeated.
    -randomDelay=N  Up to this many milliseconds of random delay before
        starting a job.  This is mostly to avoid swamping NFS with
        file opens when loading up an idle cluster.  Also it limits
        the impact on the hub of very short jobs. Default 5000.
    -cpu=N  Number of CPUs to use - default 1.
    -node=host  name used to identify this machine including resurrect and checkjob.

================================================================
========   paraNodeStart   ====================================
================================================================
### kent source version 492 ###
paraNodeStart - version 12.20
Start up parasol node daemons on a list of machines.
usage:
    paraNodeStart machineList
where machineList is a file containing a list of hosts.
Machine list contains the following columns:
     <name> <number of cpus>
It may have other columns as well.
options:
    -exe=/path/to/paraNode
    -logFacility=facility  Log to the specified syslog facility - default local0.
    -logMinPriority=pri minimum syslog priority to log, also filters file logging.
     defaults to "warn"
    -log=file  Log to file instead of syslog.
    -umask=000  Set umask to run under - default 002.
    -randomDelay=N  Set random start delay in milliseconds - default 5000.
    -userPath=bin:bin/i386  User dirs to add to path.
    -sysPath=/sbin:/local/bin  System dirs to add to path.
    -env=name=value - add environment variable to jobs.  Maybe repeated.
    -hub=machineHostingParaHub  Nodes will ignore messages from elsewhere.
    -rsh=/path/to/rsh/like/command.

================================================================
========   paraNodeStatus   ====================================
================================================================
paraNodeStatus - version 12.20
Check status of paraNode on a list of machines.
usage:
    paraNodeStatus machineList
options:
    -retries=N  Number of retries to get in touch with machine.
        The first retry is after 1/100th of a second. 
        Each retry after that takes twice as long up to a maximum
        of 1 second per retry.  Default is 7 retries and takes
        about a second.
    -long  List details of current and recent jobs.

================================================================
========   paraNodeStop   ====================================
================================================================
Couldn't open -verbose=2 , No such file or directory
================================================================
========   paraSync   ====================================
================================================================
### kent source version 492 ###
paraSync 1.0
paraSync - uses paraFetch to recursively mirror url to given path
usage:
   paraSync {options} N R URL outPath
   where N is the number of connections to use
         R is the number of retries
options:
   -A='ext1,ext2'  means accept only files with ext1 or ext2
   -newer  only download a file if it is newer than the version we already have.
   -progress  Show progress of download.

================================================================
========   paraTestJob   ====================================
================================================================
paraTestJob - version 12.20
A good test job to run on Parasol.  Can be configured to take a long time or crash.
usage:
   paraTestJob count
Run a relatively time consuming algorithm count times.
This algorithm takes about 1/10 per second each time.
options:
   -crash  Try to write to NULL when done.
   -err  Return -1 error code when done.
   -output=file  Make some output in file as well.
   -heavy=n  Make output heavy: n extra lumberjack lines.
   -input=file  Make it read in a file too.
   -sleep=n  Sleep for N seconds.

================================================================
========   parasol   ====================================
================================================================
parasol - parallel job management system for a compute cluster

Parasol version 12.20
Parasol is the name given to the overall system for managing jobs on
a computer cluster and to this specific command.  This command is
intended primarily for system administrators.  The 'para' command
is the primary command for users.
Usage in brief:
   parasol add machine machineFullHostName localTempDir  - Add new machine to pool.
    or 
   parasol add machine machineFullHostName cpus ramSizeMB localTempDir localDir localSizeMB switchName
   parasol remove machine machineFullHostName "reason why"  - Remove machine from pool.
   parasol check dead - Check machines marked dead ASAP, some have been fixed.
   parasol add spoke  - Add a new spoke daemon.
   parasol [options] add job command-line   - Add job to list.
         options:
            -in=in - Where to get stdin, default /dev/null
            -out=out - Where to put stdout, default /dev/null
            -wait - If set wait for job to finish to return and return with job status code
            -err=outFile - set stderr to out file - only works with wait flag
            -verbose=N - set verbosity level, default level is 1
            -printId - prints jobId to stdout
            -dir=dir - set output results dir, default is current dir
            -results=resultFile fully qualified path to the results file, 
             or `results' in the current directory if not specified.
            -cpu=N  Number of CPUs used by the jobs, default 1.
            -ram=N  Number of bytes of RAM used by the jobs.
             Default is RAM on node divided by number of cpus on node.
             Shorthand expressions allow t,g,m,k for tera, giga, mega, kilo.
             e.g. 4g = 4 Gigabytes.
   parasol [options] clear sick  - Clear sick stats on a batch.
         options:
            -results=resultFile fully qualified path to the results file, 
             or `results' in the current directory if not specified.
   parasol remove job id  - Remove job of given ID.
   parasol ping [count]  - Ping hub server to make sure it's alive.
   parasol remove jobs userName [jobPattern]  - Remove jobs submitted by user that
         match jobPattern (which may include ? and * escaped for shell).
   parasol list machines  - List machines in pool.
   parasol [-extended] list jobs  - List jobs one per line.
   parasol list users  - List users one per line.
   parasol [options] list batches  - List batches one per line.
         option - 'all' if set include inactive
   parasol list sick  - List nodes considered sick by all running batches, one per line.
   parasol status  - Summarize status of machines, jobs, and spoke daemons.
   parasol [options] pstat2  - Get status of jobs queued and running.
         options:
            -results=resultFile fully qualified path to the results file, 
             or `results' in the current directory if not specified.
   parasol flushResults
         Flush results file.  Warns if batch has jobs queued or running.
         options:
            -results=resultFile fully qualified path to the results file, 
             or `results' in the current directory if not specified.
options:
   -host=hostname - connect to a paraHub process on a remote host instead
                    localhost.
Important note:
  Options must precede positional arguments

================================================================
========   pslMap   ====================================
================================================================
### kent source version 492 ###
pslMap - map PSLs alignments to new targets using alignments of the old target to the new target.
usage:
   pslMap [options] inPsl mapFile outPsl

pslMap - map PSLs alignments to new targets using alignments of
the old target to the new target.  Given inPsl and mapPsl, where
the target of inPsl is the query of mapPsl, create a new PSL
with the query of inPsl aligned to all the targets of mapPsl.

If inPsl is a protein to nucleotide alignment and mapPsl is a
nucleotide to nucleotide alignment, the resulting alignment is
nucleotide to nucleotide alignment of the CDS coordinates mRNA that
would code for the protein.  This is useful as it gives base
alignments of spliced codons.

Protein to NA alignments can be determine from the PSL, otherwise
they are assumed to be NA-NA unless the types of the alignments are
specified with -inType and -mapType.  The following combinations are
valid, along with the type of output,

     inPslType   mapPslType  outPslType
     na_na       na_na       na_na
     prot_prot   prot_prot   prot_prot
     prot_na     na_na       cds_na
     prot_prot   na_na       cds_na
     prot_prot   prot_na     cds_na

A chain file may be used instead mapPsl.

Options:
  -chainMapFile - mapFile is a chain file instead of a psl file
  -swapMap - swap query and target sides of map file.
  -swapIn - swap query and target sides of inPsl file.
  -check - validate input, mapping, and mapped PSLs.  This does slow
   down the program some, so it is optional.
  -suffix=str - append str to the query ids in the output
   alignment.  Useful with protein alignments, where the result
   is not actually and alignment of the protein.
  -keepTranslated - if either psl is translated, the output psl
   will be translated (both strands explicted).  Normally an
   untranslated psl will always be created
  -mapFileWithInQName - The first column of the mapFile PSL records are a qName,
   the remainder is a standard PSL.  When an inPsl record is mapped, only
   mapping records are used with the corresponding qName.
  -inType=type - input alignment type (prot-port, prot-na, na-na)
   This is the type after swapping if -swapIn is supplied.
  -mapType=type - map alignment type (prot-port, prot-na, na-na)
   This is the type after swapping if -swapMap is supplied.
  -mapInfo=file - output a file with information about each mapping.
   The file has the following columns:
     o srcQName, srcQStart, srcQEnd, srcQSize - qName, etc of
       psl being mapped (source alignment)
     o srcTName, srcTStart, srcTEnd - tName, etc of psl being
       mapped
     o srcStrand - strand of psl being mapped
     o srcAligned - number of aligned based in psl being mapped
     o mappingQName, mappingQStart, mappingQEnd - qName, etc of
       mapping psl used to map alignment
     o mappingTName, mappingTStart, mappingTEnd - tName, etc of
       mapping psl
     o mappingStrand - strand of mapping psl
     o mappingId - chain id, or psl file row
     o mappedQName mappedQStart, mappedQEnd - qName, etc of
       mapped psl
     o mappedTName, mappedTStart, mappedTEnd - tName, etc of
       mapped psl
     o mappedStrand - strand of mapped psl
     o mappedAligned - number of aligned bases that were mapped
     o qStartTrunc - aligned bases at qStart not mapped due to
       mapping psl/chain not covering the entire soruce psl.
       This is from the start of the query in the positive
       direction.
     o qEndTrunc - similary for qEnd
     o mappedPslLine - zero-based line number of the corresponding PSL line number
       in outPsl.
   If the psl count not be mapped, the mapping* and mapped* columns are empty.
  -tsv - write output of mapInfo as a TSV rather than autoSql format file.
  -mappingPsls=pslFile - write mapping alignments that were used in
   PSL format to this file.  Transformations that were done, such as
   -swapMap, will be reflected in this file.  There will be a one-to-one
   correspondence of rows of this file to rows of the outPsl file.
  -simplifyMappingIds - simplifying mapping ids (inPsl target
   name and mapFile query name) before matching them. This
   first drops everything after the last `-', and then drops
   everything after the last remaining `.'.
  -verbose=n  - verbose output
     2 - show each overlap and the mapping

================================================================
========   pslMapPostChain   ====================================
================================================================
### kent source version 492 ###
pslMapPostChain - Post genomic pslMap (TransMap) chaining.
usage:
    pslMapPostChain [options] inPsl outPsl

Post genomic pslMap (TransMap) chaining.  This takes transcripts
that have been mapped via genomic chains adds back in
blocks that didn't get include in genomic chains due
to complex rearrangements or other issues.
This also handles other PSLs, including protein-RNA alignments

================================================================
========   pslPosTarget   ====================================
================================================================
### kent source version 492 ###
pslPosTarget - flip psl strands so target is positive and implicit
usage:
   pslPosTarget inPsl outPsl

================================================================
========   pslProtToRnaCoords   ====================================
================================================================
### kent source version 492 ###
pslProtToRnaCoords - Convert protein alignments to RNA coordinates
usage:
   pslProtToRnaCoords inPsl outPsl

Convert either a protein/protein or protein/NA PSL to NA/NA PSL.  This
multiplies coordinates and statistics by three.  As this can occasionally
results in blocks overlapping, overlap is trimmed as needed.

================================================================
========   pslRc   ====================================
================================================================
### kent source version 492 ###
pslRc - reverse-complement psl
usage:
    pslRc [options] inPsl outPsl

reverse-complement psl

Options:

================================================================
========   pslRemoveFrameShifts   ====================================
================================================================
### kent source version 492 ###
pslRemoveFrameShifts - remove frame shifts from psl
usage:
   pslRemoveFrameShifts file.psl out.psl

================================================================
========   pslScore   ====================================
================================================================
### kent source version 492 ###
pslScore - calculate web blat score from psl files
usage:
   pslScore <file.psl> [moreFiles.psl]
options:
   none at this time

columns in output:

#tName	tStart	tEnd	qName:qStart-qEnd	score	percentIdentity
================================================================
========   pslSpliceJunctions   ====================================
================================================================
### kent source version 492 ###
pslSpliceJunctions - Extract splice junctions from a PSL file
usage:
   pslSpliceJunctions pslFile genome2bit junctionsTsv
options:

Output query and target coordinates of target gaps, often introns,
in alignments. Output is always in query-positive and target-positive coordinates,
with only gaps in the target reported. Canonical junctions will be in upper cases,
unknown ones lower case. 

================================================================
========   pslSwap   ====================================
================================================================
### kent source version 492 ###
pslSwap - swap target and query in psls
usage:
    pslSwap [options] inPsl outPsl

Options:
  -noRc - don't reverse complement untranslated alignments to
   keep target positive strand.  This will make the target strand
   explict.

================================================================
========   pslToPslx   ====================================
================================================================
### kent source version 492 ###
pslToPslx - Convert from psl to pslx format, which includes sequences
usage:
   pslToPslx [options] in.psl qSeqSpec tSeqSpec out.pslx

qSeqSpec and tSeqSpec can be nib directory, a 2bit file, or a FASTA file.
FASTA files should end in .fa, .fa.gz, .fa.Z, or .fa.bz2 and are read into
memory.

Options:
  -masked - if specified, repeats are in lower case cases, otherwise entire
            sequence is loader case.

================================================================
========   raToLines   ====================================
================================================================
### kent source version 492 ###
raToLines - Output .ra file stanzas as single lines, with pipe-separated fields.

usage:
   raToLines in.ra out.txt

================================================================
========   raToTab   ====================================
================================================================
### kent source version 492 ###
raToTab - Convert ra file to table.
usage:
   raToTab in.ra out.tab
options:
   -cols=a,b,c - List columns in order to output in table
                 Only these columns will be output.  If you
                 Don't give this option, all columns are output
                 in alphabetical order
   -head - Put column names in header

================================================================
========   randomLines   ====================================
================================================================
### kent source version 492 ###
randomLines - Pick out random lines from file
usage:
   randomLines inFile count outFile
options:
   -seed=N - Set seed used for randomizing, useful for debugging.
   -decomment - remove blank lines and those starting with 

================================================================
========   rmFaDups   ====================================
================================================================
rmFaDups - remove duplicate records in FA file
usage
   rmFaDups oldName.fa newName.fa

================================================================
========   rowsToCols   ====================================
================================================================
### kent source version 492 ###
rowsToCols - Convert rows to columns and vice versa in a text file.
usage:
   rowsToCols in.txt out.txt
By default all columns are space-separated, and all rows must have the
same number of columns.
options:
   -varCol - rows may to have various numbers of columns.
   -tab - fields are separated by tab
   -fs=X - fields are separated by given character
   -fixed - fields are of fixed width with space padding
   -offsets=X,Y,Z - fields are of fixed width at given offsets

================================================================
========   sizeof   ====================================
================================================================
sizeof - show size of various C types for reference

     type   bytes    bits
     char	1	8
unsigned char	1	8
short int	2	16
u short int	2	16
      int	4	32
 unsigned	4	32
     long	8	64
unsigned long	8	64
long long	8	64
u long long	8	64
   size_t	8	64
   void *	8	64
    float	4	32
   double	8	64
long double	16	128
LITTLE ENDIAN machine detected
byte order: normal order: 0x12345678 in memory: 0x78563412
================================================================
========   spacedToTab   ====================================
================================================================
### kent source version 492 ###
spacedToTab - Convert fixed width space separated fields to tab separated
Note this requires two passes, so it can't be done on a pipe
usage:
   spacedToTab in.txt out.tab
options:
   -sizes=X,Y,Z - Force it to have columns of the given widths.
                 The final char in each column should be space or newline

================================================================
========   splitFile   ====================================
================================================================
splitFile - Split up a file
usage:
   splitFile source linesPerFile outBaseName
options:
   -head=file - put head in front of each output
   -tail=file - put tail at end of each output
================================================================
========   splitFileByColumn   ====================================
================================================================
### kent source version 492 ###
splitFileByColumn - Split text input into files named by column value
usage:
   splitFileByColumn source outDir
options:
   -col=N      - Use the Nth column value (default: N=1, first column)
   -head=file  - Put head in front of each output
   -tail=file  - Put tail at end of each output
   -chromDirs  - Split into subdirs of outDir that are distilled from chrom
                 names, e.g. chr3_random -> outDir/3/chr3_random.XXX .
   -ending=XXX - Use XXX as the dot-suffix of split files (default: taken
                 from source).
   -tab        - Split by tab characters instead of whitespace.
Split source into multiple files in outDir, with each filename determined
by values from a column of whitespace-separated input in source.
If source begins with a header, you should pipe "tail +N source" to this
program where N is number of header lines plus 1, or use some similar
method to strip the header from the input.

================================================================
========   strexCalc   ====================================
================================================================
### kent source version 492 ###
strexCalc - String expression calculator, mostly to test strex expression evaluator.
usage:
   strexCalc [variable assignments] expression
command options in strexCalc are used to seed variables so for instance the command
   strexCalc a=12 b=13 c=xyz 'a + b + c'
ends up returning 1213xyz

================================================================
========   stringify   ====================================
================================================================
### kent source version 492 ###
stringify - Convert file to C strings
usage:
   stringify [options] in.txt
A stringified version of in.txt  will be printed to standard output.

Options:
  -var=varname - create a variable with the specified name containing
                 the string.
  -static - create the variable but put static in front of it.
  -array - create an array of strings, one for each line


================================================================
========   subChar   ====================================
================================================================
subChar - Substitute one character for another throughout a file.
usage:
   subChar oldChar newChar file(s)
oldChar and newChar can either be single letter literal characters,
or two digit hexadecimal ascii codes
================================================================
========   subColumn   ====================================
================================================================
### kent source version 492 ###
subColumn - Substitute one column in a tab-separated file.
usage:
   subColumn column in.tab sub.tab out.tab
Where:
    column is the column number (starting with 1)
    in.tab is a tab-separated file
    sub.tab is a where first column is old values, second new
    out.tab is the substituted output
options:
   -list - Column is a comma-separated list.  Substitute all elements in list
   -miss=fileName - Print misses to this file instead of aborting
   -skipMiss -- skip missing id's instead of outputting them

================================================================
========   tabFmt   ====================================
================================================================
tabFmt - Format a tab-seperated file for human readability
Usage:
   tabFmt [options] inFile [outFile]

Options:
  -right - right-justify
  -numRight - right-justify numeric appearing columns (text header allowed)
  -passNoTab - pass through lines with no tabs without including them in the
   formatting
  -h,-help - help

================================================================
========   tabToTabDir   ====================================
================================================================
### kent source version 492 ###
tabToTabDir - Convert a large tab-separated table to a directory full of such tables according
to a specification. The program is designed to make it relatively easy to unpack overloaded
single fields into multiple fields, and to created normalized less redundant representations.
The command line is:
   tabToTabDir in.tsv spec.x outDir
options:
   -id=fieldName - Add a numeric id field of given name that starts at 1 and autoincrements 
                   for each table
   -startId=fieldName - sets starting ID to be something other than 1
   -sort - if set then sort tables before output
usage:
   in.tsv is a tab-separated input file.  The first line is the label names and may start with #
   spec.x is a file that says what columns to put into the output, described in more detail below.
The spec.x file contains one blank line separated stanza per output table.
Each stanza should look like:
        table tableName    key-column
        columnName1	sourceExpression1
        columnName2	sourceExpression2
              ...
if the sourceExpression is missing it is assumed to be a just a field of the same name from in.tsv
Otherwise the sourceExpression can be a strex expression involving fields in in.tsv.

Each output table has duplicate rows merged using the key-column to determine uniqueness.
Please see tabToTabDir.doc in the source code for more information on what can go into spec.x.

================================================================
========   tailLines   ====================================
================================================================
tailLines - add tail to each line of file
usage:
   tailLines file tail
This will add tail to each line of file and print to stdout.
================================================================
========   tdbRename   ====================================
================================================================
Usage: tdbRename [options] inFile tagName replaceFile outFile - mass-rename trackDb tags given a file with oldVal<tab>newVal
            
    Examples:
        tdbRename trackDb.orig.txt track replace.tsv trackDb.txt
        tdbRename trackDb.orig.txt shortLabel replace.tsv trackDb.txt
    

Options:
  -h, --help           show this help message and exit
  -d, --debug          show debug messages
  --newMeta=NEWMETA    keep the old name as metadata tag with this name
  --suffList=SUFFLIST  comma-sep list of suffixes. These are ignored when
                       comparing values. Many tracks need suffixes for the
                       various track types, e.g. peaks and coverage. A typical
                       value could be 'pk,cov'
================================================================
========   tdbSort   ====================================
================================================================
Usage: tdbSort [options] inFile tagName outFile - sort a trackDb file by a tag
            
    Examples:
        tdbSort trackDb.orig.txt shortLabel trackDb.txt
    

Options:
  -h, --help            show this help message and exit
  -d, --debug           show debug messages
  -p PARENT, --parent=PARENT
                        only sort tracks that have a given 'parent' tag
  -i, --ignCase         ignore case when sorting
================================================================
========   textHistogram   ====================================
================================================================
### kent source version 492 ###
textHistogram - Make a histogram in ascii
usage:
   textHistogram [options] inFile
Where inFile contains one number per line.
  options:
   -binSize=N - Size of bins, default 1
   -maxBinCount=N - Maximum # of bins, default 25
   -minVal=N - Minimum value to put in histogram, default 0
   -log - Do log transformation before plotting
   -noStar - Don't draw asterisks
   -col=N - Which column to use. Default 1
   -aveCol=N - A second column to average over. The averages
             will be output in place of counts of primary column.
   -real - Data input are real values (default is integer)
   -autoScale=N - autoscale to N # of bins
   -probValues - show prob-Values (density and cum.distr.) (sets -noStar too)
   -freq - show frequences instead of counts
   -skip=N - skip N lines before starting, default 0

================================================================
========   tickToDate   ====================================
================================================================
tickToDate - Convert seconds since 1970 to time and date
usage:
   tickToDate ticks
Use 'now' for current ticks and date

================================================================
========   toLower   ====================================
================================================================
toLower - Convert upper case to lower case in file. Leave other chars alone
usage:
   toLower inFile outFile
equivalent to the unix commands: cat inFile | tr '[A-Z]' '[a-z]' > outFile
================================================================
========   toUpper   ====================================
================================================================
toUpper - Convert lower case to upper case in file. Leave other chars alone
usage:
   toUpper inFile outFile
equivalent to the unix commands: cat inFile | tr '[a-z]' '[A-Z]' > outFile
================================================================
========   trackDbIndexBb   ====================================
================================================================
usage: trackDbIndexBb [-h] [-o OUTDIR] [-p TOOLSPATH] [-n] [-m METADATAVAR]
                      [-s SUBGROUPREMOVE]
                      trackName raFile chromSizes

Given a track name, a trackDb.ra composite of bigBeds, and a chrom.sizes
file, will create index files needed to optimize hideEmptySubtracks setting.
Will also build track associations between tracks sharing metadata, which
will cause them to display together whenever the primary bigBed track is present.
Depending on size and quantity of files, can take over 60 minutes.

This script has three dependancies: bigBedToBed, bedToBigBed, and bedtools. The first two are
UCSC Genome Browser utilities which can be downloaded to the current directory with the 
following commands:

1) wget http://hgdownload.soe.ucsc.edu/admin/exe/<system>.x86_64/<tool>

where: 
<system> is macOSX or linux
<tool> is bedToBigBed and bigBedToBed

2) chmod +x <tool>

bedtools can be found here: https://bedtools.readthedocs.io
    
These dependancies can be in the path, in the local directory the script is run from, or 
specified using the optional flags.

    
Example run:
    trackDbIndexBb mm10HMMdata mm10HMMdata.ra mm10chrom.sizes
    trackDbIndexBb hg19peaks hg19peaks.ra hg19chrom.sizes -o ./hg19peaks/output -p /user/bin

required arguments:
  trackName             Track name for top level coposite which contains the
                        bigBed tracks.
  raFile                Relative or absolute path to trackDb.ra file
                        containing the composite track with bigDataUrls.
  chromSizes            Chrom.sizes for database which the track belongs to.
                        Needed to build final bigBed file.

optional arguments:
  -h, --help            show this help message and exit
  -o OUTDIR, --out OUTDIR
                        Optional: Output directory for files. Default current
                        directory.
  -p TOOLSPATH, --pathTools TOOLSPATH
                        Optional: Path to directory where
                        bedtools/bedToBigBed/bigBedToBed can be found
  -n, --noDelete        Optional: Do not delete intermediary multibed.bed
                        file. This option will result in both the multibed.bb
                        and multibed.bed files in the output directory.
  -m METADATAVAR, --metaDataVar METADATAVAR
                        Optional: Used when there are associated tracks to be
                        displayed alongside the primary BB track. Such as peak
                        tracks with related signals. To relate the tracks,
                        trackDbIndexBb expects all except one of the metaData
                        variables to match among associated tracks. By
                        default, trackDbIndexBb attempts to make association
                        between tracks by using the metaData in the
                        'subGroups' trackDb parameter. Use this flag to change
                        it to a different association, often 'metaData' is
                        also used.
  -s SUBGROUPREMOVE, --subGroupRemove SUBGROUPREMOVE
                        Optional: Used when there are associated tracks to be
                        displayed alongside the primary BB track. Such as peak
                        tracks with related signals. To relate the tracks,
                        trackDbIndexBb expects all except one of the metaData
                        variables to match among associated tracks. This
                        metaData often looks likes: 'view=Peaks
                        mark=A6_H3K36me3' for the .bb track, and 'view=Signal
                        mark=A6_H3K36me3' for the .bw track. In this case, you
                        would want to exclude the 'view' varaible to make
                        histone mark associations (A6_H3K36me3). This flag can
                        be used to pass a different exclusionary variable than
                        the default 'view'
================================================================
========   twoBitDup   ====================================
================================================================
### kent source version 492 ###
twoBitDup - check to see if a twobit file has any identical sequences in it
usage:
   twoBitDup file.2bit
options:
  -keyList=file - file to write a key list, two columns: md5sum and sequenceName
  -udcDir=/dir/to/cache - place to put cache for remote bigBed/bigWigs

example: twoBitDup -keyList=stdout db.2bit \
          | grep -v 'are identical' | sort > db.idKeys.txt
================================================================
========   twoBitInfo   ====================================
================================================================
### kent source version 492 ###
twoBitInfo - get information about sequences in a .2bit file
usage:
   twoBitInfo input.2bit output.tab
options:
   -maskBed instead of seq sizes, output BED records that define 
           areas with masked sequence
   -nBed   instead of seq sizes, output BED records that define 
           areas with N's in sequence
   -noNs   outputs the length of each sequence, but does not count Ns 
   -udcDir=/dir/to/cache - place to put cache for remote bigBed/bigWigs
Output file has the columns::
   seqName size

The 2bit file may be specified in the form path:seq or path:seq1,seq2,seqN...
so that information is returned only on the requested sequence(s).
If the form path:seq:start-end is used, start-end is ignored.

================================================================
========   twoBitToFa   ====================================
================================================================
### kent source version 492 ###
twoBitToFa - Convert all or part of .2bit file to fasta
usage:
   twoBitToFa input.2bit output.fa
options:
   -seq=name       Restrict this to just one sequence.
   -start=X        Start at given position in sequence (zero-based).
   -end=X          End at given position in sequence (non-inclusive).
   -seqList=file   File containing list of the desired sequence names 
                   in the format seqSpec[:start-end], e.g. chr1 or chr1:0-189
                   where coordinates are half-open zero-based, i.e. [start,end).
   -noMask         Convert sequence to all upper case.
   -bpt=index.bpt  Use bpt index instead of built-in one.
   -bed=input.bed  Grab sequences specified by input.bed. Will exclude introns.
   -bedPos         With -bed, use chrom:start-end as the fasta ID in output.fa.
   -udcDir=/dir/to/cache  Place to put cache for remote bigBed/bigWigs.

Input file can be a URL
Sequence and range may also be specified as part of the input
file name using the syntax:
      /path/input.2bit:name
   or
      /path/input.2bit:name:start-end
examples:
  wget https://hgdownload.soe.ucsc.edu/goldenPath/hg38/bigZips/latest/hg38.2bit
  twoBitToFa hg38.2bit -seq=chr1 -start=1000000 -end=20000000 out.fa
  echo 'chr1 1 1000 testRegion' > test.bed
  twoBitToFa https://hgdownload.soe.ucsc.edu/goldenPath/hg38/bigZips/latest/hg38.2bit -bed=test.bed out.fa

================================================================
========   ucscApiClient   ====================================
================================================================
usage: ucscApiClient [-h] [-p] [--debug] [-test0] [-getDnaExample]
                     [endpoint] [parameters]

Command line utility for UCSC Genome Browser API access

positional arguments:
  endpoint            Endpoint string like "/list/tracks" or "/getData/track/"
  parameters          Parameters to endpoints. semi-colon separated key=value
                      formatted string, like
                      "genome=hg38;chrom=chrM;maxItemsOutput=2"

optional arguments:
  -h, --help          show this help message and exit
  -p, --pretty-print  Print json response with newlines
  --debug             Print final URL of the request
  -test0              Run special test
  -getDnaExample      Show example query for fetching Human GRCh38(hg38) DNA
                      sequence

Example usage:
ucscApiClient "/getData/track" "track=gold;genome=hg38;chrom=chrM;maxItemsOutput=2"
================================================================
========   varStepToBedGraph.pl   ====================================
================================================================
Can't open -verbose=2: No such file or directory at ./varStepToBedGraph.pl line 34.
Processed 0 lines input, 0 data lines, 0 variable step declarations
================================================================
========   webSync   ====================================
================================================================
Usage: webSync [options] <url> - download from https server, using files.txt on their end to get the list of files

    To create files.txt on the remote end, this simple command can be used to create a list of files:
      du -ab > files.txt
    But the above command is slow, includes directories (will lead to warnings) and does not follow
    symlinks, so rather use this command:
      find -L . -type f -print0 | du -Lab --files0-from=- > files.txt

    Then run this in the download directory:
      webSync https://there.org/

    This will create a "webSyncLog" directory in the current directory, compare
    https://there.org/files.txt with the files in the current directory,
    transfer the missing files and write the changes to webSync/transfer.log.

    The URL will be saved after the first run and is not necessary from then on. You can add
    cd xxx && webSync to your crontab. It will not start if it's already running (flagfile).

    Status files after a run:
    - webSyncLog/biggerHere.txt - list of files that are bigger here. These could be errors or OK.
    - webSyncLog/files.here.txt - the list of files here
    - webSyncLog/files.there.txt - the list of files there, current copy of https://there.org/files.txt
    - webSyncLog/missingThere.txt - the list of files not on https://there.org anymore but here
    - webSyncLog/transfer.log - big transfer log, each run, date and size of transferred file is noted here.
    

Options:
  -h, --help            show this help message and exit
  -d, --debug           show debug messages
  -x CONNECTIONS, --connections=CONNECTIONS
                        Maximum number of parallel connections to the server,
                        default 10
  -s, --skipScan        Do not scan local file sizes again, in case you know
                        it is up to date
================================================================
========   wigCorrelate   ====================================
================================================================
### kent source version 492 ###
wigCorrelate - Produce a table that correlates all pairs of wigs.
usage:
   wigCorrelate one.wig two.wig ... n.wig
This works on bigWig as well as wig files.
The output is to stdout
options:
   -clampMax=N - values larger than this are clipped to this value

================================================================
========   wigEncode   ====================================
================================================================
### kent source version 492 ###
wigEncode - convert Wiggle ascii data to binary format

usage:
    wigEncode [options] wigInput wigFile wibFile
	wigInput - wiggle ascii data input file (stdin OK)
	wigFile - .wig output file to be used with hgLoadWiggle
	wibFile - .wib output file to be symlinked into /gbdb/<db>/wib/

This processes the three data input format types described at:
	http://genome.ucsc.edu/encode/submission.html#WIG
	(track and browser lines are tolerated, i.e. ignored)
options:
    -lift=<D> - lift all input coordinates by D amount, default 0
              - can be negative as well as positive
    -allowOverlap - allow overlapping data, default: overlap not allowed
              - only effective for fixedStep and if fixedStep declarations
              - are in order by chromName,chromStart
    -noOverlapSpanData - check for overlapping span data
    -wibSizeLimit=<N> - ignore rest of input when wib size is >= N

Example:
    hgGcPercent -wigOut -doGaps -file=stdout -win=5 xenTro1 \
        /cluster/data/xenTro1 | wigEncode stdin gc5Base.wig gc5Base.wib
load the resulting .wig file with hgLoadWiggle:
    hgLoadWiggle -pathPrefix=/gbdb/xenTro1/wib xenTro1 gc5Base gc5Base.wig
    ln -s `pwd`/gc5Base.wib /gbdb/xenTro1/wib
================================================================
========   wigToBigWig   ====================================
================================================================
### kent source version 492 ###
wigToBigWig v 2.9 - Convert ascii format wig file (in fixedStep, variableStep
or bedGraph format) to binary big wig format (bbi version: 4).
usage:
   wigToBigWig in.wig chrom.sizes out.bw
Where in.wig is in one of the ascii wiggle formats, but not including track lines
and chrom.sizes is a two-column file/URL: <chromosome name> <size in bases>
and out.bw is the output indexed big wig file.
If the assembly <db> is hosted by UCSC, chrom.sizes can be a URL like
  http://hgdownload.soe.ucsc.edu/goldenPath/<db>/bigZips/<db>.chrom.sizes
or you may use the script fetchChromSizes to download the chrom.sizes file.
If not hosted by UCSC, a chrom.sizes file can be generated by running
twoBitInfo on the assembly .2bit file.
options:
   -blockSize=N - Number of items to bundle in r-tree.  Default 256
   -itemsPerSlot=N - Number of data points bundled at lowest level. Default 1024
   -clip - If set just issue warning messages rather than dying if wig
                  file contains items off end of chromosome or chromosomes
                  that are not in the chrom.sizes file.
   -unc - If set, do not use compression.
   -fixedSummaries - If set, use a predefined sequence of summary levels.
   -keepAllChromosomes - If set, store all chromosomes in b-tree.
================================================================
========   wordLine   ====================================
================================================================
### kent source version 492 ###
wordLine - chop up words by white space and output them with one
word to each line.
usage:
    wordLine inFile(s)
Output will go to stdout.Options:
    -csym - Break up words based on C symbol rules rather than white space

================================================================