##############################################################################
This file is from:

http://hgdownload.soe.ucsc.edu/admin/exe/macOSX.arm64/README.txt

This directory contains applications for stand-alone use, 
built on a Mac OSX 14.4 M1 (arm64) machine.  (Sonoma)
Darwin Kernel Version 23.4.0, gcc version:
Apple clang version 15.0.0 (clang-1500.3.9.4)
Target: arm64-apple-darwin23.4.0
Thread model: posix

kent source tree v473 November 2024.

For help on the bigBed and bigWig applications see:
http://genome.ucsc.edu/goldenPath/help/bigBed.html
http://genome.ucsc.edu/goldenPath/help/bigWig.html

View the file 'FOOTER.txt' to see the usage statement for 
each of the applications.

The shared libraries used by these binaries are:  (from: otool -L <binary>)

/usr/lib/libc++.1.dylib (compatibility version 1.0.0, current version 1336.61.1)
/usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1600.157.0)
/usr/lib/libz.1.dylib (compatibility version 1.0.0, current version 1.2.12)

##############################################################################
This entire directory can by copied with the rsync command
into the local directory ./

rsync -aP rsync://hgdownload.soe.ucsc.edu/genome/admin/exe/macOSX.arm64/ ./

Individual programs can by copied by adding their name, for example:

rsync -aP \
   rsync://hgdownload.soe.ucsc.edu/genome/admin/exe/macOSX.arm64/faSize ./

##############################################################################
      Name                       Last modified      Size  Description
Parent Directory - blat/ 2024-11-07 05:06 - barChartMaxLimit 2024-11-06 21:30 1.0K fixStepToBedGraph.pl 2024-11-07 04:03 1.4K varStepToBedGraph.pl 2024-11-07 04:03 1.7K fetchChromSizes 2024-11-07 04:03 3.0K tdbRename 2024-11-06 21:30 3.1K tdbSort 2024-11-06 21:30 4.2K bedJoinTabOffset.py 2024-11-06 21:30 4.3K ucscApiClient 2024-11-06 21:30 4.8K bigGuessDb 2024-11-06 21:30 9.1K webSync 2024-11-06 21:30 10K chromToUcsc 2024-11-06 21:30 11K vai.pl 2024-11-06 21:30 12K expMatrixToBarchartBed 2024-11-06 21:30 16K bigHeat 2024-11-06 21:30 17K trackDbIndexBb 2024-11-06 21:30 19K sizeof 2024-11-07 04:02 33K gmtime 2024-11-07 04:02 33K localtime 2024-11-07 04:02 33K mktime 2024-11-07 04:02 33K cpg_lh 2024-11-07 04:02 51K FOOTER.txt 2024-11-07 04:04 330K toLower 2024-11-07 04:02 5.0M toUpper 2024-11-07 04:02 5.0M aveCols 2024-11-07 04:02 5.0M faOneRecord 2024-11-07 04:02 5.0M tailLines 2024-11-07 04:02 5.0M catUncomment 2024-11-07 04:02 5.0M colTransform 2024-11-07 04:02 5.0M endsInLf 2024-11-07 04:02 5.0M splitFile 2024-11-07 04:02 5.0M countChars 2024-11-07 04:02 5.0M catDir 2024-11-07 04:02 5.0M subChar 2024-11-07 04:02 5.0M headRest 2024-11-07 04:02 5.0M raToLines 2024-11-07 04:02 5.0M bedGraphPack 2024-11-07 04:02 5.0M calc 2024-11-07 04:02 5.0M linesToRa 2024-11-07 04:02 5.0M tickToDate 2024-11-07 04:02 5.0M addCols 2024-11-07 04:02 5.0M stringify 2024-11-07 04:02 5.0M randomLines 2024-11-07 04:02 5.0M bedRemoveOverlap 2024-11-07 04:02 5.0M faSomeRecords 2024-11-07 04:02 5.0M newPythonProg 2024-11-07 04:02 5.0M bedRestrictToPositions 2024-11-07 04:02 5.0M paraTestJob 2024-11-07 04:02 5.0M bedCommonRegions 2024-11-07 04:02 5.0M bedJoinTabOffset 2024-11-07 04:02 5.0M bedPileUps 2024-11-07 04:02 5.0M fastqToFa 2024-11-07 04:02 5.0M paraNodeStart 2024-11-07 04:02 5.0M clusterMatrixToBarChartBed 2024-11-07 04:02 5.0M newProg 2024-11-07 04:02 5.0M subColumn 2024-11-07 04:02 5.0M textHistogram 2024-11-07 04:02 5.0M matrixMarketToTsv 2024-11-07 04:02 5.0M paraFetch 2024-11-07 04:02 5.0M splitFileByColumn 2024-11-07 04:02 5.0M paraSync 2024-11-07 04:02 5.0M spacedToTab 2024-11-07 04:02 5.0M rowsToCols 2024-11-07 04:02 5.0M gensub2 2024-11-07 04:02 5.0M matrixToBarChartBed 2024-11-07 04:02 5.0M ave 2024-11-07 04:02 5.0M wordLine 2024-11-07 04:02 5.0M netSplit 2024-11-07 04:02 5.0M autoXml 2024-11-07 04:02 5.0M htmlCheck 2024-11-07 04:02 5.0M wigEncode 2024-11-07 04:03 5.0M netToBed 2024-11-07 04:02 5.0M raToTab 2024-11-07 04:02 5.0M xmlCat 2024-11-07 04:02 5.0M nibSize 2024-11-07 04:02 5.0M bedIntersect 2024-11-07 04:02 5.0M faRc 2024-11-07 04:02 5.0M rmFaDups 2024-11-07 04:02 5.0M chopFaLines 2024-11-07 04:02 5.0M chainSort 2024-11-07 04:02 5.0M chainSwap 2024-11-07 04:02 5.0M chainStitchId 2024-11-07 04:02 5.0M chainSplit 2024-11-07 04:02 5.0M netFilter 2024-11-07 04:02 5.0M chainPreNet 2024-11-07 04:02 5.0M chainFilter 2024-11-07 04:02 5.0M autoDtd 2024-11-07 04:02 5.0M mafNoAlign 2024-11-07 04:02 5.0M mafSpeciesList 2024-11-07 04:03 5.0M mafMeFirst 2024-11-07 04:02 5.0M mafToBigMaf 2024-11-07 04:03 5.0M mafOrder 2024-11-07 04:03 5.0M mafSpeciesSubset 2024-11-07 04:02 5.0M chainMergeSort 2024-11-07 04:02 5.0M mafRanges 2024-11-07 04:02 5.0M netSyntenic 2024-11-07 04:02 5.0M axtSort 2024-11-07 04:02 5.0M faFrag 2024-11-07 04:02 5.0M faToVcf 2024-11-07 04:03 5.0M axtSwap 2024-11-07 04:02 5.0M faCmp 2024-11-07 04:02 5.0M faPolyASizes 2024-11-07 04:02 5.0M matrixNormalize 2024-11-07 04:02 5.0M faToFastq 2024-11-07 04:02 5.0M crTreeSearchBed 2024-11-07 04:03 5.0M faRandomize 2024-11-07 04:02 5.0M hgFakeAgp 2024-11-07 04:03 5.0M faCount 2024-11-07 04:02 5.0M faTrans 2024-11-07 04:02 5.0M faSize 2024-11-07 04:02 5.0M faToTab 2024-11-07 04:02 5.0M crTreeIndexBed 2024-11-07 04:03 5.0M faNoise 2024-11-07 04:02 5.0M matrixClusterColumns 2024-11-07 04:02 5.0M bigWigCluster 2024-11-07 04:02 5.0M netChainSubset 2024-11-07 04:02 5.0M faFilter 2024-11-07 04:02 5.0M mafFilter 2024-11-07 04:02 5.0M qaToQac 2024-11-07 04:02 5.0M nibFrag 2024-11-07 04:02 5.0M paraHubStop 2024-11-07 04:02 5.0M paraNodeStop 2024-11-07 04:02 5.0M qacToQa 2024-11-07 04:02 5.0M paraNodeStatus 2024-11-07 04:02 5.0M qacToWig 2024-11-07 04:02 5.0M gencodeVersionForGenes 2024-11-07 04:03 5.0M validateManifest 2024-11-07 04:02 5.0M faSplit 2024-11-07 04:02 5.0M strexCalc 2024-11-07 04:02 5.0M xmlToSql 2024-11-07 04:02 5.0M twoBitInfo 2024-11-07 04:02 5.0M trfBig 2024-11-07 04:02 5.0M ixIxx 2024-11-07 04:03 5.0M fastqStatsAndSubsample 2024-11-07 04:02 5.0M bigWigSummary 2024-11-07 04:02 5.0M twoBitDup 2024-11-07 04:02 5.0M bigWigInfo 2024-11-07 04:02 5.0M bigWigToWig 2024-11-07 04:02 5.0M bigWigToBedGraph 2024-11-07 04:02 5.0M chainNet 2024-11-07 04:02 5.0M mafToAxt 2024-11-07 04:02 5.0M bigWigMerge 2024-11-07 04:02 5.0M axtToMaf 2024-11-07 04:02 5.0M faAlign 2024-11-07 04:02 5.0M mafAddQRows 2024-11-07 04:02 5.0M paraNode 2024-11-07 04:02 5.0M faToTwoBit 2024-11-07 04:02 5.0M mafAddIRows 2024-11-07 04:02 5.0M chainToBigChain 2024-11-07 04:03 5.0M chainAntiRepeat 2024-11-07 04:02 5.0M qacAgpLift 2024-11-07 04:02 5.0M autoSql 2024-11-07 04:02 5.0M vcfToBed 2024-11-07 04:03 5.0M tabToTabDir 2024-11-07 04:03 5.0M findMotif 2024-11-07 04:02 5.0M parasol 2024-11-07 04:02 5.0M chainToPsl 2024-11-07 04:02 5.0M lavToAxt 2024-11-07 04:02 5.0M chainToAxt 2024-11-07 04:02 5.0M pslCat 2024-11-07 04:02 5.0M pslDropOverlap 2024-11-07 04:02 5.0M checkAgpAndFa 2024-11-07 04:02 5.0M pslSortAcc 2024-11-07 04:02 5.0M pslFilter 2024-11-07 04:02 5.0M netToAxt 2024-11-07 04:02 5.0M pslScore 2024-11-07 04:02 5.0M pslxToFa 2024-11-07 04:02 5.0M pslPosTarget 2024-11-07 04:02 5.0M pslRc 2024-11-07 04:02 5.0M pslRemoveFrameShifts 2024-11-07 04:02 5.0M pslSwap 2024-11-07 04:02 5.0M pslStats 2024-11-07 04:02 5.0M pslPartition 2024-11-07 04:02 5.0M pslSplitOnTarget 2024-11-07 04:02 5.0M pslSomeRecords 2024-11-07 04:02 5.0M pslMapPostChain 2024-11-07 04:02 5.0M pslSelect 2024-11-07 04:02 5.0M pslSort 2024-11-07 04:02 5.0M pslProtToRnaCoords 2024-11-07 04:02 5.0M pslHisto 2024-11-07 04:02 5.0M faFilterN 2024-11-07 04:02 5.1M chainBridge 2024-11-07 04:02 5.1M chainScore 2024-11-07 04:02 5.1M pslReps 2024-11-07 04:02 5.1M pslPairs 2024-11-07 04:02 5.1M rmskAlignToPsl 2024-11-07 04:03 5.1M para 2024-11-07 04:02 5.1M blastToPsl 2024-11-07 04:02 5.1M pslToChain 2024-11-07 04:02 5.1M chainToPslBasic 2024-11-07 04:02 5.1M mafToPsl 2024-11-07 04:02 5.1M axtToPsl 2024-11-07 04:02 5.1M pslMrnaCover 2024-11-07 04:02 5.1M pslMap 2024-11-07 04:02 5.1M pslSpliceJunctions 2024-11-07 04:02 5.1M gff3ToPsl 2024-11-07 04:03 5.1M paraHub 2024-11-07 04:02 5.1M bamToPsl 2024-11-07 04:02 5.1M pslToPslx 2024-11-07 04:02 5.1M pslRecalcMatch 2024-11-07 04:02 5.1M chainCleaner 2024-11-07 04:02 5.1M pslPretty 2024-11-07 04:02 5.1M axtChain 2024-11-07 04:02 5.1M blastXmlToPsl 2024-11-07 04:02 5.2M hgsqldump 2024-11-07 04:02 5.3M hgsql 2024-11-07 04:02 5.3M hgBbiDbLink 2024-11-07 04:03 5.3M sqlToXml 2024-11-07 04:02 5.3M hgLoadSqlTab 2024-11-07 04:03 5.3M dbSnoop 2024-11-07 04:03 5.3M ameme 2024-11-07 04:03 7.3M bedSort 2024-11-07 04:02 7.3M bedToExons 2024-11-07 04:02 7.3M bedMergeAdjacent 2024-11-07 04:03 7.3M bedPartition 2024-11-07 04:03 7.3M bedGeneParts 2024-11-07 04:02 7.3M liftOverMerge 2024-11-07 04:02 7.3M bedToPsl 2024-11-07 04:03 7.3M pslToBed 2024-11-07 04:02 7.3M bedWeedOverlapping 2024-11-07 04:03 7.3M mafsInRegion 2024-11-07 04:02 7.3M mafSplit 2024-11-07 04:03 7.3M maskOutFa 2024-11-07 04:02 7.3M lavToPsl 2024-11-07 04:02 7.3M twoBitMask 2024-11-07 04:03 7.3M twoBitToFa 2024-11-07 04:02 7.3M bigMafToMaf 2024-11-07 04:03 7.4M bigBedInfo 2024-11-07 04:02 7.4M bigBedSummary 2024-11-07 04:02 7.4M bigBedNamedItems 2024-11-07 04:02 7.4M bigBedToBed 2024-11-07 04:02 7.4M bigWigAverageOverBed 2024-11-07 04:02 7.4M bigChainBreaks 2024-11-07 04:03 7.4M bigPslToPsl 2024-11-07 04:03 7.4M bedClip 2024-11-07 04:02 7.4M oligoMatch 2024-11-07 04:03 7.4M bigWigCorrelate 2024-11-07 04:02 7.4M bedGraphToBigWig 2024-11-07 04:02 7.4M bigWigCat 2024-11-07 04:02 7.4M wigToBigWig 2024-11-07 04:02 7.4M wigCorrelate 2024-11-07 04:02 7.4M bedToBigBed 2024-11-07 04:02 7.4M mafFetch 2024-11-07 04:02 9.6M positionalTblCheck 2024-11-07 04:03 9.6M binFromRange 2024-11-07 04:03 9.6M hicInfo 2024-11-07 04:03 9.6M mafFrags 2024-11-07 04:02 9.6M mafToSnpBed 2024-11-07 04:02 9.6M hgSpeciesRna 2024-11-07 04:02 9.6M dbDbToHubTxt 2024-11-07 04:03 9.6M genePredToFakePsl 2024-11-07 04:02 9.6M hgvsToVcf 2024-11-07 04:03 9.6M chromGraphToBin 2024-11-07 04:03 9.6M bigChainToChain 2024-11-07 04:03 9.6M chromGraphFromBin 2024-11-07 04:03 9.6M mafFrag 2024-11-07 04:02 9.6M bedExtendRanges 2024-11-07 04:03 9.6M bigGenePredToGenePred 2024-11-07 04:03 9.6M genePredToProt 2024-11-07 04:03 9.6M genePredCheck 2024-11-07 04:02 9.6M genePredFilter 2024-11-07 04:03 9.6M hubPublicCheck 2024-11-07 04:03 9.6M bedToGenePred 2024-11-07 04:02 9.6M genePredToBed 2024-11-07 04:02 9.6M fixTrackDb 2024-11-07 04:03 9.6M makeTableList 2024-11-07 04:03 9.6M getRna 2024-11-07 04:02 9.6M liftOver 2024-11-07 04:02 9.6M gtfToGenePred 2024-11-07 04:03 9.6M mafSplitPos 2024-11-07 04:03 9.6M bedItemOverlapCount 2024-11-07 04:02 9.6M genePredToGtf 2024-11-07 04:02 9.6M genePredHisto 2024-11-07 04:02 9.6M checkCoverageGaps 2024-11-07 04:02 9.6M hgFindSpec 2024-11-07 04:03 9.6M pslCheck 2024-11-07 04:02 9.6M hubClone 2024-11-07 04:03 9.6M gapToLift 2024-11-07 04:03 9.6M hgLoadGap 2024-11-07 04:03 9.6M hgGcPercent 2024-11-07 04:03 9.6M mrnaToGene 2024-11-07 04:02 9.6M checkTableCoords 2024-11-07 04:02 9.6M transMapPslToGenePred 2024-11-07 04:03 9.6M hgTrackDb 2024-11-07 04:03 9.6M pslLiftSubrangeBlat 2024-11-07 04:03 9.6M genePredSingleCover 2024-11-07 04:02 9.6M raSqlQuery 2024-11-07 04:03 9.6M mafGene 2024-11-07 04:02 9.6M pslToBigPsl 2024-11-07 04:03 9.6M hgLoadChain 2024-11-07 04:03 9.6M getRnaPred 2024-11-07 04:02 9.6M genePredToBigGenePred 2024-11-07 04:03 9.6M hgLoadWiggle 2024-11-07 04:03 9.6M dbTrash 2024-11-07 04:02 9.6M hgLoadNet 2024-11-07 04:03 9.6M bedCoverage 2024-11-07 04:02 9.6M ldHgGene 2024-11-07 04:03 9.6M mafCoverage 2024-11-07 04:02 9.6M hgLoadOutJoined 2024-11-07 04:03 9.6M estOrient 2024-11-07 04:02 9.6M netClass 2024-11-07 04:02 9.6M clusterGenes 2024-11-07 04:02 9.6M hgGoldGapGl 2024-11-07 04:03 9.6M hgLoadMaf 2024-11-07 04:03 9.6M tdbQuery 2024-11-07 04:03 9.6M hubCheck 2024-11-07 04:03 9.6M hgLoadBed 2024-11-07 04:03 9.6M hgLoadOut 2024-11-07 04:03 9.6M hgLoadMafSummary 2024-11-07 04:03 9.6M gff3ToGenePred 2024-11-07 04:03 9.6M genePredToMafFrames 2024-11-07 04:02 9.6M featureBits 2024-11-07 04:02 9.6M overlapSelect 2024-11-07 04:03 9.6M hgWiggle 2024-11-07 04:03 9.6M pslCDnaFilter 2024-11-07 04:02 9.6M validateFiles 2024-11-07 04:02 9.6M liftUp 2024-11-07 04:02 9.6M checkHgFindSpec 2024-11-07 04:02 9.6M
================================================================
to download all of the files from one of these admin/exe/ directories,
  for example: admin/exe/linux.x86_64/
    using the rsync command to your current directory:

  rsync -aP rsync://hgdownload.soe.ucsc.edu/genome/admin/exe/linux.x86_64/ ./

================================================================
========   addCols   ====================================
================================================================
### kent source version 473 ###
addCols - Sum columns in a text file.
usage:
   addCols <fileName>
adds all columns in the given file, 
outputs the sum of each column.  <fileName> can be the
name: stdin to accept input from stdin.
Options:
    -maxCols=N - maximum number of colums (defaults to 16)

================================================================
========   ameme   ====================================
================================================================
ameme - find common patterns in DNA
usage
    ameme good=goodIn.fa [bad=badIn.fa] [numMotifs=2] [background=m1] [maxOcc=2] [motifOutput=fileName] [html=output.html] [gif=output.gif] [rcToo=on] [controlRun=on] [startScanLimit=20] [outputLogo] [constrainer=1]
where goodIn.fa is a multi-sequence fa file containing instances
of the motif you want to find, badIn.fa is a file containing similar
sequences but lacking the motif, numMotifs is the number of motifs
to scan for, background is m0,m1, or m2 for various levels of Markov
models, maxOcc is the maximum occurrences of the motif you 
expect to find in a single sequence and motifOutput is the name 
of a file to store just the motifs in. rcToo=on searches both strands.
If you include controlRun=on in the command line, a random set of 
sequences will be generated that match your foreground data set in size, 
and your background data set in nucleotide probabilities. The program 
will then look for motifs in this random set. If the scores you get in a 
real run are about the same as those you get in a control run, then the motifs
Improbizer has found are probably not significant.

================================================================
========   autoDtd   ====================================
================================================================
### kent source version 473 ###
autoDtd - Give this a XML document to look at and it will come up with a DTD
to describe it.
usage:
   autoDtd in.xml out.dtd out.stats
options:
   -tree=out.tree - Output tag tree.
   -atree=out.atree - Output attributed tag tree.

================================================================
========   autoSql   ====================================
================================================================
### kent source version 473 ###
autoSql - create SQL and C code for permanently storing
a structure in database and loading it back into memory
based on a specification file
usage:
    autoSql specFile outRoot {optional: -dbLink -withNull -json} 
This will create outRoot.sql outRoot.c and outRoot.h based
on the contents of specFile. 

options:
  -dbLink - optionally generates code to execute queries and
            updates of the table.
  -addBin - Add an initial bin field and index it as (chrom,bin)
  -withNull - optionally generates code and .sql to enable
              applications to accept and load data into objects
              with potential 'missing data' (NULL in SQL)
              situations.
  -defaultZeros - will put zero and or empty string as default value
  -django - generate method to output object as django model Python code
  -json - generate method to output the object in JSON (JavaScript) format.

================================================================
========   autoXml   ====================================
================================================================
autoXml - Generate structures code and parser for XML file from DTD-like spec
usage:
   autoXml file.dtdx root
This will generate root.c, root.h
options:
   -textField=xxx what to name text between start/end tags. Default 'text'
   -comment=xxx Comment to appear at top of generated code files
   -picky  Generate parser that rejects stuff it doesn't understand
   -main   Put in a main routine that's a test harness
   -prefix=xxx Prefix to add to structure names. By default same as root
   -positive Don't write out optional attributes with negative values

================================================================
========   ave   ====================================
================================================================
ave - Compute average and basic stats
usage:
   ave file
options:
   -col=N Which column to use.  Default 1
   -tableOut - output by columns (default output in rows)
   -noQuartiles - only calculate min,max,mean,standard deviation
                - for large data sets that will not fit in memory.
================================================================
========   aveCols   ====================================
================================================================
aveCols - average together columns
usage:
   aveCols file
adds all columns (up to 16 columns) in the given file, 
outputs the average (sum/#ofRows) of each column.  <fileName> can be the
name: stdin to accept input from stdin.
================================================================
========   axtChain   ====================================
================================================================
axtChain - Chain together axt alignments.
usage:
   axtChain [options] -linearGap=loose in.axt tNibDir qNibDir out.chain
Where tNibDir/qNibDir are either directories full of nib files, the name
of a .2bit file, or a single fasta file with additional -faQ or -faT options.
options:
   -psl Use psl instead of axt format for input
   -faQ The specified qNibDir is a fasta file with multiple sequences for query
   -faT The specified tNibDir is a fasta file with multiple sequences for target
                NOTE: will not work with gzipped fasta files
   -minScore=N  Minimum score for chain, default 1000
   -details=fileName Output some additional chain details
   -scoreScheme=fileName Read the scoring matrix from a blastz-format file
   -linearGap=<medium|loose|filename> Specify type of linearGap to use.
              *Must* specify this argument to one of these choices.
              loose is chicken/human linear gap costs.
              medium is mouse/human linear gap costs.
              Or specify a piecewise linearGap tab delimited file.
   sample linearGap file (loose)
tablesize       11
smallSize       111
position        1       2       3       11      111     2111    12111   32111   72111   152111  252111
qGap    325     360     400     450     600     1100    3600    7600    15600   31600   56600
tGap    325     360     400     450     600     1100    3600    7600    15600   31600   56600
bothGap 625     660     700     750     900     1400    4000    8000    16000   32000   57000

================================================================
========   axtSort   ====================================
================================================================
axtSort - Sort axt files
usage:
   axtSort in.axt out.axt
options:
   -query - Sort by query position, not target
   -byScore - Sort by score

================================================================
========   axtSwap   ====================================
================================================================
axtSwap - Swap source and query in an axt file
usage:
   axtSwap source.axt target.sizes query.sizes dest.axt
options:
   -xxx=XXX

================================================================
========   axtToMaf   ====================================
================================================================
### kent source version 473 ###
axtToMaf - Convert from axt to maf format
usage:
   axtToMaf in.axt tSizes qSizes out.maf
Where tSizes and qSizes is a file that contains
the sizes of the target and query sequences.
Very often this with be a chrom.sizes file
Options:
    -qPrefix=XX. - add XX. to start of query sequence name in maf
    -tPrefex=YY. - add YY. to start of target sequence name in maf
    -tSplit Create a separate maf file for each target sequence.
            In this case output is a dir rather than a file
            In this case in.maf must be sorted by target.
    -score       - recalculate score 
    -scoreZero   - recalculate score if zero 

================================================================
========   axtToPsl   ====================================
================================================================
axtToPsl - Convert axt to psl format
usage:
   axtToPsl in.axt tSizes qSizes out.psl
Where tSizes and qSizes are tab-delimited files with
       <seqName><size>
columns.
options:
   -xxx=XXX

================================================================
========   bamToPsl   ====================================
================================================================
### kent source version 473 ###
bamToPsl - Convert a bam file to a psl and optionally also a fasta file that contains the reads.
usage:
   bamToPsl [options] in.bam out.psl
options:
   -fasta=output.fa - output query sequences to specified file
   -chromAlias=file - specify a two-column file: 1: alias, 2: other name
          for target name translation from column 1 name to column 2 name
          names not found are passed through intact
   -nohead          - do not output the PSL header, default has header output
   -dots=N          - output progress dot(.) every N alignments processed

Note: a chromAlias file can be obtained from a UCSC database, e.g.:
 hgsql -N -e 'select alias,chrom from chromAlias;' hg38 > hg38.chromAlias.tab
 Or from the downloads server:
  wget https://hgdownload.soe.ucsc.edu/goldenPath/hg38/database/chromAlias.txt.gz
See also our tool chromToUcsc

================================================================
========   barChartMaxLimit   ====================================
================================================================
Can't open file '-verbose=2' for reading
================================================================
========   bedClip   ====================================
================================================================
### kent source version 473 ###
bedClip - Remove lines from bed file that refer to off-chromosome locations.
usage:
   bedClip [options] input.bed chrom.sizes output.bed
chrom.sizes is a two-column file/URL: <chromosome name> <size in bases>
If the assembly <db> is hosted by UCSC, chrom.sizes can be a URL like
  http://hgdownload.soe.ucsc.edu/goldenPath/<db>/bigZips/<db>.chrom.sizes
or you may use the script fetchChromSizes to download the chrom.sizes file.
If not hosted by UCSC, a chrom.sizes file can be generated by running
twoBitInfo on the assembly .2bit file.
options:
   -truncate  - truncate items that span ends of chrom instead of the
                default of dropping the items
   -verbose=2 - set to get list of lines clipped and why
================================================================
========   bedCommonRegions   ====================================
================================================================
### kent source version 473 ###
bedCommonRegions - Create a bed file (just bed3) that contains the regions common to all inputs.
Regions are common only if exactly the same chromosome, starts, and end.  Overlap is not enough.
Each region must be in each input at most once. Output is stdout.
usage:
   bedCommonRegions file1 file2 file3 ... fileN

================================================================
========   bedCoverage   ====================================
================================================================
bedCoverage - Analyse coverage by bed files - chromosome by 
chromosome and genome-wide.
usage:
   bedCoverage database bedFile
Note bed file must be sorted by chromosome
   -restrict=restrict.bed Restrict to parts in restrict.bed

================================================================
========   bedExtendRanges   ====================================
================================================================
### kent source version 473 ###
bedExtendRanges - extend length of entries in bed 6+ data to be at least the given length,
taking strand directionality into account.

usage:
   bedExtendRanges database length files(s)

options:
   -host	mysql host
   -user	mysql user
   -password	mysql password
   -tab		Separate by tabs rather than space
   -verbose=N - verbose level for extra information to STDERR

example:

   bedExtendRanges hg18 250 stdin

   bedExtendRanges -user=genome -host=genome-mysql.soe.ucsc.edu hg18 250 stdin

will transform:
    chr1 500 525 . 100 +
    chr1 1000 1025 . 100 -
to:
    chr1 500 750 . 100 +
    chr1 775 1025 . 100 -

================================================================
========   bedGeneParts   ====================================
================================================================
### kent source version 473 ###
bedGeneParts - Given a bed, spit out promoter, first exon, or all introns.
usage:
   bedGeneParts part in.bed out.bed
Where part is either 'exons' or 'firstExon' or 'introns' or 'promoter' or 'firstCodingSplice'
or 'secondCodingSplice'
options:
   -proStart=NN - start of promoter relative to txStart, default -100
   -proEnd=NN - end of promoter relative to txStart, default 50

================================================================
========   bedGraphPack   ====================================
================================================================
### kent source version 473 ###
bedGraphPack v1 - Pack together adjacent records representing same value.
usage:
   bedGraphPack in.bedGraph out.bedGraph
The input needs to be sorted by chrom and this is checked.  To put in a pipe
use stdin and stdout in the command line in place of file names.

================================================================
========   bedGraphToBigWig   ====================================
================================================================
### kent source version 473 ###
bedGraphToBigWig v 2.10 - Convert a bedGraph file to bigWig format (bbi version: 4).
usage:
   bedGraphToBigWig in.bedGraph chrom.sizes out.bw
where in.bedGraph is a four column file in the format:
      <chrom> <start> <end> <value>
and chrom.sizes is a two-column file/URL: <chromosome name> <size in bases>
and out.bw is the output indexed big wig file.
If the assembly <db> is hosted by UCSC, chrom.sizes can be a URL like
  http://hgdownload.soe.ucsc.edu/goldenPath/<db>/bigZips/<db>.chrom.sizes
or you may use the script fetchChromSizes to download the chrom.sizes file.
If not hosted by UCSC, a chrom.sizes file can be generated by running
twoBitInfo on the assembly .2bit file.
The input bedGraph file must be sorted, use the unix sort command:
  LC_ALL=C sort -k1,1 -k2,2n unsorted.bedGraph > sorted.bedGraph
The LC_ALL=C variable activates case-sensitive sorting.
options:
   -blockSize=N - Number of items to bundle in r-tree.  Default 256
   -itemsPerSlot=N - Number of data points bundled at lowest level. Default 1024
   -sizesIsBb  -- If set, the chrom.sizes file is assumed to be a bigBed file.
   -unc - If set, do not use compression.
================================================================
========   bedIntersect   ====================================
================================================================
### kent source version 473 ###
bedIntersect - Intersect two bed files
usage:
bed columns four(name) and five(score) are optional
   bedIntersect a.bed b.bed output.bed
options:
   -aHitAny        output all of a if any of it is hit by b
   -minCoverage=0.N  min coverage of b to output match (or if -aHitAny, of a).
                   Not applied to 0-length items.  Default 0.000010
   -bScore         output score from b.bed (must be at least 5 field bed)
   -tab            chop input at tabs not spaces
   -allowStartEqualEnd  Don't discard 0-length items of a or b
                        (e.g. point insertions)

================================================================
========   bedItemOverlapCount   ====================================
================================================================
### kent source version 473 ###
bedItemOverlapCount - count number of times a base is overlapped by the
	items in a bed file.  Output is bedGraph 4 to stdout.
usage:
 sort bedFile.bed | bedItemOverlapCount [options] <database> stdin
To create a bigWig file from this data to use in a custom track:
 sort -k1,1 bedFile.bed | bedItemOverlapCount [options] <database> stdin \
         > bedFile.bedGraph
 bedGraphToBigWig bedFile.bedGraph chrom.sizes bedFile.bw
   where the chrom.sizes is obtained with the script: fetchChromSizes
   See also:
 http://genome-test.gi.ucsc.edu/~kent/src/unzipped/utils/userApps/fetchChromSizes
options:
   -zero      add blocks with zero count, normally these are ommitted
   -bed12     expect bed12 and count based on blocks
              Without this option, only the first three fields are used.
   -max       if counts per base overflows set to max (4294967295) instead of exiting
   -outBounds output min/max to stderr
   -chromSize=sizefile	Read chrom sizes from file instead of database
             sizefile contains two white space separated fields per line:
		chrom name and size
   -host=hostname	mysql host used to get chrom sizes
   -user=username	mysql user
   -password=password	mysql password

Notes:
 * You may want to separate your + and - strand
   items before sending into this program as it only looks at
   the chrom, start and end columns of the bed file.
 * Program requires a <database> connection to lookup chrom sizes for a sanity
   check of the incoming data.  Even when the -chromSize argument is used
   the <database> must be present, but it will not be used.

 * The bed file *must* be sorted by chrom
 * Maximum count per base is 4294967295. Recompile with new unitSize to increase this
================================================================
========   bedJoinTabOffset   ====================================
================================================================
bedJoinTabOffset - Add file offset and length of line in a text file with the same name as the BED name to each row of BED.
usage:
   bedJoinTabOffset inTabFile inBedFile outBedFile

Given a bed file and tab file where each have a column with matching values:
1. first get the value of column0, the offset and line length from inTabFile.
2. Then go over the bed file, use the -bedKey (defaults to the name field)
   field and append its offset and length to the bed file as two separate
   fields. Write the new bed file to outBed.
options:
   -bedKey=integer   0-based index key of the bed file to use to match up with
                     the tab file. Default is 3 for the name field.

================================================================
========   bedJoinTabOffset.py   ====================================
================================================================
Usage: bedJoinTabOffset.py [options] inTabFile inBedFile outBedFile - given a bed file and tab file where each have a column with matching values: first get the value of column0, the offset and line length from inTabFile. Then go over the bed file, use the name field and append its offset and length to the bed file as two separate fields. Write the new bed file to outBed.

bedJoinTabOffset.py: error: no such option: -v
================================================================
========   bedMergeAdjacent   ====================================
================================================================
### kent source version 473 ###
bedMergeAdjacent - merge adjacent blocks in a BED 12
usage:
   bedMergeAdjacent inBed outBed
options:

================================================================
========   bedPartition   ====================================
================================================================
### kent source version 473 ###
bedPartition - split BED ranges into non-overlapping ranges
usage:
   bedPartition [options] bedFile rangesBed

Split ranges in a BED into non-overlapping sets for use in cluster jobs.
Output is a BED 3 of the ranges.
The bedFile maybe compressed and no ordering is assumed.

options:
   -partSize=1 - will combine non-overlapping partitions, up to
    this number of ranges.
    per set of overlapping records.
   -parallel=n - use this many cores for parallel sorting

================================================================
========   bedPileUps   ====================================
================================================================
### kent source version 473 ###
bedPileUps - Find (exact) overlaps if any in bed input
usage:
   bedPileUps in.bed
Where in.bed is in one of the ascii bed formats.
The in.bed file must be sorted by chromosome,start,
  to sort a bed file, use the unix sort command:
     sort -k1,1 -k2,2n unsorted.bed > sorted.bed

Options:
  -name - include BED name field 4 when evaluating uniqueness
  -tab  - use tabs to parse fields
  -verbose=2 - show the location and size of each pileUp

================================================================
========   bedRemoveOverlap   ====================================
================================================================
### kent source version 473 ###
bedRemoveOverlap - Remove overlapping records from a (sorted) bed file.  Gets rid of
`the smaller of overlapping records.
usage:
   bedRemoveOverlap in.bed out.bed
options:
   -xxx=XXX

================================================================
========   bedRestrictToPositions   ====================================
================================================================
### kent source version 473 ###
bedRestrictToPositions - Filter bed file, restricting to only ones that match chrom/start/ends specified in restrict.bed file.
usage:
   bedRestrictToPositions in.bed restrict.bed out.bed
options:
   -xxx=XXX

================================================================
========   bedSort   ====================================
================================================================
bedSort - Sort a .bed file by chrom,chromStart
usage:
   bedSort in.bed out.bed
in.bed and out.bed may be the same.
================================================================
========   bedToBigBed   ====================================
================================================================
### kent source version 473 ###
bedToBigBed v. 2.10 - Convert bed file to bigBed. (bbi version: 4)
usage:
   bedToBigBed in.bed chrom.sizes out.bb
Where in.bed is in one of the ascii bed formats, but not including track lines
and chrom.sizes is a two-column file/URL: <chromosome name> <size in bases>
and out.bb is the output indexed big bed file.

If the assembly <db> is hosted by UCSC, chrom.sizes can be a URL like
  http://hgdownload.soe.ucsc.edu/goldenPath/<db>/bigZips/<db>.chrom.sizes
or you may use the script fetchChromSizes to download the chrom.sizes file.
If you have bed annotations on patch sequences from NCBI, a more inclusive
chrom.sizes file can be found using a URL like
  http://hgdownload.soe.ucsc.edu/goldenPath/<db>/database/chromInfo.txt.gz
If not hosted by UCSC, a chrom.sizes file can be generated by running
twoBitInfo on the assembly .2bit file or the 2bit file or used directly
if the -sizesIs2Bit option is specified.

The chrom.sizes file may also be a chromAlias bigBed file, or a URL to
such a file, by specifying the -sizesIsChromAliasBb option.  When using
a chromAlias bigBed file, the input BED file may have chromosome names
matching any of the sequence name aliases in the chromAlias file.

For UCSC provided genomes, the chromAlias files can be found under:
    https://hgdownload.soe.ucsc.edu/goldenPath/<db>/bigZips/<db>.chromAlias.bb
For UCSC GenArk assembly hubs, the chrom aliases are namedd in the form:
    https://hgdownload.soe.ucsc.edu/hubs/GCF/006/542/625/GCF_006542625.1/GCF_006542625.1.chromAlias.bb
For a description of generating chromAlias files for your own assembly hub, see:
      http://genomewiki.ucsc.edu/index.php/Chrom_Alias

The in.bed file must be sorted by chromosome,start,
  to sort a bed file, use the unix sort command:
     sort -k1,1 -k2,2n unsorted.bed > sorted.bed
Sequences must be sorted by name so all sequences with the same name
are collected together, but they don't need to be in any particular order.

options:
   -type=bedN[+[P]] : 
                      N is between 3 and 15, 
                      optional (+) if extra "bedPlus" fields, 
                      optional P specifies the number of extra fields. Not required, but preferred.
                      Examples: -type=bed6 or -type=bed6+ or -type=bed6+3 
                      (see http://genome.ucsc.edu/FAQ/FAQformat.html#format1)
   -as=fields.as - If you have non-standard "bedPlus" fields, it's great to put a definition
                   of each field in a row in AutoSql format here.
   -blockSize=N - Number of items to bundle in r-tree.  Default 256
   -itemsPerSlot=N - Number of data points bundled at lowest level. Default 512
   -unc - If set, do not use compression.
   -tab - If set, expect fields to be tab separated, normally
           expects white space separator.
   -extraIndex=fieldList - If set, make an index on each field in a comma separated list
           extraIndex=name and extraIndex=name,id are commonly used.
   -sizesIs2Bit  -- If set, the chrom.sizes file is assumed to be a 2bit file.
   -sizesIsChromAliasBb -- If set, then chrom.sizes file is assumed to be a chromAlias
    bigBed file or a URL to a such a file (see above).
   -sizesIsBb  -- Obsolete name for -sizesIsChromAliasBb.
   -udcDir=/path/to/udcCacheDir  -- sets the UDC cache dir for caching of remote files.
   -allow1bpOverlap  -- allow exons to overlap by at most one base pair
   -maxAlloc=N -- Set the maximum memory allocation size to N bytes

================================================================
========   bedToExons   ====================================
================================================================
### kent source version 473 ###
bedToExons - Split a bed up into individual beds.
One for each internal exon.
usage:
   bedToExons originalBeds.bed splitBeds.bed
options:
   -cdsOnly - Only output the coding portions of exons.

================================================================
========   bedToGenePred   ====================================
================================================================
### kent source version 473 ###
bedToGenePred - convert bed format files to genePred format
usage:
   bedToGenePred bedFile genePredFile

Convert a bed file to a genePred file. If BED has at least 12 columns,
then a genePred with blocks is created. Otherwise single-exon genePreds are
created.

================================================================
========   bedToPsl   ====================================
================================================================
### kent source version 473 ###
bedToPsl - convert bed format files to psl format
usage:
   bedToPsl [options] chromSizes bedFile pslFile

Convert a BED file to a PSL file. This the result is an alignment.
 It is intended to allow processing by tools that operate on PSL.
If the BED has at least 12 columns, then a PSL with blocks is created.
Otherwise single-exon PSLs are created.

Options:
-tabs        -  use tab as a separator
-keepQuery   -  instead of creating a fake query, create PSL with identical query and
                target specs. Useful if bed features are to be lifted with pslMap and one 
                wants to keep the source location in the lift result.

================================================================
========   bedWeedOverlapping   ====================================
================================================================
### kent source version 473 ###
bedWeedOverlapping - Filter out beds that overlap a 'weed.bed' file.
usage:
   bedWeedOverlapping weeds.bed input.bed output.bed
options:
   -maxOverlap=0.N - maximum overlapping ratio, default 0 (any overlap)
   -invert - keep the overlapping and get rid of everything else

================================================================
========   bigBedInfo   ====================================
================================================================
### kent source version 473 ###
bigBedInfo - Show information about a bigBed file.
usage:
   bigBedInfo file.bb
options:
   -udcDir=/dir/to/cache - place to put cache for remote bigBed/bigWigs
   -chroms - list all chromosomes and their sizes
   -zooms - list all zoom levels and their sizes
   -as - get autoSql spec
   -asOut - output only autoSql spec
   -extraIndex - list all the extra indexes

================================================================
========   bigBedNamedItems   ====================================
================================================================
### kent source version 473 ###
bigBedNamedItems - Extract item of given name from bigBed
usage:
   bigBedNamedItems file.bb name output.bed
options:
   -nameFile - if set, treat name parameter as file full of space delimited names
   -field=fieldName - use index on field name, default is "name"
   -header - output a autoSql-style header (starts with '#').

================================================================
========   bigBedSummary   ====================================
================================================================
### kent source version 473 ###
bigBedSummary - Extract summary information from a bigBed file.
usage:
   bigBedSummary file.bb chrom start end dataPoints
Get summary data from bigBed for indicated region, broken into
dataPoints equal parts.  (Use dataPoints=1 for simple summary.)
options:
   -type=X where X is one of:
         coverage - % of region that is covered (default)
         mean - average depth of covered regions
         min - minimum depth of covered regions
         max - maximum depth of covered regions
   -fields - print out information on fields in file.
      If fields option is used, the chrom, start, end, dataPoints
      parameters may be omitted
   -udcDir=/dir/to/cache - place to put cache for remote bigBed/bigWigs

================================================================
========   bigBedToBed   ====================================
================================================================
### kent source version 473 ###
bigBedToBed v1 - Convert from bigBed to ascii bed format.
usage:
   bigBedToBed input.bb output.bed
options:
   -chrom=chr1 - if set restrict output to given chromosome
   -start=N - if set, restrict output to only that over start
   -end=N - if set, restrict output to only that under end
   -bed=in.bed - restrict output to all regions in a BED file
   -udcDir=/dir/to/cache - place to put cache for remote bigBed/bigWigs
   -header - output a autoSql-style header (starts with '#').
   -tsv - output a TSV header (without '#').

================================================================
========   bigChainBreaks   ====================================
================================================================
### kent source version 473 ###
bigChainBreaks - output a set of rearrangement breakpoints
usage:
   bigChainBreaks bigChain.bb label breaks.txt
options:
   -xxx=XXX

================================================================
========   bigChainToChain   ====================================
================================================================
### kent source version 473 ###
bigChainToChain - convert bigChain files back into a chain file
usage:
   bigChainToChain bigChain.bb bigLinks.bb output.chain
options:
   -xxx=XXX

================================================================
========   bigGenePredToGenePred   ====================================
================================================================
### kent source version 473 ###
bigGenePredToGenePred - convert bigGenePred file to genePred.
usage:
   bigGenePredToGenePred bigGenePred.bb genePred.gp


================================================================
========   bigGuessDb   ====================================
================================================================
Usage: bigGuessDb [options] inFile - given a bigBed or "bigWig file or URL, 
guess the assembly based on the chrom names and sizes. Must have bigBedInfo and
bigWigInfo in PATH. Also requires a bigGuessDb.txt.gz, an alpha version of
which can be downloaded at https://hgwdev.gi.ucsc.edu/~max/bigGuessDb/bigGuessDb.txt.gz

Example run:
    $ wget https://hgwdev.gi.ucsc.edu/~max/bigGuessDb/bigGuessDb.txt.gz
    $ bigGuessDb --best https://ftp.ncbi.nlm.nih.gov/geo/samples/GSM1014nnn/GSM1014177/suppl/GSM1014177_mm9_wgEncodeUwDnaseNih3t3NihsMImmortalSigRep2.bigWig
    mm9



bigGuessDb: error: no such option: -v
================================================================
========   bigHeat   ====================================
================================================================
Usage: bigHeat [options] locationBed locationMatrixFnames chromSizes outDir - create one feature
            Duplicate BED features and color by them by the values in locationMatrix.
            Creates new bigBed files in outDir and creates a basic trackDb.ra file there.

    BED file looks like this:

            chr1 1 1000 myGene 0 + 1 1000 0,0,0
            chr2 1 1000 myGene2 0 + 1 1000 0,0,0

    locationMatrix looks like this:

            gene sample1 sample2 sample3
            myGene 1 2 3
            myGene2 0.1 3 10
            myGene2_probe2 0.1 3 10

    This will create a composite with three subtracks (sample1, sample2, sample). Each subtrack will have myGene,
    and colored in intensity with sample3 more intense than sample1 and sample2. Same for myGene2.
    Also can add a bigWig with a summary of all these values, one per nucleotide
    

bigHeat: error: no such option: -v
================================================================
========   bigMafToMaf   ====================================
================================================================
### kent source version 473 ###
bigMafToMaf - convert bigMaf to maf file
usage:
   bigMafToMaf bigMaf.bb file.maf
options:

================================================================
========   bigPslToPsl   ====================================
================================================================
### kent source version 473 ###
bigPslToPsl - convert bigPsl file to psl
usage:
   bigPslToPsl bigPsl.bb output.psl
options:
   -collapseStrand   if target strand is '+', don't output it

================================================================
========   bigWigAverageOverBed   ====================================
================================================================
### kent source version 473 ###
bigWigAverageOverBed v2 - Compute average score of big wig over each bed, which may have introns.
usage:
   bigWigAverageOverBed in.bw in.bed out.tab
The output columns are:
   name - name field from bed, which should be unique
   size - size of bed (sum of exon sizes
   covered - # bases within exons covered by bigWig
   sum - sum of values over all bases covered
   mean0 - average over bases with non-covered bases counting as zeroes
   mean - average over just covered bases
Options:
   -stats=stats.ra - Output a collection of overall statistics to stat.ra file
   -bedOut=out.bed - Make output bed that is echo of input bed but with mean column appended
   -sampleAroundCenter=N - Take sample at region N bases wide centered around bed item, rather
                     than the usual sample in the bed item.
   -minMax - include two additional columns containing the min and max observed in the area.
   -tsv - include a TSV header for input to other tools.

================================================================
========   bigWigCat   ====================================
================================================================
### kent source version 473 ###
bigWigCat v 4 - merge non-overlapping bigWig files
directly into bigWig format
usage:
   bigWigCat out.bw in1.bw in2.bw ...
Where in*.bw is in big wig format
and out.bw is the output indexed big wig file.
options:
   -itemsPerSlot=N - Number of data points bundled at lowest level. Default 1024

Note: must use wigToBigWig -fixedSummaries -keepAllChromosomes (perhaps in parallel cluster jobs) to create the input files.
Note: By non-overlapping we mean the entire span of each file, from first data point to last data point, must not overlap with that of other files.

================================================================
========   bigWigCluster   ====================================
================================================================
### kent source version 473 ###
bigWigCluster - Cluster bigWigs using a hacTree
usage:
   bigWigCluster input.list chrom.sizes output.json output.tab
where: input.list is a list of bigWig file names
       chrom.sizes is tab separated <chrom><size> for assembly for bigWigs
       output.json is json formatted output suitable for graphing with D3
       output.tab is tab-separated file of  of items ordered by tree with the fields
           label - label from -labels option or from file name with no dir or extention
           pos - number from 0-1 representing position according to tree and distance
           red - number from 0-255 representing recommended red component of color
           green - number from 0-255 representing recommended green component of color
           blue - number from 0-255 representing recommended blue component of color
           path - file name from input.list including directory and extension
options:
   -labels=fileName - label files from tabSeparated file with fields
           path - path to bigWig file
           label - a string with no tabs
   -precalc=precalc.tab - tab separated file with <file1> <file2> <distance>
            columns.
   -threads=N - number of threads to use, default 10
   -tmpDir=/tmp/path - place to put temp files, default current dir

================================================================
========   bigWigCorrelate   ====================================
================================================================
### kent source version 473 ###
bigWigCorrelate - Correlate bigWig files, optionally only on target regions.
usage:
   bigWigCorrelate a.bigWig b.bigWig
or
   bigWigCorrelate listOfFiles
options:
   -restrict=restrict.bigBed - restrict correlation to parts covered by this file
   -threshold=N.N - clip values to this threshold
   -rootNames - if set just report the root (minus directory and suffix) of file
                names when using listOfFiles
   -ignoreMissing - if set do not correlate where either side is missing data
                Normally missing data is treated as zeros

================================================================
========   bigWigInfo   ====================================
================================================================
### kent source version 473 ###
bigWigInfo - Print out information about bigWig file.
usage:
   bigWigInfo file.bw
options:
   -udcDir=/dir/to/cache - place to put cache for remote bigBed/bigWigs
   -chroms - list all chromosomes and their sizes
   -zooms - list all zoom levels and their sizes
   -minMax - list the min and max on a single line

================================================================
========   bigWigMerge   ====================================
================================================================
### kent source version 473 ###
bigWigMerge v2 - Merge together multiple bigWigs into a single output bedGraph.
You'll have to run bedGraphToBigWig to make the output bigWig.
The signal values are just added together to merge them
usage:
   bigWigMerge in1.bw in2.bw .. inN.bw out.bedGraph
options:
   -threshold=0.N - don't output values at or below this threshold. Default is 0.0
   -adjust=0.N - add adjustment to each value
   -clip=NNN.N - values higher than this are clipped to this value
   -inList - input file are lists of file names of bigWigs
   -max - merged value is maximum from input files rather than sum

================================================================
========   bigWigSummary   ====================================
================================================================
### kent source version 473 ###
bigWigSummary - Extract summary information from a bigWig file.
usage:
   bigWigSummary file.bigWig chrom start end dataPoints
Get summary data from bigWig for indicated region, broken into
dataPoints equal parts.  (Use dataPoints=1 for simple summary.)

NOTE:  start and end coordinates are in BED format (0-based)

options:
   -type=X where X is one of:
         mean - average value in region (default)
         min - minimum value in region
         max - maximum value in region
         std - standard deviation in region
         coverage - % of region that is covered
   -udcDir=/dir/to/cache - place to put cache for remote bigBed/bigWigs

================================================================
========   bigWigToBedGraph   ====================================
================================================================
### kent source version 473 ###
bigWigToBedGraph - Convert from bigWig to bedGraph format.
usage:
   bigWigToBedGraph in.bigWig out.bedGraph
options:
   -chrom=chr1 - if set restrict output to given chromosome
   -start=N - if set, restrict output to only that over start
   -end=N - if set, restict output to only that under end
   -udcDir=/dir/to/cache - place to put cache for remote bigBed/bigWigs

================================================================
========   bigWigToWig   ====================================
================================================================
### kent source version 473 ###
bigWigToWig - Convert bigWig to wig.  This will keep more of the same structure of the
original wig than bigWigToBedGraph does, but still will break up large stepped sections
into smaller ones.
usage:
   bigWigToWig in.bigWig out.wig
options:
   -chrom=chr1 - if set restrict output to given chromosome
   -start=N - if set, restrict output to only that over start
   -end=N - if set, restict output to only that under end
   -udcDir=/dir/to/cache - place to put cache for remote bigBed/bigWigs

================================================================
========   binFromRange   ====================================
================================================================
### kent source version 473 ###
binFromRange - Translate a 0-based half open start and end into a bin range sql expression.
usage:
   binFromRange start end

================================================================
========   blastToPsl   ====================================
================================================================
### kent source version 473 ###
blastToPsl - Convert blast alignments to PSLs.

usage:
   blastToPsl [options] blastOutput psl

Options:
  -scores=file - Write score information to this file.  Format is:
       strands qName qStart qEnd tName tStart tEnd bitscore eVal
  -verbose=n - n >= 3 prints each line of file after parsing.
               n >= 4 dumps the result of each query
  -eVal=n n is e-value threshold to filter results. Format can be either
          an integer, double or 1e-10. Default is no filter.
  -pslx - create PSLX output (includes sequences for blocks)

Output only results of last round from PSI BLAST

================================================================
========   blastXmlToPsl   ====================================
================================================================
### kent source version 473 ###
blastXmlToPsl - convert blast XML output to PSLs
usage:
   blastXmlToPsl [options] blastXml psl

options:
  -scores=file - Write score information to this file.  Format is:
       strands qName qStart qEnd tName tStart tEnd bitscore eVal qDef tDef
  -verbose=n - n >= 3 prints each line of file after parsing.
               n >= 4 dumps the result of each query
  -eVal=n n is e-value threshold to filter results. Format can be either
          an integer, double or 1e-10. Default is no filter.
  -pslx - create PSLX output (includes sequences for blocks)
  -convertToNucCoords - convert protein to nucleic alignments to nucleic
   to nucleic coordinates
  -qName=src - define element used to obtain the qName.  The following
   values are support:
     o query-ID - use contents of the <Iteration_query-ID> element if it
       exists, otherwise use <BlastOutput_query-ID>
     o query-def0 - use the first white-space separated word of the
       <Iteration_query-def> element if it exists, otherwise the first word
       of <BlastOutput_query-def>.
   Default is query-def0.
  -tName=src - define element used to obtain the tName.  The following
   values are support:
     o Hit_id - use contents of the <Hit-id> element.
     o Hit_def0 - use the first white-space separated word of the
       <Hit_def> element.
     o Hit_accession - contents of the <Hit_accession> element.
   Default is Hit-def0.
  -forcePsiBlast - treat as output of PSI-BLAST. blast-2.2.16 and maybe
   others indentify psiblast as blastp.
Output only results of last round from PSI BLAST

================================================================
========   blat   ====================================
================================================================
### kent source version 473 ###
blat - Standalone BLAT v. 39x1 fast sequence search command line tool
usage:
   blat database query [-ooc=11.ooc] output.psl
where:
   database and query are each either a .fa, .nib or .2bit file,
      or a list of these files with one file name per line.
   -ooc=11.ooc tells the program to load over-occurring 11-mers from
      an external file.  This will increase the speed
      by a factor of 40 in many cases, but is not required.
   output.psl is the name of the output file.
   Subranges of .nib and .2bit files may be specified using the syntax:
      /path/file.nib:seqid:start-end
   or
      /path/file.2bit:seqid:start-end
   or
      /path/file.nib:start-end
   With the second form, a sequence id of file:start-end will be used.
options:
   -t=type        Database type.  Type is one of:
                    dna - DNA sequence
                    prot - protein sequence
                    dnax - DNA sequence translated in six frames to protein
                  The default is dna.
   -q=type        Query type.  Type is one of:
                    dna - DNA sequence
                    rna - RNA sequence
                    prot - protein sequence
                    dnax - DNA sequence translated in six frames to protein
                    rnax - DNA sequence translated in three frames to protein
                  The default is dna.
   -prot          Synonymous with -t=prot -q=prot.
   -ooc=N.ooc     Use overused tile file N.ooc.  N should correspond to 
                  the tileSize.
   -tileSize=N    Sets the size of match that triggers an alignment.  
                  Usually between 8 and 12.
                  Default is 11 for DNA and 5 for protein.
   -stepSize=N    Spacing between tiles. Default is tileSize.
   -oneOff=N      If set to 1, this allows one mismatch in tile and still
                  triggers an alignment.  Default is 0.
   -minMatch=N    Sets the number of tile matches.  Usually set from 2 to 4.
                  Default is 2 for nucleotide, 1 for protein.
   -minScore=N    Sets minimum score.  This is the matches minus the 
                  mismatches minus some sort of gap penalty.  Default is 30.
   -minIdentity=N Sets minimum sequence identity (in percent).  Default is
                  90 for nucleotide searches, 25 for protein or translated
                  protein searches.
   -maxGap=N      Sets the size of maximum gap between tiles in a clump.  Usually
                  set from 0 to 3.  Default is 2. Only relevant for minMatch > 1.
   -noHead        Suppresses .psl header (so it's just a tab-separated file).
   -makeOoc=N.ooc Make overused tile file. Target needs to be complete genome.
   -repMatch=N    Sets the number of repetitions of a tile allowed before
                  it is marked as overused.  Typically this is 256 for tileSize
                  12, 1024 for tile size 11, 4096 for tile size 10.
                  Default is 1024.  Typically comes into play only with makeOoc.
                  Also affected by stepSize: when stepSize is halved, repMatch is
                  doubled to compensate.
   -noSimpRepMask Suppresses simple repeat masking.
   -mask=type     Mask out repeats.  Alignments won't be started in masked region
                  but may extend through it in nucleotide searches.  Masked areas
                  are ignored entirely in protein or translated searches. Types are:
                    lower - mask out lower-cased sequence
                    upper - mask out upper-cased sequence
                    out   - mask according to database.out RepeatMasker .out file
                    file.out - mask database according to RepeatMasker file.out
   -qMask=type    Mask out repeats in query sequence.  Similar to -mask above, but
                  for query rather than target sequence.
   -repeats=type  Type is same as mask types above.  Repeat bases will not be
                  masked in any way, but matches in repeat areas will be reported
                  separately from matches in other areas in the psl output.
   -minRepDivergence=NN   Minimum percent divergence of repeats to allow 
                  them to be unmasked.  Default is 15.  Only relevant for 
                  masking using RepeatMasker .out files.
   -dots=N        Output dot every N sequences to show program's progress.
   -trimT         Trim leading poly-T.
   -noTrimA       Don't trim trailing poly-A.
   -trimHardA     Remove poly-A tail from qSize as well as alignments in 
                  psl output.
   -fastMap       Run for fast DNA/DNA remapping - not allowing introns, 
                  requiring high %ID. Query sizes must not exceed 5000.
   -out=type      Controls output file format.  Type is one of:
                    psl - Default.  Tab-separated format, no sequence
                    pslx - Tab-separated format with sequence
                    axt - blastz-associated axt format
                    maf - multiz-associated maf format
                    sim4 - similar to sim4 format
                    wublast - similar to wublast format
                    blast - similar to NCBI blast format
                    blast8- NCBI blast tabular format
                    blast9 - NCBI blast tabular format with comments
   -fine          For high-quality mRNAs, look harder for small initial and
                  terminal exons.  Not recommended for ESTs.
   -maxIntron=N  Sets maximum intron size. Default is 750000.
   -extendThroughN  Allows extension of alignment through large blocks of Ns.

  To filter PSL files to the best hits (e.g. minimum ID > 90% or 'only best match'),
  you can use the commands pslReps, pslCDnaFilter or pslUniq.
================================================================
========   blatHuge   ====================================
================================================================
### kent source version 473 ###
blat - Standalone BLAT v. 39x1 fast sequence search command line tool
usage:
   blat database query [-ooc=11.ooc] output.psl
where:
   database and query are each either a .fa, .nib or .2bit file,
      or a list of these files with one file name per line.
   -ooc=11.ooc tells the program to load over-occurring 11-mers from
      an external file.  This will increase the speed
      by a factor of 40 in many cases, but is not required.
   output.psl is the name of the output file.
   Subranges of .nib and .2bit files may be specified using the syntax:
      /path/file.nib:seqid:start-end
   or
      /path/file.2bit:seqid:start-end
   or
      /path/file.nib:start-end
   With the second form, a sequence id of file:start-end will be used.
options:
   -t=type        Database type.  Type is one of:
                    dna - DNA sequence
                    prot - protein sequence
                    dnax - DNA sequence translated in six frames to protein
                  The default is dna.
   -q=type        Query type.  Type is one of:
                    dna - DNA sequence
                    rna - RNA sequence
                    prot - protein sequence
                    dnax - DNA sequence translated in six frames to protein
                    rnax - DNA sequence translated in three frames to protein
                  The default is dna.
   -prot          Synonymous with -t=prot -q=prot.
   -ooc=N.ooc     Use overused tile file N.ooc.  N should correspond to 
                  the tileSize.
   -tileSize=N    Sets the size of match that triggers an alignment.  
                  Usually between 8 and 12.
                  Default is 11 for DNA and 5 for protein.
   -stepSize=N    Spacing between tiles. Default is tileSize.
   -oneOff=N      If set to 1, this allows one mismatch in tile and still
                  triggers an alignment.  Default is 0.
   -minMatch=N    Sets the number of tile matches.  Usually set from 2 to 4.
                  Default is 2 for nucleotide, 1 for protein.
   -minScore=N    Sets minimum score.  This is the matches minus the 
                  mismatches minus some sort of gap penalty.  Default is 30.
   -minIdentity=N Sets minimum sequence identity (in percent).  Default is
                  90 for nucleotide searches, 25 for protein or translated
                  protein searches.
   -maxGap=N      Sets the size of maximum gap between tiles in a clump.  Usually
                  set from 0 to 3.  Default is 2. Only relevant for minMatch > 1.
   -noHead        Suppresses .psl header (so it's just a tab-separated file).
   -makeOoc=N.ooc Make overused tile file. Target needs to be complete genome.
   -repMatch=N    Sets the number of repetitions of a tile allowed before
                  it is marked as overused.  Typically this is 256 for tileSize
                  12, 1024 for tile size 11, 4096 for tile size 10.
                  Default is 1024.  Typically comes into play only with makeOoc.
                  Also affected by stepSize: when stepSize is halved, repMatch is
                  doubled to compensate.
   -noSimpRepMask Suppresses simple repeat masking.
   -mask=type     Mask out repeats.  Alignments won't be started in masked region
                  but may extend through it in nucleotide searches.  Masked areas
                  are ignored entirely in protein or translated searches. Types are:
                    lower - mask out lower-cased sequence
                    upper - mask out upper-cased sequence
                    out   - mask according to database.out RepeatMasker .out file
                    file.out - mask database according to RepeatMasker file.out
   -qMask=type    Mask out repeats in query sequence.  Similar to -mask above, but
                  for query rather than target sequence.
   -repeats=type  Type is same as mask types above.  Repeat bases will not be
                  masked in any way, but matches in repeat areas will be reported
                  separately from matches in other areas in the psl output.
   -minRepDivergence=NN   Minimum percent divergence of repeats to allow 
                  them to be unmasked.  Default is 15.  Only relevant for 
                  masking using RepeatMasker .out files.
   -dots=N        Output dot every N sequences to show program's progress.
   -trimT         Trim leading poly-T.
   -noTrimA       Don't trim trailing poly-A.
   -trimHardA     Remove poly-A tail from qSize as well as alignments in 
                  psl output.
   -fastMap       Run for fast DNA/DNA remapping - not allowing introns, 
                  requiring high %ID. Query sizes must not exceed 5000.
   -out=type      Controls output file format.  Type is one of:
                    psl - Default.  Tab-separated format, no sequence
                    pslx - Tab-separated format with sequence
                    axt - blastz-associated axt format
                    maf - multiz-associated maf format
                    sim4 - similar to sim4 format
                    wublast - similar to wublast format
                    blast - similar to NCBI blast format
                    blast8- NCBI blast tabular format
                    blast9 - NCBI blast tabular format with comments
   -fine          For high-quality mRNAs, look harder for small initial and
                  terminal exons.  Not recommended for ESTs.
   -maxIntron=N  Sets maximum intron size. Default is 750000.
   -extendThroughN  Allows extension of alignment through large blocks of Ns.

  To filter PSL files to the best hits (e.g. minimum ID > 90% or 'only best match'),
  you can use the commands pslReps, pslCDnaFilter or pslUniq.
================================================================
========   calc   ====================================
================================================================
### kent source version 473 ###
calc - Little command line calculator
usage:
   calc this + that * theOther / (a + b)
Options:
  -h - output result as a human-readable integer numbers, with k/m/g/t suffix

================================================================
========   catDir   ====================================
================================================================
catDir - concatenate files in directory to stdout.
For those times when too many files for cat to handle.
usage:
   catDir dir(s)
options:
   -r            Recurse into subdirectories
   -suffix=.suf  This will restrict things to files ending in .suf
   '-wild=*.???' This will match wildcards.
   -nonz         Prints file name of non-zero length files

================================================================
========   catUncomment   ====================================
================================================================
catUncomment - Concatenate input removing lines that start with '#'
Output goes to stdout
usage:
   catUncomment file(s)

================================================================
========   chainAntiRepeat   ====================================
================================================================
### kent source version 473 ###
chainAntiRepeat - Get rid of chains that are primarily the results of repeats and degenerate DNA
usage:
   chainAntiRepeat tNibDir qNibDir inChain outChain
options:
   -minScore=N - minimum score (after repeat stuff) to pass
   -noCheckScore=N - score that will pass without checks (speed tweak)

================================================================
========   chainBridge   ====================================
================================================================
### kent source version 473 ###
chainBridge - Attempt to extend alignments through double-sided gaps of similar size
usage:
   chainBridge in.chain target.2bit query.2bit out.chain
options:
   -diffTolerance=D  Don't try to extend when a 2-sided gap's longer side is Dx
                     longer than its shorter side (default: 0.3, i.e. 30% longer)
   -maxGap=N  Maximum size of double-sided gap to try to bridge (default: 6000)
              Note: there is no size limit for exact sequence matches
   -scoreScheme=fileName Read the scoring matrix from a blastz-format file
   -linearGap=<medium|loose|filename> Specify type of linearGap to use.
              loose is chicken/human linear gap costs.
              medium is mouse/human linear gap costs.
              Or specify a piecewise linearGap tab delimited file.
              (default: loose)

================================================================
========   chainCleaner   ====================================
================================================================
### kent source version 473 ###
chainCleaner - Remove chain-breaking alignments from chains that break nested chains.

NOTATION: The "breaking chain" contains a local alignment block (called "chain-breaking alignment" (CBA) or "suspect") that breaks a nested chain ("broken chain") into two nets.

usage:
   chainCleaner in.chain tNibDir qNibDir out.chain out.bed -net=in.net 
 OR 
   chainCleaner in.chain tNibDir qNibDir out.chain out.bed -tSizes=/dir/to/target/chrom.sizes -qSizes=/dir/to/query/chrom.sizes 
 First option:   you have netted the chains and specify the net file via -net=netFile
 Second option:  you have not netted the chains. Then chainCleaner will net them. In this case, you must specify the chrom.sizes file for the target and query with -tSizes/-qSizes
 tNibDir/qNibDir are either directories with nib files, or the name of a .2bit file


output:
   out.chain      output file in chain format containing the untouched chains, the original broken chain and the modified breaking chains. NOTE: this file is chainSort-ed.
   out.bed        output file in bed format containing the coords and information about the removed chain-breaking alignments.

Most important options for deciding which chain-breaking alignments (CBA) to remove:
   -LRfoldThreshold=N        threshold for removing local alignment blocks if the score of the left and right fill of brokenChain / CBA score is at least this fold threshold. Default 2.5
   -doPairs                  flag: if set, do test if pairs of CBAs can be removed
   -LRfoldThresholdPairs=N   threshold for removing local alignment blocks if the score of the left and right fill of brokenChain / CBA score is at least this fold threshold. Default 10.0
   -maxPairDistance=N        only consider pairs of CBAs where the distance between the end of the upstream CBA and the start of the downstream CBA is at most that many bp (Default 10000)

   -scoreScheme=fileName       Read the scoring matrix from a blastz-format file
   -linearGap=<medium|loose|filename> Specify type of linearGap to use.
              *Must* specify this argument to one of these choices.
              loose is chicken/human linear gap costs.
              medium is mouse/human linear gap costs.
              Or specify a piecewise linearGap tab delimited file.
   sample linearGap file (loose)
tablesize       11
smallSize       111
position        1       2       3       11      111     2111    12111   32111   72111   152111  252111
qGap    325     360     400     450     600     1100    3600    7600    15600   31600   56600
tGap    325     360     400     450     600     1100    3600    7600    15600   31600   56600
bothGap 625     660     700     750     900     1400    4000    8000    16000   32000   57000


Other options for deciding which suspects to remove: 
   -foldThreshold=N          threshold for removing local alignment blocks if the brokenChain score / suspect score is at least this fold threshold. Default 0.0
   -maxSuspectBases=N        threshold for number of target bases in aligning blocks of the suspect subChain. If higher, do not remove suspect. Default 2147483647
   -maxSuspectScore=N        threshold for score of suspect subChain. If higher, do not remove suspect. Default 100000
   -minBrokenChainScore=N    threshold for minimum score of the entire broken chain. If the broken chain scores lower, it is less likely to be a real alignment and we will not remove the suspect. Default 50000
   -minLRGapSize=N           threshold for min size of left/right gap (how far the suspect is away from other blocks in the breaking chain). If lower, do not remove suspect (suspect to close to left or right part of breaking chain). Default 0


Debug and testing options: 
   -newChainIDDict=fileName  output 'newChainID{tab}breakingChainID' to this file. Gives a dictionary of the new IDs of chains representing removed suspects and the chain ID of the breaking chain that had the suspect before.
   -suspectDataFile=fileName output all the data for suspects to this file in bed format. If set, we do not clean any suspect as this would lead to updating the suspect values (updating the L/R fill region).
   -debug                    produces output chain files with the suspect and broken chains, and a bed file with information about all possible suspects. For debugging.

================================================================
========   chainFilter   ====================================
================================================================
### kent source version 473 ###
chainFilter - Filter chain files.  Output goes to standard out.
usage:
   chainFilter file(s)
options:
   -q=chr1,chr2 - restrict query side sequence to those named
   -notQ=chr1,chr2 - restrict query side sequence to those not named
   -t=chr1,chr2 - restrict target side sequence to those named
   -notT=chr1,chr2 - restrict target side sequence to those not named
   -id=N - only get one with ID number matching N
   -minScore=N - restrict to those scoring at least N
   -maxScore=N - restrict to those scoring less than N
   -qStartMin=N - restrict to those with qStart at least N
   -qStartMax=N - restrict to those with qStart less than N
   -qEndMin=N - restrict to those with qEnd at least N
   -qEndMax=N - restrict to those with qEnd less than N
   -tStartMin=N - restrict to those with tStart at least N
   -tStartMax=N - restrict to those with tStart less than N
   -tEndMin=N - restrict to those with tEnd at least N
   -tEndMax=N - restrict to those with tEnd less than N
   -qOverlapStart=N - restrict to those where the query overlaps a region starting here
   -qOverlapEnd=N - restrict to those where the query overlaps a region ending here
   -tOverlapStart=N - restrict to those where the target overlaps a region starting here
   -tOverlapEnd=N - restrict to those where the target overlaps a region ending here
   -strand=?    -restrict strand (to + or -)
   -long        -output in long format
   -zeroGap     -get rid of gaps of length zero
   -minGapless=N - pass those with minimum gapless block of at least N
   -qMinGap=N     - pass those with minimum gap size of at least N
   -tMinGap=N     - pass those with minimum gap size of at least N
   -qMaxGap=N     - pass those with maximum gap size no larger than N
   -tMaxGap=N     - pass those with maximum gap size no larger than N
   -qMinSize=N    - minimum size of spanned query region
   -qMaxSize=N    - maximum size of spanned query region
   -tMinSize=N    - minimum size of spanned target region
   -tMaxSize=N    - maximum size of spanned target region
   -noRandom      - suppress chains involving '_random' chromosomes
   -noHap         - suppress chains involving '_hap|_alt' chromosomes

================================================================
========   chainMergeSort   ====================================
================================================================
### kent source version 473 ###
chainMergeSort - Combine sorted files into larger sorted file
usage:
   chainMergeSort file(s)
Output goes to standard output
options:
   -saveId - keep the existing chain ids.
   -inputList=somefile - somefile contains list of input chain files.
   -tempDir=somedir/ - somedir has space for temporary sorting data, default ./

================================================================
========   chainNet   ====================================
================================================================
### kent source version 473 ###
chainNet - Make alignment nets out of chains
usage:
   chainNet in.chain target.sizes query.sizes target.net query.net
where:
   in.chain is the chain file sorted by score
   target.sizes contains the size of the target sequences
   query.sizes contains the size of the query sequences
   target.net is the output over the target genome
   query.net is the output over the query genome
options:
   -minSpace=N - minimum gap size to fill, default 25
   -minFill=N  - default half of minSpace
   -minScore=N - minimum chain score to consider, default 2000.0
   -verbose=N - Alter verbosity (default 1)
   -inclHap - include query sequences name in the form *_hap*|*_alt*.
              Normally these are excluded from nets as being haplotype
              pseudochromosomes

================================================================
========   chainPreNet   ====================================
================================================================
### kent source version 473 ###
chainPreNet - Remove chains that don't have a chance of being netted
usage:
   chainPreNet in.chain target.sizes query.sizes out.chain
options:
   -dots=N - output a dot every so often
   -pad=N - extra to pad around blocks to decrease trash
            (default 1)
   -inclHap - include query sequences name in the form *_hap*|*_alt*.
              Normally these are excluded from nets as being haplotype
              pseudochromosomes

================================================================
========   chainScore   ====================================
================================================================
### kent source version 473 ###
chainScore - score chains
usage:
   chainScore in.chain t.2bit q.2bit out.chain
options:
   -faQ  q.2bit is read as a fasta file
   -minScore=N  Minimum score for chain, default 1000
   -scoreScheme=fileName Read the scoring matrix from a blastz-format file
   -linearGap=filename Read piecewise linear gap from tab delimited file
   sample linearGap file 
tablesize 11
smallSize 111
position 1 2 3 11 111 2111 12111 32111 72111 152111 252111
qGap 350 425 450 600 900 2900 22900 57900 117900 217900 317900
tGap 350 425 450 600 900 2900 22900 57900 117900 217900 317900
bothGap 750 825 850 1000 1300 3300 23300 58300 118300 218300 318300

================================================================
========   chainSort   ====================================
================================================================
### kent source version 473 ###
chainSort - Sort chains.  By default sorts by score.
Note this loads all chains into memory, so it is not
suitable for large sets.  Instead, run chainSort on
multiple small files, followed by chainMergeSort.
usage:
   chainSort inFile outFile
Note that inFile and outFile can be the same
options:
   -target sort on target start rather than score
   -query sort on query start rather than score
   -index=out.tab build simple two column index file
                    <out file position>  <value>
                  where <value> is score, target, or query 
                  depending on the sort.

================================================================
========   chainSplit   ====================================
================================================================
### kent source version 473 ###
chainSplit - Split chains up by target or query sequence
usage:
   chainSplit outDir inChain(s)
options:
   -q  - Split on query (default is on target)
   -lump=N  Lump together so have only N split files.

================================================================
========   chainStitchId   ====================================
================================================================
### kent source version 473 ###
chainStitchId - Join chain fragments with the same chain ID into a single
   chain per ID.  Chain fragments must be from same original chain but
   must not overlap.  Chain fragment scores are summed.
usage:
   chainStitchId in.chain out.chain

================================================================
========   chainSwap   ====================================
================================================================
chainSwap - Swap target and query in chain
usage:
   chainSwap in.chain out.chain

================================================================
========   chainToAxt   ====================================
================================================================
### kent source version 473 ###
chainToAxt - Convert from chain to axt file
usage:
   chainToAxt in.chain tNibDirOr2bit qNibDirOr2bit out.axt
options:
   -maxGap=maximum gap sized allowed without breaking, default 100
   -maxChain=maximum chain size allowed without breaking, default 2147483647
   -minScore=minimum score of chain
   -minId=minimum percentage ID within blocks
   -bed  Output bed instead of axt

================================================================
========   chainToBigChain   ====================================
================================================================
### kent source version 473 ###
chainToBigChain - converts chain to bigChain input (bed format with extra fields)
usage:
  chainToBigChain chainIn bigChainOut bigLinkOut

Output will be sorted

To build bigBed files:
  bedToBigBed -type=bed6+6 -as=bigChain.as -tab data.bigChain hg38.chrom.sizes data.bb
  bedToBigBed -type=bed4+1 -as=bigLink.as -tab data.bigLink hg38.chrom.sizes data.link.bb

================================================================
========   chainToPsl   ====================================
================================================================
chainToPsl - Convert chain file to psl format
usage:
   chainToPsl in.chain tSizes qSizes target.lst query.lst out.psl
Where tSizes and qSizes are tab-delimited files with
       <seqName><size>
columns.
The target and query lists can either be fasta files, nib files, 2bit files
or a list of fasta, 2bit and/or nib files one per line

================================================================
========   chainToPslBasic   ====================================
================================================================
### kent source version 473 ###
chainToPslBasic - Basic conversion chain file to psl format
usage:
   chainToPsl in.chain out.psl
If you need match and mismatch stats updated, pipe output through pslRecalcMatch

================================================================
========   checkAgpAndFa   ====================================
================================================================
### kent source version 473 ###
checkAgpAndFa - takes a .agp file and .fa file and ensures that they are in synch
usage:

   checkAgpAndFa in.agp in.fa

options:
   -exclude=seq - Ignore seq (e.g. chrM for which we usually get
                  sequence from GenBank but don't have AGP)
in.fa can be a .2bit file.  If it is .fa then sequences must appear
in the same order in .agp and .fa.


================================================================
========   checkCoverageGaps   ====================================
================================================================
### kent source version 473 ###
checkCoverageGaps - Check for biggest gap in coverage for a list of tracks.
For most tracks coverage of 10,000,000 or more will indicate that there was
a mistake in generating the track.
usage:
   checkCoverageGaps database track1 ... trackN
Note: for bigWig and bigBeds, the biggest gap is rounded to the nearest 10,000 or so
options:
   -allParts  If set then include _hap and _random and other wierd chroms
   -female If set then don't check chrY
   -noComma - Don't put commas in biggest gap output

================================================================
========   checkHgFindSpec   ====================================
================================================================
### kent source version 473 ###
checkHgFindSpec - test and describe search specs in hgFindSpec tables.
usage:
  checkHgFindSpec database [options | termToSearch]
If given a termToSearch, displays the list of tables that will be searched
and how long it took to figure that out; then performs the search and the
time it took.
options:
  -showSearches       Show the order in which tables will be searched in
                      general.  [This will be done anyway if no
                      termToSearch or options are specified.]
  -checkTermRegex     For each search spec that includes a regular
                      expression for terms, make sure that all values of
                      the table field to be searched match the regex.  (If
                      not, some of them could be excluded from searches.)
  -checkIndexes       Make sure that an index is defined on each field to
                      be searched.
  -noHtml             Do not print the html results list, just figure out
                      the search results and print timing

================================================================
========   checkTableCoords   ====================================
================================================================
### kent source version 473 ###
checkTableCoords - check invariants on genomic coords in table(s).
usage:
  checkTableCoords database [tableName]
Searches for illegal genomic coordinates in all tables in database
unless narrowed down using options.  Uses ~/.hg.conf to determine
genome database connection info.  For psl/alignment tables, checks
target coords only.
options:
  -table=tableName  Check this table only.  (Default: all tables)
  -daysOld=N        Check tables that have been modified at most N days ago.
  -hoursOld=N       Check tables that have been modified at most N hours ago.
                    (days and hours are additive)
  -exclude=patList  Exclude tables matching any pattern in comma-separated
                    patList.  patList can contain wildcards (*?) but should
                    be escaped or single-quoted if it does.  patList can
                    contain "genbank" which will be expanded to all tables
                    generated by the automated genbank build process.
  -ignoreBlocks     To save time (but lose coverage), skip block coord checks.
  -verboseBlocks    Print out more details about illegal block coords, since 
                    they can't be found by simple SQL queries.

================================================================
========   chopFaLines   ====================================
================================================================
chopFaLines - Read in FA file with long lines and rewrite it with shorter lines
usage:
   chopFaLines in.fa out.fa

================================================================
========   chromGraphFromBin   ====================================
================================================================
### kent source version 473 ###
chromGraphFromBin - Convert chromGraph binary to ascii format.
usage:
   chromGraphFromBin in.chromGraph out.tab
options:
   -chrom=chrX - restrict output to single chromosome

================================================================
========   chromGraphToBin   ====================================
================================================================
### kent source version 473 ###
chromGraphToBin - Make binary version of chromGraph.
usage:
   chromGraphToBin in.tab out.chromGraph
options:
   -xxx=XXX

================================================================
========   chromToUcsc   ====================================
================================================================
Usage: chromToUcsc [options] filename - change NCBI or Ensembl chromosome names to UCSC names in tabular or wiggle files, using a chromAlias table.

    Supports these UCSC file formats:
    BED, genePred, PSL, wiggle (all formats), bedGraph, VCF, SAM, GTF, Chain
    ... or any other csv or tsv format where the sequence (chromosome) name is a separate field.

    Requires a <genome>.chromAlias.tsv file which can be downloaded like this:
        chromToUcsc --get hg19              # download the file hg19.chromAlias.tsv into current directory
    Which also works for GenArk assemblies and can take an output directory:
        chromToUcsc --get GCF_000001735.3 -o /tmp/  # for GenArk assemblies, will translate to NCBI sequence names (accessions)

    If you do not want to use the --get option to retrieve the mapping tables, you can also download the alias mapping
    files yourself, e.g. for mm10 with 'wget https://hgdownload.soe.ucsc.edu/goldenPath/mm10/database/chromAlias.txt.gz'

    Then the script can be run like this:
        chromToUcsc -i in.bed -o out.bed -a hg19.chromAlias.tsv
        chromToUcsc -i in.bed -o out.bed -a https://hgdownload.soe.ucsc.edu/goldenPath/hg19/database/chromAlias.txt.gz
    Or in pipes, like this:
        cat test.bed | chromToUcsc -a mm10.chromAlias.tsv > test.ucsc.bed
    For BAM files use this program in a pipe with samtools:
        samtools view -h in.bam | ./chromToUcsc -a mm10.chromAlias.tsv | samtools -bS > out.bam

    By default, this script expects the chromosome name in the first field.
    The default works for BED, bedGraph, GTF, wiggle, VCF.
    For the following file formats, you will need to set the -k option to these values manually:
    genePred: 2 -- PSL: 10 (query) or 14 (target) -- chain: 2 (target) or 7 (query) -- SAM: 2
    (If a line starts with @ (SAM format), -k is automatically set to 2.)
    

Options:
  -h, --help            show this help message and exit
  --get=DOWNLOADDB      download a chrom alias table from UCSC for the
                        genomeDb into the current directory or directory
                        provided by -o and exit
  -a ALIASFNAME, --chromAlias=ALIASFNAME
                        a UCSC chromAlias file in tab-sep format or the
                        http/https URL to one
  -i INFNAME, --in=INFNAME
                        input filename, default: /dev/stdin
  -o OUTFNAME, --out=OUTFNAME
                        output filename, default: /dev/stdout
  -d, --debug           show debug messages
  -s, --skipUnknown     skip unknown sequence rather than generate an error.
  -k FIELDNO, --field=FIELDNO
                        Index of field (1-based) that contains the chromosome
                        name. No other field is touched by this program,
                        unless the SAM format is detected. Default is 1 (first
                        field).
================================================================
========   clusterGenes   ====================================
================================================================
### kent source version 473 ###
clusterGenes - Cluster genes from genePred tracks
usage:
   clusterGenes [options] outputFile database table1 ... tableN
   clusterGenes [options] -trackNames outputFile database track1 table1 ... trackN tableN

Where outputFile is a tab-separated file describing the clustering,
database is a genome database such as mm4 or hg16,
and the table parameters are either tables in genePred format in that
database or genePred tab seperated files. If the input is all from files, the argument
can be `no'.
options:
   -verbose=N - Print copious debugging info. 0 for none, 3 for loads
   -chrom=chrN - Just work this chromosome, maybe repeated.
   -cds - cluster only on CDS exons, Non-coding genes are dropped.
   -trackNames - If specified, input are pairs of track names and files.
    This is useful when the file names don't reflact the desired track
    names.
   -ignoreStrand - cluster postive and negative strand together
   -clusterBed=bed - output BED file for each cluster.  If -cds is specified,
    this only contains bounds based on CDS
   -clusterTxBed=bed - output BED file for each cluster.  If -cds is specified,
    this contains bounds based on full transcript range of genes, not just CDS
   -flatBed=bed - output BED file that contains the exons of all genes
    flattned into a single record. If -cds is specified, this only contains
    bounds based on CDS
   -joinContained - join genes that are contained within a larger loci
    into that loci. Intended as a way to handled fragments and exon-level
    predictsions, as genes-in-introns on the same strand are very rare.
   -conflicted - detect conflicted loci. Conflicted loci are loci that
    contain genes that share no sequence.  This option greatly increases
    size of output file.
   -ignoreBases=N - ignore this many based to the start and end of each
    transcript when determine overlap.  This prevents small amounts of overlap
    from merging transcripts.  If -cds is specified, this amount of the CDS.
    is ignored. The default is 0.

The cdsConflicts and exonConflicts columns contains `y' if the cluster has
conficts. A conflict is a cluster where all of the genes don't share exons. 
Conflicts maybe either internal to a table or between tables.

================================================================
========   clusterMatrixToBarChartBed   ====================================
================================================================
### kent source version 473 ###
clusterMatrixToBarChartBed - Compute a barchart bed file from  a gene matrix
and a gene bed file and a way to cluster samples.
NOTE: consider using matrixClusterColumns and matrixToBarChartBed instead
usage:
   clusterMatrixToBarChartBed sampleClusters.tsv geneMatrix.tsv geneset.bed output.bed
where:
   sampleClusters.tsv is a two column tab separated file with sampleId and clusterId
   geneMatrix.tsv has a row for each gene. The first row uses the same sampleId as above
   geneset.bed has the maps the genes in the matrix (from it's first column) to the genome
        geneset.bed needs 6 standard bed fields.  Unless name2 is set it also needs a name2
        field as the last field
   output.bed is the resulting bar chart, with one column per cluster
options:
   -simple - don't store the position of gene in geneMatrix.tsv file in output
   -median - use median (instead of mean)
   -name2=twoColFile.tsv - get name2 from file where first col is same ase geneset.bed's name

================================================================
========   colTransform   ====================================
================================================================
colTransform - Add and/or multiply column by constant.
usage:
   colTransform column input.tab addFactor mulFactor output.tab
where:
   column is the column to transform, starting with 1
   input.tab is the tab delimited input file
   addFactor is what to add.  Use 0 here to not change anything
   mulFactor is what to multiply by.  Use 1 here not to change anything
   output.tab is the tab delimited output file

================================================================
========   countChars   ====================================
================================================================
countChars - Count the number of occurrences of a particular char
usage:
   countChars char file(s)
Char can either be a two digit hexadecimal value or
a single letter literal character
================================================================
========   cpg_lh   ====================================
================================================================
cpg_lh - calculate CpG Island data for cpgIslandExt tracks
usage:
    cpg_lh <sequence.fa>
where <sequence.fa> is fasta sequence, must be more than
   200 bases of legitimate sequence, not all N's

To process the output into the UCSC bed file format:

cpglh fastaInput.fa \
 | awk '{$2 = $2 - 1; width = $3 - $2;
   printf("%s\t%d\t%s\t%s %s\t%s\t%s\t%0.0f\t%0.1f\t%s\t%s\n",
    $1, $2, $3, $5, $6, width, $6, width*$7*0.01, 100.0*2*$6/width, $7, $9);}' \
     | sort -k1,1 -k2,2n > output.bed

The original cpg.c was written by Gos Miklem from the Sanger Center.
LaDeana Hillier added some modifications --> cpg_lh.c, and UCSC hass
added some further modifications to cpg_lh.c, so that its expected
number of CpGs in an island is calculated as described in
  Gardiner-Garden, M. and M. Frommer, 1987
  CpG islands in vertebrate genomes. J. Mol. Biol. 196:261-282

    Expected = (Number of C's * Number of G's) / Length

Instead of a sliding-window search for CpG islands, this cpg program
uses a running-sum score where a 'C' followed by a 'G' increases the
score by 17 and anything else decreases the score by 1.  When the
score transitions from positive to 0 (and at the end of the sequence),
the sequence in the current span is evaluated to see if it qualifies
as a CpG island (>200 bp length, >50% GC, >0.6 ratio of observed CpG
to expected).  Then the search recurses on the span from the position
with the max running score up to the current position.
================================================================
========   crTreeIndexBed   ====================================
================================================================
### kent source version 473 ###
crTreeIndexBed - Create an index for a bed file.
usage:
   crTreeIndexBed in.bed out.cr
options:
   -blockSize=N - number of children per node in index tree. Default 1024
   -itemsPerSlot=N - number of items per index slot. Default is half block size
   -noCheckSort - Don't check sorting order of in.tab

================================================================
========   crTreeSearchBed   ====================================
================================================================
### kent source version 473 ###
crTreeSearchBed - Search a crTree indexed bed file and print all items that overlap query.
usage:
   crTreeSearchBed file.bed index.cr chrom start end

================================================================
========   dbDbToHubTxt   ====================================
================================================================
### kent source version 473 ###
dbDbToHubTxt - Reformat dbDb line to hub and genome stanzas for assembly hubs
usage:
   dbDbToHubTxt database email groups hubAndGenome.txt
options:
   -xxx=XXX

================================================================
========   dbSnoop   ====================================
================================================================
### kent source version 473 ###
dbSnoop - Produce an overview of a database.
usage:
   dbSnoop database output
options:
   -unsplit - if set will merge together tables split by chromosome
   -noNumberCommas - if set will leave out commas in big numbers
   -justSchema - only schema parts, no contents
   -skipTable=tableName - if set skip a given table name
   -profile=profileName - use profile for connection settings, default = 'db'
   -sortAlpha - if set changes output order of fields to make comparisons between databases which have field order swapped easier.

================================================================
========   dbTrash   ====================================
================================================================
### kent source version 473 ###
dbTrash - drop tables from a database older than specified N hours
usage:
   dbTrash -age=N [-drop] [-historyToo] [-db=<DB>] [-verbose=N]
options:
   -age=N - number of hours old to qualify for drop.  N can be a float.
   -drop - actually drop the tables, default is merely to display tables.
   -dropLimit=N - ERROR out if number of tables to drop is greater than limit,
                - default is to drop all expired tables
   -db=<DB> - Specify a database to work with, default is customTrash.
   -historyToo - also consider the table called 'history' for deletion.
               - default is to leave 'history' alone no matter how old.
               - this applies to the table 'metaInfo' also.
   -extFile    - check extFile for lines that reference files
               - no longer in trash
   -extDel     - delete lines in extFile that fail file check
               - otherwise just verbose(2) lines that would be deleted
   -topDir     - directory name to prepend to file names in extFile
               - default is /usr/local/apache/trash
               - file names in extFile are typically: "../trash/ct/..."
   -tableStatus  - use 'show table status' to get size data, very inefficient
   -delLostTable - delete tables that exist but are missing from metaInfo
                 - this operation can be even slower than -tableStatus
                 - if there are many tables to check.
   -verbose=N - 2 == show arguments, dates, and dropped tables,
              - 3 == show date information for all tables.
================================================================
========   endsInLf   ====================================
================================================================
endsInLf - Check that last letter in files is end of line
usage:
   endsInLf file(s)
options:
   -zeroOk

================================================================
========   estOrient   ====================================
================================================================
### kent source version 473 ###
estOrient - convert ESTs so that orientation matches directory of transcription

usage: estOrient [options] db estTable outPsl

Read ESTs from a database and determine orientation based on
estOrientInfo table or direction in gbCdnaInfo table.  Update
PSLs so that the strand reflects the direction of transcription.
By default, PSLs where the direction can't be determined are dropped.

Options:
   -chrom=chr - process this chromosome, maybe repeated
   -keepDisoriented - don't drop ESTs where orientation can't
    be determined.
   -disoriented=psl - output ESTs that where orientation can't
    be determined to this file.
   -inclVer - add NCBI version number to accession if not already
    present.
   -fileInput - estTable is a psl file
   -estOrientInfo=file - instead of getting the orientation information
    from the estOrientInfo table, load it from this file.  This data is the
    output of polyInfo command.  If this option is specified, the direction
    will not be looked up in the gbCdnaInfo table and db can be `no'.
   -info=infoFile - write information about each EST to this tab
    separated file 
       qName tName tStart tEnd origStrand newStrand orient
    where orient is < 0 if PSL was reverse, > 0 if it was left unchanged
    and 0 if the orientation couldn't be determined (and was left
    unchanged).

================================================================
========   expMatrixToBarchartBed   ====================================
================================================================
usage: expMatrixToBarchartBed [-h] [--autoSql AUTOSQL]
                              [--groupOrderFile GROUPORDERFILE] [--useMean]
                              [--verbose]
                              sampleFile matrixFile bedFile outputFile

Generate a barChart bed6+5 file from a matrix, meta data, and coordinates.

positional arguments:
  sampleFile            Two column no header, the first column is the samples
                        which should match the matrix, the second is the
                        grouping (cell type, tissue, etc)
  matrixFile            The input matrix file. The samples in the first row
                        should exactly match the ones in the sampleFile. The
                        labels (ex ENST*****) in the first column should
                        exactly match the ones in the bed file.
  bedFile               Bed6+1 format. File that maps the column labels from
                        the matrix to coordinates. Tab separated; chr, start
                        coord, end coord, label, score, strand, gene name. The
                        score column is ignored.
  outputFile            The output file, bed 6+5 format. See the schema in
                        kent/src/hg/lib/barChartBed.as.

optional arguments:
  -h, --help            show this help message and exit
  --autoSql AUTOSQL     Optional autoSql description of extra fields in the
                        input bed.
  --groupOrderFile GROUPORDERFILE
                        Optional file to define the group order, list the
                        groups in a single column in the order desired. The
                        default ordering is alphabetical.
  --useMean             Calculate the group values using mean rather than
                        median.
  --verbose             Show runtime messages.
================================================================
========   faAlign   ====================================
================================================================
### kent source version 473 ###
faAlign - Align two fasta files
usage:
   faAlign target.fa query.fa output.axt
options:
   -dna - use DNA scoring scheme

================================================================
========   faCmp   ====================================
================================================================
### kent source version 473 ###
faCmp - Compare two .fa files
usage:
   faCmp [options] a.fa b.fa
options:
    -softMask - use the soft masking information during the compare
                Differences will be noted if the masking is different.
    -sortName - sort input files by name before comparing
    -peptide - read as peptide sequences
default:
    no masking information is used during compare.  It is as if both
    sequences were not masked.

Exit codes:
   - 0 if files are the same
   - 1 if files differ
   - 255 on an error


================================================================
========   faCount   ====================================
================================================================
### kent source version 473 ###
faCount - count base statistics and CpGs in FA files.
usage:
   faCount file(s).fa
     -summary  show only summary statistics
     -dinuc    include statistics on dinucletoide frequencies
     -strands  count bases on both strands

================================================================
========   faFilter   ====================================
================================================================
### kent source version 473 ###
faFilter - Filter fa records, selecting ones that match the specified conditions
usage:
   faFilter [options] in.fa out.fa

Options:
    -name=wildCard  - Only pass records where name matches wildcard
                      * matches any string or no character.
                      ? matches any single character.
                      anything else etc must match the character exactly
                      (these will will need to be quoted for the shell)
    -namePatList=filename - A list of regular expressions, one per line, that
                            will be applied to the fasta name the same as -name
    -v - invert match, select non-matching records.
    -minSize=N - Only pass sequences at least this big.
    -maxSize=N - Only pass sequences this size or smaller.
    -maxN=N Only pass sequences with fewer than this number of N's
    -uniq - Removes duplicate sequence ids, keeping the first.
    -i    - make -uniq ignore case so sequence IDs ABC and abc count as dupes.

All specified conditions must pass to pass a sequence.  If no conditions are
specified, all records will be passed.

================================================================
========   faFilterN   ====================================
================================================================
faFilterN - Get rid of sequences with too many N's
usage:
   faFilterN in.fa out.fa maxPercentN
options:
   -out=in.fa.out
   -uniq=self.psl

================================================================
========   faFrag   ====================================
================================================================
faFrag - Extract a piece of DNA from a .fa file.
usage:
   faFrag in.fa start end out.fa
options:
   -mixed - preserve mixed-case in FASTA file

================================================================
========   faNoise   ====================================
================================================================
faNoise - Add noise to .fa file
usage:
   faNoise inName outName transitionPpt transversionPpt insertPpt deletePpt chimeraPpt
options:
   -upper - output in upper case

================================================================
========   faOneRecord   ====================================
================================================================
faOneRecord - Extract a single record from a .FA file
usage:
   faOneRecord in.fa recordName

================================================================
========   faPolyASizes   ====================================
================================================================
### kent source version 473 ###
faPolyASizes - get poly A sizes
usage:
   faPolyASizes in.fa out.tab

output file has four columns:
   id seqSize tailPolyASize headPolyTSize

options:

================================================================
========   faRandomize   ====================================
================================================================
### kent source version 473 ###
faRandomize - Program to create random fasta records
usage:
  faRandomize [-seed=N] in.fa randomized.fa
    Use optional -seed argument to specify seed (integer) for random
    number generator (rand).  Generated sequence has the
    same base frequency as seen in original fasta records.
================================================================
========   faRc   ====================================
================================================================
faRc - Reverse complement a FA file
usage:
   faRc in.fa out.fa
In.fa and out.fa may be the same file.
options:
   -keepName - keep name identical (don't prepend RC)
   -keepCase - works well for ACGTUN in either case. bizarre for other letters.
               without it bases are turned to lower, all else to n's
   -justReverse - prepends R unless asked to keep name
   -justComplement - prepends C unless asked to keep name
                     (cannot appear together with -justReverse)

================================================================
========   faSize   ====================================
================================================================
### kent source version 473 ###
faSize - print total base count in fa files.
usage:
   faSize file(s).fa
Command flags
   -detailed        outputs name and size of each record
                    has the side effect of printing nothing else
   -tab             output statistics in a tab separated format
   -veryDetailed    outputs name, size, #Ns, #real, #upper, #lower of each record

================================================================
========   faSomeRecords   ====================================
================================================================
### kent source version 473 ###
faSomeRecords - Extract multiple fa records
usage:
   faSomeRecords in.fa listFile out.fa
options:
   -exclude - output sequences not in the list file.

================================================================
========   faSplit   ====================================
================================================================
### kent source version 473 ###
faSplit - Split an fa file into several files.
usage:
   faSplit how input.fa count outRoot
where how is either 'about' 'byname' 'base' 'gap' 'sequence' or 'size'.  
Files split by sequence will be broken at the nearest fa record boundary. 
Files split by base will be broken at any base.  
Files broken by size will be broken every count bases.

Examples:
   faSplit sequence estAll.fa 100 est
This will break up estAll.fa into 100 files
(numbered est001.fa est002.fa, ... est100.fa
Files will only be broken at fa record boundaries

   faSplit base chr1.fa 10 1_
This will break up chr1.fa into 10 files

   faSplit size input.fa 2000 outRoot
This breaks up input.fa into 2000 base chunks

   faSplit about est.fa 20000 outRoot
This will break up est.fa into files of about 20000 bytes each by record.

   faSplit byname scaffolds.fa outRoot/ 
This breaks up scaffolds.fa using sequence names as file names.
       Use the terminating / on the outRoot to get it to work correctly.

   faSplit gap chrN.fa 20000 outRoot
This breaks up chrN.fa into files of at most 20000 bases each, 
at gap boundaries if possible.  If the sequence ends in N's, the last
piece, if larger than 20000, will be all one piece.

Options:
    -verbose=2 - Write names of each file created (=3 more details)
    -maxN=N - Suppress pieces with more than maxN n's.  Only used with size.
              default is size-1 (only suppresses pieces that are all N).
    -oneFile - Put output in one file. Only used with size
    -extra=N - Add N extra bytes at the end to form overlapping pieces.  Only used with size.
    -out=outFile Get masking from outfile.  Only used with size.
    -lift=file.lft Put info on how to reconstruct sequence from
                   pieces in file.lft.  Only used with size and gap.
    -minGapSize=X Consider a block of Ns to be a gap if block size >= X.
                  Default value 1000.  Only used with gap.
    -noGapDrops - include all N's when splitting by gap.
    -outDirDepth=N Create N levels of output directory under current dir.
                   This helps prevent NFS problems with a large number of
                   file in a directory.  Using -outDirDepth=3 would
                   produce ./1/2/3/outRoot123.fa.
    -prefixLength=N - used with byname option. create a separate output
                   file for each group of sequences names with same prefix
                   of length N.

================================================================
========   faToFastq   ====================================
================================================================
### kent source version 473 ###
faToFastq - Convert fa to fastq format, just faking quality values.
usage:
   faToFastq in.fa out.fastq
options:
   -qual=X quality letter to use.  Default is '<' which is good I think....

================================================================
========   faToTab   ====================================
================================================================
faToTab - convert fa file to tab separated file
usage:
   faToTab infileName outFileName
options:
     -type=seqType   sequence type, dna or protein, default is dna
     -keepAccSuffix - don't strip dot version off of sequence id, keep as is

================================================================
========   faToTwoBit   ====================================
================================================================
### kent source version 473 ###
faToTwoBit - Convert DNA from fasta to 2bit format
usage:
   faToTwoBit in.fa [in2.fa in3.fa ...] out.2bit
options:
   -long            use 64-bit offsets for index.   Allow for twoBit to contain more than 4Gb of sequence. 
                    NOT COMPATIBLE WITH OLDER CODE.
   -noMask          Ignore lower-case masking in fa file.
   -stripVersion    Strip off version number after '.' for GenBank accessions.
   -ignoreDups      Convert first sequence only if there are duplicate sequence
                    names.  Use 'twoBitDup' to find duplicate sequences.
   -namePrefix=XX.  add XX. to start of sequence name in 2bit.
================================================================
========   faToVcf   ====================================
================================================================
### kent source version 473 ###
faToVcf - Convert a FASTA alignment file to Variant Call Format (VCF) single-nucleotide diffs
usage:
   faToVcf in.fa out.vcf
options:
   -ambiguousToN         Treat all IUPAC ambiguous bases (N, R, V etc) as N (no call).
   -excludeFile=file     Exclude sequences named in file which has one sequence name per line
   -includeNoAltN        Include base positions with no alternate alleles observed, but at
                         least one N (missing base / no-call)
   -includeRef           Include the reference in the genotype columns
                         (default: omitted as redundant)
   -maskSites=file       Exclude variants in positions recommended for masking in file
                         (typically https://github.com/W-L/ProblematicSites_SARS-CoV2/raw/master/problematic_sites_sarsCov2.vcf)
   -maxDiff=N            Exclude sequences with more than N mismatches with the reference
                         (if -windowSize is used, sequences are masked accordingly first)
   -minAc=N              Ignore alternate alleles observed fewer than N times
   -minAf=F              Ignore alternate alleles observed in less than F of non-N bases
   -minAmbigInWindow=N   When -windowSize is provided, mask any base for which there are at
                         least this many N, ambiguous or gap characters within the window.
                         (default: 2)
   -noGenotypes          Output 8-column VCF, without the sample genotype columns
   -ref=seqName          Use seqName as the reference sequence; must be present in faFile
                         (default: first sequence in faFile)
   -resolveAmbiguous     For IUPAC ambiguous characters like R (A or G), if the character
                         represents two bases and one is the reference base, convert it to the
                         non-reference base.  Otherwise convert it to N.
   -startOffset=N        Add N bases to each position (for trimmed alignments)
   -vcfChrom=seqName     Use seqName for the CHROM column in VCF (default: ref sequence)
   -windowSize=N         Mask any base for which there are at least -minAmbigWindow bases in a
                         window of +-N bases around the base.  Masking approach adapted from
                         https://github.com/roblanf/sarscov2phylo/ file scripts/mask_seq.py
                         Use -windowSize=7 for same results.
in.fa must contain a series of sequences with different names and the same length.
Both N and - are treated as missing information.
================================================================
========   faTrans   ====================================
================================================================
### kent source version 473 ###
faTrans - Translate DNA .fa file to peptide
usage:
   faTrans in.fa out.fa
options:
   -stop stop at first stop codon (otherwise puts in Z for stop codons)
   -offset=N start at a particular offset.
   -cdsUpper - cds is in upper case

================================================================
========   fastqStatsAndSubsample   ====================================
================================================================
### kent source version 473 ###
fastqStatsAndSubsample v2 - Go through a fastq file doing sanity checks and collecting stats
and also producing a smaller fastq out of a sample of the data.  The fastq input may be
compressed with gzip or bzip2.
Paired-end samples: run on both files, the seed is fixed so it will chose the paired reads
usage:
   fastqStatsAndSubsample in.fastq out.stats out.fastq
options:
   -sampleSize=N - default 100000
   -seed=N - Use given seed for random number generator.  Default 0.
   -smallOk - Not an error if less than sampleSize reads.  out.fastq will be entire in.fastq
   -json - out.stats will be in json rather than text format
Use /dev/null for out.fastq and/or out.stats if not interested in these outputs

================================================================
========   fastqToFa   ====================================
================================================================
### kent source version 473 ###
#	no name checks will be made on lines beginning with @
#	ignore quality scores
#	using default Phread quality score algorithm
#	all errors will cause exit
fastqToFa - Convert from fastq to fasta format.
usage:
   fastqToFa [options] in.fastq out.fa
options:
   -nameVerify='string' - for multi-line fastq files, 'string' must
	match somewhere in the sequence names in order to correctly
	identify the next sequence block (e.g.: -nameVerify='Supercontig_')
   -qual=file.qual.fa - output quality scores to specifed file
	(default: quality scores are ignored)
   -qualSizes=qual.sizes - write sizes file for the quality scores
   -noErrors - warn only on problems, do not error out
              (specify -verbose=3 to see warnings
   -solexa - use Solexa/Illumina quality score algorithm
	(instead of Phread quality)
   -verbose=2 - set warning level to get some stats output during processing
================================================================
========   featureBits   ====================================
================================================================
### kent source version 473 ###
featureBits - Correlate tables via bitmap projections. 
usage:
   featureBits database table(s)
This will return the number of bits in all the tables bitwise ANDed together
Pipe warning:  output goes to stderr.
Options:
   -bed=output.bed   Put intersection into bed format. Can use stdout.
   -fa=output.fa     Put sequence in intersection into .fa file
   -faMerge          For fa output merge overlapping features.
   -minSize=N        Minimum size to output (default 1)
   -chrom=chrN       Restrict to one chromosome
   -chromSize=sizefile       Read chrom sizes from file instead of database. 
                             (chromInfo three column format)
   -or               Bitwise OR tables together instead of ANDing them.
   -not              Output negation of resulting bit set.
   -countGaps        Count gaps in denominator
   -countBlocks      Count blocks in bed12 files rather than entire extent.
   -noRandom         Don't include _random (or Un) chromosomes
   -noHap            Don't include _hap|_alt chromosomes
   -primaryChroms    Primary assembly (chroms without '_' in name)
   -dots=N           Output dot every N chroms (scaffolds) processed
   -minFeatureSize=n Don't include bits of the track that are smaller than
                     minFeatureSize, useful for differentiating between
                     alignment gaps and introns.
   -bin=output.bin   Put bin counts in output file
   -binSize=N        Bin size for generating counts in bin file (default 500000)
   -binOverlap=N     Bin overlap for generating counts in bin file (default 250000)
   -bedRegionIn=input.bed    Read in a bed file for bin counts in specific regions 
                     and write to bedRegionsOut
   -bedRegionOut=output.bed  Write a bed file of bin counts in specific regions 
                     from bedRegionIn
   -enrichment       Calculates coverage and enrichment assuming first table
                     is reference gene track and second track something else
                     Enrichment is the amount of table1 that covers table2 vs. the
                     amount of table1 that covers the genome. It's how much denser
                     table1 is in table2 than it is genome-wide.
   '-where=some sql pattern'  Restrict to features matching some sql pattern
You can include a '!' before a table name to negate it.
   To prevent your shell from interpreting the '!' you will need
   to use the backslash \!, for example the gap table: \!gap
Some table names can be followed by modifiers such as:
    :exon:N          Break into exons and add N to each end of each exon
    :cds             Break into coding exons
    :intron:N        Break into introns, remove N from each end
    :utr5, :utr3     Break into 5' or 3' UTRs
    :upstream:N      Consider the region of N bases before region
    :end:N           Consider the region of N bases after region
    :score:N         Consider records with score >= N 
    :upstreamAll:N   Like upstream, but doesn't filter out genes that 
                     have txStart==cdsStart or txEnd==cdsEnd
    :endAll:N        Like end, but doesn't filter out genes that 
                     have txStart==cdsStart or txEnd==cdsEnd
The tables can be bed, psl, or chain files, or a directory full of
such files as well as actual database tables.  To count the bits
used in dir/chrN_something*.bed you'd do:
   featureBits database dir/_something.bed
File types supported are BED, bigBed, PSL, and chain.  The suffix of the file 
is used to determine the type and MUST be .bed, .bb, .psl, or .chain respectively.
NB: by default, featureBits omits gap regions from its calculation of the total
number of bases.  This requires connecting to a database server using credentials
from a .hg.conf file (or similar).  If such a connection is not available, you will
need to specify -countGaps (which skips the database connection) in addition to
providing all tables as files or directories.

================================================================
========   fetchChromSizes   ====================================
================================================================
fetchChromSizes - script to grab chrom.sizes from UCSC via either of: mysql, wget or ftp

usage: fetchChromSizes <db> > <db>.chrom.sizes
    used to fetch chrom.sizes information from UCSC for the given <db>
<db> - name of UCSC database, e.g.: hg38, hg18, mm9, etc ...

This script expects to find one of the following commands:
    wget, mysql, or ftp in order to fetch information from UCSC.
Route the output to the file <db>.chrom.sizes as indicated above.
This data is available at the URL:
  http://hgdownload.soe.ucsc.edu/goldenPath/<db>/bigZips/<db>.chrom.sizes

Example:   fetchChromSizes hg38 > hg38.chrom.sizes
================================================================
========   findMotif   ====================================
================================================================
### kent source version 473 ###
findMotif - find specified motif in sequence
usage:
   findMotif [options] -motif=<acgt...> sequence
where:
   sequence is a .fa , .nib or .2bit file or a file which is a list of
       sequence files.
options:
   -motif=<acgt...> - search for this specified motif
                      (case ignored, [acgt] only)
   NOTE: motif must be at least 4 characters, less than 32
   -chr=<chrN> - process only this one chrN from the sequence
   -strand=<+|-> - limit to only one strand.  Default is both.
   -bedOutput - output bed format (this is the default)
   -wigOutput - output wiggle data format instead of bed file
   -misMatch=N - allow N mismatches (0 default == perfect match)
   -verbose=N - set information level [1-4]
   -verbose=4 - will display gaps as bed file data lines to stderr

 * libpopcnt.h - C/C++ library for counting the number of 1 bits (bit
 * population count) in an array as quickly as possible using
 * specialized CPU instructions i.e. POPCNT, AVX2, AVX512, NEON.
 *
 * Copyright (c) 2016 - 2020, Kim Walisch
 * Copyright (c) 2016 - 2018, Wojciech Muła
 *
 * All rights reserved.
 *
 * Redistribution and use in source and binary forms, with or without
 * modification, are permitted provided that the following conditions are met:
 *
 * 1. Redistributions of source code must retain the above copyright notice,
 *    this list of conditions and the following disclaimer.
 * 2. Redistributions in binary form must reproduce the above copyright notice,
 *    this list of conditions and the following disclaimer in the documentation
 *    and/or other materials provided with the distribution.
 *
 * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS 'AS IS'
 * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
 * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
 * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
 * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
 * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
 * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS)
 * INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
 * CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
 * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
 * POSSIBILITY OF SUCH DAMAGE.
================================================================
========   fixStepToBedGraph.pl   ====================================
================================================================
fixStepToBedGraph.pl - read fixedStep wiggle input data, output four column bedGraph format data
usage: fixStepToBedGraph.pl
	run in a pipeline like this:
usage: zcat fixedStepData.gz | fixStepToBedGraph.pl | gzip > bedGraph.gz
reading input data from stdin ...
Can't open -verbose=2: No such file or directory at ./fixStepToBedGraph.pl line 28.
================================================================
========   fixTrackDb   ====================================
================================================================
### kent source version 473 ###
fixTrackDb - check for data accessible for everything in trackDb table
usage:
   fixTrackDb database trackDbTable
options:
  -gbdbList=list - list of files to confirm existance of bigDataUrl files

================================================================
========   gapToLift   ====================================
================================================================
### kent source version 473 ###
gapToLift - create lift file from gap table(s)
usage:
   gapToLift [options] db liftFile.lft
       uses gap table(s) from specified db.  Writes to liftFile.lft
       generates lift file segements separated by non-bridged gaps.
options:
   -chr=chrN - work only on given chrom
   -minGap=M - examine gaps only >= than M
   -insane - do *not* perform coordinate sanity checks on gaps
   -bedFile=fileName.bed - output segments to fileName.bed
   -allowBridged - consider any type of gap not just the non-bridged gaps
   -verbose=N - N > 1 see more information about procedure
================================================================
========   gencodeVersionForGenes   ====================================
================================================================
### kent source version 473 ###
gencodeVersionForGenes - Figure out which version of a gencode gene set a set of gene 
identifiers best fits
usage:
   gencodeVersionForGenes genes.txt geneSymVer.tsv
where:
   genes.txt is a list of gene symbols or identifiers, one per line
   geneSymVer.tsv is output of gencodeGeneSymVer, usually /hive/data/inside/geneSymVerTx.tsv
options:
   -bed=output.bed - Create bed file for mapping genes to genome via best gencode fit
   -upperCase - Force genes to be upper case
   -allBed=outputDir - Output beds for all versions in geneSymVer.tsv
   -geneToId=geneToId.tsv - Output two column file with symbol from gene.txt and gencode
                  gene names as second. Symbols with no gene found are omitted
   -miss=output.txt - unassigned genes are put here, one per line
   -target=ucscDb - something like hg38 or hg19.  If set this will use most recent
                version of each gene that exists for the assembly in symbol mode

================================================================
========   genePredCheck   ====================================
================================================================
### kent source version 473 ###
genePredCheck - validate genePred files or tables
usage:
   genePredCheck [options] fileTbl ..

If fileTbl is an existing file, then it is checked.  Otherwise, if -db
is provided, then a table by this name in db is checked.

options:
   -db=db - If specified, then this database is used to
          get chromosome sizes, and perhaps the table to check.
   -chromSizes=file.chrom.sizes - use chrom sizes from tab separated
          file (name,size) instead of from chromInfo table in specified db.
================================================================
========   genePredFilter   ====================================
================================================================
### kent source version 473 ###
genePredFilter - filter a genePred file
usage:
   genePredFilter [options] genePredIn genePredOut

Filter a genePredFile, dropping invalid entries

options:
   -db=db - If specified, then this database is used to
         get chromosome sizes.
   -chromSizes=file.chrom.sizes - use chrom sizes from tab separated
         file (name,size) instead of from chromInfo table in specified db.
   -verbose=2 - level >= 2 prints out errors for each problem found.
================================================================
========   genePredHisto   ====================================
================================================================
### kent source version 473 ###
genePredHisto - get data for generating histograms from a genePred file.
usage:
   genePredHisto [options] what genePredFile histoOut

Options:
  -ids - a second column with the gene name, useful for finding outliers.

The what arguments indicates the type of output. The output file is
a list of numbers suitable for input to textHistogram or similar
The following values are current implemented
   exonLen- length of exons
   5utrExonLen- length of 5'UTR regions of exons
   cdsExonLen- length of CDS regions of exons
   3utrExonLen- length of 3'UTR regions of exons
   exonCnt- count of exons
   5utrExonCnt- count of exons containing 5'UTR
   cdsExonCnt- count of exons count CDS
   3utrExonCnt- count of exons containing 3'UTR

================================================================
========   genePredSingleCover   ====================================
================================================================
### kent source version 473 ###
genePredSingleCover - create single-coverage genePred files
usage:
    genePredSingleCover [options] inGenePred outGenePred

Create a genePred file that have single CDS coverage of the genome.
UTR is allowed to overlap.  The default is to keep the gene with the
largest numberr of CDS bases.

Options:
  -scores=file - read scores used in selecting genes from this file.
   It consists of tab seperated lines of
       name chrom txStart score
   where score is a real or integer number. Higher scoring genes will
   be choosen over lower scoring ones.  Equaly scoring genes are
   choosen by number of CDS bases.  If this option is supplied, all
   genes must be in the file


================================================================
========   genePredToBed   ====================================
================================================================
### kent source version 473 ###
genePredToBed - Convert from genePred to bed format. Does not yet handle genePredExt
usage:
   genePredToBed in.genePred out.bed
options:
   -tab - genePred fields are separated by tab instead of just white space
   -fillSpace - when tab input, fill space chars in 'name' with underscore: _
   -score=N - set score to N in bed output (default 0)
================================================================
========   genePredToBigGenePred   ====================================
================================================================
### kent source version 473 ###
genePredToBigGenePred - converts genePred or genePredExt to bigGenePred input (bed format with extra fields)
usage:
  genePredToBigGenePred [-known] [-score=scores] [-geneNames=geneNames] [-colors=colors] file.gp stdout | sort -k1,1 -k2,2n > file.bgpInput
NOTE: In order to visualize on Genome Browser, the bigGenePred file needs to be converted to a bigBed such as the following:
   wget https://genome.ucsc.edu/goldenpath/help/examples/bigGenePred.as
   bedToBigBed -type=bed12+8 -tab -as=bigGenePred.as file.bgpInput chrom.sizes output.bb
options:
    -known                input file is a genePred in knownGene format
    -score=scores         scores is two column file with id's mapping to scores
    -geneNames=geneNames  geneNames is a three column file with id's mapping to two gene names
    -colors=colors        colors is a four column file with id's mapping to r,g,b
    -cds=cds              cds is a five column file with id's mapping to cds status codes and exonFrames (see knownCds.as)
    -geneType=geneType              geneType is a two column file with id's mapping to geneType

================================================================
========   genePredToFakePsl   ====================================
================================================================
### kent source version 473 ###
genePredToFakePsl - Create a psl of fake-mRNA aligned to gene-preds from a file or table.
usage:
   genePredToFakePsl [options] db fileTbl pslOut cdsOut

If fileTbl is an existing file, then it is used.
Otherwise, the table by this name is used.

pslOut specifies the fake-mRNA output psl filename.

cdsOut specifies the output cds tab-separated file which contains
genbank-style CDS records showing cdsStart..cdsEnd
e.g. NM_123456 34..305
options:
   -chromSize=sizefile	Read chrom sizes from file instead of database
             sizefile contains two white space separated fields per line:
		chrom name and size
   -qSizes=qSizesFile	Read in query sizes to fixup qSize and qStarts


================================================================
========   genePredToGtf   ====================================
================================================================
### kent source version 473 ###
genePredToGtf - Convert genePred table or file to gtf.
usage:
   genePredToGtf database genePredTable output.gtf
If database is 'file' then track is interpreted as a file
rather than a table in database.
options:
   -utr - Add 5UTR and 3UTR features
   -honorCdsStat - use cdsStartStat/cdsEndStat when defining start/end
    codon records
   -source=src set source name to use
   -addComments - Add comments before each set of transcript records.
    allows for easier visual inspection
Note: use a refFlat table or extended genePred table or file to include
the gene_name attribute in the output.  This will not work with a refFlat
table dump file. If you are using a genePred file that starts with a numeric
bin column, drop it using the UNIX cut command:
    cut -f 2- in.gp | genePredToGtf file stdin out.gp

================================================================
========   genePredToMafFrames   ====================================
================================================================
### kent source version 473 ###
genePredToMafFrames - create mafFrames tables from a genePreds
usage:
    genePredToMafFrames [options] targetDb maf mafFrames geneDb1 genePred1 [geneDb2 genePred2...] 

Create frame annotations for one or more components of a MAF.
It is significantly faster to process multiple gene sets in the same"run, as 95% of the CPU time is spent reading the MAF

Arguments:
  o targetDb - db of target genome
  o maf - input MAF file
  o mafFrames - output file
  o geneDb1 - db in MAF that corresponds to genePred's organism.
  o genePred1 - genePred file.  Overlapping annotations ahould have
    be removed.  This file may optionally include frame annotations
Options:
  -bed=file - output a bed of for each mafFrame region, useful for debugging.
  -verbose=level - enable verbose tracing, the following levels are implemented:
     3 - print information about data used to compute each record.
     4 - dump information about the gene mappings that were constructed
     5 - dump information about the gene mappings after split processing
     6 - dump information about the gene mappings after frame linking


================================================================
========   genePredToProt   ====================================
================================================================
### kent source version 473 ###
genePredToProt - create protein sequences by translating gene annotations
usage:
   genePredToProt genePredFile genomeSeqs protFa

This honors frame if genePred has frames, dropping partial codons.
genomeSeqs is a 2bit or directory of nib files.

options:
  -cdsFa=fasta - output FASTA with CDS that was used to generate protein.
                 This will not include dropped partial codons.
  -protIdSuffix=str - add this string to the end of the name for protein FASTA
  -cdsIdSuffix=str - add this string to the end of the name for CDS FASTA
  -translateSeleno - assume internal TGA code for selenocysteine and translate to `U'.
  -includeStop - If the CDS ends with a stop codon, represent it as a `*'
  -starForInframeStops - use `*' instead of `X' for in-frame stop codons.
                  This will result in selenocysteine's being `*', with only codons
                  containing `N' being translated to `X'.  This doesn't include terminal
                  stop

================================================================
========   gensub2   ====================================
================================================================
gensub2 - version 12.19
Generate condor submission file from template and two file lists.
Usage:
   gensub2 <file list 1> <file list 2> <template file> <output file>
This will substitute each file in the file lists for $(path1) and $(path2)
in the template between #LOOP and #ENDLOOP, and write the results to
the output.  Other substitution variables are:
       $(path1)  - Full path name of first file.
       $(path2)  - Full path name of second file.
       $(dir1)   - First directory. Includes trailing slash if any.
       $(dir2)   - Second directory.
       $(lastDir1) - The last directory in the first path. Includes trailing slash if any.
       $(lastDir2) - The last directory in the second path. Includes trailing slash if any.
       $(lastDirs1=<n>) - The last n directories in the first path.
       $(lastDirs2=<n>) - The last n directories in the second path.
       $(root1)  - First file name without directory or extension.
       $(root2)  - Second file name without directory or extension.
       $(ext1)   - First file extension.
       $(ext2)   - Second file extension.
       $(file1)  - Name without dir of first file.
       $(file2)  - Name without dir of second file.
       $(num1)   - Index of first file in list.
       $(num2)   - Index of second file in list.
The <file list 2> parameter can be 'single' if there is only one file list and 
'selfPair' if there is a single list, but you want all
pairs of single list with itself.  By default the order is diagonal, meaning if 
the first list is ABC and the second list is abc the combined 
order is Aa Ba Ab Ca Bb Ac  Cb Bc Cc.  This tends to put the 
largest jobs first if the file lists are both sorted by size. 
The following options can change this:
    -group1 - write elements in order Aa Ab Ac Ba Bb Bc Ca Cb Cc
    -group2 - write elements in order Aa Ba Ca Ab Bb Cb Ac Bc Cc
template file syntax help for check statement: {check 'when' 'what' <file>}
 where 'when' is either 'in' or 'out'
 and 'what' is one of: 'exists' 'exists+' 'line' 'line+'
 'exists' means file exists, may be zero size
 'exists+' means file exists and is non-zero size
 'line' means file may have 0 or more lines of ascii data and is properly
        line-feed terminated
 'line+' means file is 1 or more lines of ascii data and is properly
        line-feed terminated
================================================================
========   getRna   ====================================
================================================================
### kent source version 473 ###
getRna - Get mrna for GenBank or RefSeq sequences found in a database
usage:
   getRna [options] database accFile outfa

Get mrna for all accessions in accFile, writing to a fasta file. If accession
 has a version, that version is returned or an error generated

Options:
  -cdsUpper - lookup CDS and output it as upper case. If CDS annotation
    can't be obtained, the sequence is skipped with a warning.
  -cdsUpperAll - like -cdsUpper, except keep sequeneces without CDS
  -inclVer - include version with sequence id.
  -peptides - translate mRNAs to peptides
  -seqTbl=tbl - use this table instead of gbSeq and seq. Many other options don't work if this is used.
  -extFileTbl=tbl - use this table instead of gbExtFile and extFile


================================================================
========   getRnaPred   ====================================
================================================================
### kent source version 473 ###
getRnaPred - Get genome putative RNA for gene predictions
usage:
   getRnaPred [options] database table chromosome output.fa
table can be a table or a file.  Specify chromosome of 'all' to
to process all chromosome

options:
   -weird - only get ones with weird splice sites
   -cdsUpper - output CDS in upper case
   -cdsOnly - only output CDS
   -cdsOut=file - write CDS to this tab-separated file, in the form
      acc  start  end
    where start..end are genbank style, one-based coordinates
   -keepMasking - un/masked in upper/lower case.
   -pslOut=psl - output a PSLs for the genomic mRNAs.  Allows 
    mRNA to be analyzed by tools that work on PSLs
   -suffix=suf - append suffix to each id to avoid confusion with mRNAs
    use to define the genes.
   -peptides - out the translation of the CDS to a peptide sequence.
    The newer program genePredToProt maybe produce better results in cases
    were there are frame-shifting indels in the CDS.
   -exonIndices - output indices of exon boundaries after sequence name,
    e.g., "103 243 290" says positions 1-103 are from the first exon,
    positions 104-243 are from the second exon, etc. 
   -maxSize=size - output a maximum of size characters.  Useful when
    testing gene predictions by RT-PCR.
   -genomeSeqs=spec - get genome sequences from the specified nib directory
    or 2bit file instead of going though the path found in chromInfo.
   -includeCoords - include the genomic coordinates as a comment in the
    fasta header.  This is necessary when there are multiple genePreds
    with the same name.
   -genePredExt - (for use with -peptides) use extended genePred format,
    and consider frame information when translating (Warning: only
    considers offset at 5' end, not frameshifts between blocks)

================================================================
========   gfClient   ====================================
================================================================
### kent source version 473 ###
gfClient v. 39x1 - A client for the genomic finding program that produces a .psl file
usage:
   gfClient host port seqDir in.fa out.psl
where
   host is the name of the machine running the gfServer
   port is the same port that you started the gfServer with
   seqDir is the path of the .2bit or .nib files relative to the current dir
       (note these are needed by the client as well as the server)
   in.fa is a fasta format file.  May contain multiple records
   out.psl is where to put the output
options:
   -t=type       Database type. Type is one of:
                   dna - DNA sequence
                   prot - protein sequence
                   dnax - DNA sequence translated in six frames to protein
                 The default is dna.
   -q=type       Query type. Type is one of:
                   dna - DNA sequence
                   rna - RNA sequence
                   prot - protein sequence
                   dnax - DNA sequence translated in six frames to protein
                   rnax - DNA sequence translated in three frames to protein
   -prot         Synonymous with -t=prot -q=prot.
   -dots=N       Output a dot every N query sequences.
   -nohead       Suppresses 5-line psl header.
   -minScore=N   Sets minimum score.  This is twice the matches minus the 
                 mismatches minus some sort of gap penalty.  Default is 30.
   -minIdentity=N   Sets minimum sequence identity (in percent).  Default is
                 90 for nucleotide searches, 25 for protein or translated
                 protein searches.
   -out=type     Controls output file format.  Type is one of:
                   psl - Default.  Tab-separated format without actual sequence
                   pslx - Tab-separated format with sequence
                   axt - blastz-associated axt format
                   maf - multiz-associated maf format
                   sim4 - similar to sim4 format
                   wublast - similar to wublast format
                   blast - similar to NCBI blast format
                   blast8- NCBI blast tabular format
                   blast9 - NCBI blast tabular format with comments
   -maxIntron=N   Sets maximum intron size. Default is 750000.
   -genome=name  When using a dynamic gfServer, The genome name is used to 
                 find the data files relative to the dynamic gfServer root, named 
                 in the form $genome.2bit, $genome.untrans.gfidx, and $genome.trans.gfidx
   -genomeDataDir=path
                 When using a dynamic gfServer, this is the dynamic gfServer root directory
                 that contained the genome data files.  Defaults to being the root directory.
                
================================================================
========   gfPcr   ====================================
================================================================
### kent source version 473 ###
gfPcr - In silico PCR version 39x1 using gfServer index.
usage:
   gfPcr host port seqDir fPrimer rPrimer output
or
   gfPcr host port seqDir batch output
Where:
   host is the name of the machine running the gfServer
   port is the gfServer port (usually 17779)
   seqDir is where the nib or 2bit files for the genome database are
   fPrimer is the forward strand primer
   rPrimer is the reverse strand primer
   output is the output file.  Use 'stdout' for output to standard output
   batch is a space or tab delimited file with the following fields on each line
       name/fPrimer/rPrimer/maxProductSize
options:
   -maxSize=N - Maximum size of PCR product (default 4000)
   -minPerfect=N - Minimum size of perfect match at 3' end of primer (default 15)
   -minGood=N - Minimum size where there must be 2 matches for each mismatch (default 18)
   -out=XXX - Output format.  Either
      fa - fasta with position, primers in header (default)
      bed - tab delimited format. Fields: chrom/start/end/name/score/strand
      psl - blat format.
   -name=XXX - Name to use in bed output.
   -genome=name  When using a dynamic gfServer, The genome name is used to 
                 find the data files relative to the dynamic gfServer root, named 
                 in the form $genome.2bit, and $genome.untrans.gfidx.
   -genomeDataDir=path
                 When using a dynamic gfServer, this is the dynamic gfServer root directory
                 that contained the genome data files.  Defaults to being the root directory.
                

================================================================
========   gfServer   ====================================
================================================================
### kent source version 473 ###
gfServer v 39x1 - Make a server to quickly find where DNA occurs in genome (32-bit index)
   To set up a server:
      gfServer start host port file(s)
      where the files are .2bit or .nib format files specified relative to the current directory
   To remove a server:
      gfServer stop host port
   To query a server with DNA sequence:
      gfServer query host port probe.fa
   To query a server with protein sequence:
      gfServer protQuery host port probe.fa
   To query a server with translated DNA sequence:
      gfServer transQuery host port probe.fa
   To query server with PCR primers:
      gfServer pcr host port fPrimer rPrimer maxDistance
   To process one probe fa file against a .2bit format genome (not starting server):
      gfServer direct probe.fa file(s).2bit
   To test PCR without starting server:
      gfServer pcrDirect fPrimer rPrimer file(s).2bit
   To figure out if server is alive, on static instances get usage statics as well:
      gfServer status host port
     For dynamic gfServer instances, specify -genome and optionally the -genomeDataDir
     to get information on an untranslated genome index. Include -trans to get about information
     about a translated genome index
   To get input file list:
      gfServer files host port
   To generate a precomputed index:
      gfServer index gfidx file(s)
     where the files are .2bit or .nib format files.  Separate indexes are
     be created for untranslated and translated queries.  These can be used
     with a persistent server as with 'start -indexFile or a dynamic server.
     They must follow the naming convention for for dynamic servers.
   To run a dynamic server (usually called by xinetd):
      gfServer dynserver rootdir
     Data files for genomes are found relative to the root directory.
     Queries are made using the prefix of the file path relative to the root
     directory.  The files $genome.2bit, $genome.untrans.gfidx, and
     $genome.trans.gfidx are required. Typically the structure will be in
     the form:
         $rootdir/$genomeDataDir/$genome.2bit
         $rootdir/$genomeDataDir/$genome.untrans.gfidx
         $rootdir/$genomeDataDir/$genome.trans.gfidx
     in this case, one would call gfClient with 
         -genome=$genome -genomeDataDir=$genomeDataDir
     Often $genomeDataDir will be the same name as $genome, however it
     can be a multi-level path. For instance:
          GCA/902/686/455/GCA_902686455.1_mSciVul1.1/
     The translated or untranslated index maybe omitted if there is no
     need to handle that type of request.
     The -perSeqMax functionality can be implemented by creating a file
         $rootdir/$genomeDataDir/$genome.perseqmax

options:
   -tileSize=N     Size of n-mers to index.  Default is 11 for nucleotides, 4 for
                   proteins (or translated nucleotides).
   -stepSize=N     Spacing between tiles. Default is tileSize.
   -minMatch=N     Number of n-mer matches that trigger detailed alignment.
                   Default is 2 for nucleotides, 3 for proteins.
   -maxGap=N       Number of insertions or deletions allowed between n-mers.
                   Default is 2 for nucleotides, 0 for proteins.
   -trans          Translate database to protein in 6 frames.  Note: it is best
                   to run this on RepeatMasked data in this case.
   -log=logFile    Keep a log file that records server requests.
   -seqLog         Include sequences in log file (not logged with -syslog).
   -ipLog          Include user's IP in log file (not logged with -syslog).
   -debugLog       Include debugging info in log file.
   -syslog         Log to syslog.
   -logFacility=facility  Log to the specified syslog facility - default local0.
   -mask           Use masking from .2bit file.
   -repMatch=N     Number of occurrences of a tile (n-mer) that triggers repeat masking the
                   tile. Default is 1024.
   -noSimpRepMask  Suppresses simple repeat masking.
   -maxDnaHits=N   Maximum number of hits for a DNA query that are sent from the server.
                   Default is 100.
   -maxTransHits=N Maximum number of hits for a translated query that are sent from the server.
                   Default is 200.
   -maxNtSize=N    Maximum size of untranslated DNA query sequence.
                   Default is 40000.
   -maxAaSize=N    Maximum size of protein or translated DNA queries.
                   Default is 8000.
   -perSeqMax=file File contains one seq filename (possibly with ':seq' suffix) per line.
                   -maxDnaHits will be applied to each filename[:seq] separately: each may
                   have at most maxDnaHits/2 hits.  The filename MUST not include the directory.
                   Useful for assemblies with many alternate/patch sequences.
   -canStop        If set, a quit message will actually take down the server.
   -indexFile      Index file created by `gfServer index'. Saving index can speed up
                   gfServer startup by two orders of magnitude.  The parameters must
                   exactly match the parameters when the file is written or bad things
                   will happen.
   -timeout=N      Timeout in seconds.
                   Default is 90.

================================================================
========   gfServerHuge   ====================================
================================================================
### kent source version 473 ###
gfServer v 39x1 - Make a server to quickly find where DNA occurs in genome (64-bit index)
   To set up a server:
      gfServer start host port file(s)
      where the files are .2bit or .nib format files specified relative to the current directory
   To remove a server:
      gfServer stop host port
   To query a server with DNA sequence:
      gfServer query host port probe.fa
   To query a server with protein sequence:
      gfServer protQuery host port probe.fa
   To query a server with translated DNA sequence:
      gfServer transQuery host port probe.fa
   To query server with PCR primers:
      gfServer pcr host port fPrimer rPrimer maxDistance
   To process one probe fa file against a .2bit format genome (not starting server):
      gfServer direct probe.fa file(s).2bit
   To test PCR without starting server:
      gfServer pcrDirect fPrimer rPrimer file(s).2bit
   To figure out if server is alive, on static instances get usage statics as well:
      gfServer status host port
     For dynamic gfServer instances, specify -genome and optionally the -genomeDataDir
     to get information on an untranslated genome index. Include -trans to get about information
     about a translated genome index
   To get input file list:
      gfServer files host port
   To generate a precomputed index:
      gfServer index gfidx file(s)
     where the files are .2bit or .nib format files.  Separate indexes are
     be created for untranslated and translated queries.  These can be used
     with a persistent server as with 'start -indexFile or a dynamic server.
     They must follow the naming convention for for dynamic servers.
   To run a dynamic server (usually called by xinetd):
      gfServer dynserver rootdir
     Data files for genomes are found relative to the root directory.
     Queries are made using the prefix of the file path relative to the root
     directory.  The files $genome.2bit, $genome.untrans.gfidx, and
     $genome.trans.gfidx are required. Typically the structure will be in
     the form:
         $rootdir/$genomeDataDir/$genome.2bit
         $rootdir/$genomeDataDir/$genome.untrans.gfidx
         $rootdir/$genomeDataDir/$genome.trans.gfidx
     in this case, one would call gfClient with 
         -genome=$genome -genomeDataDir=$genomeDataDir
     Often $genomeDataDir will be the same name as $genome, however it
     can be a multi-level path. For instance:
          GCA/902/686/455/GCA_902686455.1_mSciVul1.1/
     The translated or untranslated index maybe omitted if there is no
     need to handle that type of request.
     The -perSeqMax functionality can be implemented by creating a file
         $rootdir/$genomeDataDir/$genome.perseqmax

options:
   -tileSize=N     Size of n-mers to index.  Default is 11 for nucleotides, 4 for
                   proteins (or translated nucleotides).
   -stepSize=N     Spacing between tiles. Default is tileSize.
   -minMatch=N     Number of n-mer matches that trigger detailed alignment.
                   Default is 2 for nucleotides, 3 for proteins.
   -maxGap=N       Number of insertions or deletions allowed between n-mers.
                   Default is 2 for nucleotides, 0 for proteins.
   -trans          Translate database to protein in 6 frames.  Note: it is best
                   to run this on RepeatMasked data in this case.
   -log=logFile    Keep a log file that records server requests.
   -seqLog         Include sequences in log file (not logged with -syslog).
   -ipLog          Include user's IP in log file (not logged with -syslog).
   -debugLog       Include debugging info in log file.
   -syslog         Log to syslog.
   -logFacility=facility  Log to the specified syslog facility - default local0.
   -mask           Use masking from .2bit file.
   -repMatch=N     Number of occurrences of a tile (n-mer) that triggers repeat masking the
                   tile. Default is 1024.
   -noSimpRepMask  Suppresses simple repeat masking.
   -maxDnaHits=N   Maximum number of hits for a DNA query that are sent from the server.
                   Default is 100.
   -maxTransHits=N Maximum number of hits for a translated query that are sent from the server.
                   Default is 200.
   -maxNtSize=N    Maximum size of untranslated DNA query sequence.
                   Default is 40000.
   -maxAaSize=N    Maximum size of protein or translated DNA queries.
                   Default is 8000.
   -perSeqMax=file File contains one seq filename (possibly with ':seq' suffix) per line.
                   -maxDnaHits will be applied to each filename[:seq] separately: each may
                   have at most maxDnaHits/2 hits.  The filename MUST not include the directory.
                   Useful for assemblies with many alternate/patch sequences.
   -canStop        If set, a quit message will actually take down the server.
   -indexFile      Index file created by `gfServer index'. Saving index can speed up
                   gfServer startup by two orders of magnitude.  The parameters must
                   exactly match the parameters when the file is written or bad things
                   will happen.
   -timeout=N      Timeout in seconds.
                   Default is 90.

================================================================
========   gff3ToGenePred   ====================================
================================================================
### kent source version 473 ###
gff3ToGenePred - convert a GFF3 file to a genePred file
usage:
   gff3ToGenePred inGff3 outGp
options:
  -warnAndContinue - on bad genePreds being created, put out warning but continue
  -useName - use the 'name' tag as the name, if present
  -rnaNameAttr=attr - If this attribute exists on an RNA record, use it as the genePred
   name column
  -geneNameAttr=attr - If this attribute exists on a gene record, use it as the genePred
   name2 column
  -attrsOut=file - output attributes of mRNA record to file.  These are per-genePred row,
   not per-GFF3 record. Thery are derived from GFF3 attributes, not the attributes themselves.
  -processAllGeneChildren - output genePred for all children of a gene regardless of feature
  -unprocessedRootsOut=file - output GFF3 root records that were not used.  This will not be a
   valid GFF3 file.  It's expected that many non-root records will not be used and they are not
   reported.
  -bad=file   - output genepreds that fail checks to file
  -maxParseErrors=50 - Maximum number of parsing errors before aborting. A negative
   value will allow an unlimited number of errors.  Default is 50.
  -maxConvertErrors=50 - Maximum number of conversion errors before aborting. A negative
   value will allow an unlimited number of errors.  Default is 50.
  -honorStartStopCodons - only set CDS start/stop status to complete if there are
   corresponding start_stop codon records
  -defaultCdsStatusToUnknown - default the CDS status to unknown rather
   than complete.
  -allowMinimalGenes - normally this programs assumes that genes contains
   transcripts which contain exons.  If this option is specified, genes with exons
   as direct children of genes and stand alone genes with no exon or transcript
   children will be converted.
  -refseqHacks - enable various hacks to make RefSeq conversion work:
     This turns on -useName, -allowMinimalGenes, and -processAllGeneChildren.
     It try harder to find an accession in attributes

This converts:
   - top-level gene records with RNA records
   - top-level RNA records
   - RNA records that contain:
       - exon and CDS
       - CDS, five_prime_UTR, three_prime_UTR
       - only exon for non-coding
   - top-level gene records with transcript records
   - top-level transcript records
   - transcript records that contain:
       - exon
where RNA can be mRNA, ncRNA, or rRNA, and transcript can be either
transcript or primary_transcript
The first step is to parse GFF3 file, up to 50 errors are reported before
aborting.  If the GFF3 files is successfully parse, it is converted to gene,
annotation.  Up to 50 conversion errors are reported before aborting.

Input file must conform to the GFF3 specification:
   http://www.sequenceontology.org/gff3.shtml

================================================================
========   gff3ToPsl   ====================================
================================================================
### kent source version 473 ###
gff3ToPsl - convert a GFF3 CIGAR file to a PSL file
usage:
   gff3ToPsl [options] queryChromSizes targetChomSizes inGff3 out.psl
arguments:
   queryChromSizes file with query (main coordinates) chromosome sizes  .
               File formatted:  chromeName<tab>chromSize
   targetChromSizes file with target (Target attribute)  chromosome sizes .
   inGff3     GFF3 formatted file with Gap attribute in match records
   out.psl    PSL formatted output
options:
   -dropQ     drop record when query not found in queryChromSizes
   -dropT     drop record when target not found in targetChromSizes
This converts:
The first step is to parse GFF3 file, up to 50 errors are reported before
aborting.  If the GFF3 files is successfully parse, it is converted to PSL

Input file must conform to the GFF3 specification:
   http://www.sequenceontology.org/gff3.shtml

================================================================
========   gmtime   ====================================
================================================================
gmtime - convert unix timestamp to date string
usage: gmtime <time stamp>
	<time stamp> - integer 0 to 2147483647
================================================================
========   gtfToGenePred   ====================================
================================================================
### kent source version 473 ###
gtfToGenePred - convert a GTF file to a genePred
usage:
   gtfToGenePred gtf genePred

options:
     -genePredExt - create a extended genePred, including frame
      information and gene name
     -allErrors - skip groups with errors rather than aborting.
      Useful for getting infomation about as many errors as possible.
     -ignoreGroupsWithoutExons - skip groups contain no exons rather than
      generate an error.
     -infoOut=file - write a file with information on each transcript
     -sourcePrefix=pre - only process entries where the source name has the
      specified prefix.  May be repeated.
     -impliedStopAfterCds - implied stop codon in after CDS
     -simple    - just check column validity, not hierarchy, resulting genePred may be damaged
     -geneNameAsName2 - if specified, use gene_name for the name2 field
      instead of gene_id.
     -includeVersion - it gene_version and/or transcript_version attributes exist, include the version
      in the corresponding identifiers.

================================================================
========   headRest   ====================================
================================================================
### kent source version 473 ###
headRest - Return all *but* the first N lines of a file.
usage:
   headRest count fileName
You can use stdin for fileName
options:
   -xxx=XXX

================================================================
========   hgBbiDbLink   ====================================
================================================================
### kent source version 473 ###
hgBbiDbLink - Add table that just contains a pointer to a bbiFile to database.  This program 
is used to add bigWigs and bigBeds.
usage:
   hgBbiDbLink database trackName fileName

================================================================
========   hgFakeAgp   ====================================
================================================================
### kent source version 473 ###
hgFakeAgp - Create fake AGP file by looking at N's
usage:
   hgFakeAgp input.fa output.agp
options:
   -minContigGap=N Minimum size for a gap between contigs.  Default 25
   -minScaffoldGap=N Min size for a gap between scaffolds. Default 50000
   -singleContigs - when a full sequence has no gaps, maintain contig
	name without adding index extension.

================================================================
========   hgFindSpec   ====================================
================================================================
### kent source version 473 ###
hgFindSpec - Create hgFindSpec table from trackDb.ra files.

usage:
   hgFindSpec [options] orgDir database hgFindSpec hgFindSpec.sql hgRoot

Options:
  -strict		Add spec to hgFindSpec only if its table(s) exist.
  -raName=trackDb.ra - Specify a file name to use other than trackDb.ra
   for the ra files.
  -release=alpha|beta|public - Include trackDb entries with this release tag only.

================================================================
========   hgGcPercent   ====================================
================================================================
### kent source version 473 ###
hgGcPercent - Calculate GC Percentage in 20kb windows
usage:
   hgGcPercent [options] database nibDir
     nibDir can be a .2bit file, a directory that contains a
     database.2bit file, or a directory that contains *.nib files.
     Loads gcPercent table with counts from sequence.
options:
   -win=<size> - change windows size (default 20000)
   -noLoad - do not load mysql table - create bed file
   -file=<filename> - output to <filename> (stdout OK) (implies -noLoad)
   -chr=<chrN> - process only chrN from the nibDir
   -noRandom - ignore randome chromosomes from the nibDir
   -noDots - do not display ... progress during processing
   -doGaps - process gaps correctly (default: gaps are not counted as GC)
   -wigOut - output wiggle ascii data ready to pipe to wigEncode
   -overlap=N - overlap windows by N bases (default 0)
   -verbose=N - display details to stderr during processing
   -bedRegionIn=input.bed   Read in a bed file for GC content in specific regions and write to bedRegionsOut
   -bedRegionOut=output.bed Write a bed file of GC content in specific regions from bedRegionIn

example:
  calculate GC percent in 5 base windows using a 2bit assembly (dp2):
    hgGcPercent -wigOut -doGaps -win=5 -file=stdout -verbose=0 \
      dp2 /cluster/data/dp2 \
    | wigEncode stdin gc5Base.wig gc5Base.wib
================================================================
========   hgGoldGapGl   ====================================
================================================================
hgGoldGapGl - Put chromosome .agp and .gl files into browser database.
usage:
   hgGoldGapGl database gsDir ooSubDir
	(this usage creates split gold and gap tables)
or
   hgGoldGapGl database agpFile
	(this usage creates single gold and gap tables)
options:
   -noGl  - don't do gl bits
   -chrom=chrN - just do a single chromosome.  Don't delete old tables.
   -chromLst=chrom.lst - chromosomes subdirs are named in chrom.lst (1, 2, ...)
   -noLoad - do not load tables, leave SQL files instead.
   -verbose n - n==2 brief information and SQL table create statements
              - n==3 show all gaps
example:
   hgGoldGapGl -noGl hg16 /cluster/data/hg16 .

================================================================
========   hgLoadBed   ====================================
================================================================
### kent source version 473 ###
hgLoadBed - Load a generic bed file into database
usage:
   hgLoadBed database track files(s).bed
options:
   -noSort  don't sort (you better be sorting before this)
   -noBin   suppress bin field
   -oldTable add to existing table
   -onServer This will speed things up if you're running in a directory that
             the mysql server can access.
   -sqlTable=table.sql Create table from .sql file
   -renameSqlTable Rename table created with -sqlTable to match track
   -trimSqlTable   If sqlTable has n fields, and input has m fields, only load m fields, meaning the last n-m fields in the sqlTable are optional
   -type=bedN[+[P]] : 
                      N is between 3 and 15, 
                      optional (+) if extra "bedPlus" fields, 
                      optional P specifies the number of extra fields. Not required, but preferred.
                      Examples: -type=bed6 or -type=bed6+ or -type=bed6+3 
                      (see http://genome.ucsc.edu/FAQ/FAQformat.html#format1)
                      Recommended to use with -as option for better bedPlus validation.
   -as=fields.as   If you have extra "bedPlus" fields, it's great to put a definition
                     of each field in a row in AutoSql format here.
   -chromInfo=file.txt    Specify chromInfo file to validate chrom names and sizes.
   -tab       Separate by tabs rather than space
   -hasBin    Input bed file starts with a bin field.
   -noLoad     - Do not load database and do not clean up tab files
   -noHistory  - Do not add history table comments (for custom tracks)
   -notItemRgb - Do not parse column nine as r,g,b when commas seen (bacEnds)
   -bedGraph=N - wiggle graph column N of the input file as float dataValue
               - bedGraph N is typically 4: -bedGraph=4
   -bedDetail  - bedDetail format with id and text for hgc clicks
               - requires tab and sqlTable options
   -maxChromNameLength=N  - specify max chromName length to avoid
               - reference to chromInfo table
   -tmpDir=<path>  - path to directory for creation of temporary .tab file
                   - which will be removed after loading
   -noNameIx  - no index for the name column (default creates index)
   -ignoreEmpty  - no error on empty input file
   -noStrict  - don't perform coord sanity checks
              - by default we abort when: chromStart >= chromEnd
   -allowStartEqualEnd  - even when doing strict checks, allow
                          chromStart==chromEnd (zero-length e.g. insertion)
   -allowNegativeScores  - sql definition of score column is int, not unsigned
   -customTrackLoader  - turns on: -noNameIx, -noHistory, -ignoreEmpty,
                         -allowStartEqualEnd, -allowNegativeScores, -verbose=0
                         Plus, this turns on a 20 minute time-out exit.
   -fillInScore=colName - if every score value is zero, then use column 'colName' to fill in the score column (from minScore-1000)
   -minScore=N - minimum value for score field for -fillInScore option (default 100)
   -verbose=N - verbose level for extra information to STDERR
   -dotIsNull=N - if the specified field is a '.' the replace it with -1
   -lineLimit=N - limit input file to this number of lines

================================================================
========   hgLoadChain   ====================================
================================================================
### kent source version 473 ###
hgLoadChain - Load a generic Chain file into database
usage:
   hgLoadChain database chrN_track chrN.chain
options:
   -tIndex  Include tName in indexes (for non-split chain tables)
   -noBin   suppress bin field, default: bin field is added
   -noSort  Don't sort by target (memory-intensive) -- input *must* be
            sorted by target already if this option is used.
   -oldTable add to existing table, default: create new table
   -sqlTable=table.sql Create table from .sql file
   -normScore add normalized score column to table, default: not added
   -qPrefix=xxx   prepend "xxx" to query name
   -test    suppress loading to database

================================================================
========   hgLoadGap   ====================================
================================================================
### kent source version 473 ###
hgLoadGap - Load gap table from AGP-style file containing only gaps
usage:
   hgLoadGap database dir
options:
   -chrom=chrN - just do a single chromosome.  Don't delete old tables.
   -unsplit    - Instead of making chr*_gap tables from .gap files found 
                 in dir, expect dir to be a .gap file and make an 
                 unsplit gap table from its contents.
   -noLoad     - Don't load the database table (for testing).
example:
   hgLoadGap fr1 /cluster/data/fr1
      Gap file must be named with .gap extension 
      This is only needed if gap table needs to be rebuilt,
      without changing the gold table.

================================================================
========   hgLoadMaf   ====================================
================================================================
### kent source version 473 ###
hgLoadMaf - Load a maf file index into the database
usage:
   hgLoadMaf database table
options:
   -warn            warn instead of error upon empty/incomplete alignments
   -WARN            warn instead of error, with detail for the warning
   -test=infile     use infile as input, and suppress loading
                    the database. Just create .tab file in current dir.
   -pathPrefix=dir  load files from specified directory 
                    (default /gbdb/database/table.
   -tmpDir=<path>   path to directory for creation of temporary .tab file
                    which will be removed after loading
   -loadFile=file   use file as input
   -maxNameLen=N    specify max chromosome name length to avoid
                    reference to chromInfo table
   -defPos=file     file to put default position in
                    default position is first block
   -custom          loading a custom track, don't use history
                    or extFile tables

NOTE: The maf files need to be in chromosome coordinates,
      the reference species must be the first component, and 
      the blocks must be correctly ordered and be on the
      '+' strand

================================================================
========   hgLoadMafSummary   ====================================
================================================================
### kent source version 473 ###
hgLoadMafSummary - Load a summary table of pairs in a maf into a database
usage:
   hgLoadMafSummary database table file.maf
options:
   -mergeGap=N   max size of gap to merge regions (default 500)
   -minSize=N         merge blocks smaller than N (default 10000)
   -maxSize=N         break up blocks larger than N (default 50000)
   -minSeqSize=N skip alignments when reference sequence is less than N
                 (default 1000000 -- match with hgTracks min window size for
                 using summary table)
   -test         suppress loading the database. Just create .tab file(s)
                     in current dir.

================================================================
========   hgLoadNet   ====================================
================================================================
### kent source version 473 ###
hgLoadNet - Load a generic net file into database
usage:
   hgLoadNet database track files(s).net
options:
   -noBin   suppress bin field
   -oldTable add to existing table
   -sqlTable=table.sql Create table from .sql file
   -qPrefix=xxx prepend "xxx-" to query name
   -warn load even with missing fields
   -test suppress loading table

================================================================
========   hgLoadOut   ====================================
================================================================
### kent source version 473 ###
hgLoadOut - load RepeatMasker .out files into database
usage:
   hgLoadOut database file(s).out
For multiple files chrN.out this will create the single table 'rmsk'
in the database, use the -split argument to obtain separate chrN_rmsk tables.
options:
   -tabFile=text.tab - don't actually load database, just create tab file
   -split - load chrN_rmsk separate tables even if a single file is given
   -table=name - use a different suffix other than the default (rmsk)
note: the input file.out can also be a compressed file.out.gz file,
      or a URL to a file.out or file.out.gz
================================================================
========   hgLoadOutJoined   ====================================
================================================================
### kent source version 473 ###
hgLoadOutJoined - load new style (2014) RepeatMasker .out files into database
usage:
   hgLoadOutJoined database file(s).out
For multiple files chrN.out this will create the single table 'rmskOutBaseline'
in the database.
options:
   -tabFile=text.tab - don't actually load database, just create tab file
   -table=name - use a different suffix other than the default (rmskOutBaseline)
================================================================
========   hgLoadSqlTab   ====================================
================================================================
### kent source version 473 ###
hgLoadSqlTab - Load table into database from SQL and text files.
usage:
   hgLoadSqlTab database table file.sql file(s).tab
file.sql contains a SQL create statement for table
file.tab contains tab-separated text (rows of table)
The actual table name will come from the command line, not the sql file.
options:
  -warn - warn instead of abort on mysql errors or warnings
  -notOnServer - file is *not* in a directory that the mysql server can see
  -oldTable|-append - add to existing table

To load bed 3+ sorted tab files as hgLoadBed would do automatically
sort the input file:
  sort -k1,1 -k2,2n file(s).tab | hgLoadSqlTab database table file.sql stdin

================================================================
========   hgLoadWiggle   ====================================
================================================================
### kent source version 473 ###
hgLoadWiggle - Load a wiggle track definition into database
usage:
   hgLoadWiggle [options] database track files(s).wig
options:
   -noBin	suppress bin field
   -noLoad	do not load table, only create .tab file
   -noHistory	do not add history table comments (for custom tracks)
   -oldTable	add to existing table
   -tab		Separate by tabs rather than space
   -pathPrefix=<path>	.wib file path prefix to use (default /gbdb/<DB>/wib)
   -chromInfoDb=<DB>	database to extract chromInfo size information
   -maxChromNameLength=N  - specify max chromName length to avoid
               - reference to chromInfo table
   -tmpDir=<path>  - path to directory for creation of temporary .tab file
                   - which will be removed after loading
   -verbose=N	N=2 see # of lines input and SQL create statement,
		N=3 see chrom size info, N=4 see details on chrom size info
================================================================
========   hgSpeciesRna   ====================================
================================================================
### kent source version 473 ###
hgSpeciesRna - Create fasta file with RNA from one species
usage:
   hgSpeciesRna database Genus species output.fa
options:
   -est         - If set will get ESTs rather than mRNAs
   -filter=file - only read accessions listed in file

================================================================
========   hgTrackDb   ====================================
================================================================
### kent source version 473 ###
hgTrackDb - Create trackDb table from text files.

Note that the browser supports multiple trackDb tables, usually
in the form: trackDb_YourUserName. Which particular trackDb
table the browser uses is specified in the hg.conf file found
either in your home directory file '.hg.conf' or in the web server's
cgi-bin/hg.conf configuration file with the setting: db.trackDb=trackDb
see also: src/product/ex.hg.conf discussion of this setting.
usage:
   hgTrackDb [options] org database trackDb trackDb.sql hgRoot

Options:
  org - a directory name with a hierarchy of trackDb.ra files to examine
      - in the case of a single directory with a single trackDb.ra file use .
  database - name of database to create the trackDb table in
  trackDb  - name of table to create, usually trackDb, or trackDb_${USER}
  trackDb.sql  - SQL definition of the table to create, typically from
               - the source tree file: src/hg/lib/trackDb.sql
               - the table name in the CREATE statement is replaced by the
               - table name specified on this command line.
  hgRoot - a directory name to prepend to org to locate the hierarchy:
           hgRoot/trackDb.ra - top level trackDb.ra file processed first
           hgRoot/org/trackDb.ra - second level file processed second
           hgRoot/org/database/trackDb.ra - third level file processed last
         - for no directory hierarchy use .
  -strict - only include tables that exist (and complain about missing html files).
  -raName=trackDb.ra - Specify a file name to use other than trackDb.ra
   for the ra files.
  -release=alpha|beta|public - Include trackDb entries with this release tag only.
  -settings - for trackDb scanning, output table name, type line,
            -  and settings hash to stderr while loading everything.
  -gbdbList=list - list of files to confirm existance of bigDataUrl files
  -addVersion - add cartVersion pseudo-table
  -noHtmlCheck - don't check for HTML even if strict is set

================================================================
========   hgWiggle   ====================================
================================================================
### kent source version 473 ###
#	no database specified, using .wig files
#	doAscii option on, perform the default ascii output
hgWiggle - fetch wiggle data from data base or file
usage:
   hgWiggle [options] <track names ...>
options:
   -db=<database> - use specified database
   -chr=chrN - examine data only on chrN
   -chrom=chrN - same as -chr option above
   -position=[chrN:]start-end - examine data in window start-end (1-relative)
             (the chrN: is optional)
   -chromLst=<file> - file with list of chroms to examine
   -doAscii - perform the default ascii output, in addition to other outputs
            - Any of the other -do outputs turn off the default ascii output
            - ***WARNING*** this ascii output is 0-relative offset which
            - *** is *not* the normal wiggle input format.  Use the -lift
            - *** argument -lift=1 to get 1-relative offset:
   -lift=<D> - lift ascii output positions by D (0 default)
   -rawDataOut - output just the data values, nothing else
   -htmlOut - output stats or histogram in HTML instead of plain text
   -doStats - perform stats measurement, default output text, see -htmlOut
   -doBed - output bed format
   -bedFile=<file> - constrain output to ranges specified in bed <file>
   -dataConstraint='DC' - where DC is one of < = >= <= == != 'in range'
   -ll=<F> - lowerLimit compare data values to F (float) (all but 'in range')
   -ul=<F> - upperLimit compare data values to F (float)
		(need both ll and ul when 'in range')

   -help - display more examples and extra options (to stderr)

   When no database is specified, track names will refer to .wig files

   example using the file chrM.wig:
	hgWiggle chrM
   example using the database table hg17.gc5Base:
	hgWiggle -chr=chrM -db=hg17 gc5Base
================================================================
========   hgsqldump   ====================================
================================================================
hgsqldump - Execute mysqldump using passwords from .hg.conf
usage:
   hgsqldump [OPTIONS] database [tables]
or:
   hgsqldump [OPTIONS] --databases [OPTIONS] DB1 [DB2 DB3 ...]
or:
   hgsqldump [OPTIONS] --all-databases [OPTIONS]
Generally anything in command line is passed to mysqldump
	after an implicit '-u user -ppassword
See also: mysqldump
Note: directory for results must be writable by mysql.  i.e. 'chmod 777 .'
Which is a security risk, so remember to change permissions back after use.
e.g.: hgsqldump --all -c --tab=. cb1

================================================================
========   hgvsToVcf   ====================================
================================================================
### kent source version 473 ###
hgvsToVcf - Convert HGVS terms to VCF tab-separated output
usage:
   hgvsToVcf db input.hgvs output.vcf
options:
   -noLeftShift        Don't do the VCF-conventional left shifting of ambiguous placements
db is a UCSC database such as hg19, hg38 etc.
Only nucleotide HGVS terms (g., c., n., m.) are supported, not protein (p.).

================================================================
========   hicInfo   ====================================
================================================================
### kent source version 473 ###
hicInfo - Retrieve and display header information for a .hic file.  Uses UDC for remote files.
usage:
   hicInfo file.hic
options:
   -attrs - write out the attribute dictionary for the file (might be large)
   -udcDir=/dir/to/cache - place to put cache for remote bigBed/bigWigs

================================================================
========   htmlCheck   ====================================
================================================================
### kent source version 473 ###
htmlCheck - Do a little reading and verification of html file
usage:
   htmlCheck how url
where how is:
   ok - just check for 200 return.  Print error message and exit -1 if no 200
   getAll - read the url (header and html) and print to stdout
   getHeader - read the header and print to stdout
   getCookies - print list of cookies
   getHtml - print the html, but not the header to stdout
   getForms - print the form structure to stdout
   getVars - print the form variables to stdout
   getLinks - print links
   getTags - print out just the tags
   checkLinks - check links in page
   checkLinks2 - check links in page and all subpages in same host
             (Just one level of recursion)
   checkLocalLinks - check local links in page
   checkLocalLinks2 - check local links in page and connected local pages
             (Just one level of recursion)
   submit - submit first form in page if any using 'GET' method
   validate - do some basic validations including TABLE/TR/TD nesting
   strictTagNestCheck - check tags are correctly nested
options:
   cookies=cookie.txt - Cookies is a two column file
           containing <cookieName><space><value><newLine>
   withSrc - causes the get and checkLinks commands to also include SRC= links.
note: url will need to be in quotes if it contains an ampersand or question mark.
================================================================
========   hubCheck   ====================================
================================================================
### kent source version 473 ###
hubCheck - Check a track data hub for integrity.
usage:
   hubCheck http://yourHost/yourDir/hub.txt
options:
   -noTracks             - don't check remote files for tracks, just trackDb (faster)
   -checkSettings        - check trackDb settings to spec
   -version=[v?|url]     - version to validate settings against
                                     (defaults to version in hub.txt, or current standard)
   -extra=[file|url]     - accept settings in this file (or url)
   -level=base|required  - reject settings below this support level
   -settings             - just list settings with support level
   -genome=genome        - only check this genome
   -udcDir=/dir/to/cache - place to put cache for remote bigBed/bigWigs.
                                     Will create this directory if not existing
   -httpsCertCheck=[abort,warn,log,none] - set the ssl certificate verification mode.
   -httpsCertCheckDomainExceptions= - space separated list of domains to whitelist.
   -printMeta            - print the metadata for each track
   -cacheTime=N          - set cache refresh time in seconds, default 1
   -verbose=2            - output verbosely

================================================================
========   hubClone   ====================================
================================================================
### kent source version 473 ###
hubClone - Clone the remote hub text files to a local copy in newDirectoryName, fixing up bigDataUrls to remote location if necessary
usage:
   hubClone http://url/to/hub.txt
options:
   -udcDir=/dir/to/udcCache   Path to udc directory
   -download                  Download data files in addition to the hub configuration files

================================================================
========   hubPublicCheck   ====================================
================================================================
### kent source version 473 ###
hubPublicCheck - checks that the labels in hubPublic match what is in the hub labels
   outputs SQL statements to put the table into compliance
usage:
   hubPublicCheck tableName 
options:
   -udcDir=/dir/to/cache - place to put cache for remote bigBed/bigWigs
   -addHub=url           - output statments to add url to table

================================================================
========   isPcr   ====================================
================================================================
### kent source version 473 ###
isPcr - Standalone v 39x1 In-Situ PCR Program
usage:
   isPcr database query output
where database is a fasta, nib, or twoBit file or a text file containing
a list of these files,  query is a text file file containing three columns: name,
forward primer, and reverse primer,  and output is where the results go.
The names 'stdin' and 'stdout' can be used as file names to make using the
program in pipes easier.
options:
   -ooc=N.ooc  Use overused tile file N.ooc.  N should correspond to 
               the tileSize
   -tileSize=N the size of match that triggers an alignment.  
               Default is 11 .
   -stepSize=N spacing between tiles. Default is 5.
   -maxSize=N - Maximum size of PCR product (default 4000)
   -minSize=N - Minimum size of PCR product (default 0)
   -minPerfect=N - Minimum size of perfect match at 3' end of primer (default 15)
   -minGood=N - Minimum size where there must be 2 matches for each mismatch (default 15)
   -mask=type  Mask out repeats.  Alignments won't be started in masked region
               but may extend through it in nucleotide searches.  Masked areas
               are ignored entirely in protein or translated searches. Types are
                 lower - mask out lower cased sequence
                 upper - mask out upper cased sequence
                 out   - mask according to database.out RepeatMasker .out file
                 file.out - mask database according to RepeatMasker file.out
   -makeOoc=N.ooc Make overused tile file. Database needs to be complete genome.
   -repMatch=N sets the number of repetitions of a tile allowed before
               it is marked as overused.  Typically this is 256 for tileSize
               12, 1024 for tile size 11, 4096 for tile size 10.
               Default is 1024.  Only comes into play with makeOoc
   -noSimpRepMask Suppresses simple repeat masking.
   -flipReverse Reverse complement reverse (second) primer before using
   -out=XXX - Output format.  Either
      fa - fasta with position, primers in header (default)
      bed - tab delimited format. Fields: chrom/start/end/name/score/strand
      psl - blat format.

================================================================
========   ixIxx   ====================================
================================================================
### kent source version 473 ###
ixIxx - Create indices for simple line-oriented file of format 
<symbol> <free text>
usage:
   ixIxx in.text out.ix out.ixx
Where out.ix is a word index, and out.ixx is an index into the index.
options:
   -prefixSize=N Size of prefix to index on in ixx.  Default is 5.
   -binSize=N Size of bins in ixx.  Default is 64k.
   -maxWordLength=N Maximum allowed word length. 
     Words with more characters than this limit are ignored and will not appear in index or be searchable.  Default is 31.

================================================================
========   lavToAxt   ====================================
================================================================
lavToAxt - Convert blastz lav file to an axt file (which includes sequence)
usage:
   lavToAxt in.lav tNibDir qNibDir out.axt
Where tNibDir/qNibDir are either directories full of nib files, or a single
twoBit file
options:
   -fa  qNibDir is interpreted as a fasta file of multiple dna seq instead of directory of nibs
   -tfa tNibDir is interpreted as a fasta file of multiple dna seq instead of directory of nibs
   -dropSelf  drops alignment blocks on the diagonal for self alignments
   -scoreScheme=fileName Read the scoring matrix from a blastz-format file.
                (only used in conjunction with -dropSelf, to rescore 
                alignments when blocks are dropped)

================================================================
========   lavToPsl   ====================================
================================================================
lavToPsl - Convert blastz lav to psl format
usage:
   lavToPsl in.lav out.psl
options:
   -target-strand=c set the target strand to c (default is no strand)
   -bed output bed instead of psl
   -scoreFile=filename  output lav scores to side file, such that
                        each psl line in out.psl is matched by a score line.

================================================================
========   ldHgGene   ====================================
================================================================
### kent source version 473 ###
ldHgGene - load database with gene predictions from a gff file.
usage:
     ldHgGene database table file(s).gff
options:
     -bin         Add bin column (now the default)
     -nobin       don't add binning (you probably don't want this)
     -exon=type   Sets type field for exons to specific value
     -oldTable    Don't overwrite what's already in table
     -noncoding   Forces whole prediction to be UTR
     -gtf         input is GTF, stop codon is not in CDS
     -predTab     input is already in genePredTab format
     -requireCDS  discard genes that don't have CDS annotation
     -out=gpfile  write output, in genePred format, instead of loading
                  table. Database is ignored.
     -genePredExt create a extended genePred, including frame
                  information and gene name
     -impliedStopAfterCds - implied stop codon in GFF/GTF after CDS

================================================================
========   liftOver   ====================================
================================================================
### kent source version 473 ###
liftOver - Move annotations from one assembly to another
usage:
   liftOver oldFile map.chain newFile unMapped
oldFile and newFile are in bed format by default, but can be in GFF and
maybe eventually others with the appropriate flags below.
The map.chain file has the old genome as the target and the new genome
as the query.

***********************************************************************
WARNING: liftOver was only designed to work between different
         assemblies of the same organism. It may not do what you want
         if you are lifting between different organisms. If there has
         been a rearrangement in one of the species, the size of the
         region being mapped may change dramatically after mapping.
***********************************************************************

options:
   -minMatch=0.N Minimum ratio of bases that must remap. Default 0.95
   -gff  File is in gff/gtf format.  Note that the gff lines are converted
         separately.  It would be good to have a separate check after this
         that the lines that make up a gene model still make a plausible gene
         after liftOver
   -genePred - File is in genePred format
   -sample - File is in sample format
   -bedPlus=N - File is bed N+ format (i.e. first N fields conform to bed format)
   -positions - File is in browser "position" format
   -hasBin - File has bin value (used only with -bedPlus)
   -tab - Separate by tabs rather than space (used only with -bedPlus)
   -pslT - File is in psl format, map target side only
   -ends=N - Lift the first and last N bases of each record and combine the
             result. This is useful for lifting large regions like BAC end pairs.
   -minBlocks=0.N Minimum ratio of alignment blocks or exons that must map
                  (default 1.00)
   -fudgeThick    (bed 12 or 12+ only) If thickStart/thickEnd is not mapped,
                  use the closest mapped base.  Recommended if using 
                  -minBlocks.
   -multiple               Allow multiple output regions
   -noSerial               In -multiple mode, do not put a serial number in the 5th BED column
   -minChainT, -minChainQ  Minimum chain size in target/query, when mapping
                           to multiple output regions (default 0, 0)
   -minSizeT               deprecated synonym for -minChainT (ENCODE compat.)
   -minSizeQ               Min matching region size in query with -multiple.
   -chainTable             Used with -multiple, format is db.tablename,
                               to extend chains from net (preserves dups)
   -errorHelp              Explain error messages
   -preserveInput          Attach positions from the input file to item names, to assist in
                           determining what got mapped where (bed4+, gff, genePred, sample only)

================================================================
========   liftOverMerge   ====================================
================================================================
### kent source version 473 ###
liftOverMerge - Merge multiple regions in BED 5 files
                   generated by liftOver -multiple
usage:
   liftOverMerge oldFile newFile
options:
   -mergeGap=N    Max size of gap to merge regions (default 0)

================================================================
========   liftUp   ====================================
================================================================
### kent source version 473 ###
liftUp - change coordinates of .psl, .agp, .gap, .gl, .out, .align, .gff, .gtf
.bscore .tab .gdup .axt .chain .net, .gp, .genepred, .wab, .bed, .bed3, or .bed8 files to
parent coordinate system.

usage:
   liftUp [-type=.xxx] destFile liftSpec how sourceFile(s)
The optional -type parameter tells what type of files to lift
If omitted the type is inferred from the suffix of destFile
Type is one of the suffixes described above.
DestFile will contain the merged and lifted source files,
with the coordinates translated as per liftSpec.  LiftSpec
is tab-delimited with each line of the form:
   offset oldName oldSize newName newSize
LiftSpec may optionally have a sixth column specifying + or - strand,
but strand is not supported for all input types.
The 'how' parameter controls what the program will do with
items which are not in the liftSpec.  It must be one of:
   carry - Items not in liftSpec are carried to dest without translation
   drop  - Items not in liftSpec are silently dropped from dest
   warn  - Items not in liftSpec are dropped.  A warning is issued
   error - Items not in liftSpec generate an error
If the destination is a .agp file then a 'large inserts' file
also needs to be included in the command line:
   liftUp dest.agp liftSpec how inserts sourceFile(s)
This file describes where large inserts due to heterochromitin
should be added. Use /dev/null and set -gapsize if there's not inserts file.

options:
   -nohead  No header written for .psl files
   -dots=N Output a dot every N lines processed
   -pslQ  Lift query (rather than target) side of psl
   -axtQ  Lift query (rather than target) side of axt
   -chainQ  Lift query (rather than target) side of chain
   -netQ  Lift query (rather than target) side of net
   -wabaQ  Lift query (rather than target) side of waba alignment
   	(waba lifts only work with query side at this time)
   -nosort Don't sort bed, gff, or gdup files, to save memory
   -gapsize change contig gapsize from default
   -ignoreVersions - Ignore NCBI-style version number in sequence ids of input files
   -extGenePred lift extended genePred

================================================================
========   linesToRa   ====================================
================================================================
### kent source version 473 ###
linesToRa - generate .ra format from lines with pipe-separated fields
usage:
   linesToRa in.txt out.ra

================================================================
========   localtime   ====================================
================================================================
localtime - convert unix timestamp to date string
usage: localtime <time stamp>
	<time stamp> - integer 0 to 2147483647
================================================================
========   mafAddIRows   ====================================
================================================================
### kent source version 473 ###
mafAddIRows - add 'i' rows to a maf
usage:
   mafAddIRows mafIn twoBitFile mafOut
WARNING:  requires a maf with only a single target sequence
options:
   -nBeds=listOfBedFiles
       reads in list of bed files, one per species, with N locations
   -addN
       adds rows of N's into maf blocks (rather than just annotating them)
   -addDash
       adds rows of -'s into maf blocks (rather than just annotating them)

================================================================
========   mafAddQRows   ====================================
================================================================
### kent source version 473 ###
mafAddQRows - Add quality data to a maf
usage:
   mafAddQRows species.lst in.maf out.maf
where each species.lst line contains two fields
   1) species name
   2) directory where the .qac and .qdx files are located
options:
  -divisor=n is value to divide Q value by.  Default is 5.

================================================================
========   mafCoverage   ====================================
================================================================
### kent source version 473 ###
mafCoverage - Analyse coverage by maf files - chromosome by 
chromosome and genome-wide.
usage:
   mafCoverage database mafFile
Note maf file must be sorted by chromosome,tStart
   -restrict=restrict.bed Restrict to parts in restrict.bed
   -count=N Number of matching species to count coverage. Default = 3 

================================================================
========   mafFetch   ====================================
================================================================
mafFetch - get overlapping records from an MAF using an index table
usage:
   mafFetch db table overBed mafOut

Select MAF records overlapping records in the BED using the
the database table to lookup the file and record offset.
Only the first 3 columns are required in the bed.

Options:

================================================================
========   mafFilter   ====================================
================================================================
### kent source version 473 ###
mafFilter - Filter out maf files. Output goes to standard out
usage:
   mafFilter file(s).maf
options:
   -tolerate - Just ignore bad input rather than aborting.
   -minCol=N - Filter out blocks with fewer than N columns (default 1)
   -minRow=N - Filter out blocks with fewer than N rows (default 2)
   -maxRow=N - Filter out blocks with >= N rows (default 100)
   -factor - Filter out scores below -minFactor * (ncol**2) * nrow
   -minFactor=N - Factor to use with -minFactor (default 5)
   -minScore=N - Minimum allowed score (alternative to -minFactor)
   -reject=filename - Save rejected blocks in filename
   -needComp=species - all alignments must have species as one of the component
   -overlap - Reject overlapping blocks in reference (assumes ordered blocks)
   -componentFilter=filename - Filter out blocks without a component listed in filename 
   -speciesFilter=filename - Filter out blocks without a species listed in filename 

================================================================
========   mafFrag   ====================================
================================================================
### kent source version 473 ###
mafFrag - Extract maf sequences for a region from database
usage:
   mafFrag database mafTrack chrom start end strand out.maf
options:
   -outName=XXX  Use XXX instead of database.chrom for the name

================================================================
========   mafFrags   ====================================
================================================================
### kent source version 473 ###
mafFrags - Collect MAFs from regions specified in a 6 column bed file
usage:
   mafFrags database track in.bed out.maf
options:
   -orgs=org.txt - File with list of databases/organisms in order
   -bed12 - If set, in.bed is a bed 12 file, including exons
   -thickOnly - Only extract subset between thickStart/thickEnd
   -meFirst - Put native sequence first in maf
   -txStarts - Add MAF txstart region definitions ('r' lines) using BED name
    and output actual reference genome coordinates in MAF.
   -refCoords - output actual reference genome coordinates in MAF.

================================================================
========   mafGene   ====================================
================================================================
### kent source version 473 ###
mafGene - output protein alignments using maf and genePred
usage:
   mafGene dbName mafTable genePredTable species.lst output
arguments:
   dbName         name of SQL database
   mafTable       name of maf file table
   genePredTable  name of the genePred table
   species.lst    list of species names
   output         put output here
options:
   -useFile           genePredTable argument is a genePred file name
   -geneName=foobar   name of gene as it appears in genePred
   -geneList=foolst   name of file with list of genes
   -geneBeds=foo.bed  name of bed file with genes and positions
   -chrom=chr1        name of chromosome from which to grab genes
   -exons             output exons
   -noTrans           don't translate output into amino acids
   -uniqAA            put out unique pseudo-AA for every different codon
   -includeUtr        include the UTRs, use only with -noTrans
   -delay=N           delay N seconds between genes (default 0)
   -noDash            don't output lines with all dashes

================================================================
========   mafMeFirst   ====================================
================================================================
### kent source version 473 ###
mafMeFirst - Move component to top if it is one of the named ones.  
Useful in conjunction with mafFrags when you don't want the one with 
the gene name to be in the middle.
usage:
   mafMeFirst in.maf me.list out.maf
options:
   -xxx=XXX

================================================================
========   mafNoAlign   ====================================
================================================================
### kent source version 473 ###
mafNoAlign - output a bed where there are no other sequences mapping to the reference sequence
NOTE: code expects first sequence in blocks to be the reference and to be on the '+' strand
      and the sequence name to be in the form species.chromosome.
usage:
   mafNoAlign in.maf out.bed

================================================================
========   mafOrder   ====================================
================================================================
### kent source version 473 ###
mafOrder - order components within a maf file
usage:
   mafOrder mafIn order.lst mafOut
where order.lst has one species per line
options:

================================================================
========   mafRanges   ====================================
================================================================
### kent source version 473 ###
mafRanges - Extract ranges of target (or query) coverage from maf and 
            output as BED 3 (e.g. for processing by featureBits).
usage:
   mafRanges in.maf db out.bed
            db should appear in in.maf alignments as the first part of 
            "db.seqName"-style sequence names.  The seqName part will 
            be used as the chrom field in the range printed to out.bed.
options:
   -otherDb=oDb  Output ranges only for alignments that include oDb.
                 oDB can be comma-separated list.
   -notAllOGap   Don't include bases for which all other species have a gap.


================================================================
========   mafSpeciesList   ====================================
================================================================
### kent source version 473 ###
mafSpeciesList - Scan maf and output all species used in it.
usage:
   mafSpeciesList in.maf out.lst
options:
   -ignoreFirst - If true ignore first species in each maf, useful when this
                  is a mafFrags result that puts gene id there.

================================================================
========   mafSpeciesSubset   ====================================
================================================================
### kent source version 473 ###
mafSpeciesSubset - Extract a maf that just has a subset of species.
usage:
   mafSpeciesSubset in.maf species.lst out.maf
Where:
    in.maf is a file where the sequence source are either simple species
           names, or species.something.  Usually actually it's a genome
           database name rather than a species before the dot to tell the
           truth.
    species.lst is a file with a list of species to keep
    out.maf is the output.  It will have columns that are all - or . in
           the reduced species set removed, as well as the lines representing
           species not in species.lst removed.
options:
   -keepFirst - If set, keep the first 'a' line in a maf no matter what
                Useful for mafFrag results where we use this for the gene name

================================================================
========   mafSplit   ====================================
================================================================
### kent source version 473 ###
mafSplit - Split multiple alignment files
usage:
   mafSplit splits.bed outRoot file(s).maf
options:
   -byTarget       Make one file per target sequence.  (splits.bed input
                   is ignored).
   -outDirDepth=N  For use only with -byTarget.
                   Create N levels of output directory under current dir.
                   This helps prevent NFS problems with a large number of
                   file in a directory.  Using -outDirDepth=3 would
                   produce ./1/2/3/outRoot123.maf.
   -useSequenceName  For use only with -byTarget.
                     Instead of auto-incrementing an integer to determine
                     output filename, expect each target sequence name to
                     end with a unique number and use that number as the
                     integer to tack onto outRoot.
   -useFullSequenceName  For use only with -byTarget.
                     Instead of auto-incrementing an integer to determine
                     output filename, use the target sequence name
                     to tack onto outRoot.
   -useHashedName=N  For use only with -byTarget.
                     Instead of auto-incrementing an integer or requiring
                     a unique number in the sequence name, use a hash
                     function on the sequence name to compute an N-bit
                     number.  This limits the max #filenames to 2^N and
                     ensures that even if different subsets of sequences
                     appear in different pairwise mafs, the split file
                     names will be consistent (due to hash function).
                     This option is useful when a "scaffold-based"
                     assembly has more than one sequence name pattern,
                     e.g. both chroms and scaffolds.


================================================================
========   mafSplitPos   ====================================
================================================================
### kent source version 473 ###
mafSplitPos - Pick positions to split multiple alignment input files
usage:
   mafSplitPos database size(Mbp) out.bed
options:
   -chrom=chrN   Restrict to one chromosome
   -minGap=N     Split only on gaps >N bp, defaults to 100, specify -1 to disable
   -minRepeat=N  Split only on repeats >N bp, defaults to 100, specify -1 to disable

================================================================
========   mafToAxt   ====================================
================================================================
### kent source version 473 ###
mafToAxt - Convert from maf to axt format
usage:
   mafToAxt in.maf tName qName output
Where tName and qName are the names for the
target and query sequences respectively.
tName should be maf target since it must always
be oriented in "+" direction.
  Use 'first' for tName to always use first sequence
Options:
  -stripDb - Strip names from start to first period.

================================================================
========   mafToBigMaf   ====================================
================================================================
### kent source version 473 ###
mafToBigMaf - Put ucsc standard maf file into bigMaf format
usage:
   mafToBigMaf referenceDb input.maf out.bed
options:
   -xxx=XXX

================================================================
========   mafToPsl   ====================================
================================================================
### kent source version 473 ###
mafToPsl - Convert maf to psl format
usage:
   mafToPsl querySrc targetSrc in.maf out.psl

The query and target src can be either an organism prefix (hg17),
or a full src sequence name (hg17.chr11), or just the sequence name
if the MAF does not contain organism prefixes.


================================================================
========   mafToSnpBed   ====================================
================================================================
### kent source version 473 ###
mafToSnpBed - finds SNPs in MAF and builds a bed with their functional consequence
usage:
   mafToSnpBed database input.maf input.gp output.bed
options:
   -xxx=XXX

================================================================
========   mafsInRegion   ====================================
================================================================
### kent source version 473 ###
mafsInRegion - Extract MAFS in a genomic region
usage:
    mafsInRegion regions.bed out.maf|outDir in.maf(s)
options:
    -outDir - output separate files named by bed name field to outDir
    -keepInitialGaps - keep alignment columns at the beginning and of a block that are gapped in all species

================================================================
========   makeTableList   ====================================
================================================================
### kent source version 473 ###
makeTableList - create/recreate tableList tables (cache of SHOW TABLES and DESCRIBE)
usage:
   makeTableList [assemblies]
options:
   -host               show tables: mysql host
   -user               show tables: mysql user
   -password           show tables: mysql password
   -toProf             optional: mysql profile to write table list to (target server)
   -toHost             alternative to toProf: mysql target host
   -toUser             alternative to toProf: mysql target user
   -toPassword         alternative to toProf: mysql target pwd
   -hgcentral          specify an alternative hgcentral db name when using -all
   -all                recreate tableList for all active assemblies in hg.conf's hgcentral
   -bigFiles           create table with tuples (track, name of bigfile)

================================================================
========   maskOutFa   ====================================
================================================================
### kent source version 473 ###
maskOutFa - Produce a masked .fa file given an unmasked .fa and
a RepeatMasker .out file, or a .bed file to mask on.
usage:
   maskOutFa in.fa maskFile out.fa.masked
where in.fa and out.fa.masked can be the same file, and
maskFile can end in .out (for RepeatMasker) or .bed.
MaskFile parameter can also be the word 'hard' in which case 
lower case letters are converted to N's.
options:
   -soft - puts masked parts in lower case other in upper.
   -softAdd - lower cases masked bits, leaves others unchanged
   -clip - clip out of bounds mask records rather than dying.
   -maskFormat=fmt - "out" or "bed" for when input does not have required extension.

================================================================
========   matrixClusterColumns   ====================================
================================================================
### kent source version 473 ###
matrixClusterColumns - Group the columns of a matrix into clusters, and output a matrix with
the same number of rows and generally much fewer columns. Combines columns by taking mean.
usage:
   matrixClusterColumns inMatrix.tsv meta.tsv cluster outMatrix.tsv outStats.tsv [cluster2 outMatrix2.tsv outStats2.tsv ... ]
where:
   inMatrix.tsv is a file in tsv format with cell labels in first row and gene labels in first column
   meta.tsv is a table where the first row is field labels and the first column is sample ids
   cluster is the name of the field with the cluster names
You can produce multiple clusterings in the same pass through the input matrix by specifying
additional cluster/outMatrix/outStats triples in the command line.
options:
   -makeIndex=index.tsv - output index tsv file with <matrix-col1><input-file-pos><line-len>
   -median if set ouput median rather than mean cluster value
   -excludeZeros if set exclude zeros when calculating mean/median

================================================================
========   matrixMarketToTsv   ====================================
================================================================
### kent source version 473 ###
matrixMarketToTsv - Convert matrix file from Matrix Market sparse matrix format to tab-separated-values.
usage:
   matrixMarketToTsv in.mtx sampleLabels.lst geneLabels.lst out.tsv
where in.mtx is a matrix market format matrix.  SampleLabels is a text file
with one label per line.  It will end in the first row of the output.
GeneLabels.lst is a text file with one gene name per line.  It will end up
in the first column of the output

================================================================
========   matrixNormalize   ====================================
================================================================
### kent source version 473 ###
matrixNormalize - Normalize a matrix somehow - make it's columns or rows all sum to one or have vector length one.
usage:
   matrixNormalize direction how inMatrix outMatrix
where "direction" is one of
   row - normalize rows to one
   column - normalize columns to one
and "how" is one of
   sum - sum adds to one after normalization
   length - Euclidian length as a vector adds to one
options:
   -target=val - use target val instead of one for normalizing
   -serial - disable use of pthreads to speed up via parallelization

================================================================
========   matrixToBarChartBed   ====================================
================================================================
### kent source version 473 ###
matrixToBarChartBed - Attach a labeled expression matrix to a bed file joining
on the matrix's first column and the bed's name column.
usage:
   matrixToBarChartBed matrix.tsv mapping.bed barChartOutput.bed
where
   matrix is tab-separated values with the first row and column used as labels
   mapping.bed maps a row of the matrix using the bed's name field and matrix's 1st field
        The mapping.bed file is expected to have a 'name2' field as it's last column
        and otherwise be at least bed 6.   Only the first 6 fields plus the name2 field are
        used.  Often this file will be made with gencodeVersionForGenes
options:
   -bedIx=N - use the N'th column of the mapping.bed as the id column. Default 4
   -trackDb=stanza.txt -output a trackDb stanza for this as a track
   -stats=stats.tsv - stats file from matrixClusterColumns, makes coloring in trackDb better

================================================================
========   mktime   ====================================
================================================================
mktime - convert date string to unix timestamp
usage: mktime YYYY-MM-DD HH:MM:SS
valid dates: 1970-01-01 00:00:00 to 2038-01-19 03:14:07
================================================================
========   mrnaToGene   ====================================
================================================================
### kent source version 473 ###
mrnaToGene - convert PSL alignments of mRNAs to gene annotations
usage:
   mrnaToGene [options] psl genePredFile

Convert PSL alignments with CDS annotation from genbank to  gene
annotations in genePred format.  Accessions without valids CDS are
optionally dropped. A best attempt is made to convert incomplete CDS
annotations.

The psl argument may either be a PSL file or a table in a databases,
depending on options.  CDS maybe obtained from the database or file.
Accession in PSL files are tried with and with out genbank versions.

Options:
  -db=db - get PSLs and CDS from this database, psl specifies the table.
  -cdsDb=db - get CDS from this database, psl is a file.
  -cdsFile=file - get CDS from this file, psl is a file.
   File is tab separate with name as the first column and
   NCBI CDS the second
  -insertMergeSize=8 - Merge inserts (gaps) no larger than this many bases.
   A negative size disables merging of blocks.  This differs from specifying zero
   in that adjacent blocks will not be merged, allowing tracking of frame for
   each block. Defaults to 8 unless -cdsMergeSize or -utrMergeSize are specified,
   if either of these are specified, this option is ignored.
  -smallInsertSize=n - alias for -insertMergetSize
  -cdsMergeSize=-1 - merge gaps in CDS no larger than this size.
   A negative values disables.
  -cdsMergeMod3 - only merge CDS gaps if they mod 3
  -utrMergeSize=-1  - merge gaps in UTR no larger than this size.
   A negative values disables.
  -requireUtr - Drop sequences that don't have both 5' and 3' UTR annotated.
  -genePredExt - create a extended genePred, including frame information.
  -allCds - consider PSL to be all CDS.
  -noCds - consider PSL to not contain any CDS.
  -keepInvalid - Keep sequences with invalid CDS.
  -quiet - Don't print print info about dropped sequences.
  -ignoreUniqSuffix - ignore all characters after last `-' in qName
   when looking up CDS. Used when a suffix has been added to make qName
   unique.  It is not removed from the name in the genePred.


================================================================
========   netChainSubset   ====================================
================================================================
### kent source version 473 ###
netChainSubset - Create chain file with subset of chains that appear in the net
usage:
   netChainSubset in.net in.chain out.chain
options:
   -gapOut=gap.tab - Output gap sizes to file
   -type=XXX - Restrict output to particular type in net file
   -splitOnInsert - Split chain when get an insertion of another chain
   -wholeChains - Write entire chain references by net, don't split
    when a high-level net is encoundered.  This is useful when nets
    have been filtered.
   -skipMissing - skip chains that are not found instead of generating
    an error.  Useful if chains have been filtered.

================================================================
========   netClass   ====================================
================================================================
### kent source version 473 ###
netClass - Add classification info to net
usage:
   netClass [options] in.net tDb qDb out.net
       tDb - database to fetch target repeat masker table information
       qDb - database to fetch query repeat masker table information
options:
   -tNewR=dir - Dir of chrN.out.spec files, with RepeatMasker .out format
                lines describing lineage specific repeats in target
   -qNewR=dir - Dir of chrN.out.spec files for query
   -noAr - Don't look for ancient repeats
   -qRepeats=table - table name for query repeats in place of rmsk
   -tRepeats=table - table name for target repeats in place of rmsk
                   - for example: -tRepeats=windowmaskerSdust
   -liftQ=file.lft - Lift in.net's query coords to chrom-level using
                     file.lft (for accessing chrom-level coords in qDb)
   -liftT=file.lft - Lift in.net's target coords to chrom-level using
                     file.lft (for accessing chrom-level coords in tDb)
   -qSizes=chrom.sizes - file with query chrom.sizes instead of reading
                   - the chromInfo table from the database

================================================================
========   netFilter   ====================================
================================================================
### kent source version 473 ###
netFilter - Filter out parts of net.  What passes
filter goes to standard output.  Note a net is a
recursive data structure.  If a parent fails to pass
the filter, the children are not even considered.
usage:
   netFilter in.net(s)
options:
   -q=chr1,chr2 - restrict query side sequence to those named
   -notQ=chr1,chr2 - restrict query side sequence to those not named
   -t=chr1,chr2 - restrict target side sequence to those named
   -notT=chr1,chr2 - restrict target side sequence to those not named
   -minScore=N - restrict to those scoring at least N
   -maxScore=N - restrict to those scoring less than N
   -minGap=N  - restrict to those with gap size (tSize) >= minSize
   -minAli=N - restrict to those with at least given bases aligning
   -maxAli=N - restrict to those with at most given bases aligning
   -minSizeT=N - restrict to those at least this big on target
   -minSizeQ=N - restrict to those at least this big on query
   -qStartMin=N - restrict to those with qStart at least N
   -qStartMax=N - restrict to those with qStart less than N
   -qEndMin=N - restrict to those with qEnd at least N
   -qEndMax=N - restrict to those with qEnd less than N
   -tStartMin=N - restrict to those with tStart at least N
   -tStartMax=N - restrict to those with tStart less than N
   -tEndMin=N - restrict to those with tEnd at least N
   -tEndMax=N - restrict to those with tEnd less than N
   -qOverlapStart=N - restrict to those where the query overlaps a region starting here
   -qOverlapEnd=N - restrict to those where the query overlaps a region ending here
   -tOverlapStart=N - restrict to those where the target overlaps a region starting here
   -tOverlapEnd=N - restrict to those where the target overlaps a region ending here
   -type=XXX - restrict to given type, maybe repeated to allow several types
   -syn        - do filtering based on synteny (tuned for human/mouse).  
   -minTopScore=N - Minimum score for top level alignments. default 300000
   -minSynScore=N - Min syntenic block score (def=200,000). 
                      Default covers 27,000 bases including 9,000 
                      aligning--a very stringent requirement. 
   -minSynSize=N - Min syntenic block size (def=20,000). -
   -minSynAli=N  - Min syntenic alignment size(def=10,000). -
   -maxFar=N     - Max distance to allow synteny (def=200,000). 
   -nonsyn     - do inverse filtering based on synteny (tuned for human/mouse).  
   -chimpSyn   - do filtering based on synteny (tuned for human/chimp).  
   -fill - Only pass fills, not gaps. Only useful with -line.
   -gap  - Only pass gaps, not fills. Only useful with -line.
   -line - Do this a line at a time, not recursing
   -noRandom      - suppress chains involving 'random' chromosomes
   -noHap         - suppress chains involving chromosome names inc '_hap|_alt'

================================================================
========   netSplit   ====================================
================================================================
netSplit - Split a genome net file into chromosome net files
usage:
   netSplit in.net outDir
options:
   -xxx=XXX

================================================================
========   netSyntenic   ====================================
================================================================
### kent source version 473 ###
netSyntenic - Add synteny info to net.
usage:
   netSyntenic in.net out.net
options:
   -xxx=XXX

================================================================
========   netToAxt   ====================================
================================================================
### kent source version 473 ###
netToAxt - Convert net (and chain) to axt.
usage:
   netToAxt in.net in.chain target.2bit query.2bit out.axt
note:
   directories full of .nib files (an older format)
   may also be used in place of target.2bit and query.2bit.
options:
   -qChain - net is with respect to the q side of chains.
   -maxGap=N - maximum size of gap before breaking. Default 100
   -gapOut=gap.tab - Output gap sizes to file
   -noSplit - Don't split chain when there is an insertion of another chain

================================================================
========   netToBed   ====================================
================================================================
### kent source version 473 ###
netToBed - Convert target coverage of net to a bed file.
usage:
   netToBed in.net out.bed
options:
   -maxGap=N - break up at gaps of given size or more
   -minFill=N - only include fill of given size of above.

================================================================
========   newProg   ====================================
================================================================
### kent source version 473 ###
newProg - make a new C source skeleton.
usage:
   newProg progName description words
This will make a directory 'progName' and a file in it 'progName.c'
with a standard skeleton

Options:
   -jkhgap - include jkhgap.a and mysql libraries as well as jkweb.a archives 
   -cgi    - create shell of a CGI script for web
================================================================
========   newPythonProg   ====================================
================================================================
### kent source version 473 ###
newPythonProg - Make a skeleton for a new python program
usage:
   newPythonProg programName "The usage statement"
options:
   -xxx=XXX

================================================================
========   nibFrag   ====================================
================================================================
### kent source version 473 ###
nibFrag - Extract part of a nib file as .fa (all bases/gaps lower case by default)
usage:
   nibFrag [options] file.nib start end strand out.fa
where strand is + (plus) or m (minus)
options:
   -masked       Use lower-case characters for bases meant to be masked out.
   -hardMasked   Use upper-case for not masked-out, and 'N' characters for masked-out bases.
   -upper        Use upper-case characters for all bases.
   -name=name    Use given name after '>' in output sequence.
   -dbHeader=db  Add full database info to the header, with or without -name option.
   -tbaHeader=db Format header for compatibility with tba, takes database name as argument.

================================================================
========   nibSize   ====================================
================================================================
### kent source version 473 ###
nibSize - print size of nibs
usage:
   nibSize nib1 [...]

================================================================
========   oligoMatch   ====================================
================================================================
oligoMatch - find perfect matches in sequence.
usage:
   oligoMatch oligos sequence output.bed
where "oligos" and "sequence" can be .fa, .nib, or .2bit files.
The oligos may contain IUPAC codes.

================================================================
========   overlapSelect   ====================================
================================================================
### kent source version 473 ###
overlapSelect - Select records based on overlapping chromosome ranges.
usage:
   overlapSelect [options] selectFile inFile outFile

Select records based on overlapping chromosome ranges.  The ranges are
specified in the selectFile, with each block specifying a range.
Records are copied from the inFile to outFile based on the selection
criteria.  Selection is based on blocks or exons rather than entire
range.

Options starting with -select* apply to selectFile and those starting
with -in* apply to inFile.

Options:
  -selectFmt=fmt - specify selectFile format:
          psl - PSL format (default for *.psl files).  Additional columns
                are copied as-is.
          pslq - PSL format, using query instead of target.  Additional columns
                 are copied as-is.
          genePred - genePred format (default for *.gp or *.genePred files).  Additional
                     columns are copied as-is.
          bed - BED format (default for *.bed files).
                If BED doesn't have blocks, the bed range is used. If it has more
                than 12 columns, the remainder of the columns are copied as-is.
          bed3+ - BED format where only the first three columns are used as the
                  range, the remainder are copied as-is.
          bed6+ - BED format where only the first six columns are used as the
                  range and strand, the remainder are copied as-is.
          chain - chain file format (default from .chain files)
          chainq - chain file format, using query instead of target
  -selectCoordCols=spec - selectFile is tab-separate with coordinates
       as described by spec, which is one of:
            o chromCol - chrom in this column followed by start and end.
            o chromCol,startCol,endCol,strandCol,name - chrom, start, end, and
              strand in specified columns. Columns can be omitted from the end
              or left empty to not specify.
          NOTE: column numbers are zero-based
  -selectCds - Use only CDS in the selectFile
  -selectRange - Use entire range instead of blocks from records in
          the selectFile.
  -inFmt=fmt - specify inFile format, same values as -selectFmt.
  -inCoordCols=spec - inFile is tab-separate with coordinates specified by
      spec, in format described above.
  -inCds - Use only CDS in the inFile
  -inRange - Use entire range instead of blocks of records in the inFile.
  -nonOverlapping - select non-overlapping instead of overlapping records
  -strand - must be on the same strand to be considered overlapping
  -oppositeStrand - must be on the opposite strand to be considered overlapping
  -excludeSelf - don't compare records with the same coordinates and name.
      Warning: using only one of -inCds or -selectCds will result in different
      coordinates for the same record.
  -idMatch - only select overlapping records if they have the same id
  -aggregate - instead of computing overlap bases on individual select entries, 
      compute it based on the total number of inFile bases overlap by selectFile
      records. -overlapSimilarity and -mergeOutput will not work with
      this option.
  -overlapThreshold=N - minimum fraction of an inFile record that
      must be overlapped by a single select record to be considered
      overlapping.  Note that this is only coverage by a single select
      record, not total coverage.  Default is 0.0.
  -overlapThresholdCeil=N - select only inFile records with less than
      this fraction of overlap with a single record, provided they are selected
      by other criteria.
  -overlapSimilarity=N - minimum fraction bases in inFile and selectFile
      records that overlap the same genomic locations.  This is computed
      by (2*overlapBase)/(inFileBase+selectFileBases).
      Note that this is only coverage by a single select record and this
      is bidirectional inFile and selectFile must overlap by this
      amount.  A value of 1.0 will select identical records (or CDS if
      both CDS options are specified.  Not currently supported with
      -aggregate.   Default is 0.0.
  -overlapSimilarityCeil=N - select only inFile records with less than this
      amount of similarity with a single record. provided they are selected by
      other criteria.
  -overlapBases=-1 - minimum number of bases of overlap, < 0 disables.
  -statsOutput - output overlap statistics instead of selected records. 
      If no overlap criteria is specified, all overlapping entries are
      reported, Otherwise only the pairs passing the criteria are
      reported. This results in a tab-separated file with the columns:
         inId selectId inOverlap selectOverlap overBases
      Where inOverlap is the fraction of the inFile record overlapped by
      the selectFile record and selectOverlap is the fraction of the
      select record overlap by inFile records.  With -aggregate, output
      is:
         inId inOverlap inOverBases inBases
  -statsOutputAll - like -statsOutput, however output all inFile records,
      including those that are not overlapped.
  -statsOutputBoth - like -statsOutput, however output all selectFile and
      inFile records, including those that are not overlapped.
  -mergeOutput - output file with be a merge of the input file with the
      selectFile records that selected it.  The format is
         inRec<tab>selectRec.
      if multiple select records hit, inRec is repeated. This will increase
      the memory required. Not supported with -nonOverlapping or -aggregate.
  -idOutput - output a tab-separated file of pairs of
         inId selectId
      with -aggregate, only a single column of inId is written
  -dropped=file  - output rows that were dropped to this file.
  -verbose=n - verbose > 1 prints some details,
  -tsv - output TSV headers instead of autoSql headers for statistics output. 

================================================================
========   para   ====================================
================================================================
### kent source version 473 ###
para - version 12.19
Manage a batch of jobs in parallel on a compute cluster.
Normal usage is to do a 'para create' followed by 'para push' until
job is done.  Use 'para check' to check status.
usage:

   para [options] command [command-specific arguments]

The commands are:

para create jobList
   This makes the job-tracking database from a text file with the
   command line for each job on a separate line.
   options:
      -cpu=N  Number of CPUs used by the jobs, default 1.
      -ram=N  Number of bytes of RAM used by the jobs.
         Default is RAM on node divided by number of cpus on node.
         Shorthand expressions allow t,g,m,k for tera, giga, mega, kilo.
         e.g. 4g = 4 Gigabytes.
      -batch=batchDir - specify the directory path that is used to store the
       batch control files.  The batchDir can be an absolute path or a path
       relative to the current directory.  The resulting path is use as the
       batch name.  The directory is created if it doesn't exist.  When
       creating a new batch, batchDir should not have been previously used as
       a batch name.  The batchDir must be writable by the paraHub process.
       This does not affect the working directory assigned to jobs.  It defaults
       to the directory where para is run.  If used, this option must be specified
       on all para commands for the  batch.  For example to run two batches in the
       same directory:
          para -batch=b1 make jobs1
          para -batch=b2 make jobs2
para push 
   This pushes forward the batch of jobs by submitting jobs to parasol
   It will limit parasol queue size to something not too big and
   retry failed jobs.
   options:
      -retries=N  Number of retries per job - default 4.
      -maxQueue=N  Number of jobs to allow on parasol queue. 
         Default 2000000.
      -minPush=N  Minimum number of jobs to queue. 
         Default 1.  Overrides maxQueue.
      -maxPush=N  Maximum number of jobs to queue - default 100000.
      -warnTime=N  Number of minutes job runs before hang warning. 
         Default 4320 (3 days).
      -killTime=N  Number of minutes hung job runs before push kills it.
         By default kill off for backwards compatibility.
      -delayTime=N  Number of seconds to delay before submitting next job 
         to minimize i/o load at startup - default 0.
      -priority=x  Set batch priority to high, medium, or low.
         Default medium (use high only with approval).
         If needed, use with make, push, create, shove, or try.
         Or, set batch priority to a specific numeric value - default 10.
         1 is emergency high priority, 
         10 is normal medium, 
         100 is low for bottomfeeders.
         Setting priority higher than normal (1-9) will be logged.
         Please keep low priority jobs short, they won't be pre-empted.
      -maxJob=x  Limit the number of jobs the batch can run.
         Specify number of jobs, for example 10 or 'unlimited'.
         Default unlimited displays as -1.
      -jobCwd=dir - specify the directory path to use as the current working
       directory for each job.  The dir can be an absolute path or a path
       relative to the current directory. It defaults to the directory where
       para is run.
para try 
   This is like para push, but only submits up to 10 jobs.
para shove
   Push jobs in this database until all are done or one fails after N retries.
para make jobList
   Create database and run all jobs in it if possible.  If one job
   fails repeatedly this will fail.  Suitable for inclusion in makefiles.
   Same as a 'create' followed by a 'shove'.
para check
   This checks on the progress of the jobs.
para stop
   This stops all the jobs in the batch.
para chill
   Tells system to not launch more jobs in this batch, but
   does not stop jobs that are already running.
para finished
   List jobs that have finished.
para hung
   List hung jobs in the batch (running > killTime).
para slow
   List slow jobs in the batch (running > warnTime).
para crashed
   List jobs that crashed or failed output checks the last time they were run.
para failed
   List jobs that crashed after repeated restarts.
para status
   List individual job status, including times.
para problems
   List jobs that had problems (even if successfully rerun).
   Includes host info.
para running
   Print info on currently running jobs.
para hippos time
   Print info on currently running jobs taking > 'time' (minutes) to run.
para time
   List timing information.
para recover jobList newJobList
   Generate a job list by selecting jobs from an existing list where
   the `check out' tests fail.
para priority 999
   Set batch priority. Values explained under 'push' options above.
para maxJob 999
   Set batch maxJob. Values explained under 'push' options above.
para ram 999
   Set batch ram usage. Values explained under 'push' options above.
para cpu 999
   Set batch cpu usage. Values explained under 'push' options above.
para resetCounts
   Set batch done and crash counters to 0.
para flushResults
   Flush results file.  Warns if batch has jobs queued or running.
para freeBatch
   Free all batch info on hub.  Works only if batch has nothing queued or running.
para showSickNodes
   Show sick nodes which have failed when running this batch.
para clearSickNodes
   Clear sick nodes statistics and consecutive crash counts of batch.

Common options
   -verbose=1 - set verbosity level.

================================================================
========   paraFetch   ====================================
================================================================
### kent source version 473 ###
paraFetch - try to fetch url with multiple connections
usage:
   paraFetch N R URL {outPath}
   where N is the number of connections to use
         R is the number of retries
   outPath is optional. If not specified, it will attempt to parse URL to discover output filename.
options:
   -newer  only download a file if it is newer than the version we already have.
   -progress  Show progress of download.

================================================================
========   paraHub   ====================================
================================================================
### kent source version 473 ###
paraHub - parasol hub server version 12.19
usage:
    paraHub machineList
Where machine list is a file with the following columns:
    name - Network name
    cpus - Number of CPUs we can use
    ramSize - Megabytes of memory
    tempDir - Location of (local) temp dir
    localDir - Location of local data dir
    localSize - Megabytes of local disk
    switchName - Name of switch this is on

options:
   -spokes=N  Number of processes that feed jobs to nodes - default 30.
   -jobCheckPeriod=N  Minutes between checking on job - default 10.
   -machineCheckPeriod=N  Minutes between checking on machine - default 20.
   -subnet=XXX.YYY.ZZZ Only accept connections from subnet (example 192.168).
     Or CIDR notation (example 192.168.1.2/24).
     Supports comma-separated list of IPv4 or IPv6 subnets in CIDR notation.
   -nextJobId=N  Starting job ID number.
   -logFacility=facility  Log to the specified syslog facility - default local0.
   -logMinPriority=pri minimum syslog priority to log, also filters file logging.
    defaults to "warn"
   -log=file  Log to file instead of syslog.
   -debug  Don't daemonize
   -noResume  Don't try to reconnect with jobs running on nodes.
   -ramUnit=N  Number of bytes of RAM in the base unit used by the jobs.
      Default is RAM on node divided by number of cpus on node.
      Shorthand expressions allow t,g,m,k for tera, giga, mega, kilo.
      e.g. 4g = 4 Gigabytes.
   -defaultJobRam=N Number of ram units in a job has no specified ram usage.
      Defaults to 1.

================================================================
========   paraHubStop   ====================================
================================================================
paraHubStop - version 12.19
Shut down paraHub daemon.
usage:
   paraHubStop now

================================================================
========   paraNode   ====================================
================================================================
### kent source version 473 ###
paraNode - version 12.19
Parasol node server.
usage:
    paraNode start
options:
    -logFacility=facility  Log to the specified syslog facility - default local0.
    -logMinPriority=pri minimum syslog priority to log, also filters file logging.
     defaults to "warn"
    -log=file  Log to file instead of syslog.
    -debug  Don't daemonize
    -hub=host  Restrict access to connections from hub.
    -umask=000  Set umask to run under - default 002.
    -userPath=bin:bin/i386  User dirs to add to path.
    -sysPath=/sbin:/local/bin  System dirs to add to path.
    -env=name=value - add environment variable to jobs.  Maybe repeated.
    -randomDelay=N  Up to this many milliseconds of random delay before
        starting a job.  This is mostly to avoid swamping NFS with
        file opens when loading up an idle cluster.  Also it limits
        the impact on the hub of very short jobs. Default 5000.
    -cpu=N  Number of CPUs to use - default 1.

================================================================
========   paraNodeStart   ====================================
================================================================
### kent source version 473 ###
paraNodeStart - version 12.19
Start up parasol node daemons on a list of machines.
usage:
    paraNodeStart machineList
where machineList is a file containing a list of hosts.
Machine list contains the following columns:
     <name> <number of cpus>
It may have other columns as well.
options:
    -exe=/path/to/paraNode
    -logFacility=facility  Log to the specified syslog facility - default local0.
    -logMinPriority=pri minimum syslog priority to log, also filters file logging.
     defaults to "warn"
    -log=file  Log to file instead of syslog.
    -umask=000  Set umask to run under - default 002.
    -randomDelay=N  Set random start delay in milliseconds - default 5000.
    -userPath=bin:bin/i386  User dirs to add to path.
    -sysPath=/sbin:/local/bin  System dirs to add to path.
    -env=name=value - add environment variable to jobs.  Maybe repeated.
    -hub=machineHostingParaHub  Nodes will ignore messages from elsewhere.
    -rsh=/path/to/rsh/like/command.

================================================================
========   paraNodeStatus   ====================================
================================================================
paraNodeStatus - version 12.19
Check status of paraNode on a list of machines.
usage:
    paraNodeStatus machineList
options:
    -retries=N  Number of retries to get in touch with machine.
        The first retry is after 1/100th of a second. 
        Each retry after that takes twice as long up to a maximum
        of 1 second per retry.  Default is 7 retries and takes
        about a second.
    -long  List details of current and recent jobs.

================================================================
========   paraNodeStop   ====================================
================================================================
Couldn't open -verbose=2 , No such file or directory
================================================================
========   paraSync   ====================================
================================================================
### kent source version 473 ###
paraSync 1.0
paraSync - uses paraFetch to recursively mirror url to given path
usage:
   paraSync {options} N R URL outPath
   where N is the number of connections to use
         R is the number of retries
options:
   -A='ext1,ext2'  means accept only files with ext1 or ext2
   -newer  only download a file if it is newer than the version we already have.
   -progress  Show progress of download.

================================================================
========   paraTestJob   ====================================
================================================================
paraTestJob - version 12.19
A good test job to run on Parasol.  Can be configured to take a long time or crash.
usage:
   paraTestJob count
Run a relatively time consuming algorithm count times.
This algorithm takes about 1/10 per second each time.
options:
   -crash  Try to write to NULL when done.
   -err  Return -1 error code when done.
   -output=file  Make some output in file as well.
   -heavy=n  Make output heavy: n extra lumberjack lines.
   -input=file  Make it read in a file too.
   -sleep=n  Sleep for N seconds.

================================================================
========   parasol   ====================================
================================================================
parasol - parallel job management system for a compute cluster

Parasol version 12.19
Parasol is the name given to the overall system for managing jobs on
a computer cluster and to this specific command.  This command is
intended primarily for system administrators.  The 'para' command
is the primary command for users.
Usage in brief:
   parasol add machine machineFullHostName localTempDir  - Add new machine to pool.
    or 
   parasol add machine machineFullHostName cpus ramSizeMB localTempDir localDir localSizeMB switchName
   parasol remove machine machineFullHostName "reason why"  - Remove machine from pool.
   parasol check dead - Check machines marked dead ASAP, some have been fixed.
   parasol add spoke  - Add a new spoke daemon.
   parasol [options] add job command-line   - Add job to list.
         options:
            -in=in - Where to get stdin, default /dev/null
            -out=out - Where to put stdout, default /dev/null
            -wait - If set wait for job to finish to return and return with job status code
            -err=outFile - set stderr to out file - only works with wait flag
            -verbose=N - set verbosity level, default level is 1
            -printId - prints jobId to stdout
            -dir=dir - set output results dir, default is current dir
            -results=resultFile fully qualified path to the results file, 
             or `results' in the current directory if not specified.
            -cpu=N  Number of CPUs used by the jobs, default 1.
            -ram=N  Number of bytes of RAM used by the jobs.
             Default is RAM on node divided by number of cpus on node.
             Shorthand expressions allow t,g,m,k for tera, giga, mega, kilo.
             e.g. 4g = 4 Gigabytes.
   parasol [options] clear sick  - Clear sick stats on a batch.
         options:
            -results=resultFile fully qualified path to the results file, 
             or `results' in the current directory if not specified.
   parasol remove job id  - Remove job of given ID.
   parasol ping [count]  - Ping hub server to make sure it's alive.
   parasol remove jobs userName [jobPattern]  - Remove jobs submitted by user that
         match jobPattern (which may include ? and * escaped for shell).
   parasol list machines  - List machines in pool.
   parasol [-extended] list jobs  - List jobs one per line.
   parasol list users  - List users one per line.
   parasol [options] list batches  - List batches one per line.
         option - 'all' if set include inactive
   parasol list sick  - List nodes considered sick by all running batches, one per line.
   parasol status  - Summarize status of machines, jobs, and spoke daemons.
   parasol [options] pstat2  - Get status of jobs queued and running.
         options:
            -results=resultFile fully qualified path to the results file, 
             or `results' in the current directory if not specified.
   parasol flushResults
         Flush results file.  Warns if batch has jobs queued or running.
         options:
            -results=resultFile fully qualified path to the results file, 
             or `results' in the current directory if not specified.
options:
   -host=hostname - connect to a paraHub process on a remote host instead
                    localhost.
Important note:
  Options must precede positional arguments

================================================================
========   positionalTblCheck   ====================================
================================================================
### kent source version 473 ###
positionalTblCheck - check that positional tables are sorted
usage:
   positionalTblCheck db table

options:
  -verbose=n  n>=2, print tables as checked
This will check sorting of a table in a variety of formats.
It looks for commonly used names for chrom and chrom start
columns.  It also handles split tables

================================================================
========   pslCDnaFilter   ====================================
================================================================
### kent source version 473 ###
pslCDnaFilter - Filter cDNA alignments in psl format.
usage:
    pslCDnaFilter [options] inPsl outPsl

Filter cDNA alignments in psl format.  Filtering criteria are
comparative, selecting near best in genome alignments for each
given cDNA and non-comparative, based only on the quality of an
individual alignment.

WARNING: comparative filters requires that the input is sorted by
query name.  The command: 'sort -k 10,10' will do the trick.

Each alignment is assigned a score that is based on identity and
weighted towards longer alignments and those with introns.  This
can do either global or local best-in-genome selection.  Local
near best in genome keeps fragments of an mRNA that align in
discontinuous locations from other fragments.  It is useful for
unfinished genomes.  Global near best in genome keeps alignments
based on overall score.

Options:
   -algoHelp - print message describing the filtering algorithm.

   -localNearBest=-1.0 - local near best in genome filtering,
    keeping aligments within this fraction of the top score for
    each aligned portion of the mRNA. A value of zero keeps only
    the best for each fragment. A value of -1.0 disables
    (default).

   -globalNearBest=-1.0 - global near best in genome filtering,
    keeping aligments withing this fraction of the top score.  A
    value of zero keeps only the best alignment.  A value of -1.0
    disables (default).

   -ignoreNs - don't include Ns (repeat masked) while calculating the
    score and coverage. That is treat them as unaligned rather than
    mismatches.  Ns are still counts as mismatches when calculating
    the identity.

   -ignoreIntrons - don't favor apparent introns when scoring.

   -minId=0.0 - only keep alignments with at least this fraction
    identity.

   -minCover=0.0 - minimum fraction of query that must be
    aligned.  If -polyASizes is specified and the query is in
    the file, the ploy-A is not included in coverage
    calculation.

   -decayMinCover  -  the minimum coverage is calculated
    per alignment from the query size using the formula:
       minCoverage = 1.0 - qSize / 250.0
    and minCoverage is bounded between 0.25 and 0.9.

   -minSpan=0.0 - keep only alignments whose target length are
    at least this fraction of the longest alignment passing the
    other filters.  This can be useful for removing possible
    retroposed genes.

   -minQSize=0 - drop queries shorter than this size

   -minAlnSize=0 - minimum number of aligned bases.  This includes
    repeats, but excludes poly-A/poly-T bases if available.

   -minNonRepSize=0 - Minimum number of matching bases that are not repeats.
    This does not include mismatches.
    Must use -repeats on BLAT if doing unmasked alignments.

   -maxRepMatch=1.0 - Maximum fraction of matching bases
    that are repeats.  Must use -repeats on BLAT if doing
    unmasked alignments.

   -repsAsMatch - treat matches in repeats just like other matches
 
   -maxAlignsDrop=-1 - maximum number of alignments for a given query. If
    exceeded, then all alignments of this query are dropped.
    A value of -1 disables (default)

   -maxAligns=-1 - maximum number of alignments for a given query. If
    exceeded, then alignments are sorted by score and only this number
    will be saved.  A value of -1 disables (default)

   -polyASizes=file - tab separate file with information about
    poly-A tails and poly-T heads.  Format is outputted by
    faPolyASizes:

        id seqSize tailPolyASize headPolyTSize

   -usePolyTHead - if a poly-T head was detected and is longer
    than the poly-A tail, it is used when calculating coverage
    instead of the poly-A head.

   -bestOverlap - filter overlapping alignments, keeping the best of
    alignments that are similar.  This is designed to be used with
    overlapping, windowed alignments, where one alignment might be truncated.
    Does not discarding ones with weird overlap unless -filterWeirdOverlapped
    is specified.

   -hapRegions=psl - PSL format alignments of each haplotype pseudo-chromosome
    to the corresponding reference chromosome region.  This is used to map
    alignments between regions.

   -dropped=psl - save psls that were dropped to this file.

   -weirdOverlapped=psl - output weirdly overlapping PSLs to
    this file.

   -filterWeirdOverlapped - Filter weirdly overlapped alignments, keeping
    the single highest scoring one or an arbitrary one if multiple with
    the same high score.

   -alignStats=file - output the per-alignment statistics to this file

   -uniqueMapped - keep only cDNAs that are uniquely aligned after all
    other filters have been applied.

   -noValidate - don't run pslCheck validation.

   -statsOut=file - write filtering stats to this file, overrides -verbose=1

   -verbose=1 - 0: quite
                1: output stats, unless -statsOut is specified
                2: list problem alignment (weird or invalid)
                3: list dropped alignments and reason for dropping
                4: list kept psl and info
                5: info about all PSLs

   -hapRefMapped=psl - output PSLs of haplotype to reference chromosome
    cDNA alignments mappings (for debugging purposes).

   -hapRefCDnaAlns=psl - output PSLs of haplotype cDNA to reference cDNA
    alignments (for debugging purposes).

   -hapLociAlns=outfile - output grouping of final alignments create by
    haplotype mapping process.  Each row will start with an integer haplotype
    group id number follow by a PSL record.  All rows with the same id are
    alignments of the a given cDNA that were determined to be haplotypes of
    the same locus.  Alignments that are not part of a haplotype locus are not
    included.

   -alnIdQNameMode - add internal assigned alignment numbers to cDNA names
    on output.  Useful for debugging, as they are include in the verbose
    tracing as [#1], etc.  Will make a mess of normal production usage.

   -blackList=file.txt - adds a list of accession ranges to a black list.
    Any accession on this list is dropped. Black list file is two columns
    where the first column is the beginning of the range, and the second
    column is the end of the range, inclusive.


The default options don't do any filtering. If no filtering
criteria are specified, all PSLs will be passed though, except
those that are internally inconsistent.

THE INPUT MUST BE BE SORTED BY QUERY for the comparative filters.

================================================================
========   pslCat   ====================================
================================================================
pslCat - concatenate psl files
usage:
   pslCat file(s)
options:
   -check parses input.  Detects more errors but slower
   -nohead omit psl header
   -dir  files are directories (concatenate all in dirs)
   -out=file put output to file rather than stdout
   -ext=.xxx  limit files in directories to those with extension

================================================================
========   pslCheck   ====================================
================================================================
### kent source version 473 ###
pslCheck - validate PSL files
usage:
   pslCheck fileTbl(s)
options:
   -db=db - get targetSizes from this database, and if file doesn't exist,
    look for a table in this database.
   -prot - confirm psls are protein psls
   -noCountCheck - don't validate that match/mismatch counts are match
    the total size of the alignment blocks
   -pass=pslFile - write PSLs without errors to this file
   -fail=pslFile - write PSLs with errors to this file
   -filter - use program as a filter, with -pass and/or -fail, don't error exit
    on problems, but do report them.
   -targetSizes=sizesFile - tab file with columns of target and size.
    If specified, psl is check to have a valid target and target
    coordinates.
   -skipInsertCounts - Don't validate insert counts.  Useful for BLAT protein
    PSLs where these are not computed consistently.
   -querySizes=sizesFile - file with query sizes.
   -ignoreQUniq - ignore everything after the last `-' in the qName field, that
    is sometimes used to generate a unique identifier
   -quiet - no write error message, just filter

================================================================
========   pslDropOverlap   ====================================
================================================================
pslDropOverlap - deletes all overlapping (trivial/diagonal) self-alignment blocks. 
usage:
    pslDropOverlap in.psl out.psl
This discards information in mismatch, repMatch and nCount, lumping all into match.
(all matching bases are counted in the match column).

================================================================
========   pslFilter   ====================================
================================================================
pslFilter - filter out psl file
    pslFilter in.psl out.psl 
options
    -dir  Input files are directories rather than single files
    -reward=N (default 1) Bonus to score for match
    -cost=N (default 1) Penalty to score for mismatch
    -gapOpenCost=N (default 4) Penalty for gap opening
    -gapSizeLogMod=N (default 1.00) Penalty for gap sizes
    -minScore=N (default 15) Minimum score to pass filter
    -minMatch=N (default 30) Min match (including repeats to pass)
    -minUniqueMatch (default 20) Min non-repeats to pass)
    -maxBadPpt (default 700) Maximum divergence in parts per thousand
    -minAli (default 600) Minimum ratio query in alignment in ppt
    -noHead  Don't output psl header
    -minAliT (default 0) Like minAli for target

================================================================
========   pslHisto   ====================================
================================================================
### kent source version 473 ###
pslHisto - Collect counts on PSL alignments for making histograms.
usage:
    pslHisto [options] what inPsl outHisto

Collect counts on PSL alignments for making histograms. These
then be analyzed with R, textHistogram, etc.

The 'what' argument determines what data to collect, the following
are currently supported:

  o alignsPerQuery - number of alignments per query. Output is one
    line per query with the number of alignments.

  o coverSpread - difference between the highest and lowest coverage
    for alignments of a query.  Output line per query, with the difference.
    Only includes queries with multiple alignments

  o idSpread - difference between the highest and lowest fraction identity
    for alignments of a query.  Output line per query, with the difference.

Options:
   -multiOnly - omit queries with only one alignment from output.
   -nonZero - omit queries with zero values.

================================================================
========   pslLiftSubrangeBlat   ====================================
================================================================
### kent source version 473 ###
pslLiftSubrangeBlat - lift PSLs from blat subrange alignments
usage:
   pslLiftSubrangeBlat isPsl outPsl

Lift a PSL with target coordinates from a blat subrange query
(e.g. blah/hg18.2bit:chr1:1000-20000) which has subrange
coordinates as the target name (e.g. chr1:1000-200000) to
actual target coordinates.

options:
  -tSizes=szfile - lift target side based on tName, using target sizes from
                   this tab separated file.
  -qSizes=szfile - lift query side based on qName, using query sizes from
                   this tab separated file.
Must specify at least on of -tSizes or -qSize or both.

================================================================
========   pslMap   ====================================
================================================================
### kent source version 473 ###
pslMap - map PSLs alignments to new targets using alignments of the old target to the new target.
usage:
   pslMap [options] inPsl mapFile outPsl

pslMap - map PSLs alignments to new targets using alignments of
the old target to the new target.  Given inPsl and mapPsl, where
the target of inPsl is the query of mapPsl, create a new PSL
with the query of inPsl aligned to all the targets of mapPsl.

If inPsl is a protein to nucleotide alignment and mapPsl is a
nucleotide to nucleotide alignment, the resulting alignment is
nucleotide to nucleotide alignment of the CDS coordinates mRNA that
would code for the protein.  This is useful as it gives base
alignments of spliced codons.

Protein to NA alignments can be determine from the PSL, otherwise
they are assumed to be NA-NA unless the types of the alignments are
specified with -inType and -mapType.  The following combinations are
valid, along with the type of output,

     inPslType   mapPslType  outPslType
     na_na       na_na       na_na
     prot_prot   prot_prot   prot_prot
     prot_na     na_na       cds_na
     prot_prot   na_na       cds_na
     prot_prot   prot_na     cds_na

A chain file may be used instead mapPsl.

Options:
  -chainMapFile - mapFile is a chain file instead of a psl file
  -swapMap - swap query and target sides of map file.
  -swapIn - swap query and target sides of inPsl file.
  -check - validate input, mapping, and mapped PSLs.  This does slow
   down the program some, so it is optional.
  -suffix=str - append str to the query ids in the output
   alignment.  Useful with protein alignments, where the result
   is not actually and alignment of the protein.
  -keepTranslated - if either psl is translated, the output psl
   will be translated (both strands explicted).  Normally an
   untranslated psl will always be created
  -mapFileWithInQName - The first column of the mapFile PSL records are a qName,
   the remainder is a standard PSL.  When an inPsl record is mapped, only
   mapping records are used with the corresponding qName.
  -inType=type - input alignment type (prot-port, prot-na, na-na)
   This is the type after swapping if -swapIn is supplied.
  -mapType=type - map alignment type (prot-port, prot-na, na-na)
   This is the type after swapping if -swapMap is supplied.
  -mapInfo=file - output a file with information about each mapping.
   The file has the following columns:
     o srcQName, srcQStart, srcQEnd, srcQSize - qName, etc of
       psl being mapped (source alignment)
     o srcTName, srcTStart, srcTEnd - tName, etc of psl being
       mapped
     o srcStrand - strand of psl being mapped
     o srcAligned - number of aligned based in psl being mapped
     o mappingQName, mappingQStart, mappingQEnd - qName, etc of
       mapping psl used to map alignment
     o mappingTName, mappingTStart, mappingTEnd - tName, etc of
       mapping psl
     o mappingStrand - strand of mapping psl
     o mappingId - chain id, or psl file row
     o mappedQName mappedQStart, mappedQEnd - qName, etc of
       mapped psl
     o mappedTName, mappedTStart, mappedTEnd - tName, etc of
       mapped psl
     o mappedStrand - strand of mapped psl
     o mappedAligned - number of aligned bases that were mapped
     o qStartTrunc - aligned bases at qStart not mapped due to
       mapping psl/chain not covering the entire soruce psl.
       This is from the start of the query in the positive
       direction.
     o qEndTrunc - similary for qEnd
     o mappedPslLine - zero-based line number of the corresponding PSL line number
       in outPsl.
   If the psl count not be mapped, the mapping* and mapped* columns are empty.
  -tsv - write output of mapInfo as a TSV rather than autoSql format file.
  -mappingPsls=pslFile - write mapping alignments that were used in
   PSL format to this file.  Transformations that were done, such as
   -swapMap, will be reflected in this file.  There will be a one-to-one
   correspondence of rows of this file to rows of the outPsl file.
  -simplifyMappingIds - simplifying mapping ids (inPsl target
   name and mapFile query name) before matching them. This
   first drops everything after the last `-', and then drops
   everything after the last remaining `.'.
  -verbose=n  - verbose output
     2 - show each overlap and the mapping

================================================================
========   pslMapPostChain   ====================================
================================================================
### kent source version 473 ###
pslMapPostChain - Post genomic pslMap (TransMap) chaining.
usage:
    pslMapPostChain [options] inPsl outPsl

Post genomic pslMap (TransMap) chaining.  This takes transcripts
that have been mapped via genomic chains adds back in
blocks that didn't get include in genomic chains due
to complex rearrangements or other issues.
This can also handle other PSLs, including protein-RNA alignments

This program has not seen much use and may not do what you want

================================================================
========   pslMrnaCover   ====================================
================================================================
pslMrnaCover - Make histogram of coverage percentage of mRNA in psl.
usage:
   pslMrnaCover mrna.psl mrna.fa
options:
   -minSize=N  - default 100.  Minimum size of mRNA considered
   -listZero=zero.tab - List accessions that don't align in zero.tab

================================================================
========   pslPairs   ====================================
================================================================
pslPairs - join paired ends in psl alignments
usage: pslPairs <pslFile> <pairFile> <pslTableName> <outFilePrefix>
  creates: <outFilePrefix>.pairs file
  pslFile	- filtered psl alignments of ends from kluster run
  pairFile	- three column tab separated: forward reverse cloneId
		- forward and reverse columns can be comma separated end ids
  pslTableName	- table name the psl alignments have been loaded into
  outFilePrefix	- prefix used for each output file name
Options:
  -max=N	- maximum length of clone sequence (default=47000)
  -min=N	- minimum length of clone sequence (default=32000)
  -slopval=N	- deviation from max/min clone lengths allowed for slop report
		- (default=5000)
  -nearTop=N	- maximium deviation from best match allowed (default=0.001)
  -minId=N	- minimum pct ID of at least one end (default=0.96)
  -minOrphanId=N - minimum pct ID for orphan alignment (default=0.96)
  -tInsert=N	- maximum insert bases allowed in sequence alignment
		- (default=500)
  -hardMax=N	- absolute maximum clone length for long report (default=75000)
  -verbose	- display all informational messages
  -noBin	- do not include bin column in output file
  -noRandom	- do not include placements on random portions
		- {length(chr name) < 7}
  -slop		- create <outFilePrefix>.slop file of pairs that fall within
		- slop length
  -short	- create <outFilePrefix>.short file of pairs shorter than
		- min size
  -long		- create <outFilePrefix>.long file of pairs longer than
		- max size, but less than hardMax size
  -mismatch	- create <outFilePrefix>.mismatch file of pairs with
		- bad orientation of ends
  -orphan	- create <outFilePrefix>.orphan file of unmatched end sequences
================================================================
========   pslPartition   ====================================
================================================================
### kent source version 473 ###
pslPartition - split PSL files into non-overlapping sets
usage:
   pslPartition [options] pslFile outDir

Split psl files into non-overlapping sets for use in cluster jobs,
limiting memory usage, etc. Multiple levels of directories can be are
created under outDir to prevent slow access to huge directories.
The pslFile maybe compressed and no ordering is assumed.

options:
  -outLevels=0 - number of output subdirectory levels.  0 puts all files
   directly in outDir, 2, will create files in the form outDir/0/0/00.psl
  -partSize=20000 - will combine non-overlapping partitions, while attempting
   to keep them under this number of PSLs.  This reduces the number of
   files that are created while ensuring that there are no overlaps
   between any two PSL files.  A value of 0 creates a PSL file per set of
   overlapping PSLs.
  -dropContained - drop PSLs that are completely contained in a block of
   another PSL.
  -parallel=n - use this many cores for parallel sorting


================================================================
========   pslPosTarget   ====================================
================================================================
### kent source version 473 ###
pslPosTarget - flip psl strands so target is positive and implicit
usage:
   pslPosTarget inPsl outPsl

================================================================
========   pslPretty   ====================================
================================================================
pslPretty - Convert PSL to human-readable output
usage:
   pslPretty in.psl target.lst query.lst pretty.out
options:
   -axt             Save in format like Scott Schwartz's axt format.
                    Note gaps in both sequences are still allowed in the
                    output, which not all axt readers will expect.
   -dot=N           Output a dot every N records.
   -long            Don't abbreviate long inserts.
   -check=fileName  Output alignment checks to filename.
It's recommended that the psl file be sorted by target if it contains
multiple targets; otherwise, this will be extremely slow. The target and query
lists can be fasta, 2bit or nib files, or a list of these files, one per line.

================================================================
========   pslProtToRnaCoords   ====================================
================================================================
### kent source version 473 ###
pslProtToRnaCoords - Convert protein alignments to RNA coordinates
usage:
   pslProtToRnaCoords inPsl outPsl

Convert either a protein/protein or protein/NA PSL to NA/NA PSL.  This
multiplies coordinates and statistics by three.  As this can occasionally
results in blocks overlapping, overlap is trimmed as needed.

================================================================
========   pslRc   ====================================
================================================================
### kent source version 473 ###
pslRc - reverse-complement psl
usage:
    pslRc [options] inPsl outPsl

reverse-complement psl

Options:

================================================================
========   pslRecalcMatch   ====================================
================================================================
### kent source version 473 ###
pslRecalcMatch - Recalculate match,mismatch,repMatch columns in psl file.
This can be useful if the psl went through pslMap, or if you've added 
lower-case repeat masking after the fact
usage:
   pslRecalcMatch in.psl targetSeq querySeq out.psl
where targetSeq is either a nib directory or a two bit file
and querySeq is a fasta file, nib file, two bit file, or list
of such files.  The psl's should be simple non-translated ones.
This will work faster if the in.psl is sorted on target.
options:
   -ignoreQUniq - ignore everything after the last `-' in the qName field, that
    is sometimes used to generate a unique identifier
   -ignoreQMissing - pass through the record if querySeq doesn't include qName

================================================================
========   pslRemoveFrameShifts   ====================================
================================================================
### kent source version 473 ###
pslRemoveFrameShifts - remove frame shifts from psl
usage:
   pslRemoveFrameShifts file.psl out.psl

================================================================
========   pslReps   ====================================
================================================================
### kent source version 473 ###
pslReps - Analyze repeats and generate genome-wide best alignments from a
sorted set of local alignments
usage:
    pslReps in.psl out.psl out.psr
where:
    in.psl is an alignment file generated by psLayout and sorted by pslSort
    out.psl is the best alignment output
    out.psr contains repeat info
options:
    -nohead            Don't add PSL header.
    -ignoreSize        Will not weigh as much in favor of larger alignments.
    -noIntrons         Will not penalize for not having introns when calculating
                       size factor.
    -singleHit         Takes single best hit, not splitting into parts.
    -minCover=0.N      Minimum coverage to output.  Default is 0.
    -ignoreNs          Ignore Ns when calculating minCover.
    -minAli=0.N        Minimum alignment ratio.  Default is 0.93.
    -nearTop=0.N       How much can deviate from top and be taken.
                       Default is 0.01.
    -minNearTopSize=N  Minimum size of alignment that is near top
                       for alignment to be kept.  Default 30.
    -coverQSizes=file  Tab-separate file with effective query sizes.
                       When used with -minCover, this allows polyAs
                       to be excluded from the coverage calculation.

================================================================
========   pslScore   ====================================
================================================================
### kent source version 473 ###
pslScore - calculate web blat score from psl files
usage:
   pslScore <file.psl> [moreFiles.psl]
options:
   none at this time

columns in output:

#tName	tStart	tEnd	qName:qStart-qEnd	score	percentIdentity
================================================================
========   pslSelect   ====================================
================================================================
### kent source version 473 ###
pslSelect - select records from a PSL file.

usage:
   pslSelect [options] inPsl outPsl

Must specify a selection option

Options:
   -qtPairs=file - file is tab-separated qName and tName pairs to select
   -qPass        - pass all PSLs with queries that do not appear in qtPairs file at all
                   (default is to remove all PSLs for queries that are not in file)
   -queries=file - file has qNames to select
   -queryPairs=file - file is tab-separated pairs of qNames to select
    with new qName to substitute in output file
   -qtStart=file - file is tab-separate rows of qName,tName,tStart
   -qDelim=char  - use only the part of the query name before this character

================================================================
========   pslSomeRecords   ====================================
================================================================
### kent source version 473 ###
pslSomeRecords - Extract multiple psl records
usage:
   pslSomeRecords pslIn listFile pslOut
where:
   pslIn is the input psl file
   listFile is a file with a qName (rna accession usually)
          on each line
   pslOut is the output psl file
options:
   -not  - include psl if name is NOT in list
   -tToo - if set, the list file is two column, qName and tName.
           In this case only records matching on both q and t are
           output

================================================================
========   pslSort   ====================================
================================================================
pslSort - Merge and sort psCluster .psl output files
usage:
      pslSort dirs[1|2] outFile tempDir inDir(s)OrFile(s)

   This will sort all of the .psl input files or those in the directories
   inDirs in two stages - first into temporary files in tempDir
   and second into outFile.  The device on tempDir must have
   enough space (typically 15-20 gigabytes if processing whole genome).

      pslSort g2g[1|2] outFile tempDir inDir(s)

   This will sort a genome-to-genome alignment, reflecting the
   alignments across the diagonal.

   Adding 1 or 2 to the dirs or g2g option will limit the program to only
   the first or second pass respectively of the sort.

options:
   -nohead      Do not write psl header.
   -verbose=N   Set verbosity level, higher for more output. Default is 1.

================================================================
========   pslSortAcc   ====================================
================================================================
pslSortAcc - sort pslSort .psl output file by accession
Make one output .psl file per accession.
usage:
  pslSortAcc how outDir tempDir inFile(s)
This will sort the inFiles by accession in two steps
Intermediate results will be put in tempDir.  The final
result (one .psl file per target) will be put in outDir.
Both outDir and tempDir will be created if they do not
already exist.  The 'how' parameter should be either
'head' or 'nohead'
================================================================
========   pslSpliceJunctions   ====================================
================================================================
### kent source version 473 ###
pslSpliceJunctions - Extract splice junctions from a PSL file
usage:
   pslSpliceJunctions pslFile genome2bit junctionsTsv
options:

Output query and target coordinates of target gaps, often introns,
in alignments. Output is always in query-positive and target-positive coordinates,
with only gaps in the target reported. Canonical junctions will be in upper cases,
unknown ones lower case. 

================================================================
========   pslSplitOnTarget   ====================================
================================================================
### kent source version 473 ###
pslSplitOnTarget - Split psl files into one per target.
usage:
   pslSplitOnTarget inFile.psl outDir
options:
   -maxTargetCount=N - Maximum allowed targets (default is 300).
        This implementation keeps an open file handle for each target.
   -lump - useful with scaffolds, hashes on targ name to lump together.
           (creates maxTargetCount lumps off scaffold name hash).

================================================================
========   pslStats   ====================================
================================================================
### kent source version 473 ###
pslStats - collect statistics from a psl file.

usage:
   pslStats [options] psl statsOut

Options:
  -queryStats - output per-query statistics, the default is per-alignment stats
  -overallStats - output overall statistics.
  -queries=querySizeFile - tab separated file with of expected qNames and sizes.
   If specified, statistic will include queries that didn't align.
  -warnOnConflicts - warn and ignore when a two PSLs with the same qName conflict.
   This can happen with bogus generated names.
  -tsv - write a TSV header instead of an autoSql header

================================================================
========   pslSwap   ====================================
================================================================
### kent source version 473 ###
pslSwap - swap target and query in psls
usage:
    pslSwap [options] inPsl outPsl

Options:
  -noRc - don't reverse complement untranslated alignments to
   keep target positive strand.  This will make the target strand
   explict.

================================================================
========   pslToBed   ====================================
================================================================
### kent source version 473 ###
pslToBed - tranform a psl format file to a bed format file.
usage:
    pslToBed [options] psl bed
options:
    -cds=cdsFile
cdsFile specifies a input cds tab-separated file which contains
genbank-style CDS records showing cdsStart..cdsEnd
e.g. NM_123456 34..305
These coordinates are assumed to be in the query coordinate system
of the psl, like those that are created from genePredToFakePsl
    -posName
changes the qName field to qName:qStart-qEnd
(can be used to create links to query position on details page)

================================================================
========   pslToBigPsl   ====================================
================================================================
### kent source version 473 ###
pslToBigPsl - converts psl to bigPsl input (bed format with extra fields)
usage:
  pslToBigPsl file.psl stdout | sort -k1,1 -k2,2n > file.bigPslInput
options:
  -cds=file.cds - tab-separated columns with the qName and a GenBank style one-based CDS (e.g. 475..1731, <1..354)
  -fa=file.fasta
NOTE: to build bigBed:
   wget https://genome.ucsc.edu/goldenPath/help/examples/bigPsl.as
   bedToBigBed -type=bed12+13 -tab -as=bigPsl.as file.bigPslInput chrom.sizes output.bb

================================================================
========   pslToChain   ====================================
================================================================
### kent source version 473 ###
pslToChain - Convert psl records to chain records 
usage:
   pslToChain pslIn chainOut
Options:
   -fixStrand  reverse-complement negative target strand PSLs
   -ignore   ignore psl records with negative target strand rather than exiting

================================================================
========   pslToPslx   ====================================
================================================================
### kent source version 473 ###
pslToPslx - Convert from psl to pslx format, which includes sequences
usage:
   pslToPslx [options] in.psl qSeqSpec tSeqSpec out.pslx

qSeqSpec and tSeqSpec can be nib directory, a 2bit file, or a FASTA file.
FASTA files should end in .fa, .fa.gz, .fa.Z, or .fa.bz2 and are read into
memory.

Options:
  -masked - if specified, repeats are in lower case cases, otherwise entire
            sequence is loader case.

================================================================
========   pslxToFa   ====================================
================================================================
### kent source version 473 ###
pslxToFa - convert pslx (with sequence) to fasta file
usage:
   pslxToFa in.psl out.fa
options:
   -liftTarget=liftTarget.lft
   -liftQuery=liftQuery.lft

================================================================
========   qaToQac   ====================================
================================================================
qaToQac - convert from uncompressed to compressed
quality score format.
usage:
   qaToQac in.qa out.qac
================================================================
========   qacAgpLift   ====================================
================================================================
### kent source version 473 ###
qacAgpLift - Use AGP to combine per-scaffold qac into per-chrom qac.
usage:
   qacAgpLift scaffoldToChrom.agp scaffolds.qac chrom.qac
options:
    -mScore=N - score to use for missing data (otherwise fail)
            range: 0-99, recommended values are 98 (low qual) or 99 (high)
================================================================
========   qacToQa   ====================================
================================================================
### kent source version 473 ###
qacToQa - convert from compressed to uncompressed
quality score format.
usage:
   qacToQa in.qac out.qa
	-name=name  restrict output to just this sequence name

================================================================
========   qacToWig   ====================================
================================================================
### kent source version 473 ###
qacToWig - convert from compressed quality score format to wiggle format.
usage:
   qacToWig in.qac outFileOrDir
	-name=name    restrict output to just this sequence name
	-fixed        output single file with wig headers and fixed step size
   If neither -name nor -fixed is used, outFileOrDir is a directory which
   will be created if it does not already exist.  If -name and/or -fixed is
   used, outFileOrDir is a file (or "stdout").

================================================================
========   raSqlQuery   ====================================
================================================================
### kent source version 473 ###
raSqlQuery - Do a SQL-like query on a RA file.
   raSqlQuery raFile(s) query-options
or
   raSqlQuery -db=dbName query-options
Where dbName is a UCSC Genome database like hg18, sacCer1, etc.
One of the following query-options must be specified
   -queryFile=fileName
   "-query=select list,of,fields from file where field='this'"
The queryFile just has a query in it in the same form as the query option.
The syntax of a query statement is very SQL-like. The most common commands are:
    select tag1,tag2,tag3 where tag1 like 'prefix%'
where the % is a SQL wildcard.  Sorry to mix wildcards. Another command query is
    select count(*) from * where tag = 'val
The from list is optional.  If it exists it is a list of raFile names
    select track,type from *Encode* where type like 'bigWig%'
Other command line options:
   -addFile - Add 'file' field to say where record is defined
   -addDb - Add 'db' field to say where record is defined
   -strict - Used only with db option.  Only report tracks that exist in db
   -key=keyField - Use the as the key field for merges and parenting. Default name
   -parent - Merge together inheriting on parentField
   -parentField=field - Use field as the one that tells us who is our parent. Default subTrack
   -overrideNeeded - If set records are only overridden field-by-field by later records
               if 'override' follows the track name. Otherwiser later record replaces
               earlier record completely.  If not set all records overridden field by field
   -noInheritField=field - If field is present don't inherit fields from parent
   -merge - If there are multiple raFiles, records with the same keyField will be
          merged together with fields in later files overriding fields in earlier files
   -restrict=keyListFile - restrict output to only ones with keys in file.
   -db=hg19 - Acts on trackDb files for the given database.  Sets up list of files
              appropriately and sets parent, merge, and override all.
              Use db=all for all databases

================================================================
========   raToLines   ====================================
================================================================
### kent source version 473 ###
raToLines - Output .ra file stanzas as single lines, with pipe-separated fields.

usage:
   raToLines in.ra out.txt

================================================================
========   raToTab   ====================================
================================================================
### kent source version 473 ###
raToTab - Convert ra file to table.
usage:
   raToTab in.ra out.tab
options:
   -cols=a,b,c - List columns in order to output in table
                 Only these columns will be output.  If you
                 Don't give this option, all columns are output
                 in alphabetical order
   -head - Put column names in header

================================================================
========   randomLines   ====================================
================================================================
### kent source version 473 ###
randomLines - Pick out random lines from file
usage:
   randomLines inFile count outFile
options:
   -seed=N - Set seed used for randomizing, useful for debugging.
   -decomment - remove blank lines and those starting with 

================================================================
========   rmFaDups   ====================================
================================================================
rmFaDups - remove duplicate records in FA file
usage
   rmFaDups oldName.fa newName.fa

================================================================
========   rmskAlignToPsl   ====================================
================================================================
### kent source version 473 ###
rmskAlignToPsl - convert repeatmasker alignments to PSLs

usage:
   rmskAlignToPsl rmskAlignTab rmskPslFile

  -bigRmsk - input is the text version of bigRmskAlignBed files.
  -repSizes=tab - two column tab file with repeat name and size.
   Sometimes the repeat sizes are incorrect in the align file.
   If a repeat alignment doesn't match the size here or is not
   in the file it will be discarded.
   
  -dump - print alignments to stdout for debugging purposes

This convert *.fa.align.tsv file, created by
RepeatMasker/util/rmToUCSCTables.pl into a PSL file.
Non-TE Repeats without consensus sequence are not included.

================================================================
========   rowsToCols   ====================================
================================================================
### kent source version 473 ###
rowsToCols - Convert rows to columns and vice versa in a text file.
usage:
   rowsToCols in.txt out.txt
By default all columns are space-separated, and all rows must have the
same number of columns.
options:
   -varCol - rows may to have various numbers of columns.
   -tab - fields are separated by tab
   -fs=X - fields are separated by given character
   -fixed - fields are of fixed width with space padding
   -offsets=X,Y,Z - fields are of fixed width at given offsets

================================================================
========   sizeof   ====================================
================================================================
sizeof - show size of various C types for reference

     type   bytes    bits
     char	1	8
unsigned char	1	8
short int	2	16
u short int	2	16
      int	4	32
 unsigned	4	32
     long	8	64
unsigned long	8	64
long long	8	64
u long long	8	64
   size_t	8	64
   void *	8	64
    float	4	32
   double	8	64
long double	8	64
LITTLE ENDIAN machine detected
byte order: normal order: 0x12345678 in memory: 0x78563412
================================================================
========   spacedToTab   ====================================
================================================================
### kent source version 473 ###
spacedToTab - Convert fixed width space separated fields to tab separated
Note this requires two passes, so it can't be done on a pipe
usage:
   spacedToTab in.txt out.tab
options:
   -sizes=X,Y,Z - Force it to have columns of the given widths.
                 The final char in each column should be space or newline

================================================================
========   splitFile   ====================================
================================================================
splitFile - Split up a file
usage:
   splitFile source linesPerFile outBaseName
options:
   -head=file - put head in front of each output
   -tail=file - put tail at end of each output
================================================================
========   splitFileByColumn   ====================================
================================================================
### kent source version 473 ###
splitFileByColumn - Split text input into files named by column value
usage:
   splitFileByColumn source outDir
options:
   -col=N      - Use the Nth column value (default: N=1, first column)
   -head=file  - Put head in front of each output
   -tail=file  - Put tail at end of each output
   -chromDirs  - Split into subdirs of outDir that are distilled from chrom
                 names, e.g. chr3_random -> outDir/3/chr3_random.XXX .
   -ending=XXX - Use XXX as the dot-suffix of split files (default: taken
                 from source).
   -tab        - Split by tab characters instead of whitespace.
Split source into multiple files in outDir, with each filename determined
by values from a column of whitespace-separated input in source.
If source begins with a header, you should pipe "tail +N source" to this
program where N is number of header lines plus 1, or use some similar
method to strip the header from the input.

================================================================
========   sqlToXml   ====================================
================================================================
### kent source version 473 ###
sqlToXml - dump out all or part of a relational database to XML, guided
by a dump specification.  See sqlToXml.doc for additional information.
usage:
   sqlToXml database dumpSpec.od output.xml
options:
   -topTag=name - Give the top level XML tag the given name.  By
               default it will be the same as the database name.
   -query=file.sql - Instead of dumping whole database, just dump those
                  records matching SQL select statement in file.sql.
                  This statement should be of the form:
           select * from table where ...
                   or
           select table.* from table,otherTables where ...
                   Where the table is the same as the table in the first
                   line of dumpSpec.
   -tab=N - number of spaces betweeen tabs in xml.dumpSpec - by default it's 8.
            (It may be best just to avoid tabs in that file though.)
   -maxList=N - This will limit any lists in the output to no more than
                size N.  This is mostly just for testing.

================================================================
========   strexCalc   ====================================
================================================================
### kent source version 473 ###
strexCalc - String expression calculator, mostly to test strex expression evaluator.
usage:
   strexCalc [variable assignments] expression
command options in strexCalc are used to seed variables so for instance the command
   strexCalc a=12 b=13 c=xyz 'a + b + c'
ends up returning 1213xyz

================================================================
========   stringify   ====================================
================================================================
### kent source version 473 ###
stringify - Convert file to C strings
usage:
   stringify [options] in.txt
A stringified version of in.txt  will be printed to standard output.

Options:
  -var=varname - create a variable with the specified name containing
                 the string.
  -static - create the variable but put static in front of it.
  -array - create an array of strings, one for each line


================================================================
========   subChar   ====================================
================================================================
subChar - Substitute one character for another throughout a file.
usage:
   subChar oldChar newChar file(s)
oldChar and newChar can either be single letter literal characters,
or two digit hexadecimal ascii codes
================================================================
========   subColumn   ====================================
================================================================
### kent source version 473 ###
subColumn - Substitute one column in a tab-separated file.
usage:
   subColumn column in.tab sub.tab out.tab
Where:
    column is the column number (starting with 1)
    in.tab is a tab-separated file
    sub.tab is a where first column is old values, second new
    out.tab is the substituted output
options:
   -list - Column is a comma-separated list.  Substitute all elements in list
   -miss=fileName - Print misses to this file instead of aborting
   -skipMiss -- skip missing id's instead of outputting them

================================================================
========   tabToTabDir   ====================================
================================================================
### kent source version 473 ###
tabToTabDir - Convert a large tab-separated table to a directory full of such tables according
to a specification. The program is designed to make it relatively easy to unpack overloaded
single fields into multiple fields, and to created normalized less redundant representations.
The command line is:
   tabToTabDir in.tsv spec.x outDir
options:
   -id=fieldName - Add a numeric id field of given name that starts at 1 and autoincrements 
                   for each table
   -startId=fieldName - sets starting ID to be something other than 1
   -sort - if set then sort tables before output
usage:
   in.tsv is a tab-separated input file.  The first line is the label names and may start with #
   spec.x is a file that says what columns to put into the output, described in more detail below.
The spec.x file contains one blank line separated stanza per output table.
Each stanza should look like:
        table tableName    key-column
        columnName1	sourceExpression1
        columnName2	sourceExpression2
              ...
if the sourceExpression is missing it is assumed to be a just a field of the same name from in.tsv
Otherwise the sourceExpression can be a strex expression involving fields in in.tsv.

Each output table has duplicate rows merged using the key-column to determine uniqueness.
Please see tabToTabDir.doc in the source code for more information on what can go into spec.x.

================================================================
========   tailLines   ====================================
================================================================
tailLines - add tail to each line of file
usage:
   tailLines file tail
This will add tail to each line of file and print to stdout.
================================================================
========   tdbQuery   ====================================
================================================================
### kent source version 473 ###
tdbQuery - Query the trackDb system using SQL syntax.
Usage:
    tdbQuery sqlStatement
Where the SQL statement is enclosed in quotations to avoid the shell interpreting it.
Only a very restricted subset of a single SQL statement (select) is supported.   Examples:
    tdbQuery "select count(*) from hg18"
counts all of the tracks in hg18 and prints the results to stdout
   tdbQuery "select count(*) from *"
counts all tracks in all databases.
   tdbQuery "select  track,shortLabel from hg18 where type like 'bigWig%'"
prints to stdout a a two field .ra file containing just the track and shortLabels of bigWig 
type tracks in the hg18 version of trackDb.
   tdbQuery "select * from hg18 where track='knownGene' or track='ensGene'"
prints the hg18 knownGene and ensGene track's information to stdout.
   tdbQuery "select *Label from mm9"
prints all fields that end in 'Label' from the mm9 trackDb.
OPTIONS:
   -root=/path/to/trackDb/root/dir
Sets the root directory of the trackDb.ra directory hierarchy to be given path. By default
this is ~/kent/src/hg/makeDb/trackDb.
   -check
Check that trackDb is internally consistent.  Prints diagnostic output to stderr and aborts if 
there's problems.
   -strict
Mimic -strict option on hgTrackDb. Suppresses tracks where corresponding table does not exist.
   -release=alpha|beta|public
Include trackDb entries with this release tag only. Default is alpha.
   -noBlank
Don't print out blank lines separating records
   -oneLine
Print single ('|') pipe-separated line per record
   -noCompSub
Subtracks don't inherit fields from parents
   -shortLabelLength=N
Complain if shortLabels are over N characters
   -longLabelLength=N
Complain if longLabels are over N characters

================================================================
========   tdbRename   ====================================
================================================================
Usage: tdbRename [options] inFile tagName replaceFile outFile - mass-rename trackDb tags given a file with oldVal<tab>newVal
            
    Examples:
        tdbRename trackDb.orig.txt track replace.tsv trackDb.txt
        tdbRename trackDb.orig.txt shortLabel replace.tsv trackDb.txt
    

Options:
  -h, --help           show this help message and exit
  -d, --debug          show debug messages
  --newMeta=NEWMETA    keep the old name as metadata tag with this name
  --suffList=SUFFLIST  comma-sep list of suffixes. These are ignored when
                       comparing values. Many tracks need suffixes for the
                       various track types, e.g. peaks and coverage. A typical
                       value could be 'pk,cov'
================================================================
========   tdbSort   ====================================
================================================================
Usage: tdbSort [options] inFile tagName outFile - sort a trackDb file by a tag
            
    Examples:
        tdbSort trackDb.orig.txt shortLabel trackDb.txt
    

Options:
  -h, --help            show this help message and exit
  -d, --debug           show debug messages
  -p PARENT, --parent=PARENT
                        only sort tracks that have a given 'parent' tag
  -i, --ignCase         ignore case when sorting
================================================================
========   textHistogram   ====================================
================================================================
### kent source version 473 ###
textHistogram - Make a histogram in ascii
usage:
   textHistogram [options] inFile
Where inFile contains one number per line.
  options:
   -binSize=N - Size of bins, default 1
   -maxBinCount=N - Maximum # of bins, default 25
   -minVal=N - Minimum value to put in histogram, default 0
   -log - Do log transformation before plotting
   -noStar - Don't draw asterisks
   -col=N - Which column to use. Default 1
   -aveCol=N - A second column to average over. The averages
             will be output in place of counts of primary column.
   -real - Data input are real values (default is integer)
   -autoScale=N - autoscale to N # of bins
   -probValues - show prob-Values (density and cum.distr.) (sets -noStar too)
   -freq - show frequences instead of counts
   -skip=N - skip N lines before starting, default 0

================================================================
========   tickToDate   ====================================
================================================================
tickToDate - Convert seconds since 1970 to time and date
usage:
   tickToDate ticks
Use 'now' for current ticks and date

================================================================
========   toLower   ====================================
================================================================
toLower - Convert upper case to lower case in file. Leave other chars alone
usage:
   toLower inFile outFile
equivalent to the unix commands: cat inFile | tr '[A-Z]' '[a-z]' > outFile
================================================================
========   toUpper   ====================================
================================================================
toUpper - Convert lower case to upper case in file. Leave other chars alone
usage:
   toUpper inFile outFile
equivalent to the unix commands: cat inFile | tr '[a-z]' '[A-Z]' > outFile
================================================================
========   trackDbIndexBb   ====================================
================================================================
usage: trackDbIndexBb [-h] [-o OUTDIR] [-p TOOLSPATH] [-n] [-m METADATAVAR]
                      [-s SUBGROUPREMOVE]
                      trackName raFile chromSizes

Given a track name, a trackDb.ra composite of bigBeds, and a chrom.sizes
file, will create index files needed to optimize hideEmptySubtracks setting.
Will also build track associations between tracks sharing metadata, which
will cause them to display together whenever the primary bigBed track is present.
Depending on size and quantity of files, can take over 60 minutes.

This script has three dependancies: bigBedToBed, bedToBigBed, and bedtools. The first two are
UCSC Genome Browser utilities which can be downloaded to the current directory with the 
following commands:

1) wget http://hgdownload.soe.ucsc.edu/admin/exe/<system>.x86_64/<tool>

where: 
<system> is macOSX or linux
<tool> is bedToBigBed and bigBedToBed

2) chmod +x <tool>

bedtools can be found here: https://bedtools.readthedocs.io
    
These dependancies can be in the path, in the local directory the script is run from, or 
specified using the optional flags.

    
Example run:
    trackDbIndexBb mm10HMMdata mm10HMMdata.ra mm10chrom.sizes
    trackDbIndexBb hg19peaks hg19peaks.ra hg19chrom.sizes -o ./hg19peaks/output -p /user/bin

required arguments:
  trackName             Track name for top level coposite which contains the
                        bigBed tracks.
  raFile                Relative or absolute path to trackDb.ra file
                        containing the composite track with bigDataUrls.
  chromSizes            Chrom.sizes for database which the track belongs to.
                        Needed to build final bigBed file.

optional arguments:
  -h, --help            show this help message and exit
  -o OUTDIR, --out OUTDIR
                        Optional: Output directory for files. Default current
                        directory.
  -p TOOLSPATH, --pathTools TOOLSPATH
                        Optional: Path to directory where
                        bedtools/bedToBigBed/bigBedToBed can be found
  -n, --noDelete        Optional: Do not delete intermediary multibed.bed
                        file. This option will result in both the multibed.bb
                        and multibed.bed files in the output directory.
  -m METADATAVAR, --metaDataVar METADATAVAR
                        Optional: Used when there are associated tracks to be
                        displayed alongside the primary BB track. Such as peak
                        tracks with related signals. To relate the tracks,
                        trackDbIndexBb expects all except one of the metaData
                        variables to match among associated tracks. By
                        default, trackDbIndexBb attempts to make association
                        between tracks by using the metaData in the
                        'subGroups' trackDb parameter. Use this flag to change
                        it to a different association, often 'metaData' is
                        also used.
  -s SUBGROUPREMOVE, --subGroupRemove SUBGROUPREMOVE
                        Optional: Used when there are associated tracks to be
                        displayed alongside the primary BB track. Such as peak
                        tracks with related signals. To relate the tracks,
                        trackDbIndexBb expects all except one of the metaData
                        variables to match among associated tracks. This
                        metaData often looks likes: 'view=Peaks
                        mark=A6_H3K36me3' for the .bb track, and 'view=Signal
                        mark=A6_H3K36me3' for the .bw track. In this case, you
                        would want to exclude the 'view' varaible to make
                        histone mark associations (A6_H3K36me3). This flag can
                        be used to pass a different exclusionary variable than
                        the default 'view'
================================================================
========   transMapPslToGenePred   ====================================
================================================================
### kent source version 473 ###
transMapPslToGenePred - convert PSL alignments of mRNAs to gene annotations.

usage:
   transMapPslToGenePred [options] sourceGenePred mappedPsl mappedGenePred

Convert PSL alignments from transmap to genePred.  It specifically handles
alignments where the source genes are genomic annotations in genePred
format, that are converted to PSL for mapping and using this program to
create a new genePred.

This is an alternative to mrnaToGene which determines CDS and frame from
the original annotation, which may have been imported from GFF/GTF.  This
was created because the genbankCds structure use by mrnaToGene doesn't
handle partial start/stop codon or programmed frame shifts.  This requires
handling the list of CDS regions and the /codon_start attribute,  At some
point, this program may be extended to do handle genbank alignments correctly.

Options:
  -nonCodingGapFillMax=0 - fill gaps in non-coding regions up to this many bases
   in length.
  -codingGapFillMax=0 - fill gaps in coding regions up to this many bases
   in length.  Only coding gaps that are a multiple of three will be fill,
   with the max rounded down.
  -noBlockMerge - don't do any block merging of genePred, even of adjacent blocks.
   This is mainly for debugging.
  -frameShifts=tsv - Write TSV with locations of frame-shifting indels.  The coordinates
   give a context for the shift for browsing, not an exact location.


================================================================
========   trfBig   ====================================
================================================================
### kent source version 473 ###
trfBig - Mask tandem repeats on a big sequence file.
usage:
   trfBig inFile outFile
This will repeatedly run trf to mask tandem repeats in infile
and put masked results in outFile.  inFile and outFile can be .fa
or .nib format. Outfile can be .bed as well. Sequence output is hard
masked, lowercase.

   -bed creates a bed file in current dir
   -bedAt=path.bed - create a bed file at explicit location
   -tempDir=dir Where to put temp files.
   -trf=trfExe explicitly specifies trf executable name
   -maxPeriod=N  Maximum period size of repeat (default 2000)
   -keep  don't delete tmp files
   -l=<n> when used here, for new trf v4.09 option:
          maximum TR length expected (in millions)
          (eg, -l=3 for 3 million), Human genome hg38 would need -l=6
================================================================
========   twoBitDup   ====================================
================================================================
### kent source version 473 ###
twoBitDup - check to see if a twobit file has any identical sequences in it
usage:
   twoBitDup file.2bit
options:
  -keyList=file - file to write a key list, two columns: md5sum and sequenceName
  -udcDir=/dir/to/cache - place to put cache for remote bigBed/bigWigs

example: twoBitDup -keyList=stdout db.2bit \
          | grep -v 'are identical' | sort > db.idKeys.txt
================================================================
========   twoBitInfo   ====================================
================================================================
### kent source version 473 ###
twoBitInfo - get information about sequences in a .2bit file
usage:
   twoBitInfo input.2bit output.tab
options:
   -maskBed instead of seq sizes, output BED records that define 
           areas with masked sequence
   -nBed   instead of seq sizes, output BED records that define 
           areas with N's in sequence
   -noNs   outputs the length of each sequence, but does not count Ns 
   -udcDir=/dir/to/cache - place to put cache for remote bigBed/bigWigs
Output file has the columns::
   seqName size

The 2bit file may be specified in the form path:seq or path:seq1,seq2,seqN...
so that information is returned only on the requested sequence(s).
If the form path:seq:start-end is used, start-end is ignored.

================================================================
========   twoBitMask   ====================================
================================================================
### kent source version 473 ###
twoBitMask - apply masking to a .2bit file, creating a new .2bit file
usage:
   twoBitMask input.2bit maskFile output.2bit
options:
   -add   Don't remove pre-existing masking before applying maskFile.
   -type=.XXX   Type of maskFile is XXX (bed or out).
maskFile can be a RepeatMasker .out file or a .bed file.  It must not
contain rows for sequences which are not in input.2bit.

================================================================
========   twoBitToFa   ====================================
================================================================
### kent source version 473 ###
twoBitToFa - Convert all or part of .2bit file to fasta
usage:
   twoBitToFa input.2bit output.fa
options:
   -seq=name       Restrict this to just one sequence.
   -start=X        Start at given position in sequence (zero-based).
   -end=X          End at given position in sequence (non-inclusive).
   -seqList=file   File containing list of the desired sequence names 
                   in the format seqSpec[:start-end], e.g. chr1 or chr1:0-189
                   where coordinates are half-open zero-based, i.e. [start,end).
   -noMask         Convert sequence to all upper case.
   -bpt=index.bpt  Use bpt index instead of built-in one.
   -bed=input.bed  Grab sequences specified by input.bed. Will exclude introns.
   -bedPos         With -bed, use chrom:start-end as the fasta ID in output.fa.
   -udcDir=/dir/to/cache  Place to put cache for remote bigBed/bigWigs.

Input file can be a URL
Sequence and range may also be specified as part of the input
file name using the syntax:
      /path/input.2bit:name
   or
      /path/input.2bit:name:start-end
examples:
  wget https://hgdownload.soe.ucsc.edu/goldenPath/hg38/bigZips/latest/hg38.2bit
  twoBitToFa hg38.2bit -seq=chr1 -start=1000000 -end=20000000 out.fa
  echo 'chr1 1 1000 testRegion' > test.bed
  twoBitToFa https://hgdownload.soe.ucsc.edu/goldenPath/hg38/bigZips/latest/hg38.2bit -bed=test.bed out.fa

================================================================
========   ucscApiClient   ====================================
================================================================
usage: ucscApiClient [-h] [-p] [--debug] [-test0] [-getDnaExample]
                     [endpoint] [parameters]

Command line utility for UCSC Genome Browser API access

positional arguments:
  endpoint            Endpoint string like "/list/tracks" or "/getData/track/"
  parameters          Parameters to endpoints. semi-colon separated key=value
                      formatted string, like
                      "genome=hg38;chrom=chrM;maxItemsOutput=2"

optional arguments:
  -h, --help          show this help message and exit
  -p, --pretty-print  Print json response with newlines
  --debug             Print final URL of the request
  -test0              Run special test
  -getDnaExample      Show example query for fetching Human GRCh38(hg38) DNA
                      sequence

Example usage:
ucscApiClient "/getData/track" "track=gold;genome=hg38;chrom=chrM;maxItemsOutput=2"
================================================================
========   vai.pl   ====================================
================================================================
vai.pl - Invokes hgVai (Variant Annotation Integrator) on a set of variant calls to add functional effect predictions and other data relevant to function.

usage:
    vai.pl [options] db input.(vcf|pgsnp|pgSnp|txt)[.gz] > output.tab

Invokes hgVai (Variant Annotation Integrator) on a set of variant calls to
add functional effect predictions (e.g. does the variant fall within a
regulatory region or part of a gene) and other data relevant to function.

input.(...) must be a file or URL containing either variants formatted as VCF
or pgSnp, or a sequence of dbSNP rs# IDs, optionally compressed by gzip.
Output is printed to stdout.

options:
  --hgVai=/path/to/hgVai          Path to hgVai executable
                                  (default: /usr/local/apache/cgi-bin/hgVai)
  --position=chrX:N-M             Sequence name, start and end of range to query
                                  (default: genome-wide query)
  --rsId                          Attempt to match dbSNP rs# ID with variant
                                  position at the expense of performance.
                                  (default: don't attempt to match dbSNP rs# ID)
  --udcCache=/path/to/udcCache    Path to udc cache, overriding hg.conf setting
                                  (default: use value in hg.conf file)
  --geneTrack=track               Genome Browser track with transcript predictions
                                  (default: refGene)
  --hgvsBreakDelIns=on|off        HGVS delins: show "delAGinsTT" instead of "delinsTT"
                                  (default: off)
  --hgvsCN=on|off                 Include HGVS c./n. (coding/noncoding) terms in output (RefSeq transcripts only)
                                  (default: on)
  --hgvsG=on|off                  Include HGVS g. (genomic) terms in output (RefSeq transcripts only)
                                  (default: on)
  --hgvsP=on|off                  Include HGVS p. (protein) terms in output (RefSeq transcripts only)
                                  (default: on)
  --hgvsPAddParens=on|off         Add parentheses around HGVS p. predicted changes
                                  (default: off)
  --include_cdsNonSyn=on|off      Include CDS non-synonymous variants in output
                                  (default: on)
  --include_cdsSyn=on|off         Include CDS synonymous variants in output
                                  (default: on)
  --include_exonLoss=on|off       Include exon loss variants in output
                                  (default: on)
  --include_intergenic=on|off     Include intergenic variants in output
                                  (default: on)
  --include_intron=on|off         Include intron variants in output
                                  (default: on)
  --include_nmdTranscript=on|off  Include variants in NMD transcripts in output
                                  (default: on)
  --include_noVariation=on|off    Include "variants" with no observed variation in output
                                  (default: on)
  --include_nonCodingExon=on|off  Include non-coding exon variants in output
                                  (default: on)
  --include_splice=on|off         Include splice site and splice region variants in output
                                  (default: on)
  --include_upDownstream=on|off   Include upstream and downstream variants in output
                                  (default: on)
  --include_utr=on|off            Include 3' and 5' UTR variants in output
                                  (default: on)
  --variantLimit=N                Maximum number of variants to process
                                  (default: 10000)
  -n, --dry-run                   Display hgVai command, but don't execute it
  -h, --help                      Display this message
================================================================
========   validateFiles   ====================================
================================================================
### kent source version 473 ###
validateFiles - Validates the format of different genomic files.
                Exits with a zero status for no errors detected and non-zero for errors.
                Uses filename 'stdin' to read from stdin.
                Automatically decompresses Files in .gz, .bz2, .zip, .Z format.
                Accepts multiple input files of the same type.
                Writes Error messages to stderr
usage:
   validateFiles -chromInfo=FILE -options -type=FILE_TYPE file1 [file2 [...]]

   -type=
       fasta        : Fasta files (only one line of sequence, and no quality scores)
       fastq        : Fasta with quality scores (see http://maq.sourceforge.net/fastq.shtml)
       csfasta      : Colorspace fasta (implies -colorSpace)
       csqual       : Colorspace quality (see link below)
                      See http://marketing.appliedbiosystems.com/mk/submit/SOLID_KNOWLEDGE_RD?_JS=T&rd=dm
       bam          : Binary Alignment/Map
                      See http://samtools.sourceforge.net/SAM1.pdf
       bigWig       : Big Wig
                      See http://genome.ucsc.edu/goldenPath/help/bigWig.html
       bedN[+P]     : BED N or BED N+ or BED N+P
                      where N is a number between 3 and 15 of standard BED columns,
                      optional + indicates the presence of additional columns
                      and P is the number of addtional columns
                      Examples: -type=bed6 or -type=bed6+ or -type=bed6+3 
                      See http://genome.ucsc.edu/FAQ/FAQformat.html#format1
       bigBedN[+P]  : bigBED N  or bigBED N+ or bigBED N+P, similar to BED
                      See http://genome.ucsc.edu/goldenPath/help/bigBed.html
       tagAlign     : Alignment files, replaced with BAM
       pairedTagAlign  
       broadPeak    : ENCODE Peak formats
       narrowPeak     These are specialized bedN+P formats.
       gappedPeak     See http://genomewiki.soe.ucsc.edu/EncodeDCC/index.php/File_Formats
       bedGraph    :  BED Graph
       rcc         :  NanoString RCC
       idat        :  Illumina IDAT

   -as=fields.as                If you have extra "bedPlus" fields, it's great to put a definition
                                of each field in a row in AutoSql format here. Applies to bed-related types.
   -tab                         If set, expect fields to be tab separated, normally
                                expects white space separator. Applies to bed-related types.
   -chromDb=db                  Specify DB containing chromInfo table to validate chrom names
                                and sizes
   -chromInfo=file.txt          Specify chromInfo file to validate chrom names and sizes
   -colorSpace                  Sequences include colorspace values [0-3] (can be used 
                                with formats such as tagAlign and pairedTagAlign)
   -isSorted                    Input is sorted by chrom, only affects types tagAlign and pairedTagAlign
   -doReport                    Output report in filename.report
   -version                     Print version

For Alignment validations
   -genome=path/to/hg18.2bit    REQUIRED to validate sequence mappings match the genome specified
                                in the .2bit file. (BAM, tagAlign, pairedTagAlign)
   -nMatch                      N's do not count as a mismatch
   -matchFirst=n                Only check the first N bases of the sequence
   -mismatches=n                Maximum number of mismatches in sequence (or read pair) 
   -mismatchTotalQuality=n      Maximum total quality score at mismatching positions
   -mmPerPair                   Check either pair dont exceed mismatch count if validating
                                  pairedTagAlign files (default is the total for the pair)
   -mmCheckOneInN=n             Check mismatches in only one in 'n' lines (default=1, all)
   -allowOther                  Allow chromosomes that aren't native in BAM's
   -allowBadLength              Allow chromosomes that have the wrong length in BAM
   -complementMinus             Complement the query sequence on the minus strand (for testing BAM)
   -bamPercent=N.N              Percentage of BAM alignments that must be compliant
   -privateData                 Private data so empty sequence is tolerated


================================================================
========   validateManifest   ====================================
================================================================
### kent source version 473 ###
manifest.txt not found in workingDir .
validateManifest v1.9 - Validates the ENCODE3 manifest.txt file.
                Calls validateFiles on each file in the manifest.
                Exits with a zero status for no errors detected and non-zero for errors.
                Writes Error messages to stderr
usage:
   validateManifest

   -dir=workingDir, defaults to the current directory.
   -encValData=encValDataDir, relative to workingDir, defaults to encValData.

   Input files in the working directory: 
     manifest.txt - current input manifest file
     validated.txt - input from previous run of validateManifest

   Output file in the working directory: 
     validated.txt - results of validated input


================================================================
========   varStepToBedGraph.pl   ====================================
================================================================
Can't open -verbose=2: No such file or directory at ./varStepToBedGraph.pl line 34.
Processed 0 lines input, 0 data lines, 0 variable step declarations
================================================================
========   vcfToBed   ====================================
================================================================
### kent source version 473 ###
vcfToBed - Convert VCF to BED9+ with optional extra fields.
usage:
    vcfToBed in.vcf outPrefix
options:
    -fixChromNames If present, prepend 'chr' to chromosome names
    -fields=comma-sep list of tags to include in the bed file, other fields
           will be placed into out.extraFields.tab
    -fieldsIsFile If present, the -fields argument is a file with one tag per
           line to keep. Note only the first word (white space delimited) will
           be kept per line

NOTE: Extra VCF tags (that aren't listed in -fields)  get placed into a separate tab
file for later indexing.

================================================================
========   webSync   ====================================
================================================================
Usage: webSync [options] <url> - download from https server, using files.txt on their end to get the list of files

    To create files.txt on the remote end, this simple command can be used to create a list of files:
      du -ab > files.txt
    But the above command is slow, includes directories (will lead to warnings) and does not follow
    symlinks, so rather use this command:
      find -L . -type f -print0 | du -Lab --files0-from=- > files.txt

    Then run this in the download directory:
      webSync https://there.org/

    This will create a "webSyncLog" directory in the current directory, compare
    https://there.org/files.txt with the files in the current directory,
    transfer the missing files and write the changes to webSync/transfer.log.

    The URL will be saved after the first run and is not necessary from then on. You can add
    cd xxx && webSync to your crontab. It will not start if it's already running (flagfile).

    Status files after a run:
    - webSyncLog/biggerHere.txt - list of files that are bigger here. These could be errors or OK.
    - webSyncLog/files.here.txt - the list of files here
    - webSyncLog/files.there.txt - the list of files there, current copy of https://there.org/files.txt
    - webSyncLog/missingThere.txt - the list of files not on https://there.org anymore but here
    - webSyncLog/transfer.log - big transfer log, each run, date and size of transferred file is noted here.
    

Options:
  -h, --help            show this help message and exit
  -d, --debug           show debug messages
  -x CONNECTIONS, --connections=CONNECTIONS
                        Maximum number of parallel connections to the server,
                        default 10
  -s, --skipScan        Do not scan local file sizes again, in case you know
                        it is up to date
================================================================
========   wigCorrelate   ====================================
================================================================
### kent source version 473 ###
wigCorrelate - Produce a table that correlates all pairs of wigs.
usage:
   wigCorrelate one.wig two.wig ... n.wig
This works on bigWig as well as wig files.
The output is to stdout
options:
   -clampMax=N - values larger than this are clipped to this value

================================================================
========   wigEncode   ====================================
================================================================
### kent source version 473 ###
wigEncode - convert Wiggle ascii data to binary format

usage:
    wigEncode [options] wigInput wigFile wibFile
	wigInput - wiggle ascii data input file (stdin OK)
	wigFile - .wig output file to be used with hgLoadWiggle
	wibFile - .wib output file to be symlinked into /gbdb/<db>/wib/

This processes the three data input format types described at:
	http://genome.ucsc.edu/encode/submission.html#WIG
	(track and browser lines are tolerated, i.e. ignored)
options:
    -lift=<D> - lift all input coordinates by D amount, default 0
              - can be negative as well as positive
    -allowOverlap - allow overlapping data, default: overlap not allowed
              - only effective for fixedStep and if fixedStep declarations
              - are in order by chromName,chromStart
    -noOverlapSpanData - check for overlapping span data
    -wibSizeLimit=<N> - ignore rest of input when wib size is >= N

Example:
    hgGcPercent -wigOut -doGaps -file=stdout -win=5 xenTro1 \
        /cluster/data/xenTro1 | wigEncode stdin gc5Base.wig gc5Base.wib
load the resulting .wig file with hgLoadWiggle:
    hgLoadWiggle -pathPrefix=/gbdb/xenTro1/wib xenTro1 gc5Base gc5Base.wig
    ln -s `pwd`/gc5Base.wib /gbdb/xenTro1/wib
================================================================
========   wigToBigWig   ====================================
================================================================
### kent source version 473 ###
wigToBigWig v 2.9 - Convert ascii format wig file (in fixedStep, variableStep
or bedGraph format) to binary big wig format (bbi version: 4).
usage:
   wigToBigWig in.wig chrom.sizes out.bw
Where in.wig is in one of the ascii wiggle formats, but not including track lines
and chrom.sizes is a two-column file/URL: <chromosome name> <size in bases>
and out.bw is the output indexed big wig file.
If the assembly <db> is hosted by UCSC, chrom.sizes can be a URL like
  http://hgdownload.soe.ucsc.edu/goldenPath/<db>/bigZips/<db>.chrom.sizes
or you may use the script fetchChromSizes to download the chrom.sizes file.
If not hosted by UCSC, a chrom.sizes file can be generated by running
twoBitInfo on the assembly .2bit file.
options:
   -blockSize=N - Number of items to bundle in r-tree.  Default 256
   -itemsPerSlot=N - Number of data points bundled at lowest level. Default 1024
   -clip - If set just issue warning messages rather than dying if wig
                  file contains items off end of chromosome or chromosomes
                  that are not in the chrom.sizes file.
   -unc - If set, do not use compression.
   -fixedSummaries - If set, use a predefined sequence of summary levels.
   -keepAllChromosomes - If set, store all chromosomes in b-tree.
================================================================
========   wordLine   ====================================
================================================================
### kent source version 473 ###
wordLine - chop up words by white space and output them with one
word to each line.
usage:
    wordLine inFile(s)
Output will go to stdout.Options:
    -csym - Break up words based on C symbol rules rather than white space

================================================================
========   xmlCat   ====================================
================================================================
### kent source version 473 ###
xmlCat - Concatenate xml files together, stuffing all records inside a single outer tag. 
usage:
   xmlCat XXX
options:
   -xxx=XXX

================================================================
========   xmlToSql   ====================================
================================================================
### kent source version 473 ###
xmlToSql - Convert XML dump into a fairly normalized relational database
   in the form of a directory full of tab-separated files and table
   creation SQL.  You'll need to run autoDtd on the XML file first to
   get the dtd and stats files.
usage:
   xmlToSql in.xml in.dtd in.stats outDir
options:
   -prefix=name - A name to prefix all tables with
   -textField=name - Name to use for text field (default 'text')
   -maxPromoteSize=N - Maximum size (default 32) for a element that
                       just defines a string to be promoted to a field
                       in parent table

================================================================