Index of /admin/exe/linux.x86

This directory contains applications for stand-alone use, 
built specifically for a Linux 64-bit machine.

For help on the bigBed and bigWig applications see:
http://genome.ucsc.edu/goldenPath/help/bigBed.html
http://genome.ucsc.edu/goldenPath/help/bigWig.html

View the file 'FOOTER.txt' to see the usage statement for 
each of the applications.

##############################################################################
Thank you to Bob Harris for permission to distribute a binary
version of the lastz and lastz_D programs, from:

   https://github.com/lastz/lastz

Version 1.04.00 as of April 2018:

-rwxrwxr-x 1  625283 Apr  6 11:15 lastz-1.04.00
-rwxrwxr-x 1  628835 Apr  6 11:15 lastz_D-1.04.00

$ md5sum lastz*
429e61ffdf1612b7f0f0c8c2095609a7  lastz-1.04.00
4f9a558a65c3a07d0f992cd39b3a27e1  lastz_D-1.04.00

##############################################################################
This entire directory can by copied with the rsync command
into the local directory ./

rsync -aP rsync://hgdownload.soe.ucsc.edu/genome/admin/exe/linux.x86_64/ ./

Or from our mirror site:

rsync -aP rsync://hgdownload-sd.soe.ucsc.edu/genome/admin/exe/linux.x86_64/ ./

Individual programs can by copied by adding their name, for example:

rsync -aP \
   rsync://hgdownload.soe.ucsc.edu/genome/admin/exe/linux.x86_64/faSize ./

      Name                       Last modified      Size  Description
      Parent Directory                                -   
      FOOTER                     2018-03-13 15:36  256K  
      FOOTER.txt                 2026-07-28 14:59  375   
      addCols                    2026-04-07 16:36  7.5M  
      ameme                      2026-04-07 16:36  8.3M  
      autoDtd                    2026-04-07 16:36  7.5M  
      autoSql                    2026-04-07 16:36  7.7M  
      autoXml                    2026-04-07 16:36  7.5M  
      ave                        2026-04-07 16:36  7.5M  
      aveCols                    2026-04-07 16:36  7.5M  
      axtChain                   2025-06-27 09:37  8.0M  
      axtSort                    2025-06-27 09:37  7.5M  
      axtSwap                    2025-06-27 09:37  7.5M  
      axtToMaf                   2025-06-27 09:37  7.6M  
      axtToPsl                   2025-06-27 09:37  7.7M  
      bamToPsl                   2026-04-07 16:36  7.8M  
      barChartMaxLimit           2020-04-20 10:08  1.0K  
      bedClip                    2026-04-07 16:36  8.4M  
      bedCommonRegions           2026-04-07 16:36  7.5M  
      bedCoverage                2025-06-27 09:37   24M  
      bedExtendRanges            2026-01-14 23:38   24M  
      bedGeneParts               2026-04-07 16:36  8.2M  
      bedGraphPack               2026-04-07 16:36  7.5M  
      bedGraphToBigWig           2026-04-07 16:36  8.4M  
      bedIntersect               2025-06-27 09:36  7.5M  
      bedItemOverlapCount        2025-06-27 09:36   24M  
      bedJoinTabOffset           2026-04-07 16:36  7.5M  
      bedJoinTabOffset.py        2019-07-15 10:23  4.3K  
      bedMergeAdjacent           2026-04-07 16:36  8.2M  
      bedPartition               2026-04-07 16:36  8.2M  
      bedPileUps                 2026-04-07 16:36  7.5M  
      bedRemoveOverlap           2026-04-07 16:36  7.5M  
      bedRestrictToPositions     2026-04-07 16:36  7.5M  
      bedSort                    2026-04-07 16:36  8.2M  
      bedToBigBed                2026-04-07 16:36  8.5M  
      bedToExons                 2025-06-27 09:37  8.2M  
      bedToGenePred              2025-06-27 09:36   24M  
      bedToPsl                   2026-04-07 16:36  8.2M  
      bedWeedOverlapping         2026-04-07 16:36  8.2M  
      bigBedInfo                 2026-04-07 16:36  8.3M  
      bigBedNamedItems           2026-04-07 16:36  8.3M  
      bigBedSummary              2026-04-07 16:36  8.3M  
      bigBedToBed                2026-04-07 16:36  8.3M  
      bigChainBreaks             2026-04-07 16:36  8.3M  
      bigChainToChain            2025-12-03 00:15   24M  
      bigGenePredToGenePred      2026-01-14 23:38   24M  
      bigGuessDb                 2026-03-31 08:41   10K  
      bigHeat                    2020-12-01 16:37   17K  
      bigMafToMaf                2026-03-20 10:34  8.3M  
      bigPslToPsl                2026-04-07 16:36  8.4M  
      bigWigAverageOverBed       2026-04-07 16:36  8.3M  
      bigWigCat                  2026-04-07 16:36  8.4M  
      bigWigCluster              2026-04-07 16:36  7.6M  
      bigWigCorrelate            2026-04-07 16:36  8.4M  
      bigWigInfo                 2026-04-07 16:36  7.6M  
      bigWigMerge                2026-04-07 16:36  7.6M  
      bigWigSummary              2026-04-07 16:36  7.6M  
      bigWigToBedGraph           2026-04-07 16:36  8.4M  
      bigWigToWig                2026-04-07 16:36  8.4M  
      binFromRange               2026-01-14 23:38   24M  
      blastToPsl                 2025-06-27 09:36  7.7M  
      blastXmlToPsl              2025-06-27 09:36  8.0M  
      blat/                      2026-04-07 16:51    -   
      calc                       2026-04-07 16:36  7.5M  
      catDir                     2026-04-07 16:36  7.5M  
      catUncomment               2026-04-07 16:36  7.5M  
      chainAntiRepeat            2025-06-27 09:37  7.6M  
      chainBridge                2025-06-27 09:37  7.7M  
      chainCleaner               2025-06-27 09:37  7.8M  
      chainFilter                2025-06-27 09:37  7.5M  
      chainMergeSort             2025-06-27 09:37  7.5M  
      chainNet                   2025-06-27 09:37  7.5M  
      chainPreNet                2025-06-27 09:37  7.5M  
      chainScore                 2025-06-27 09:37  7.7M  
      chainSort                  2025-06-27 09:37  7.5M  
      chainSplit                 2025-06-27 09:37  7.5M  
      chainStitchId              2025-06-27 09:37  7.5M  
      chainSwap                  2025-06-27 09:37  7.5M  
      chainToAxt                 2025-06-27 09:37  7.7M  
      chainToBigChain            2026-04-07 16:36  7.6M  
      chainToPsl                 2025-06-27 09:37  7.7M  
      chainToPslBasic            2025-06-27 09:37  7.7M  
      checkAgpAndFa              2025-06-27 09:36  7.7M  
      checkCoverageGaps          2025-06-27 09:36   24M  
      checkHgFindSpec            2025-06-27 09:36   24M  
      checkTableCoords           2025-06-27 09:36   24M  
      chopFaLines                2026-04-07 16:36  7.5M  
      chromGraphFromBin          2026-01-14 23:38   24M  
      chromGraphToBin            2025-12-03 00:15   24M  
      chromToUcsc                2023-11-28 16:23   11K  
      clusterGenes               2025-06-27 09:36   24M  
      clusterMatrixToBarChartBed 2026-04-07 16:36  7.5M  
      colTransform               2026-04-07 16:36  7.5M  
      countChars                 2026-04-07 16:36  7.5M  
      cpg_lh                     2026-04-07 16:36   32K  
      crTreeIndexBed             2026-04-07 16:36  7.6M  
      crTreeSearchBed            2026-04-07 16:36  7.6M  
      dbDbToHubTxt               2026-01-14 23:38   24M  
      dbSnoop                    2025-06-27 09:36   17M  
      dbTrash                    2025-06-27 09:36   24M  
      endsInLf                   2026-04-07 16:36  7.5M  
      estOrient                  2025-06-27 09:36   24M  
      expMatrixToBarchartBed     2025-11-03 15:25   16K  
      faAlign                    2026-04-07 16:36  7.6M  
      faCmp                      2026-04-07 16:36  7.5M  
      faCount                    2026-04-07 16:36  7.7M  
      faFilter                   2026-04-07 16:36  7.5M  
      faFilterN                  2026-04-07 16:36  7.7M  
      faFrag                     2026-04-07 16:36  7.5M  
      faNoise                    2026-04-07 16:36  7.5M  
      faOneRecord                2026-04-07 16:36  7.5M  
      faPolyASizes               2026-04-07 16:36  7.5M  
      faRandomize                2026-04-07 16:36  7.5M  
      faRc                       2026-04-07 16:36  7.5M  
      faSize                     2026-04-07 16:36  7.7M  
      faSomeRecords              2026-04-07 16:36  7.5M  
      faSplit                    2026-04-07 16:36  7.6M  
      faToFastq                  2026-04-07 16:36  7.5M  
      faToTab                    2026-04-07 16:36  7.5M  
      faToTwoBit                 2026-04-07 16:36  7.6M  
      faToVcf                    2026-04-07 16:36  7.6M  
      faTrans                    2026-04-07 16:36  7.5M  
      fastqStatsAndSubsample     2026-04-07 16:36  7.6M  
      fastqToFa                  2026-04-07 16:36  7.5M  
      featureBits                2025-06-27 09:36   24M  
      fetchChromSizes            2023-06-20 09:53  3.0K  
      findMotif                  2026-04-07 16:36  7.7M  
      fixStepToBedGraph.pl       2026-04-07 16:36  1.4K  
      fixTrackDb                 2025-06-27 09:36   24M  
      gapToLift                  2026-01-14 23:38   24M  
      gencodeVersionForGenes     2026-03-20 10:34  7.6M  
      genePredCheck              2025-06-27 09:37   24M  
      genePredCompare            2026-02-27 08:52  6.6K  
      genePredFilter             2026-01-14 23:38   24M  
      genePredHisto              2025-06-27 09:36   24M  
      genePredSingleCover        2025-06-27 09:36   24M  
      genePredToBed              2025-06-27 09:36   24M  
      genePredToBigGenePred      2025-12-03 00:15   24M  
      genePredToFakePsl          2025-06-27 09:36   24M  
      genePredToGtf              2025-06-27 09:37   24M  
      genePredToMafFrames        2025-06-27 09:36   24M  
      genePredToProt             2025-12-03 00:15   24M  
      gensub2                    2026-04-07 16:36  7.5M  
      getRna                     2025-06-27 09:36   24M  
      getRnaPred                 2025-06-27 09:36   24M  
      gff3ToGenePred             2025-12-03 00:15   24M  
      gff3ToPsl                  2026-03-20 10:34  7.8M  
      gmtime                     2026-04-07 16:36   21K  
      gtfToGenePred              2025-12-03 00:15   24M  
      headRest                   2026-04-07 16:36  7.5M  
      hgBbiDbLink                2026-04-07 16:36   17M  
      hgFakeAgp                  2026-04-07 16:36  7.5M  
      hgFindSpec                 2026-01-14 23:38   24M  
      hgGcPercent                2026-01-14 23:38   24M  
      hgGoldGapGl                2026-01-14 23:38   24M  
      hgLoadBed                  2026-01-14 23:38   24M  
      hgLoadChain                2026-01-14 23:38   24M  
      hgLoadGap                  2025-06-27 09:36   24M  
      hgLoadMaf                  2026-01-14 23:38   24M  
      hgLoadMafSummary           2026-01-14 23:38   24M  
      hgLoadNet                  2026-01-14 23:38   24M  
      hgLoadOut                  2026-01-14 23:38   24M  
      hgLoadOutJoined            2026-01-14 23:38   24M  
      hgLoadSqlTab               2026-04-07 16:36   17M  
      hgLoadWiggle               2026-01-14 23:38   24M  
      hgSpeciesRna               2025-06-27 09:36   24M  
      hgTrackDb                  2026-01-14 23:38   24M  
      hgWiggle                   2026-01-14 23:38   24M  
      hgsql                      2025-06-27 09:36   17M  
      hgsqldump                  2025-06-27 09:36   16M  
      hgvsToVcf                  2025-12-03 00:15   24M  
      hicInfo                    2025-12-03 00:15   24M  
      htmlCheck                  2026-04-07 16:36  7.5M  
      hubCheck                   2025-12-03 00:15   24M  
      hubClone                   2025-12-03 00:15   24M  
      hubPublicCheck             2025-12-03 00:15   24M  
      ixIxx                      2026-04-07 16:36  7.6M  
      lastz-1.04.00              2018-04-06 11:15  611K  
      lastz_D-1.04.00            2018-04-06 11:15  614K  
      lavToAxt                   2025-06-27 09:37  7.7M  
      lavToPsl                   2025-06-27 09:37  8.2M  
      ldHgGene                   2026-01-14 23:38   24M  
      liftOver                   2025-06-27 09:36   24M  
      liftOverMerge              2025-06-27 09:36  8.2M  
      liftUp                     2026-01-14 23:38   24M  
      linesToRa                  2026-04-07 16:36  7.5M  
      localtime                  2026-04-07 16:36   21K  
      mafAddIRows                2025-06-27 09:37  7.6M  
      mafAddQRows                2025-06-27 09:37  7.6M  
      mafCoverage                2025-06-27 09:37   24M  
      mafFetch                   2025-06-27 09:37   24M  
      mafFilter                  2025-06-27 09:37  7.5M  
      mafFrag                    2025-06-27 09:37   24M  
      mafFrags                   2025-06-27 09:37   24M  
      mafGene                    2025-06-27 09:37   24M  
      mafMeFirst                 2025-06-27 09:37  7.5M  
      mafNoAlign                 2025-06-27 09:37  7.5M  
      mafOrder                   2025-06-27 09:37  7.5M  
      mafRanges                  2025-06-27 09:37  7.5M  
      mafSpeciesList             2025-06-27 09:37  7.5M  
      mafSpeciesSubset           2025-06-27 09:37  7.5M  
      mafSplit                   2025-06-27 09:37  8.2M  
      mafSplitPos                2025-06-27 09:37   24M  
      mafToAxt                   2025-06-27 09:37  7.6M  
      mafToBigMaf                2026-03-20 10:34  7.5M  
      mafToPsl                   2025-06-27 09:37  7.7M  
      mafToSnpBed                2025-06-27 09:37   24M  
      mafsInRegion               2025-06-27 09:37  8.2M  
      makeTableList              2025-12-03 00:15   24M  
      maskOutFa                  2025-06-27 09:36  8.2M  
      matrixClusterColumns       2026-04-07 16:36  7.5M  
      matrixMarketToTsv          2026-04-07 16:36  7.5M  
      matrixNormalize            2026-04-07 16:36  7.5M  
      matrixToBarChartBed        2026-04-07 16:36  7.5M  
      mktime                     2026-04-07 16:36   21K  
      mrnaToGene                 2025-06-27 09:37   24M  
      multiz/                    2024-09-30 09:44    -   
      netChainSubset             2025-06-27 09:37  7.5M  
      netClass                   2025-06-27 09:37   24M  
      netFilter                  2025-06-27 09:37  7.5M  
      netSplit                   2025-06-27 09:37  7.5M  
      netSyntenic                2025-06-27 09:37  7.5M  
      netToAxt                   2025-06-27 09:37  7.7M  
      netToBed                   2025-06-27 09:37  7.5M  
      newProg                    2026-04-07 16:36  7.5M  
      newPythonProg              2026-04-07 16:36  7.5M  
      nibFrag                    2026-04-07 16:36  7.6M  
      nibSize                    2026-04-07 16:36  7.5M  
      oligoMatch                 2026-02-26 16:35  8.3M  
      overlapSelect              2025-09-30 22:54   24M  
      para                       2026-04-07 16:36  7.7M  
      paraFetch                  2026-04-07 16:36  7.5M  
      paraHub                    2026-04-07 16:36  7.8M  
      paraHubStop                2026-04-07 16:36  7.6M  
      paraNode                   2026-04-07 16:36  7.6M  
      paraNodeStart              2026-04-07 16:36  7.5M  
      paraNodeStatus             2026-04-07 16:36  7.6M  
      paraNodeStop               2026-04-07 16:36  7.6M  
      paraSync                   2026-04-07 16:36  7.5M  
      paraTestJob                2026-04-07 16:36  7.5M  
      parasol                    2026-04-07 16:36  7.7M  
      positionalTblCheck         2025-09-30 22:54   24M  
      pslCDnaFilter              2025-06-27 09:37   24M  
      pslCat                     2025-06-27 09:37  7.7M  
      pslCheck                   2025-06-27 09:37   24M  
      pslDropOverlap             2025-06-27 09:37  7.7M  
      pslFilter                  2025-06-27 09:37  7.7M  
      pslHisto                   2025-06-27 09:37  7.7M  
      pslLiftSubrangeBlat        2025-09-30 22:54   24M  
      pslMap                     2026-04-07 16:36  7.8M  
      pslMapPostChain            2026-04-07 16:36  7.7M  
      pslMrnaCover               2025-06-27 09:37  7.7M  
      pslPairs                   2025-06-27 09:37  7.7M  
      pslPartition               2025-06-27 09:37  7.7M  
      pslPosTarget               2026-04-07 16:36  7.7M  
      pslPretty                  2025-06-27 09:37  7.9M  
      pslProtToRnaCoords         2026-04-07 16:36  7.7M  
      pslRc                      2026-04-07 16:36  7.7M  
      pslRecalcMatch             2025-06-27 09:37  7.9M  
      pslRemoveFrameShifts       2026-04-07 16:36  7.7M  
      pslReps                    2025-06-27 09:37  7.7M  
      pslScore                   2026-04-07 16:36  7.7M  
      pslSelect                  2025-06-27 09:37  7.7M  
      pslSomeRecords             2025-06-27 09:37  7.7M  
      pslSort                    2025-06-27 09:37  7.7M  
      pslSortAcc                 2025-06-27 09:37  7.7M  
      pslSpliceJunctions         2026-04-07 16:36  7.8M  
      pslSplitOnTarget           2025-06-27 09:37  7.7M  
      pslStats                   2025-06-27 09:37  7.7M  
      pslSwap                    2026-04-07 16:36  7.7M  
      pslToBed                   2025-06-27 09:37  8.2M  
      pslToBigPsl                2025-06-27 09:36   24M  
      pslToChain                 2025-06-27 09:37  7.7M  
      pslToPslx                  2026-04-07 16:36  7.9M  
      pslxToFa                   2025-06-27 09:37  7.7M  
      qaToQac                    2025-06-27 09:37  7.5M  
      qacAgpLift                 2025-06-27 09:37  7.6M  
      qacToQa                    2025-06-27 09:37  7.5M  
      qacToWig                   2025-06-27 09:37  7.5M  
      raSqlQuery                 2025-06-27 09:36   24M  
      raToLines                  2026-04-07 16:36  7.5M  
      raToTab                    2026-04-07 16:36  7.5M  
      randomLines                2026-04-07 16:36  7.5M  
      rmFaDups                   2026-04-07 16:36  7.5M  
      rmskAlignToPsl             2025-06-27 09:36  7.8M  
      rowsToCols                 2026-04-07 16:36  7.5M  
      sizeof                     2026-04-07 16:36   20K  
      spacedToTab                2026-04-07 16:36  7.5M  
      splitFile                  2026-04-07 16:36  7.5M  
      splitFileByColumn          2026-04-07 16:36  7.5M  
      sqlToXml                   2025-06-27 09:37   17M  
      strexCalc                  2026-04-07 16:36  7.6M  
      stringify                  2026-04-07 16:36  7.5M  
      subChar                    2026-04-07 16:36  7.5M  
      subColumn                  2026-04-07 16:36  7.5M  
      tabFmt                     2026-04-07 16:36  7.5M  
      tabQuery                   2017-05-23 15:11  4.1M  
      tabToTabDir                2026-04-07 16:36  7.7M  
      tailLines                  2026-04-07 16:36  7.5M  
      tdbQuery                   2025-06-27 09:36   24M  
      tdbRename                  2019-04-16 17:26  3.1K  
      tdbSort                    2019-03-05 15:18  4.2K  
      textHistogram              2026-04-07 16:36  7.5M  
      tickToDate                 2026-04-07 16:36  7.5M  
      toLower                    2026-04-07 16:36  7.5M  
      toUpper                    2026-04-07 16:36  7.5M  
      trackDbIndexBb             2020-06-11 14:09   19K  
      transMapPslToGenePred      2025-06-27 09:36   24M  
      trfBig                     2025-06-27 09:37  7.6M  
      twoBitDup                  2026-04-07 16:36  7.6M  
      twoBitInfo                 2026-04-07 16:36  7.6M  
      twoBitMask                 2025-06-27 09:36  8.3M  
      twoBitToFa                 2026-04-07 16:36  8.3M  
      ucscApiClient              2019-07-22 10:00  4.8K  
      udr                        2016-05-18 16:21  2.7M  
      vai.pl                     2020-03-09 09:57   12K  
      validateFiles              2025-06-27 09:36   24M  
      validateManifest           2025-06-27 09:36  7.6M  
      varStepToBedGraph.pl       2026-04-07 16:36  1.7K  
      vcfToBed                   2025-06-27 09:36  7.6M  
      webSync                    2018-04-16 10:33   10K  
      wigCorrelate               2026-04-07 16:36  8.5M  
      wigEncode                  2026-04-07 16:36  7.5M  
      wigToBigWig                2026-04-07 16:36  8.4M  
      wordLine                   2026-04-07 16:36  7.5M  
      xmlCat                     2025-06-27 09:37  7.5M  
      xmlToSql                   2025-06-27 09:37  7.6M

================================================================
to download all of the files from one of these admin/exe/ directories,
  for example: admin/exe/linux.x86_64/
    using the rsync command to your current directory:

  rsync -aP rsync://hgdownload.cse.ucsc.edu/genome/admin/exe/linux.x86_64/ ./

================================================================
========   addCols   ====================================
================================================================
### kent source version 362 ###
addCols - Sum columns in a text file.
usage:
   addCols <fileName>
adds all columns in the given file, 
outputs the sum of each column.  <fileName> can be the
name: stdin to accept input from stdin.
Options:
    -maxCols=N - maximum number of colums (defaults to 16)

================================================================
========   ameme   ====================================
================================================================
ameme - find common patterns in DNA
usage
    ameme good=goodIn.fa [bad=badIn.fa] [numMotifs=2] [background=m1] [maxOcc=2] [motifOutput=fileName] [html=output.html] [gif=output.gif] [rcToo=on] [controlRun=on] [startScanLimit=20] [outputLogo] [constrainer=1]
where goodIn.fa is a multi-sequence fa file containing instances
of the motif you want to find, badIn.fa is a file containing similar
sequences but lacking the motif, numMotifs is the number of motifs
to scan for, background is m0,m1, or m2 for various levels of Markov
models, maxOcc is the maximum occurrences of the motif you 
expect to find in a single sequence and motifOutput is the name 
of a file to store just the motifs in. rcToo=on searches both strands.
If you include controlRun=on in the command line, a random set of 
sequences will be generated that match your foreground data set in size, 
and your background data set in nucleotide probabilities. The program 
will then look for motifs in this random set. If the scores you get in a 
real run are about the same as those you get in a control run, then the motifs
Improbizer has found are probably not significant.

================================================================
========   autoDtd   ====================================
================================================================
### kent source version 362 ###
autoDtd - Give this a XML document to look at and it will come up with a DTD
to describe it.
usage:
   autoDtd in.xml out.dtd out.stats
options:
   -tree=out.tree - Output tag tree.
   -atree=out.atree - Output attributed tag tree.

================================================================
========   autoSql   ====================================
================================================================
### kent source version 362 ###
autoSql - create SQL and C code for permanently storing
a structure in database and loading it back into memory
based on a specification file
usage:
    autoSql specFile outRoot {optional: -dbLink -withNull -json} 
This will create outRoot.sql outRoot.c and outRoot.h based
on the contents of specFile. 

options:
  -dbLink - optionally generates code to execute queries and
            updates of the table.
  -addBin - Add an initial bin field and index it as (chrom,bin)
  -withNull - optionally generates code and .sql to enable
              applications to accept and load data into objects
              with potential 'missing data' (NULL in SQL)
              situations.
  -defaultZeros - will put zero and or empty string as default value
  -django - generate method to output object as django model Python code
  -json - generate method to output the object in JSON (JavaScript) format.

================================================================
========   autoXml   ====================================
================================================================
autoXml - Generate structures code and parser for XML file from DTD-like spec
usage:
   autoXml file.dtdx root
This will generate root.c, root.h
options:
   -textField=xxx what to name text between start/end tags. Default 'text'
   -comment=xxx Comment to appear at top of generated code files
   -picky  Generate parser that rejects stuff it doesn't understand
   -main   Put in a main routine that's a test harness
   -prefix=xxx Prefix to add to structure names. By default same as root
   -positive Don't write out optional attributes with negative values

================================================================
========   ave   ====================================
================================================================
ave - Compute average and basic stats
usage:
   ave file
options:
   -col=N Which column to use.  Default 1
   -tableOut - output by columns (default output in rows)
   -noQuartiles - only calculate min,max,mean,standard deviation
                - for large data sets that will not fit in memory.
================================================================
========   aveCols   ====================================
================================================================
aveCols - average together columns
usage:
   aveCols file
adds all columns (up to 16 columns) in the given file, 
outputs the average (sum/#ofRows) of each column.  <fileName> can be the
name: stdin to accept input from stdin.
================================================================
========   axtChain   ====================================
================================================================
axtChain - Chain together axt alignments.
usage:
   axtChain [options] -linearGap=loose in.axt tNibDir qNibDir out.chain
Where tNibDir/qNibDir are either directories full of nib files, the name
of a .2bit file, or a single fasta file with additional -faQ or -faT options.
options:
   -psl Use psl instead of axt format for input
   -faQ The specified qNibDir is a fasta file with multiple sequences for query
   -faT The specified tNibDir is a fasta file with multiple sequences for target
                NOTE: will not work with gzipped fasta files
   -minScore=N  Minimum score for chain, default 1000
   -details=fileName Output some additional chain details
   -scoreScheme=fileName Read the scoring matrix from a blastz-format file
   -linearGap=<medium|loose|filename> Specify type of linearGap to use.
              *Must* specify this argument to one of these choices.
              loose is chicken/human linear gap costs.
              medium is mouse/human linear gap costs.
              Or specify a piecewise linearGap tab delimited file.
   sample linearGap file (loose)
tablesize       11
smallSize       111
position        1       2       3       11      111     2111    12111   32111   72111   152111  252111
qGap    325     360     400     450     600     1100    3600    7600    15600   31600   56600
tGap    325     360     400     450     600     1100    3600    7600    15600   31600   56600
bothGap 625     660     700     750     900     1400    4000    8000    16000   32000   57000

================================================================
========   axtSort   ====================================
================================================================
axtSort - Sort axt files
usage:
   axtSort in.axt out.axt
options:
   -query - Sort by query position, not target
   -byScore - Sort by score

================================================================
========   axtSwap   ====================================
================================================================
axtSwap - Swap source and query in an axt file
usage:
   axtSwap source.axt target.sizes query.sizes dest.axt
options:
   -xxx=XXX

================================================================
========   axtToMaf   ====================================
================================================================
### kent source version 362 ###
axtToMaf - Convert from axt to maf format
usage:
   axtToMaf in.axt tSizes qSizes out.maf
Where tSizes and qSizes is a file that contains
the sizes of the target and query sequences.
Very often this with be a chrom.sizes file
Options:
    -qPrefix=XX. - add XX. to start of query sequence name in maf
    -tPrefex=YY. - add YY. to start of target sequence name in maf
    -tSplit Create a separate maf file for each target sequence.
            In this case output is a dir rather than a file
            In this case in.maf must be sorted by target.
    -score       - recalculate score 
    -scoreZero   - recalculate score if zero 

================================================================
========   axtToPsl   ====================================
================================================================
axtToPsl - Convert axt to psl format
usage:
   axtToPsl in.axt tSizes qSizes out.psl
Where tSizes and qSizes are tab-delimited files with
       <seqName><size>
columns.
options:
   -xxx=XXX

================================================================
========   bamToPsl   ====================================
================================================================
### kent source version 362 ###
bamToPsl - Convert a bam file to a psl and optionally also a fasta file that contains the reads.
usage:
   bamToPsl [options] in.bam out.psl
options:
   -fasta=output.fa - output query sequences to specified file
   -chromAlias=file - specify a two-column file: 1: alias, 2: other name
          for target name translation from column 1 name to column 2 name
          names not found are passed through intact
   -nohead          - do not output the PSL header, default has header output
   -allowDups       - for fasta output, allow duplicate query sequences output
                    - default is to eliminate duplicate sequences
                    - runs much faster without the duplicate check
   -noSequenceVerify - when checking for dups, do not verify each sequence
                    - when the same name is identical, assume they are
                    - helps speed up the dup check but not thorough
   -dots=N          - output progress dot(.) every N alignments processed

note: a chromAlias file can be obtained from a UCSC database, e.g.:
 hgsql -N -e 'select alias,chrom from chromAlias;' hg38 > hg38.chromAlias.tab
================================================================
========   bedClip   ====================================
================================================================
### kent source version 362 ###
bedClip - Remove lines from bed file that refer to off-chromosome locations.
usage:
   bedClip [options] input.bed chrom.sizes output.bed
chrom.sizes is a two-column file/URL: <chromosome name> <size in bases>
If the assembly <db> is hosted by UCSC, chrom.sizes can be a URL like
  http://hgdownload.cse.ucsc.edu/goldenPath/<db>/bigZips/<db>.chrom.sizes
or you may use the script fetchChromSizes to download the chrom.sizes file.
If not hosted by UCSC, a chrom.sizes file can be generated by running
twoBitInfo on the assembly .2bit file.
options:
   -truncate  - truncate items that span ends of chrom instead of the
                default of dropping the items
   -verbose=2 - set to get list of lines clipped and why
================================================================
========   bedCommonRegions   ====================================
================================================================
### kent source version 362 ###
bedCommonRegions - Create a bed file (just bed3) that contains the regions common to all inputs.
Regions are common only if exactly the same chromosome, starts, and end.  Overlap is not enough.
Each region must be in each input at most once. Output is stdout.
usage:
   bedCommonRegions file1 file2 file3 ... fileN

================================================================
========   bedCoverage   ====================================
================================================================
bedCoverage - Analyse coverage by bed files - chromosome by 
chromosome and genome-wide.
usage:
   bedCoverage database bedFile
Note bed file must be sorted by chromosome
   -restrict=restrict.bed Restrict to parts in restrict.bed

================================================================
========   bedExtendRanges   ====================================
================================================================
### kent source version 362 ###
bedExtendRanges - extend length of entries in bed 6+ data to be at least the given length,
taking strand directionality into account.

usage:
   bedExtendRanges database length files(s)

options:
   -host	mysql host
   -user	mysql user
   -password	mysql password
   -tab		Separate by tabs rather than space
   -verbose=N - verbose level for extra information to STDERR

example:

   bedExtendRanges hg18 250 stdin

   bedExtendRanges -user=genome -host=genome-mysql.cse.ucsc.edu hg18 250 stdin

will transform:
    chr1 500 525 . 100 +
    chr1 1000 1025 . 100 -
to:
    chr1 500 750 . 100 +
    chr1 775 1025 . 100 -

================================================================
========   bedGeneParts   ====================================
================================================================
### kent source version 362 ###
bedGeneParts - Given a bed, spit out promoter, first exon, or all introns.
usage:
   bedGeneParts part in.bed out.bed
Where part is either 'exons' or 'firstExon' or 'introns' or 'promoter' or 'firstCodingSplice'
or 'secondCodingSplice'
options:
   -proStart=NN - start of promoter relative to txStart, default -100
   -proEnd=NN - end of promoter relative to txStart, default 50

================================================================
========   bedGraphPack   ====================================
================================================================
### kent source version 362 ###
bedGraphPack v1 - Pack together adjacent records representing same value.
usage:
   bedGraphPack in.bedGraph out.bedGraph
The input needs to be sorted by chrom and this is checked.  To put in a pipe
use stdin and stdout in the command line in place of file names.

================================================================
========   bedGraphToBigWig   ====================================
================================================================
### kent source version 362 ###
bedGraphToBigWig v 4 - Convert a bedGraph file to bigWig format.
usage:
   bedGraphToBigWig in.bedGraph chrom.sizes out.bw
where in.bedGraph is a four column file in the format:
      <chrom> <start> <end> <value>
and chrom.sizes is a two-column file/URL: <chromosome name> <size in bases>
and out.bw is the output indexed big wig file.
If the assembly <db> is hosted by UCSC, chrom.sizes can be a URL like
  http://hgdownload.cse.ucsc.edu/goldenPath/<db>/bigZips/<db>.chrom.sizes
or you may use the script fetchChromSizes to download the chrom.sizes file.
If not hosted by UCSC, a chrom.sizes file can be generated by running
twoBitInfo on the assembly .2bit file.
The input bedGraph file must be sorted, use the unix sort command:
  sort -k1,1 -k2,2n unsorted.bedGraph > sorted.bedGraph
options:
   -blockSize=N - Number of items to bundle in r-tree.  Default 256
   -itemsPerSlot=N - Number of data points bundled at lowest level. Default 1024
   -unc - If set, do not use compression.
================================================================
========   bedIntersect   ====================================
================================================================
### kent source version 362 ###
bedIntersect - Intersect two bed files
usage:
bed columns four(name) and five(score) are optional
   bedIntersect a.bed b.bed output.bed
options:
   -aHitAny        output all of a if any of it is hit by b
   -minCoverage=0.N  min coverage of b to output match (or if -aHitAny, of a).
                   Not applied to 0-length items.  Default 0.000010
   -bScore         output score from b.bed (must be at least 5 field bed)
   -tab            chop input at tabs not spaces
   -allowStartEqualEnd  Don't discard 0-length items of a or b
                        (e.g. point insertions)

================================================================
========   bedItemOverlapCount   ====================================
================================================================
### kent source version 362 ###
bedItemOverlapCount - count number of times a base is overlapped by the
	items in a bed file.  Output is bedGraph 4 to stdout.
usage:
 sort bedFile.bed | bedItemOverlapCount [options] <database> stdin
To create a bigWig file from this data to use in a custom track:
 sort -k1,1 bedFile.bed | bedItemOverlapCount [options] <database> stdin \
         > bedFile.bedGraph
 bedGraphToBigWig bedFile.bedGraph chrom.sizes bedFile.bw
   where the chrom.sizes is obtained with the script: fetchChromSizes
   See also:
 http://genome-test.cse.ucsc.edu/~kent/src/unzipped/utils/userApps/fetchChromSizes
options:
   -zero      add blocks with zero count, normally these are ommitted
   -bed12     expect bed12 and count based on blocks
              Without this option, only the first three fields are used.
   -max       if counts per base overflows set to max (4294967295) instead of exiting
   -outBounds output min/max to stderr
   -chromSize=sizefile	Read chrom sizes from file instead of database
             sizefile contains two white space separated fields per line:
		chrom name and size
   -host=hostname	mysql host used to get chrom sizes
   -user=username	mysql user
   -password=password	mysql password

Notes:
 * You may want to separate your + and - strand
   items before sending into this program as it only looks at
   the chrom, start and end columns of the bed file.
 * Program requires a <database> connection to lookup chrom sizes for a sanity
   check of the incoming data.  Even when the -chromSize argument is used
   the <database> must be present, but it will not be used.

 * The bed file *must* be sorted by chrom
 * Maximum count per base is 4294967295. Recompile with new unitSize to increase this
================================================================
========   bedJoinTabOffset   ====================================
================================================================
Usage: bedJoinTabOffset [options] inTabFile inBedFile outBedFile - given a bed file and tab file where each have a column with matching values: first get the value of column0, the offset and line length from inTabFile. Then go over the bed file, use the name field and append its offset and length to the bed file as two separate fields. Write the new bed file to outBed.

Options:
  -h, --help            show this help message and exit
  -d, --debug           show debug messages
  -t TABKEYFIELD, --tabKeyField=TABKEYFIELD
                        the index of the key field in the tab file that
                        matches the key field in the bed file. default 0
  -b BEDKEYFIELD, --bedKeyField=BEDKEYFIELD
                        the index of the key field in the bed file that
                        matches the key field in the tab file. default 3
================================================================
========   bedPileUps   ====================================
================================================================
### kent source version 362 ###
bedPileUps - Find (exact) overlaps if any in bed input
usage:
   bedPileUps in.bed
Where in.bed is in one of the ascii bed formats.
The in.bed file must be sorted by chromosome,start,
  to sort a bed file, use the unix sort command:
     sort -k1,1 -k2,2n unsorted.bed > sorted.bed

Options:
  -name - include BED name field 4 when evaluating uniqueness
  -tab  - use tabs to parse fields
  -verbose=2 - show the location and size of each pileUp

================================================================
========   bedRemoveOverlap   ====================================
================================================================
### kent source version 362 ###
bedRemoveOverlap - Remove overlapping records from a (sorted) bed file.  Gets rid of
`the smaller of overlapping records.
usage:
   bedRemoveOverlap in.bed out.bed
options:
   -xxx=XXX

================================================================
========   bedRestrictToPositions   ====================================
================================================================
### kent source version 362 ###
bedRestrictToPositions - Filter bed file, restricting to only ones that match chrom/start/ends specified in restrict.bed file.
usage:
   bedRestrictToPositions in.bed restrict.bed out.bed
options:
   -xxx=XXX

================================================================
========   bedSort   ====================================
================================================================
bedSort - Sort a .bed file by chrom,chromStart
usage:
   bedSort in.bed out.bed
in.bed and out.bed may be the same.
================================================================
========   bedToBigBed   ====================================
================================================================
### kent source version 362 ###
bedToBigBed v. 2.7 - Convert bed file to bigBed. (BigBed version: 4)
usage:
   bedToBigBed in.bed chrom.sizes out.bb
Where in.bed is in one of the ascii bed formats, but not including track lines
and chrom.sizes is a two-column file/URL: <chromosome name> <size in bases>
and out.bb is the output indexed big bed file.
If the assembly <db> is hosted by UCSC, chrom.sizes can be a URL like
  http://hgdownload.cse.ucsc.edu/goldenPath/<db>/bigZips/<db>.chrom.sizes
or you may use the script fetchChromSizes to download the chrom.sizes file.
If not hosted by UCSC, a chrom.sizes file can be generated by running
twoBitInfo on the assembly .2bit file.
The in.bed file must be sorted by chromosome,start,
  to sort a bed file, use the unix sort command:
     sort -k1,1 -k2,2n unsorted.bed > sorted.bed
Sorting must be set to skip Unicode mapping (LC_COLLATE=C).

options:
   -type=bedN[+[P]] : 
                      N is between 3 and 15, 
                      optional (+) if extra "bedPlus" fields, 
                      optional P specifies the number of extra fields. Not required, but preferred.
                      Examples: -type=bed6 or -type=bed6+ or -type=bed6+3 
                      (see http://genome.ucsc.edu/FAQ/FAQformat.html#format1)
   -as=fields.as - If you have non-standard "bedPlus" fields, it's great to put a definition
                   of each field in a row in AutoSql format here.
   -blockSize=N - Number of items to bundle in r-tree.  Default 256
   -itemsPerSlot=N - Number of data points bundled at lowest level. Default 512
   -unc - If set, do not use compression.
   -tab - If set, expect fields to be tab separated, normally
           expects white space separator.
   -extraIndex=fieldList - If set, make an index on each field in a comma separated list
           extraIndex=name and extraIndex=name,id are commonly used.
   -sizesIs2Bit  -- If set, the chrom.sizes file is assumed to be a 2bit file.
   -udcDir=/path/to/udcCacheDir  -- sets the UDC cache dir for caching of remote files.

================================================================
========   bedToExons   ====================================
================================================================
### kent source version 362 ###
bedToExons - Split a bed up into individual beds.
One for each internal exon.
usage:
   bedToExons originalBeds.bed splitBeds.bed
options:
   -cdsOnly - Only output the coding portions of exons.

================================================================
========   bedToGenePred   ====================================
================================================================
### kent source version 362 ###
Too few arguments:
bedToGenePred - convert bed format files to genePred format
usage:
   bedToGenePred bedFile genePredFile

Convert a bed file to a genePred file. If BED has at least 12 columns,
then a genePred with blocks is created. Otherwise single-exon genePreds are
created.

================================================================
========   bedToPsl   ====================================
================================================================
### kent source version 362 ###
Too few arguments:
bedToPsl - convert bed format files to psl format
usage:
   bedToPsl chromSizes bedFile pslFile

Convert a BED file to a PSL file. This the result is an alignment.
 It is intended to allow processing by tools that operate on PSL.
If the BED has at least 12 columns, then a PSL with blocks is created.
Otherwise single-exon PSLs are created.

Options:
-keepQuery  -  instead of creating a fake query, create PSL with identical query and
                target specs. Useful if bed features are to be lifted with pslMap and one 
                wants to keep the source location in the lift result.

================================================================
========   bedWeedOverlapping   ====================================
================================================================
### kent source version 362 ###
bedWeedOverlapping - Filter out beds that overlap a 'weed.bed' file.
usage:
   bedWeedOverlapping weeds.bed input.bed output.bed
options:
   -maxOverlap=0.N - maximum overlapping ratio, default 0 (any overlap)
   -invert - keep the overlapping and get rid of everything else

================================================================
========   bigBedInfo   ====================================
================================================================
### kent source version 362 ###
bigBedInfo - Show information about a bigBed file.
usage:
   bigBedInfo file.bb
options:
   -udcDir=/dir/to/cache - place to put cache for remote bigBed/bigWigs
   -chroms - list all chromosomes and their sizes
   -zooms - list all zoom levels and their sizes
   -as - get autoSql spec
   -extraIndex - list all the extra indexes

================================================================
========   bigBedNamedItems   ====================================
================================================================
### kent source version 362 ###
bigBedNamedItems - Extract item of given name from bigBed
usage:
   bigBedNamedItems file.bb name output.bed
options:
   -nameFile - if set, treat name parameter as file full of space delimited names
   -field=fieldName - use index on field name, default is "name"

================================================================
========   bigBedSummary   ====================================
================================================================
### kent source version 362 ###
bigBedSummary - Extract summary information from a bigBed file.
usage:
   bigBedSummary file.bb chrom start end dataPoints
Get summary data from bigBed for indicated region, broken into
dataPoints equal parts.  (Use dataPoints=1 for simple summary.)
options:
   -type=X where X is one of:
         coverage - % of region that is covered (default)
         mean - average depth of covered regions
         min - minimum depth of covered regions
         max - maximum depth of covered regions
   -fields - print out information on fields in file.
      If fields option is used, the chrom, start, end, dataPoints
      parameters may be omitted
   -udcDir=/dir/to/cache - place to put cache for remote bigBed/bigWigs

================================================================
========   bigBedToBed   ====================================
================================================================
### kent source version 362 ###
bigBedToBed v1 - Convert from bigBed to ascii bed format.
usage:
   bigBedToBed input.bb output.bed
options:
   -chrom=chr1 - if set restrict output to given chromosome
   -start=N - if set, restrict output to only that over start
   -end=N - if set, restict output to only that under end
   -maxItems=N - if set, restrict output to first N items
   -udcDir=/dir/to/cache - place to put cache for remote bigBed/bigWigs

================================================================
========   bigMafToMaf   ====================================
================================================================
### kent source version 362 ###
bigMafToMaf - convert bigMaf to maf file
usage:
   bigMafToMaf bigMaf.bb file.maf
options:
   -xxx=XXX

================================================================
========   bigPslToPsl   ====================================
================================================================
### kent source version 362 ###
bigPslToPsl - convert bigPsl file to psl
usage:
   bigPslToPsl bigPsl.bb output.psl
options:
   -collapseStrand   if target strand is '+', don't output it

================================================================
========   bigWigAverageOverBed   ====================================
================================================================
### kent source version 362 ###
bigWigAverageOverBed v2 - Compute average score of big wig over each bed, which may have introns.
usage:
   bigWigAverageOverBed in.bw in.bed out.tab
The output columns are:
   name - name field from bed, which should be unique
   size - size of bed (sum of exon sizes
   covered - # bases within exons covered by bigWig
   sum - sum of values over all bases covered
   mean0 - average over bases with non-covered bases counting as zeroes
   mean - average over just covered bases
Options:
   -stats=stats.ra - Output a collection of overall statistics to stat.ra file
   -bedOut=out.bed - Make output bed that is echo of input bed but with mean column appended
   -sampleAroundCenter=N - Take sample at region N bases wide centered around bed item, rather
                     than the usual sample in the bed item.
   -minMax - include two additional columns containing the min and max observed in the area.

================================================================
========   bigWigCat   ====================================
================================================================
### kent source version 362 ###
bigWigCat v 4 - merge non-overlapping bigWig files
directly into bigWig format
usage:
   bigWigCat out.bw in1.bw in2.bw ...
Where in*.bw is in big wig format
and out.bw is the output indexed big wig file.
options:
   -itemsPerSlot=N - Number of data points bundled at lowest level. Default 1024

Note: must use wigToBigWig -fixedSummaries -keepAllChromosomes (perhaps in parallel cluster jobs) to create the input files.
Note: By non-overlapping we mean the entire span of each file, from first data point to last data point, must not overlap with that of other files.

================================================================
========   bigWigCluster   ====================================
================================================================
### kent source version 362 ###
bigWigCluster - Cluster bigWigs using a hacTree
usage:
   bigWigCluster input.list chrom.sizes output.json output.tab
where: input.list is a list of bigWig file names
       chrom.sizes is tab separated <chrom><size> for assembly for bigWigs
       output.json is json formatted output suitable for graphing with D3
       output.tab is tab-separated file of  of items ordered by tree with the fields
           label - label from -labels option or from file name with no dir or extention
           pos - number from 0-1 representing position according to tree and distance
           red - number from 0-255 representing recommended red component of color
           green - number from 0-255 representing recommended green component of color
           blue - number from 0-255 representing recommended blue component of color
           path - file name from input.list including directory and extension
options:
   -labels=fileName - label files from tabSeparated file with fields
           path - path to bigWig file
           label - a string with no tabs
   -precalc=precalc.tab - tab separated file with <file1> <file2> <distance>
            columns.
   -threads=N - number of threads to use, default 10
   -tmpDir=/tmp/path - place to put temp files, default current dir

================================================================
========   bigWigCorrelate   ====================================
================================================================
### kent source version 362 ###
bigWigCorrelate - Correlate bigWig files, optionally only on target regions.
usage:
   bigWigCorrelate a.bigWig b.bigWig
or
   bigWigCorrelate listOfFiles
options:
   -restrict=restrict.bigBed - restrict correlation to parts covered by this file
   -threshold=N.N - clip values to this threshold
   -rootNames - if set just report the root (minus directory and suffix) of file
                names when using listOfFiles
   -ignoreMissing - if set do not correlate where either side is missing data
                Normally missing data is treated as zeros

================================================================
========   bigWigInfo   ====================================
================================================================
### kent source version 362 ###
bigWigInfo - Print out information about bigWig file.
usage:
   bigWigInfo file.bw
options:
   -udcDir=/dir/to/cache - place to put cache for remote bigBed/bigWigs
   -chroms - list all chromosomes and their sizes
   -zooms - list all zoom levels and their sizes
   -minMax - list the min and max on a single line

================================================================
========   bigWigMerge   ====================================
================================================================
### kent source version 362 ###
bigWigMerge v2 - Merge together multiple bigWigs into a single output bedGraph.
You'll have to run bedGraphToBigWig to make the output bigWig.
The signal values are just added together to merge them
usage:
   bigWigMerge in1.bw in2.bw .. inN.bw out.bedGraph
options:
   -threshold=0.N - don't output values at or below this threshold. Default is 0.0
   -adjust=0.N - add adjustment to each value
   -clip=NNN.N - values higher than this are clipped to this value
   -inList - input file are lists of file names of bigWigs
   -max - merged value is maximum from input files rather than sum

================================================================
========   bigWigSummary   ====================================
================================================================
### kent source version 362 ###
bigWigSummary - Extract summary information from a bigWig file.
usage:
   bigWigSummary file.bigWig chrom start end dataPoints
Get summary data from bigWig for indicated region, broken into
dataPoints equal parts.  (Use dataPoints=1 for simple summary.)

NOTE:  start and end coordinates are in BED format (0-based)

options:
   -type=X where X is one of:
         mean - average value in region (default)
         min - minimum value in region
         max - maximum value in region
         std - standard deviation in region
         coverage - % of region that is covered
   -udcDir=/dir/to/cache - place to put cache for remote bigBed/bigWigs

================================================================
========   bigWigToBedGraph   ====================================
================================================================
### kent source version 362 ###
bigWigToBedGraph - Convert from bigWig to bedGraph format.
usage:
   bigWigToBedGraph in.bigWig out.bedGraph
options:
   -chrom=chr1 - if set restrict output to given chromosome
   -start=N - if set, restrict output to only that over start
   -end=N - if set, restict output to only that under end
   -udcDir=/dir/to/cache - place to put cache for remote bigBed/bigWigs

================================================================
========   bigWigToWig   ====================================
================================================================
### kent source version 362 ###
bigWigToWig - Convert bigWig to wig.  This will keep more of the same structure of the
original wig than bigWigToBedGraph does, but still will break up large stepped sections
into smaller ones.
usage:
   bigWigToWig in.bigWig out.wig
options:
   -chrom=chr1 - if set restrict output to given chromosome
   -start=N - if set, restrict output to only that over start
   -end=N - if set, restict output to only that under end
   -udcDir=/dir/to/cache - place to put cache for remote bigBed/bigWigs

================================================================
========   blastToPsl   ====================================
================================================================
### kent source version 362 ###
blastToPsl - Convert blast alignments to PSLs.

usage:
   blastToPsl [options] blastOutput psl

Options:
  -scores=file - Write score information to this file.  Format is:
       strands qName qStart qEnd tName tStart tEnd bitscore eVal
  -verbose=n - n >= 3 prints each line of file after parsing.
               n >= 4 dumps the result of each query
  -eVal=n n is e-value threshold to filter results. Format can be either
          an integer, double or 1e-10. Default is no filter.
  -pslx - create PSLX output (includes sequences for blocks)

Output only results of last round from PSI BLAST

================================================================
========   blastXmlToPsl   ====================================
================================================================
### kent source version 362 ###
blastXmlToPsl - convert blast XML output to PSLs
usage:
   blastXmlToPsl [options] blastXml psl

options:
  -scores=file - Write score information to this file.  Format is:
       strands qName qStart qEnd tName tStart tEnd bitscore eVal qDef tDef
  -verbose=n - n >= 3 prints each line of file after parsing.
               n >= 4 dumps the result of each query
  -eVal=n n is e-value threshold to filter results. Format can be either
          an integer, double or 1e-10. Default is no filter.
  -pslx - create PSLX output (includes sequences for blocks)
  -convertToNucCoords - convert protein to nucleic alignments to nucleic
   to nucleic coordinates
  -qName=src - define element used to obtain the qName.  The following
   values are support:
     o query-ID - use contents of the <Iteration_query-ID> element if it
       exists, otherwise use <BlastOutput_query-ID>
     o query-def0 - use the first white-space separated word of the
       <Iteration_query-def> element if it exists, otherwise the first word
       of <BlastOutput_query-def>.
   Default is query-def0.
  -tName=src - define element used to obtain the tName.  The following
   values are support:
     o Hit_id - use contents of the <Hit-id> element.
     o Hit_def0 - use the first white-space separated word of the
       <Hit_def> element.
     o Hit_accession - contents of the <Hit_accession> element.
   Default is Hit-def0.
  -forcePsiBlast - treat as output of PSI-BLAST. blast-2.2.16 and maybe
   others indentify psiblast as blastp.
Output only results of last round from PSI BLAST

================================================================
========   blat   ====================================
================================================================
### kent source version 362 ###
blat - Standalone BLAT v. 36x2 fast sequence search command line tool
usage:
   blat database query [-ooc=11.ooc] output.psl
where:
   database and query are each either a .fa, .nib or .2bit file,
      or a list of these files with one file name per line.
   -ooc=11.ooc tells the program to load over-occurring 11-mers from
      an external file.  This will increase the speed
      by a factor of 40 in many cases, but is not required.
   output.psl is the name of the output file.
   Subranges of .nib and .2bit files may be specified using the syntax:
      /path/file.nib:seqid:start-end
   or
      /path/file.2bit:seqid:start-end
   or
      /path/file.nib:start-end
   With the second form, a sequence id of file:start-end will be used.
options:
   -t=type        Database type.  Type is one of:
                    dna - DNA sequence
                    prot - protein sequence
                    dnax - DNA sequence translated in six frames to protein
                  The default is dna.
   -q=type        Query type.  Type is one of:
                    dna - DNA sequence
                    rna - RNA sequence
                    prot - protein sequence
                    dnax - DNA sequence translated in six frames to protein
                    rnax - DNA sequence translated in three frames to protein
                  The default is dna.
   -prot          Synonymous with -t=prot -q=prot.
   -ooc=N.ooc     Use overused tile file N.ooc.  N should correspond to 
                  the tileSize.
   -tileSize=N    Sets the size of match that triggers an alignment.  
                  Usually between 8 and 12.
                  Default is 11 for DNA and 5 for protein.
   -stepSize=N    Spacing between tiles. Default is tileSize.
   -oneOff=N      If set to 1, this allows one mismatch in tile and still
                  triggers an alignment.  Default is 0.
   -minMatch=N    Sets the number of tile matches.  Usually set from 2 to 4.
                  Default is 2 for nucleotide, 1 for protein.
   -minScore=N    Sets minimum score.  This is the matches minus the 
                  mismatches minus some sort of gap penalty.  Default is 30.
   -minIdentity=N Sets minimum sequence identity (in percent).  Default is
                  90 for nucleotide searches, 25 for protein or translated
                  protein searches.
   -maxGap=N      Sets the size of maximum gap between tiles in a clump.  Usually
                  set from 0 to 3.  Default is 2. Only relevent for minMatch > 1.
   -noHead        Suppresses .psl header (so it's just a tab-separated file).
   -makeOoc=N.ooc Make overused tile file. Target needs to be complete genome.
   -repMatch=N    Sets the number of repetitions of a tile allowed before
                  it is marked as overused.  Typically this is 256 for tileSize
                  12, 1024 for tile size 11, 4096 for tile size 10.
                  Default is 1024.  Typically comes into play only with makeOoc.
                  Also affected by stepSize: when stepSize is halved, repMatch is
                  doubled to compensate.
   -mask=type     Mask out repeats.  Alignments won't be started in masked region
                  but may extend through it in nucleotide searches.  Masked areas
                  are ignored entirely in protein or translated searches. Types are:
                    lower - mask out lower-cased sequence
                    upper - mask out upper-cased sequence
                    out   - mask according to database.out RepeatMasker .out file
                    file.out - mask database according to RepeatMasker file.out
   -qMask=type    Mask out repeats in query sequence.  Similar to -mask above, but
                  for query rather than target sequence.
   -repeats=type  Type is same as mask types above.  Repeat bases will not be
                  masked in any way, but matches in repeat areas will be reported
                  separately from matches in other areas in the psl output.
   -minRepDivergence=NN   Minimum percent divergence of repeats to allow 
                  them to be unmasked.  Default is 15.  Only relevant for 
                  masking using RepeatMasker .out files.
   -dots=N        Output dot every N sequences to show program's progress.
   -trimT         Trim leading poly-T.
   -noTrimA       Don't trim trailing poly-A.
   -trimHardA     Remove poly-A tail from qSize as well as alignments in 
                  psl output.
   -fastMap       Run for fast DNA/DNA remapping - not allowing introns, 
                  requiring high %ID. Query sizes must not exceed 5000.
   -out=type      Controls output file format.  Type is one of:
                    psl - Default.  Tab-separated format, no sequence
                    pslx - Tab-separated format with sequence
                    axt - blastz-associated axt format
                    maf - multiz-associated maf format
                    sim4 - similar to sim4 format
                    wublast - similar to wublast format
                    blast - similar to NCBI blast format
                    blast8- NCBI blast tabular format
                    blast9 - NCBI blast tabular format with comments
   -fine          For high-quality mRNAs, look harder for small initial and
                  terminal exons.  Not recommended for ESTs.
   -maxIntron=N  Sets maximum intron size. Default is 750000.
   -extendThroughN  Allows extension of alignment through large blocks of Ns.
================================================================
========   calc   ====================================
================================================================
### kent source version 362 ###
calc - Little command line calculator
usage:
   calc this + that * theOther / (a + b)
Options:
  -h - output result as a human-readable integer numbers, with k/m/g/t suffix

================================================================
========   catDir   ====================================
================================================================
catDir - concatenate files in directory to stdout.
For those times when too many files for cat to handle.
usage:
   catDir dir(s)
options:
   -r            Recurse into subdirectories
   -suffix=.suf  This will restrict things to files ending in .suf
   '-wild=*.???' This will match wildcards.
   -nonz         Prints file name of non-zero length files

================================================================
========   catUncomment   ====================================
================================================================
catUncomment - Concatenate input removing lines that start with '#'
Output goes to stdout
usage:
   catUncomment file(s)

================================================================
========   chainAntiRepeat   ====================================
================================================================
### kent source version 362 ###
chainAntiRepeat - Get rid of chains that are primarily the results of repeats and degenerate DNA
usage:
   chainAntiRepeat tNibDir qNibDir inChain outChain
options:
   -minScore=N - minimum score (after repeat stuff) to pass
   -noCheckScore=N - score that will pass without checks (speed tweak)

================================================================
========   chainFilter   ====================================
================================================================
### kent source version 362 ###
chainFilter - Filter chain files.  Output goes to standard out.
usage:
   chainFilter file(s)
options:
   -q=chr1,chr2 - restrict query side sequence to those named
   -notQ=chr1,chr2 - restrict query side sequence to those not named
   -t=chr1,chr2 - restrict target side sequence to those named
   -notT=chr1,chr2 - restrict target side sequence to those not named
   -id=N - only get one with ID number matching N
   -minScore=N - restrict to those scoring at least N
   -maxScore=N - restrict to those scoring less than N
   -qStartMin=N - restrict to those with qStart at least N
   -qStartMax=N - restrict to those with qStart less than N
   -qEndMin=N - restrict to those with qEnd at least N
   -qEndMax=N - restrict to those with qEnd less than N
   -tStartMin=N - restrict to those with tStart at least N
   -tStartMax=N - restrict to those with tStart less than N
   -tEndMin=N - restrict to those with tEnd at least N
   -tEndMax=N - restrict to those with tEnd less than N
   -qOverlapStart=N - restrict to those where the query overlaps a region starting here
   -qOverlapEnd=N - restrict to those where the query overlaps a region ending here
   -tOverlapStart=N - restrict to those where the target overlaps a region starting here
   -tOverlapEnd=N - restrict to those where the target overlaps a region ending here
   -strand=?    -restrict strand (to + or -)
   -long        -output in long format
   -zeroGap     -get rid of gaps of length zero
   -minGapless=N - pass those with minimum gapless block of at least N
   -qMinGap=N     - pass those with minimum gap size of at least N
   -tMinGap=N     - pass those with minimum gap size of at least N
   -qMaxGap=N     - pass those with maximum gap size no larger than N
   -tMaxGap=N     - pass those with maximum gap size no larger than N
   -qMinSize=N    - minimum size of spanned query region
   -qMaxSize=N    - maximum size of spanned query region
   -tMinSize=N    - minimum size of spanned target region
   -tMaxSize=N    - maximum size of spanned target region
   -noRandom      - suppress chains involving '_random' chromosomes
   -noHap         - suppress chains involving '_hap|_alt' chromosomes

================================================================
========   chainMergeSort   ====================================
================================================================
### kent source version 362 ###
chainMergeSort - Combine sorted files into larger sorted file
usage:
   chainMergeSort file(s)
Output goes to standard output
options:
   -saveId - keep the existing chain ids.
   -inputList=somefile - somefile contains list of input chain files.
   -tempDir=somedir/ - somedir has space for temporary sorting data, default ./

================================================================
========   chainNet   ====================================
================================================================
### kent source version 362 ###
chainNet - Make alignment nets out of chains
usage:
   chainNet in.chain target.sizes query.sizes target.net query.net
where:
   in.chain is the chain file sorted by score
   target.sizes contains the size of the target sequences
   query.sizes contains the size of the query sequences
   target.net is the output over the target genome
   query.net is the output over the query genome
options:
   -minSpace=N - minimum gap size to fill, default 25
   -minFill=N  - default half of minSpace
   -minScore=N - minimum chain score to consider, default 2000.0
   -verbose=N - Alter verbosity (default 1)
   -inclHap - include query sequences name in the form *_hap*|*_alt*.
              Normally these are excluded from nets as being haplotype
              pseudochromosomes

================================================================
========   chainPreNet   ====================================
================================================================
### kent source version 362 ###
chainPreNet - Remove chains that don't have a chance of being netted
usage:
   chainPreNet in.chain target.sizes query.sizes out.chain
options:
   -dots=N - output a dot every so often
   -pad=N - extra to pad around blocks to decrease trash
            (default 1)
   -inclHap - include query sequences name in the form *_hap*|*_alt*.
              Normally these are excluded from nets as being haplotype
              pseudochromosomes

================================================================
========   chainSort   ====================================
================================================================
### kent source version 362 ###
chainSort - Sort chains.  By default sorts by score.
Note this loads all chains into memory, so it is not
suitable for large sets.  Instead, run chainSort on
multiple small files, followed by chainMergeSort.
usage:
   chainSort inFile outFile
Note that inFile and outFile can be the same
options:
   -target sort on target start rather than score
   -query sort on query start rather than score
   -index=out.tab build simple two column index file
                    <out file position>  <value>
                  where <value> is score, target, or query 
                  depending on the sort.

================================================================
========   chainSplit   ====================================
================================================================
### kent source version 362 ###
chainSplit - Split chains up by target or query sequence
usage:
   chainSplit outDir inChain(s)
options:
   -q  - Split on query (default is on target)
   -lump=N  Lump together so have only N split files.

================================================================
========   chainStitchId   ====================================
================================================================
### kent source version 362 ###
chainStitchId - Join chain fragments with the same chain ID into a single
   chain per ID.  Chain fragments must be from same original chain but
   must not overlap.  Chain fragment scores are summed.
usage:
   chainStitchId in.chain out.chain

================================================================
========   chainSwap   ====================================
================================================================
chainSwap - Swap target and query in chain
usage:
   chainSwap in.chain out.chain

================================================================
========   chainToAxt   ====================================
================================================================
### kent source version 362 ###
chainToAxt - Convert from chain to axt file
usage:
   chainToAxt in.chain tNibDirOr2bit qNibDirOr2bit out.axt
options:
   -maxGap=maximum gap sized allowed without breaking, default 100
   -maxChain=maximum chain size allowed without breaking, default 1073741823
   -minScore=minimum score of chain
   -minId=minimum percentage ID within blocks
   -bed  Output bed instead of axt

================================================================
========   chainToPsl   ====================================
================================================================
chainToPsl - Convert chain file to psl format
usage:
   chainToPsl in.chain tSizes qSizes target.lst query.lst out.psl
Where tSizes and qSizes are tab-delimited files with
       <seqName><size>
columns.
The target and query lists can either be fasta files, nib files, 2bit files
or a list of fasta, 2bit and/or nib files one per line

================================================================
========   chainToPslBasic   ====================================
================================================================
### kent source version 362 ###
chainToPslBasic - Basic conversion chain file to psl format
usage:
   chainToPsl in.chain out.psl
If you need match and mismatch stats updated, pipe output through pslRecalcMatch

================================================================
========   checkAgpAndFa   ====================================
================================================================
### kent source version 362 ###

checkAgpAndFa - takes a .agp file and .fa file and ensures that they are in synch
usage:

   checkAgpAndFa in.agp in.fa

options:
   -exclude=seq - Ignore seq (e.g. chrM for which we usually get
                  sequence from GenBank but don't have AGP)
in.fa can be a .2bit file.  If it is .fa then sequences must appear
in the same order in .agp and .fa.


================================================================
========   checkCoverageGaps   ====================================
================================================================
### kent source version 362 ###
checkCoverageGaps - Check for biggest gap in coverage for a list of tracks.
For most tracks coverage of 10,000,000 or more will indicate that there was
a mistake in generating the track.
usage:
   checkCoverageGaps database track1 ... trackN
Note: for bigWig and bigBeds, the biggest gap is rounded to the nearest 10,000 or so
options:
   -allParts  If set then include _hap and _random and other wierd chroms
   -female If set then don't check chrY
   -noComma - Don't put commas in biggest gap output

================================================================
========   checkHgFindSpec   ====================================
================================================================
### kent source version 362 ###
checkHgFindSpec - test and describe search specs in hgFindSpec tables.
usage:
  checkHgFindSpec database [options | termToSearch]
If given a termToSearch, displays the list of tables that will be searched
and how long it took to figure that out; then performs the search and the
time it took.
options:
  -showSearches       Show the order in which tables will be searched in
                      general.  [This will be done anyway if no
                      termToSearch or options are specified.]
  -checkTermRegex     For each search spec that includes a regular
                      expression for terms, make sure that all values of
                      the table field to be searched match the regex.  (If
                      not, some of them could be excluded from searches.)
  -checkIndexes       Make sure that an index is defined on each field to
                      be searched.

================================================================
========   checkTableCoords   ====================================
================================================================
### kent source version 362 ###
checkTableCoords - check invariants on genomic coords in table(s).
usage:
  checkTableCoords database [tableName]
Searches for illegal genomic coordinates in all tables in database
unless narrowed down using options.  Uses ~/.hg.conf to determine
genome database connection info.  For psl/alignment tables, checks
target coords only.
options:
  -table=tableName  Check this table only.  (Default: all tables)
  -daysOld=N        Check tables that have been modified at most N days ago.
  -hoursOld=N       Check tables that have been modified at most N hours ago.
                    (days and hours are additive)
  -exclude=patList  Exclude tables matching any pattern in comma-separated
                    patList.  patList can contain wildcards (*?) but should
                    be escaped or single-quoted if it does.  patList can
                    contain "genbank" which will be expanded to all tables
                    generated by the automated genbank build process.
  -ignoreBlocks     To save time (but lose coverage), skip block coord checks.
  -verboseBlocks    Print out more details about illegal block coords, since 
                    they can't be found by simple SQL queries.

================================================================
========   chopFaLines   ====================================
================================================================
chopFaLines - Read in FA file with long lines and rewrite it with shorter lines
usage:
   chopFaLines in.fa out.fa

================================================================
========   chromGraphFromBin   ====================================
================================================================
### kent source version 362 ###
chromGraphFromBin - Convert chromGraph binary to ascii format.
usage:
   chromGraphFromBin in.chromGraph out.tab
options:
   -chrom=chrX - restrict output to single chromosome

================================================================
========   chromGraphToBin   ====================================
================================================================
### kent source version 362 ###
chromGraphToBin - Make binary version of chromGraph.
usage:
   chromGraphToBin in.tab out.chromGraph
options:
   -xxx=XXX

================================================================
========   colTransform   ====================================
================================================================
colTransform - Add and/or multiply column by constant.
usage:
   colTransform column input.tab addFactor mulFactor output.tab
where:
   column is the column to transform, starting with 1
   input.tab is the tab delimited input file
   addFactor is what to add.  Use 0 here to not change anything
   mulFactor is what to multiply by.  Use 1 here not to change anything
   output.tab is the tab delimited output file

================================================================
========   countChars   ====================================
================================================================
countChars - Count the number of occurrences of a particular char
usage:
   countChars char file(s)
Char can either be a two digit hexadecimal value or
a single letter literal character
================================================================
========   crTreeIndexBed   ====================================
================================================================
### kent source version 362 ###
crTreeIndexBed - Create an index for a bed file.
usage:
   crTreeIndexBed in.bed out.cr
options:
   -blockSize=N - number of children per node in index tree. Default 1024
   -itemsPerSlot=N - number of items per index slot. Default is half block size
   -noCheckSort - Don't check sorting order of in.tab

================================================================
========   crTreeSearchBed   ====================================
================================================================
### kent source version 362 ###
crTreeSearchBed - Search a crTree indexed bed file and print all items that overlap query.
usage:
   crTreeSearchBed file.bed index.cr chrom start end

================================================================
========   dbSnoop   ====================================
================================================================
### kent source version 362 ###
dbSnoop - Produce an overview of a database.
usage:
   dbSnoop database output
options:
   -unsplit - if set will merge together tables split by chromosome
   -noNumberCommas - if set will leave out commas in big numbers
   -justSchema - only schema parts, no contents
   -skipTable=tableName - if set skip a given table name
   -profile=profileName - use profile for connection settings, default = 'db'

================================================================
========   dbTrash   ====================================
================================================================
### kent source version 362 ###
dbTrash - drop tables from a database older than specified N hours
usage:
   dbTrash -age=N [-drop] [-historyToo] [-db=<DB>] [-verbose=N]
options:
   -age=N - number of hours old to qualify for drop.  N can be a float.
   -drop - actually drop the tables, default is merely to display tables.
   -db=<DB> - Specify a database to work with, default is customTrash.
   -historyToo - also consider the table called 'history' for deletion.
               - default is to leave 'history' alone no matter how old.
               - this applies to the table 'metaInfo' also.
   -extFile    - check extFile for lines that reference files
               - no longer in trash
   -extDel     - delete lines in extFile that fail file check
               - otherwise just verbose(2) lines that would be deleted
   -topDir     - directory name to prepend to file names in extFile
               - default is /usr/local/apache/trash
               - file names in extFile are typically: "../trash/ct/..."
   -tableStatus  - use 'show table status' to get size data, very inefficient
   -delLostTable - delete tables that exist but are missing from metaInfo
                 - this operation can be even slower than -tableStatus
                 - if there are many tables to check.
   -verbose=N - 2 == show arguments, dates, and dropped tables,
              - 3 == show date information for all tables.
================================================================
========   estOrient   ====================================
================================================================
### kent source version 362 ###
wrong # of args:
estOrient [options] db estTable outPsl

Read ESTs from a database and determine orientation based on
estOrientInfo table or direction in gbCdnaInfo table.  Update
PSLs so that the strand reflects the direction of transcription.
By default, PSLs where the direction can't be determined are dropped.

Options:
   -chrom=chr - process this chromosome, maybe repeated
   -keepDisoriented - don't drop ESTs where orientation can't
    be determined.
   -disoriented=psl - output ESTs that where orientation can't
    be determined to this file.
   -inclVer - add NCBI version number to accession if not already
    present.
   -fileInput - estTable is a psl file
   -estOrientInfo=file - instead of getting the orientation information
    from the estOrientInfo table, load it from this file.  This data is the
    output of polyInfo command.  If this option is specified, the direction
    will not be looked up in the gbCdnaInfo table and db can be `no'.
   -info=infoFile - write information about each EST to this tab
    separated file 
       qName tName tStart tEnd origStrand newStrand orient
    where orient is < 0 if PSL was reverse, > 0 if it was left unchanged
    and 0 if the orientation couldn't be determined (and was left
    unchanged).

================================================================
========   expMatrixToBarchartBed   ====================================
================================================================
usage: expMatrixToBarchartBed [-h] [--groupOrderFile GROUPORDERFILE]
                              [--useMean] [--verbose]
                              sampleFile matrixFile bedFile outputFile

Generate a barChart bed6+5 file from a matrix, meta data, and coordinates.

positional arguments:
  sampleFile            Two column no header, the first column is the samples
                        which should match the matrix, the second is the
                        grouping (cell type, tissue, etc)
  matrixFile            The input matrix file. The samples in the first row
                        should exactly match the ones in the sampleFile. The
                        labels (ex ENST*****) in the first column should
                        exactly match the ones in the bed file.
  bedFile               Bed6+1 format. File that maps the column labels from
                        the matrix to coordinates. Tab separated; chr, start
                        coord, end coord, label, score, strand, gene name. The
                        score column is ignored.
  outputFile            The output file, bed 6+5 format. See the schema in
                        kent/src/hg/lib/barChartBed.as.

optional arguments:
  -h, --help            show this help message and exit
  --groupOrderFile GROUPORDERFILE
                        Optional file to define the group order, list the
                        groups in a single column in the order desired. The
                        default ordering is alphabetical.
  --useMean             Calculate the group values using mean rather than
                        median.
  --verbose             Show runtime messages.
================================================================
========   faAlign   ====================================
================================================================
### kent source version 362 ###
faAlign - Align two fasta files
usage:
   faAlign target.fa query.fa output.axt
options:
   -dna - use DNA scoring scheme

================================================================
========   faCmp   ====================================
================================================================
### kent source version 362 ###
faCmp - Compare two .fa files
usage:
   faCmp [options] a.fa b.fa
options:
    -softMask - use the soft masking information during the compare
                Differences will be noted if the masking is different.
    -sortName - sort input files by name before comparing
    -peptide - read as peptide sequences
default:
    no masking information is used during compare.  It is as if both
    sequences were not masked.

Exit codes:
   - 0 if files are the same
   - 1 if files differ
   - 255 on an error


================================================================
========   faCount   ====================================
================================================================
### kent source version 362 ###
faCount - count base statistics and CpGs in FA files.
usage:
   faCount file(s).fa
     -summary  show only summary statistics
     -dinuc    include statistics on dinucletoide frequencies
     -strands  count bases on both strands

================================================================
========   faFilter   ====================================
================================================================
### kent source version 362 ###
faFilter - Filter fa records, selecting ones that match the specified conditions
usage:
   faFilter [options] in.fa out.fa

Options:
    -name=wildCard  - Only pass records where name matches wildcard
                      * matches any string or no character.
                      ? matches any single character.
                      anything else etc must match the character exactly
                      (these will will need to be quoted for the shell)
    -namePatList=filename - A list of regular expressions, one per line, that
                            will be applied to the fasta name the same as -name
    -v - invert match, select non-matching records.
    -minSize=N - Only pass sequences at least this big.
    -maxSize=N - Only pass sequences this size or smaller.
    -maxN=N Only pass sequences with fewer than this number of N's
    -uniq - Removes duplicate sequence ids, keeping the first.
    -i    - make -uniq ignore case so sequence IDs ABC and abc count as dupes.

All specified conditions must pass to pass a sequence.  If no conditions are
specified, all records will be passed.

================================================================
========   faFilterN   ====================================
================================================================
faFilterN - Get rid of sequences with too many N's
usage:
   faFilterN in.fa out.fa maxPercentN
options:
   -out=in.fa.out
   -uniq=self.psl

================================================================
========   faFrag   ====================================
================================================================
faFrag - Extract a piece of DNA from a .fa file.
usage:
   faFrag in.fa start end out.fa
options:
   -mixed - preserve mixed-case in FASTA file

================================================================
========   faNoise   ====================================
================================================================
faNoise - Add noise to .fa file
usage:
   faNoise inName outName transitionPpt transversionPpt insertPpt deletePpt chimeraPpt
options:
   -upper - output in upper case

================================================================
========   faOneRecord   ====================================
================================================================
faOneRecord - Extract a single record from a .FA file
usage:
   faOneRecord in.fa recordName

================================================================
========   faPolyASizes   ====================================
================================================================
### kent source version 362 ###
faPolyASizes - get poly A sizes
usage:
   faPolyASizes in.fa out.tab

output file has four columns:
   id seqSize tailPolyASize headPolyTSize

options:

================================================================
========   faRandomize   ====================================
================================================================
### kent source version 362 ###
faRandomize - Program to create random fasta records
usage:
  faRandomize [-seed=N] in.fa randomized.fa
    Use optional -seed argument to specify seed (integer) for random
    number generator (rand).  Generated sequence has the
    same base frequency as seen in original fasta records.
================================================================
========   faRc   ====================================
================================================================
faRc - Reverse complement a FA file
usage:
   faRc in.fa out.fa
In.fa and out.fa may be the same file.
options:
   -keepName - keep name identical (don't prepend RC)
   -keepCase - works well for ACGTUN in either case. bizarre for other letters.
               without it bases are turned to lower, all else to n's
   -justReverse - prepends R unless asked to keep name
   -justComplement - prepends C unless asked to keep name
                     (cannot appear together with -justReverse)

================================================================
========   faSize   ====================================
================================================================
### kent source version 362 ###
faSize - print total base count in fa files.
usage:
   faSize file(s).fa
Command flags
   -detailed        outputs name and size of each record
                    has the side effect of printing nothing else
   -tab             output statistics in a tab separated format

================================================================
========   faSomeRecords   ====================================
================================================================
### kent source version 362 ###
faSomeRecords - Extract multiple fa records
usage:
   faSomeRecords in.fa listFile out.fa
options:
   -exclude - output sequences not in the list file.

================================================================
========   faSplit   ====================================
================================================================
### kent source version 362 ###
faSplit - Split an fa file into several files.
usage:
   faSplit how input.fa count outRoot
where how is either 'about' 'byname' 'base' 'gap' 'sequence' or 'size'.  
Files split by sequence will be broken at the nearest fa record boundary. 
Files split by base will be broken at any base.  
Files broken by size will be broken every count bases.

Examples:
   faSplit sequence estAll.fa 100 est
This will break up estAll.fa into 100 files
(numbered est001.fa est002.fa, ... est100.fa
Files will only be broken at fa record boundaries

   faSplit base chr1.fa 10 1_
This will break up chr1.fa into 10 files

   faSplit size input.fa 2000 outRoot
This breaks up input.fa into 2000 base chunks

   faSplit about est.fa 20000 outRoot
This will break up est.fa into files of about 20000 bytes each by record.

   faSplit byname scaffolds.fa outRoot/ 
This breaks up scaffolds.fa using sequence names as file names.
       Use the terminating / on the outRoot to get it to work correctly.

   faSplit gap chrN.fa 20000 outRoot
This breaks up chrN.fa into files of at most 20000 bases each, 
at gap boundaries if possible.  If the sequence ends in N's, the last
piece, if larger than 20000, will be all one piece.

Options:
    -verbose=2 - Write names of each file created (=3 more details)
    -maxN=N - Suppress pieces with more than maxN n's.  Only used with size.
              default is size-1 (only suppresses pieces that are all N).
    -oneFile - Put output in one file. Only used with size
    -extra=N - Add N extra bytes at the end to form overlapping pieces.  Only used with size.
    -out=outFile Get masking from outfile.  Only used with size.
    -lift=file.lft Put info on how to reconstruct sequence from
                   pieces in file.lft.  Only used with size and gap.
    -minGapSize=X Consider a block of Ns to be a gap if block size >= X.
                  Default value 1000.  Only used with gap.
    -noGapDrops - include all N's when splitting by gap.
    -outDirDepth=N Create N levels of output directory under current dir.
                   This helps prevent NFS problems with a large number of
                   file in a directory.  Using -outDirDepth=3 would
                   produce ./1/2/3/outRoot123.fa.
    -prefixLength=N - used with byname option. create a separate output
                   file for each group of sequences names with same prefix
                   of length N.

================================================================
========   faToFastq   ====================================
================================================================
### kent source version 362 ###
faToFastq - Convert fa to fastq format, just faking quality values.
usage:
   faToFastq in.fa out.fastq
options:
   -qual=X quality letter to use.  Default is '<' which is good I think....

================================================================
========   faToTab   ====================================
================================================================
faToTab - convert fa file to tab separated file
usage:
   faToTab infileName outFileName
options:
     -type=seqType   sequence type, dna or protein, default is dna
     -keepAccSuffix - don't strip dot version off of sequence id, keep as is

================================================================
========   faToTwoBit   ====================================
================================================================
### kent source version 362 ###
faToTwoBit - Convert DNA from fasta to 2bit format
usage:
   faToTwoBit in.fa [in2.fa in3.fa ...] out.2bit
options:
   -long          use 64-bit offsets for index.   Allow for twoBit to contain more than 4Gb of sequence. 
                  NOT COMPATIBLE WITH OLDER CODE.
   -noMask        Ignore lower-case masking in fa file.
   -stripVersion  Strip off version number after '.' for GenBank accessions.
   -ignoreDups    Convert first sequence only if there are duplicate sequence
                  names.  Use 'twoBitDup' to find duplicate sequences.
================================================================
========   faTrans   ====================================
================================================================
### kent source version 362 ###
faTrans - Translate DNA .fa file to peptide
usage:
   faTrans in.fa out.fa
options:
   -stop stop at first stop codon (otherwise puts in Z for stop codons)
   -offset=N start at a particular offset.
   -cdsUpper - cds is in upper case

================================================================
========   fastqStatsAndSubsample   ====================================
================================================================
### kent source version 362 ###
fastqStatsAndSubsample v2 - Go through a fastq file doing sanity checks and collecting stats
and also producing a smaller fastq out of a sample of the data.  The fastq input may be
compressed with gzip or bzip2.
usage:
   fastqStatsAndSubsample in.fastq out.stats out.fastq
options:
   -sampleSize=N - default 100000
   -seed=N - Use given seed for random number generator.  Default 0.
   -smallOk - Not an error if less than sampleSize reads.  out.fastq will be entire in.fastq
   -json - out.stats will be in json rather than text format
Use /dev/null for out.fastq and/or out.stats if not interested in these outputs

================================================================
========   fastqToFa   ====================================
================================================================
### kent source version 362 ###
#	no name checks will be made on lines beginning with @
#	ignore quality scores
#	using default Phread quality score algorithm
#	all errors will cause exit
fastqToFa - Convert from fastq to fasta format.
usage:
   fastqToFa [options] in.fastq out.fa
options:
   -nameVerify='string' - for multi-line fastq files, 'string' must
	match somewhere in the sequence names in order to correctly
	identify the next sequence block (e.g.: -nameVerify='Supercontig_')
   -qual=file.qual.fa - output quality scores to specifed file
	(default: quality scores are ignored)
   -qualSizes=qual.sizes - write sizes file for the quality scores
   -noErrors - warn only on problems, do not error out
              (specify -verbose=3 to see warnings
   -solexa - use Solexa/Illumina quality score algorithm
	(instead of Phread quality)
   -verbose=2 - set warning level to get some stats output during processing
================================================================
========   featureBits   ====================================
================================================================
### kent source version 362 ###
featureBits - Correlate tables via bitmap projections. 
usage:
   featureBits database table(s)
This will return the number of bits in all the tables anded together
Pipe warning:  output goes to stderr.
Options:
   -bed=output.bed   Put intersection into bed format. Can use stdout.
   -fa=output.fa     Put sequence in intersection into .fa file
   -faMerge          For fa output merge overlapping features.
   -minSize=N        Minimum size to output (default 1)
   -chrom=chrN       Restrict to one chromosome
   -chromSize=sizefile       Read chrom sizes from file instead of database. 
                             (chromInfo three column format)
   -or               Or tables together instead of anding them
   -not              Output negation of resulting bit set.
   -countGaps        Count gaps in denominator
   -noRandom         Don't include _random (or Un) chromosomes
   -noHap            Don't include _hap|_alt chromosomes
   -dots=N           Output dot every N chroms (scaffolds) processed
   -minFeatureSize=n Don't include bits of the track that are smaller than
                     minFeatureSize, useful for differentiating between
                     alignment gaps and introns.
   -bin=output.bin   Put bin counts in output file
   -binSize=N        Bin size for generating counts in bin file (default 500000)
   -binOverlap=N     Bin overlap for generating counts in bin file (default 250000)
   -bedRegionIn=input.bed    Read in a bed file for bin counts in specific regions 
                     and write to bedRegionsOut
   -bedRegionOut=output.bed  Write a bed file of bin counts in specific regions 
                     from bedRegionIn
   -enrichment       Calculates coverage and enrichment assuming first table
                     is reference gene track and second track something else
                     Enrichment is the amount of table1 that covers table2 vs. the
                     amount of table1 that covers the genome. It's how much denser
                     table1 is in table2 than it is genome-wide.
   '-where=some sql pattern'  Restrict to features matching some sql pattern
You can include a '!' before a table name to negate it.
   To prevent your shell from interpreting the '!' you will need
   to use the backslash \!, for example the gap table: \!gap
Some table names can be followed by modifiers such as:
    :exon:N          Break into exons and add N to each end of each exon
    :cds             Break into coding exons
    :intron:N        Break into introns, remove N from each end
    :utr5, :utr3     Break into 5' or 3' UTRs
    :upstream:N      Consider the region of N bases before region
    :end:N           Consider the region of N bases after region
    :score:N         Consider records with score >= N 
    :upstreamAll:N   Like upstream, but doesn't filter out genes that 
                     have txStart==cdsStart or txEnd==cdsEnd
    :endAll:N        Like end, but doesn't filter out genes that 
                     have txStart==cdsStart or txEnd==cdsEnd
The tables can be bed, psl, or chain files, or a directory full of
such files as well as actual database tables.  To count the bits
used in dir/chrN_something*.bed you'd do:
   featureBits database dir/_something.bed
NB: by default, featureBits omits gap regions from its calculation of the total
number of bases.  This requires connecting to a database server using credentials
from a .hg.conf file (or similar).  If such a connection is not available, you will
need to specify -countGaps (which skips the database connection) in addition to
providing all tables as files or directories.

================================================================
========   fetchChromSizes   ====================================
================================================================
usage: fetchChromSizes <db> > <db>.chrom.sizes
    used to fetch chrom.sizes information from UCSC for the given <db>
<db> - name of UCSC database, e.g.: hg38, hg18, mm9, etc ...

This script expects to find one of the following commands:
    wget, mysql, or ftp in order to fetch information from UCSC.
Route the output to the file <db>.chrom.sizes as indicated above.
This data is available at the URL:
  http://hgdownload.cse.ucsc.edu/goldenPath/<db>/bigZips/<db>.chrom.sizes

Example:   fetchChromSizes hg38 > hg38.chrom.sizes
================================================================
========   findMotif   ====================================
================================================================
### kent source version 362 ###
findMotif - find specified motif in sequence
usage:
   findMotif [options] -motif=<acgt...> sequence
where:
   sequence is a .fa , .nib or .2bit file or a file which is a list of sequence files.
options:
   -motif=<acgt...> - search for this specified motif (case ignored, [acgt] only)
   -chr=<chrN> - process only this one chrN from the sequence
   -strand=<+|-> - limit to only one strand.  Default is both.
   -bedOutput - output bed format (this is the default)
   -wigOutput - output wiggle data format instead of bed file
   -verbose=N - set information level [1-4]
   NOTE: motif must be longer than 4 characters, less than 17
   -verbose=4 - will display gaps as bed file data lines to stderr
================================================================
========   gapToLift   ====================================
================================================================
### kent source version 362 ###
gapToLift - create lift file from gap table(s)
usage:
   gapToLift [options] db liftFile.lft
       uses gap table(s) from specified db.  Writes to liftFile.lft
       generates lift file segements separated by non-bridged gaps.
options:
   -chr=chrN - work only on given chrom
   -minGap=M - examine gaps only >= than M
   -insane - do *not* perform coordinate sanity checks on gaps
   -bedFile=fileName.bed - output segments to fileName.bed
   -verbose=N - N > 1 see more information about procedure
================================================================
========   genePredCheck   ====================================
================================================================
### kent source version 362 ###
genePredCheck - validate genePred files or tables
usage:
   genePredCheck [options] fileTbl ..

If fileTbl is an existing file, then it is checked.  Otherwise, if -db
is provided, then a table by this name in db is checked.

options:
   -db=db - If specified, then this database is used to
          get chromosome sizes, and perhaps the table to check.
   -chromSizes=file.chrom.sizes - use chrom sizes from tab separated
          file (name,size) instead of from chromInfo table in specified db.
================================================================
========   genePredFilter   ====================================
================================================================
### kent source version 362 ###
genePredFilter - filter a genePred file
usage:
   genePredFilter [options] genePredIn genePredOut

Filter a genePredFile, dropping invalid entries

options:
   -db=db - If specified, then this database is used to
    get chromosome sizes.
   -verbose=2 - level >= 2 prints out errors for each problem found.


================================================================
========   genePredHisto   ====================================
================================================================
### kent source version 362 ###
wrong number of arguments

genePredHisto - get data for generating histograms from a genePred file.
usage:
   genePredHisto [options] what genePredFile histoOut

Options:
  -ids - a second column with the gene name, useful for finding outliers.

The what arguments indicates the type of output. The output file is
a list of numbers suitable for input to textHistogram or similar
The following values are current implemented
   exonLen- length of exons
   5utrExonLen- length of 5'UTR regions of exons
   cdsExonLen- length of CDS regions of exons
   3utrExonLen- length of 3'UTR regions of exons
   exonCnt- count of exons
   5utrExonCnt- count of exons containing 5'UTR
   cdsExonCnt- count of exons count CDS
   3utrExonCnt- count of exons containing 3'UTR

================================================================
========   genePredSingleCover   ====================================
================================================================
### kent source version 362 ###
wrong # args

genePredSingleCover - create single-coverage genePred files

genePredSingleCover [options] inGenePred outGenePred

Create a genePred file that have single CDS coverage of the genome.
UTR is allowed to overlap.  The default is to keep the gene with the
largest numberr of CDS bases.

Options:
  -scores=file - read scores used in selecting genes from this file.
   It consists of tab seperated lines of
       name chrom txStart score
   where score is a real or integer number. Higher scoring genes will
   be choosen over lower scoring ones.  Equaly scoring genes are
   choosen by number of CDS bases.  If this option is supplied, all
   genes must be in the file


================================================================
========   genePredToBed   ====================================
================================================================
### kent source version 362 ###
genePredToBed - Convert from genePred to bed format. Does not yet handle genePredExt
usage:
   genePredToBed in.genePred out.bed
options:
   -xxx=XXX

================================================================
========   genePredToBigGenePred   ====================================
================================================================
### kent source version 362 ###
genePredToBigGenePred - converts genePred or genePredExt to bigGenePred input (bed format with extra fields)
usage:
  genePredToBigGenePred [-known] [-score=scores] [-geneNames=geneNames] [-colors=colors] file.gp stdout | sort -k1,1 -k2,2n > file.bgpInput
NOTE: to build bigBed:
   bedToBigBed -type=bed12+8 -tab -as=bigGenePred.as file.bgpInput chrom.sizes output.bb
options:
    -known                input file is a genePred in knownGene format
    -score=scores         scores is two column file with id's mapping to scores
    -geneNames=geneNames  geneNames is a three column file with id's mapping to two gene names
    -colors=colors        colors is a four column file with id's mapping to r,g,b
    -cds=cds              cds is a five column file with id's mapping to cds status codes and exonFrames (see knownCds.as)

================================================================
========   genePredToFakePsl   ====================================
================================================================
### kent source version 362 ###
genePredToFakePsl - Create a psl of fake-mRNA aligned to gene-preds from a file or table.
usage:
   genePredToFakePsl [options] db fileTbl pslOut cdsOut

If fileTbl is an existing file, then it is used.
Otherwise, the table by this name is used.

pslOut specifies the fake-mRNA output psl filename.

cdsOut specifies the output cds tab-separated file which contains
genbank-style CDS records showing cdsStart..cdsEnd
e.g. NM_123456 34..305
options:
   -chromSize=sizefile	Read chrom sizes from file instead of database
             sizefile contains two white space separated fields per line:
		chrom name and size
   -qSizes=qSizesFile	Read in query sizes to fixup qSize and qStarts


================================================================
========   genePredToGtf   ====================================
================================================================
### kent source version 362 ###
genePredToGtf - Convert genePred table or file to gtf.
usage:
   genePredToGtf database genePredTable output.gtf
If database is 'file' then track is interpreted as a file
rather than a table in database.
options:
   -utr - Add 5UTR and 3UTR features
   -honorCdsStat - use cdsStartStat/cdsEndStat when defining start/end
    codon records
   -source=src set source name to use
   -addComments - Add comments before each set of transcript records.
    allows for easier visual inspection
Note: use a refFlat table or extended genePred table or file to include
the gene_name attribute in the output.  This will not work with a refFlat
table dump file. If you are using a genePred file that starts with a numeric
bin column, drop it using the UNIX cut command:
    cut -f 2- in.gp | genePredToGtf file stdin out.gp

================================================================
========   genePredToMafFrames   ====================================
================================================================
### kent source version 362 ###
wrong # args

genePredToMafFrames - create mafFrames tables from a genePreds

genePredToMafFrames [options] targetDb maf mafFrames geneDb1 genePred1 [geneDb2 genePred2...] 

Create frame annotations for one or more components of a MAF.
It is significantly faster to process multiple gene sets in the same"run, as 95% of the CPU time is spent reading the MAF

Arguments:
  o targetDb - db of target genome
  o maf - input MAF file
  o mafFrames - output file
  o geneDb1 - db in MAF that corresponds to genePred's organism.
  o genePred1 - genePred file.  Overlapping annotations ahould have
    be removed.  This file may optionally include frame annotations
Options:
  -bed=file - output a bed of for each mafFrame region, useful for debugging.
  -verbose=level - enable verbose tracing, the following levels are implemented:
     3 - print information about data used to compute each record.
     4 - dump information about the gene mappings that were constructed
     5 - dump information about the gene mappings after split processing
     6 - dump information about the gene mappings after frame linking


================================================================
========   genePredToProt   ====================================
================================================================
### kent source version 362 ###
genePredToProt - create protein sequences by translating gene annotations
usage:
   genePredToProt genePredFile genomeSeqs protFa

This honors frame if genePred has frames, dropping partial codons.
genomeSeqs is a 2bit or directory of nib files.

options:
  -cdsFa=fasta - output FASTA with CDS that was used to generate protein.
                 This will not include dropped partial codons.
  -protIdSuffix=str - add this string to the end of the name for protein FASTA
  -cdsIdSuffix=str - add this string to the end of the name for CDS FASTA
  -translateSeleno - assume internal TGA code for selenocysteine and translate to `U'.
  -includeStop - If the CDS ends with a stop codon, represent it as a `*'
  -starForInframeStops - use `*' instead of `X' for in-frame stop codons.
                  This will result in selenocysteine's being `*', with only codons
                  containing `N' being translated to `X'.  This doesn't include terminal
                  stop

================================================================
========   gensub2   ====================================
================================================================
gensub2 - version 12.18
Generate condor submission file from template and two file lists.
Usage:
   gensub2 <file list 1> <file list 2> <template file> <output file>
This will substitute each file in the file lists for $(path1) and $(path2)
in the template between #LOOP and #ENDLOOP, and write the results to
the output.  Other substitution variables are:
       $(path1)  - Full path name of first file.
       $(path2)  - Full path name of second file.
       $(dir1)   - First directory. Includes trailing slash if any.
       $(dir2)   - Second directory.
       $(lastDir1) - The last directory in the first path. Includes trailing slash if any.
       $(lastDir2) - The last directory in the second path. Includes trailing slash if any.
       $(lastDirs1=<n>) - The last n directories in the first path.
       $(lastDirs2=<n>) - The last n directories in the second path.
       $(root1)  - First file name without directory or extension.
       $(root2)  - Second file name without directory or extension.
       $(ext1)   - First file extension.
       $(ext2)   - Second file extension.
       $(file1)  - Name without dir of first file.
       $(file2)  - Name without dir of second file.
       $(num1)   - Index of first file in list.
       $(num2)   - Index of second file in list.
The <file list 2> parameter can be 'single' if there is only one file list and 
'selfPair' if there is a single list, but you want all
pairs of single list with itself.  By default the order is diagonal, meaning if 
the first list is ABC and the second list is abc the combined 
order is Aa Ba Ab Ca Bb Ac  Cb Bc Cc.  This tends to put the 
largest jobs first if the file lists are both sorted by size. 
The following options can change this:
    -group1 - write elements in order Aa Ab Ac Ba Bb Bc Ca Cb Cc
    -group2 - write elements in order Aa Ba Ca Ab Bb Cb Ac Bc Cc
template file syntax help for check statement: {check 'when' 'what' <file>}
 where 'when' is either 'in' or 'out'
 and 'what' is one of: 'exists' 'exists+' 'line' 'line+'
 'exists' means file exists, may be zero size
 'exists+' means file exists and is non-zero size
 'line' means file may have 0 or more lines of ascii data and is properly
        line-feed terminated
 'line+' means file is 1 or more lines of ascii data and is properly
        line-feed terminated
================================================================
========   getRna   ====================================
================================================================
### kent source version 362 ###
getRna - Get mrna for GenBank or RefSeq sequences found in a database
usage:
   getRna [options] database accFile outfa

Get mrna for all accessions in accFile, writing to a fasta file. If accession
 has a version, that version is returned or an error generated

Options:
  -cdsUpper - lookup CDS and output it as upper case. If CDS annotation
    can't be obtained, the sequence is skipped with a warning.
  -cdsUpperAll - like -cdsUpper, except keep sequeneces without CDS
  -inclVer - include version with sequence id.
  -peptides - translate mRNAs to peptides


================================================================
========   getRnaPred   ====================================
================================================================
### kent source version 362 ###
getRnaPred - Get virtual RNA for gene predictions
usage:
   getRnaPred [options] database table chromosome output.fa
table can be a table or a file.  Specify chromosome of 'all' to
to process all chromosome

options:
   -weird - only get ones with weird splice sites
   -cdsUpper - output CDS in upper case
   -cdsOnly - only output CDS
   -cdsOut=file - write CDS to this tab-separated file, in the form
      acc  start  end
    where start..end are genbank style, one-based coordinates
   -keepMasking - un/masked in upper/lower case.
   -pslOut=psl - output a PSLs for the virtual mRNAs.  Allows virtual
    mRNA to be analyzed by tools that work on PSLs
   -suffix=suf - append suffix to each id to avoid confusion with mRNAs
    use to define the genes.
   -peptides - out the translation of the CDS to a peptide sequence.
    The newer program genePredToProt maybe produce better results in cases
    were there are frame-shifting indels in the CDS.
   -exonIndices - output indices of exon boundaries after sequence name,
    e.g., "103 243 290" says positions 1-103 are from the first exon,
    positions 104-243 are from the second exon, etc. 
   -maxSize=size - output a maximum of size characters.  Useful when
    testing gene predictions by RT-PCR.
   -genomeSeqs=spec - get genome sequences from the specified nib directory
    or 2bit file instead of going though the path found in chromInfo.
   -includeCoords - include the genomic coordinates as a comment in the
    fasta header.  This is necessary when there are multiple genePreds
    with the same name.
   -genePredExt - (for use with -peptides) use extended genePred format,
    and consider frame information when translating (Warning: only
    considers offset at 5' end, not frameshifts between blocks)

================================================================
========   gfClient   ====================================
================================================================
### kent source version 362 ###
gfClient v. 36x2 - A client for the genomic finding program that produces a .psl file
usage:
   gfClient host port seqDir in.fa out.psl
where
   host is the name of the machine running the gfServer
   port is the same port that you started the gfServer with
   seqDir is the path of the .nib or .2bit files relative to the current dir
       (note these are needed by the client as well as the server)
   in.fa is a fasta format file.  May contain multiple records
   out.psl is where to put the output
options:
   -t=type       Database type. Type is one of:
                   dna - DNA sequence
                   prot - protein sequence
                   dnax - DNA sequence translated in six frames to protein
                 The default is dna.
   -q=type       Query type. Type is one of:
                   dna - DNA sequence
                   rna - RNA sequence
                   prot - protein sequence
                   dnax - DNA sequence translated in six frames to protein
                   rnax - DNA sequence translated in three frames to protein
   -prot         Synonymous with -t=prot -q=prot.
   -dots=N       Output a dot every N query sequences.
   -nohead       Suppresses 5-line psl header.
   -minScore=N   Sets minimum score.  This is twice the matches minus the 
                 mismatches minus some sort of gap penalty.  Default is 30.
   -minIdentity=N   Sets minimum sequence identity (in percent).  Default is
                 90 for nucleotide searches, 25 for protein or translated
                 protein searches.
   -out=type     Controls output file format.  Type is one of:
                   psl - Default.  Tab-separated format without actual sequence
                   pslx - Tab-separated format with sequence
                   axt - blastz-associated axt format
                   maf - multiz-associated maf format
                   sim4 - similar to sim4 format
                   wublast - similar to wublast format
                   blast - similar to NCBI blast format
                   blast8- NCBI blast tabular format
                   blast9 - NCBI blast tabular format with comments
   -maxIntron=N   Sets maximum intron size. Default is 750000.
================================================================
========   gfServer   ====================================
================================================================
### kent source version 362 ###
gfServer v 36x2 - Make a server to quickly find where DNA occurs in genome
   To set up a server:
      gfServer start host port file(s)
      where the files are .nib or .2bit format files specified relative to the current directory
   To remove a server:
      gfServer stop host port
   To query a server with DNA sequence:
      gfServer query host port probe.fa
   To query a server with protein sequence:
      gfServer protQuery host port probe.fa
   To query a server with translated DNA sequence:
      gfServer transQuery host port probe.fa
   To query server with PCR primers:
      gfServer pcr host port fPrimer rPrimer maxDistance
   To process one probe fa file against a .nib format genome (not starting server):
      gfServer direct probe.fa file(s).nib
   To test PCR without starting server:
      gfServer pcrDirect fPrimer rPrimer file(s).nib
   To figure out usage level:
      gfServer status host port
   To get input file list:
      gfServer files host port
options:
   -tileSize=N     Size of n-mers to index.  Default is 11 for nucleotides, 4 for
                   proteins (or translated nucleotides).
   -stepSize=N     Spacing between tiles. Default is tileSize.
   -minMatch=N     Number of n-mer matches that trigger detailed alignment.
                   Default is 2 for nucleotides, 3 for proteins.
   -maxGap=N       Number of insertions or deletions allowed between n-mers.
                   Default is 2 for nucleotides, 0 for proteins.
   -trans          Translate database to protein in 6 frames.  Note: it is best
                   to run this on RepeatMasked data in this case.
   -log=logFile    Keep a log file that records server requests.
   -seqLog         Include sequences in log file (not logged with -syslog).
   -ipLog          Include user's IP in log file (not logged with -syslog).
   -debugLog       Include debugging info in log file.
   -syslog         Log to syslog.
   -logFacility=facility  Log to the specified syslog facility - default local0.
   -mask           Use masking from nib file.
   -repMatch=N     Number of occurrences of a tile (n-mer) that triggers repeat masking the
                   tile. Default is 1024.
   -maxDnaHits=N   Maximum number of hits for a DNA query that are sent from the server.
                   Default is 100.
   -maxTransHits=N Maximum number of hits for a translated query that are sent from the server.
                   Default is 200.
   -maxNtSize=N    Maximum size of untranslated DNA query sequence.
                   Default is 40000.
   -maxAaSize=N    Maximum size of protein or translated DNA queries.
                   Default is 8000.
   -canStop        If set, a quit message will actually take down the server.

================================================================
========   gff3ToGenePred   ====================================
================================================================
### kent source version 362 ###
gff3ToGenePred - convert a GFF3 file to a genePred file
usage:
   gff3ToGenePred inGff3 outGp
options:
  -warnAndContinue - on bad genePreds being created, put out warning but continue
  -useName - rather than using 'id' as name, use the 'name' tag
  -rnaNameAttr=attr - If this attribute exists on an RNA record, use it as the genePred
   name column
  -geneNameAttr=attr - If this attribute exists on a gene record, use it as the genePred
   name2 column
  -attrsOut=file - output attributes of mRNA record to file.  These are per-genePred row,
   not per-GFF3 record. Thery are derived from GFF3 attributes, not the attributes themselves.
  -processAllGeneChildren - output genePred for all children of a gene regardless of feature
  -unprocessedRootsOut=file - output GFF3 root records that were not used.  This will not be a
   valid GFF3 file.  It's expected that many non-root records will not be used and they are not
   reported.
  -bad=file   - output genepreds that fail checks to file
  -maxParseErrors=50 - Maximum number of parsing errors before aborting. A negative
   value will allow an unlimited number of errors.  Default is 50.
  -maxConvertErrors=50 - Maximum number of conversion errors before aborting. A negative
   value will allow an unlimited number of errors.  Default is 50.
  -honorStartStopCodons - only set CDS start/stop status to complete if there are
   corresponding start_stop codon records
  -defaultCdsStatusToUnknown - default the CDS status to unknown rather
   than complete.
  -allowMinimalGenes - normally this programs assumes that genes contains
   transcripts which contain exons.  If this option is specified, genes with exons
   as direct children of genes and stand alone genes with no exon or transcript
   children will be converted.

This converts:
   - top-level gene records with RNA records
   - top-level RNA records
   - RNA records that contain:
       - exon and CDS
       - CDS, five_prime_UTR, three_prime_UTR
       - only exon for non-coding
   - top-level gene records with transcript records
   - top-level transcript records
   - transcript records that contain:
       - exon
where RNA can be mRNA, ncRNA, or rRNA, and transcript can be either
transcript or primary_transcript
The first step is to parse GFF3 file, up to 50 errors are reported before
aborting.  If the GFF3 files is successfully parse, it is converted to gene,
annotation.  Up to 50 conversion errors are reported before aborting.

Input file must conform to the GFF3 specification:
   http://www.sequenceontology.org/gff3.shtml

================================================================
========   gff3ToPsl   ====================================
================================================================
### kent source version 362 ###
gff3ToPsl - convert a GFF3 CIGAR file to a PSL file
usage:
   gff3ToPsl [options] queryChromSizes targetChomSizes inGff3 out.psl
arguments:
   queryChromSizes file with query (main coordinates) chromosome sizes  .
               File formatted:  chromeName<tab>chromSize
   targetChromSizes file with target (Target attribute)  chromosome sizes .
   inGff3     GFF3 formatted file with Gap attribute in match records
   out.psl    PSL formatted output
options:
   -dropQ     drop record when query not found in queryChromSizes
   -dropT     drop record when target not found in targetChromSizes
This converts:
The first step is to parse GFF3 file, up to 50 errors are reported before
aborting.  If the GFF3 files is successfully parse, it is converted to PSL

Input file must conform to the GFF3 specification:
   http://www.sequenceontology.org/gff3.shtml

================================================================
========   gmtime   ====================================
================================================================
gmtime - convert unix timestamp to date string
usage: gmtime <time stamp>
	<time stamp> - integer 0 to 2147483647
================================================================
========   gtfToGenePred   ====================================
================================================================
### kent source version 362 ###
gtfToGenePred - convert a GTF file to a genePred
usage:
   gtfToGenePred gtf genePred

options:
     -genePredExt - create a extended genePred, including frame
      information and gene name
     -allErrors - skip groups with errors rather than aborting.
      Useful for getting infomation about as many errors as possible.
     -ignoreGroupsWithoutExons - skip groups contain no exons rather than
      generate an error.
     -infoOut=file - write a file with information on each transcript
     -sourcePrefix=pre - only process entries where the source name has the
      specified prefix.  May be repeated.
     -impliedStopAfterCds - implied stop codon in after CDS
     -simple    - just check column validity, not hierarchy, resulting genePred may be damaged
     -geneNameAsName2 - if specified, use gene_name for the name2 field
      instead of gene_id.
     -includeVersion - it gene_version and/or transcript_version attributes exist, include the version
      in the corresponding identifiers.

================================================================
========   headRest   ====================================
================================================================
### kent source version 362 ###
headRest - Return all *but* the first N lines of a file.
usage:
   headRest count fileName
You can use stdin for fileName
options:
   -xxx=XXX

================================================================
========   hgBbiDbLink   ====================================
================================================================
### kent source version 362 ###
hgBbiDbLink - Add table that just contains a pointer to a bbiFile to database.  This program 
is used to add bigWigs and bigBeds.
usage:
   hgBbiDbLink database trackName fileName

================================================================
========   hgFakeAgp   ====================================
================================================================
### kent source version 362 ###
hgFakeAgp - Create fake AGP file by looking at N's
usage:
   hgFakeAgp input.fa output.agp
options:
   -minContigGap=N Minimum size for a gap between contigs.  Default 25
   -minScaffoldGap=N Min size for a gap between scaffolds. Default 50000

================================================================
========   hgFindSpec   ====================================
================================================================
### kent source version 362 ###
hgFindSpec - Create hgFindSpec table from trackDb.ra files.

usage:
   hgFindSpec [options] orgDir database hgFindSpec hgFindSpec.sql hgRoot

Options:
  -strict		Add spec to hgFindSpec only if its table(s) exist.
  -raName=trackDb.ra - Specify a file name to use other than trackDb.ra
   for the ra files.
  -release=alpha|beta|public - Include trackDb entries with this release tag only.

================================================================
========   hgGcPercent   ====================================
================================================================
### kent source version 362 ###
hgGcPercent - Calculate GC Percentage in 20kb windows
usage:
   hgGcPercent [options] database nibDir
     nibDir can be a .2bit file, a directory that contains a
     database.2bit file, or a directory that contains *.nib files.
     Loads gcPercent table with counts from sequence.
options:
   -win=<size> - change windows size (default 20000)
   -noLoad - do not load mysql table - create bed file
   -file=<filename> - output to <filename> (stdout OK) (implies -noLoad)
   -chr=<chrN> - process only chrN from the nibDir
   -noRandom - ignore randome chromosomes from the nibDir
   -noDots - do not display ... progress during processing
   -doGaps - process gaps correctly (default: gaps are not counted as GC)
   -wigOut - output wiggle ascii data ready to pipe to wigEncode
   -overlap=N - overlap windows by N bases (default 0)
   -verbose=N - display details to stderr during processing
   -bedRegionIn=input.bed   Read in a bed file for GC content in specific regions and write to bedRegionsOut
   -bedRegionOut=output.bed Write a bed file of GC content in specific regions from bedRegionIn

example:
  calculate GC percent in 5 base windows using a 2bit assembly (dp2):
    hgGcPercent -wigOut -doGaps -win=5 -file=stdout -verbose=0 \
      dp2 /cluster/data/dp2 \
    | wigEncode stdin gc5Base.wig gc5Base.wib
================================================================
========   hgLoadBed   ====================================
================================================================
### kent source version 362 ###
hgLoadBed - Load a generic bed file into database
usage:
   hgLoadBed database track files(s).bed
options:
   -noSort  don't sort (you better be sorting before this)
   -noBin   suppress bin field
   -oldTable add to existing table
   -onServer This will speed things up if you're running in a directory that
             the mysql server can access.
   -sqlTable=table.sql Create table from .sql file
   -renameSqlTable Rename table created with -sqlTable to match track
   -trimSqlTable   If sqlTable has n fields, and input has m fields, only load m fields, meaning the last n-m fields in the sqlTable are optional
   -type=bedN[+[P]] : 
                      N is between 3 and 15, 
                      optional (+) if extra "bedPlus" fields, 
                      optional P specifies the number of extra fields. Not required, but preferred.
                      Examples: -type=bed6 or -type=bed6+ or -type=bed6+3 
                      (see http://genome.ucsc.edu/FAQ/FAQformat.html#format1)
                      Recommended to use with -as option for better bedPlus validation.
   -as=fields.as   If you have extra "bedPlus" fields, it's great to put a definition
                     of each field in a row in AutoSql format here.
   -chromInfo=file.txt    Specify chromInfo file to validate chrom names and sizes.
   -tab       Separate by tabs rather than space
   -hasBin    Input bed file starts with a bin field.
   -noLoad     - Do not load database and do not clean up tab files
   -noHistory  - Do not add history table comments (for custom tracks)
   -notItemRgb - Do not parse column nine as r,g,b when commas seen (bacEnds)
   -bedGraph=N - wiggle graph column N of the input file as float dataValue
               - bedGraph N is typically 4: -bedGraph=4
   -bedDetail  - bedDetail format with id and text for hgc clicks
               - requires tab and sqlTable options
   -maxChromNameLength=N  - specify max chromName length to avoid
               - reference to chromInfo table
   -tmpDir=<path>  - path to directory for creation of temporary .tab file
                   - which will be removed after loading
   -noNameIx  - no index for the name column (default creates index)
   -ignoreEmpty  - no error on empty input file
   -noStrict  - don't perform coord sanity checks
              - by default we abort when: chromStart >= chromEnd
   -allowStartEqualEnd  - even when doing strict checks, allow
                          chromStart==chromEnd (zero-length e.g. insertion)
   -allowNegativeScores  - sql definition of score column is int, not unsigned
   -customTrackLoader  - turns on: -noNameIx, -noHistory, -ignoreEmpty,
                         -allowStartEqualEnd, -allowNegativeScores, -verbose=0
                         Plus, this turns on a 20 minute time-out exit.
   -fillInScore=colName - if every score value is zero, then use column 'colName' to fill in the score column (from minScore-1000)
   -minScore=N - minimum value for score field for -fillInScore option (default 100)
   -verbose=N - verbose level for extra information to STDERR
   -dotIsNull=N - if the specified field is a '.' the replace it with -1
   -lineLimit=N - limit input file to this number of lines

================================================================
========   hgLoadChain   ====================================
================================================================
### kent source version 362 ###
hgLoadChain - Load a generic Chain file into database
usage:
   hgLoadChain database chrN_track chrN.chain
options:
   -tIndex  Include tName in indexes (for non-split chain tables)
   -noBin   suppress bin field, default: bin field is added
   -noSort  Don't sort by target (memory-intensive) -- input *must* be
            sorted by target already if this option is used.
   -oldTable add to existing table, default: create new table
   -sqlTable=table.sql Create table from .sql file
   -normScore add normalized score column to table, default: not added
   -qPrefix=xxx   prepend "xxx" to query name
   -test    suppress loading to database

================================================================
========   hgLoadMaf   ====================================
================================================================
### kent source version 362 ###
hgLoadMaf - Load a maf file index into the database
usage:
   hgLoadMaf database table
options:
   -warn            warn instead of error upon empty/incomplete alignments
   -WARN            warn instead of error, with detail for the warning
   -test=infile     use infile as input, and suppress loading
                    the database. Just create .tab file in current dir.
   -pathPrefix=dir  load files from specified directory 
                    (default /gbdb/database/table.
   -tmpDir=<path>   path to directory for creation of temporary .tab file
                    which will be removed after loading
   -loadFile=file   use file as input
   -maxNameLen=N    specify max chromosome name length to avoid
                    reference to chromInfo table
   -defPos=file     file to put default position in
                    default position is first block
   -custom          loading a custom track, don't use history
                    or extFile tables

NOTE: The maf files need to be in chromosome coordinates,
      the reference species must be the first component, and 
      the blocks must be correctly ordered and be on the
      '+' strand

================================================================
========   hgLoadMafSummary   ====================================
================================================================
### kent source version 362 ###
hgLoadMafSummary - Load a summary table of pairs in a maf into a database
usage:
   hgLoadMafSummary database table file.maf
options:
   -mergeGap=N   max size of gap to merge regions (default 500)
   -minSize=N         merge blocks smaller than N (default 10000)
   -maxSize=N         break up blocks larger than N (default 50000)
   -minSeqSize=N skip alignments when reference sequence is less than N
                 (default 1000000 -- match with hgTracks min window size for
                 using summary table)
   -test         suppress loading the database. Just create .tab file(s)
                     in current dir.

================================================================
========   hgLoadNet   ====================================
================================================================
### kent source version 362 ###
hgLoadNet - Load a generic net file into database
usage:
   hgLoadNet database track files(s).net
options:
   -noBin   suppress bin field
   -oldTable add to existing table
   -sqlTable=table.sql Create table from .sql file
   -qPrefix=xxx prepend "xxx-" to query name
   -warn load even with missing fields
   -test suppress loading table

================================================================
========   hgLoadOut   ====================================
================================================================
### kent source version 362 ###
hgLoadOut - load RepeatMasker .out files into database
usage:
   hgLoadOut database file(s).out
For multiple files chrN.out this will create the single table 'rmsk'
in the database, use the -split argument to obtain separate chrN_rmsk tables.
options:
   -tabFile=text.tab - don't actually load database, just create tab file
   -split - load chrN_rmsk separate tables even if a single file is given
   -table=name - use a different suffix other than the default (rmsk)
================================================================
========   hgLoadOutJoined   ====================================
================================================================
### kent source version 362 ###
hgLoadOutJoined - load new style (2014) RepeatMasker .out files into database
usage:
   hgLoadOutJoined database file(s).out
For multiple files chrN.out this will create the single table 'rmskOutBaseline'
in the database.
options:
   -tabFile=text.tab - don't actually load database, just create tab file
   -table=name - use a different suffix other than the default (rmskOutBaseline)
================================================================
========   hgLoadSqlTab   ====================================
================================================================
### kent source version 362 ###
hgLoadSqlTab - Load table into database from SQL and text files.
usage:
   hgLoadSqlTab database table file.sql file(s).tab
file.sql contains a SQL create statement for table
file.tab contains tab-separated text (rows of table)
The actual table name will come from the command line, not the sql file.
options:
  -warn - warn instead of abort on mysql errors or warnings
  -notOnServer - file is *not* in a directory that the mysql server can see
  -oldTable|-append - add to existing table

To load bed 3+ sorted tab files as hgLoadBed would do automatically
sort the input file:
  sort -k1,1 -k2,2n file(s).tab | hgLoadSqlTab database table file.sql stdin

================================================================
========   hgLoadWiggle   ====================================
================================================================
### kent source version 362 ###
hgLoadWiggle - Load a wiggle track definition into database
usage:
   hgLoadWiggle [options] database track files(s).wig
options:
   -noBin	suppress bin field
   -noLoad	do not load table, only create .tab file
   -noHistory	do not add history table comments (for custom tracks)
   -oldTable	add to existing table
   -tab		Separate by tabs rather than space
   -pathPrefix=<path>	.wib file path prefix to use (default /gbdb/<DB>/wib)
   -chromInfoDb=<DB>	database to extract chromInfo size information
   -maxChromNameLength=N  - specify max chromName length to avoid
               - reference to chromInfo table
   -tmpDir=<path>  - path to directory for creation of temporary .tab file
                   - which will be removed after loading
   -verbose=N	N=2 see # of lines input and SQL create statement,
		N=3 see chrom size info, N=4 see details on chrom size info
================================================================
========   hgSpeciesRna   ====================================
================================================================
### kent source version 362 ###
hgSpeciesRna - Create fasta file with RNA from one species
usage:
   hgSpeciesRna database Genus species output.fa
options:
   -est         - If set will get ESTs rather than mRNAs
   -filter=file - only read accessions listed in file

================================================================
========   hgTrackDb   ====================================
================================================================
### kent source version 362 ###
hgTrackDb - Create trackDb table from text files.

Note that the browser supports multiple trackDb tables, usually
in the form: trackDb_YourUserName. Which particular trackDb
table the browser uses is specified in the hg.conf file found
either in your home directory file '.hg.conf' or in the web server's
cgi-bin/hg.conf configuration file with the setting: db.trackDb=trackDb
see also: src/product/ex.hg.conf discussion of this setting.
usage:
   hgTrackDb [options] org database trackDb trackDb.sql hgRoot

Options:
  org - a directory name with a hierarchy of trackDb.ra files to examine
      - in the case of a single directory with a single trackDb.ra file use .
  database - name of database to create the trackDb table in
  trackDb  - name of table to create, usually trackDb, or trackDb_${USER}
  trackDb.sql  - SQL definition of the table to create, typically from
               - the source tree file: src/hg/lib/trackDb.sql
               - the table name in the CREATE statement is replaced by the
               - table name specified on this command line.
  hgRoot - a directory name to prepend to org to locate the hierarchy:
           hgRoot/trackDb.ra - top level trackDb.ra file processed first
           hgRoot/org/trackDb.ra - second level file processed second
           hgRoot/org/database/trackDb.ra - third level file processed last
         - for no directory hierarchy use .
  -strict - only include tables that exist (and complain about missing html files).
  -raName=trackDb.ra - Specify a file name to use other than trackDb.ra
   for the ra files.
  -release=alpha|beta|public - Include trackDb entries with this release tag only.
  -settings - for trackDb scanning, output table name, type line,
            -  and settings hash to stderr while loading everything.
================================================================
========   hgWiggle   ====================================
================================================================
### kent source version 362 ###
#	no database specified, using .wig files
#	doAscii option on, perform the default ascii output
hgWiggle - fetch wiggle data from data base or file
usage:
   hgWiggle [options] <track names ...>
options:
   -db=<database> - use specified database
   -chr=chrN - examine data only on chrN
   -chrom=chrN - same as -chr option above
   -position=[chrN:]start-end - examine data in window start-end (1-relative)
             (the chrN: is optional)
   -chromLst=<file> - file with list of chroms to examine
   -doAscii - perform the default ascii output, in addition to other outputs
            - Any of the other -do outputs turn off the default ascii output
            - ***WARNING*** this ascii output is 0-relative offset which
            - *** is *not* the normal wiggle input format.  Use the -lift
            - *** argument -lift=1 to get 1-relative offset:
   -lift=<D> - lift ascii output positions by D (0 default)
   -rawDataOut - output just the data values, nothing else
   -htmlOut - output stats or histogram in HTML instead of plain text
   -doStats - perform stats measurement, default output text, see -htmlOut
   -doBed - output bed format
   -bedFile=<file> - constrain output to ranges specified in bed <file>
   -dataConstraint='DC' - where DC is one of < = >= <= == != 'in range'
   -ll=<F> - lowerLimit compare data values to F (float) (all but 'in range')
   -ul=<F> - upperLimit compare data values to F (float)
		(need both ll and ul when 'in range')

   -help - display more examples and extra options (to stderr)

   When no database is specified, track names will refer to .wig files

   example using the file chrM.wig:
	hgWiggle chrM
   example using the database table hg17.gc5Base:
	hgWiggle -chr=chrM -db=hg17 gc5Base
================================================================
========   hgsqldump   ====================================
================================================================
hgsqldump - Execute mysqldump using passwords from .hg.conf
usage:
   hgsqldump [OPTIONS] database [tables]
or:
   hgsqldump [OPTIONS] --databases [OPTIONS] DB1 [DB2 DB3 ...]
or:
   hgsqldump [OPTIONS] --all-databases [OPTIONS]
Generally anything in command line is passed to mysqldump
	after an implicit '-u user -ppassword
See also: mysqldump
Note: directory for results must be writable by mysql.  i.e. 'chmod 777 .'
Which is a security risk, so remember to change permissions back after use.
e.g.: hgsqldump --all -c --tab=. cb1

================================================================
========   htmlCheck   ====================================
================================================================
### kent source version 362 ###
htmlCheck - Do a little reading and verification of html file
usage:
   htmlCheck how url
where how is:
   ok - just check for 200 return.  Print error message and exit -1 if no 200
   getAll - read the url (header and html) and print to stdout
   getHeader - read the header and print to stdout
   getCookies - print list of cookies
   getHtml - print the html, but not the header to stdout
   getForms - print the form structure to stdout
   getVars - print the form variables to stdout
   getLinks - print links
   getTags - print out just the tags
   checkLinks - check links in page
   checkLinks2 - check links in page and all subpages in same host
             (Just one level of recursion)
   checkLocalLinks - check local links in page
   checkLocalLinks2 - check local links in page and connected local pages
             (Just one level of recursion)
   submit - submit first form in page if any using 'GET' method
   validate - do some basic validations including TABLE/TR/TD nesting
   strictTagNestCheck - check tags are correctly nested
options:
   cookies=cookie.txt - Cookies is a two column file
           containing <cookieName><space><value><newLine>
note: url will need to be in quotes if it contains an ampersand or question mark.
================================================================
========   hubCheck   ====================================
================================================================
### kent source version 362 ###
hubCheck - Check a track data hub for integrity.
usage:
   hubCheck http://yourHost/yourDir/hub.txt
options:
   -noTracks             - don't check remote files for tracks, just trackDb (faster)
   -checkSettings        - check trackDb settings to spec
   -version=[v?|url]     - version to validate settings against
                                     (defaults to version in hub.txt, or current standard)
   -extra=[file|url]     - accept settings in this file (or url)
   -level=base|required  - reject settings below this support level
   -settings             - just list settings with support level
   -udcDir=/dir/to/cache - place to put cache for remote bigBed/bigWigs.
                                     Will create this directory if not existing
   -printMeta            - print the metadaa for each track
   -cacheTime=N          - set cache refresh time in seconds, default 1
   -verbose=2            - output verbosely

================================================================
========   hubPublicCheck   ====================================
================================================================
### kent source version 362 ###
hubPublicCheck - checks that the labels in hubPublic match what is in the hub labels
   outputs SQL statements to put the table into compliance
usage:
   hubPublicCheck tableName 
options:
   -udcDir=/dir/to/cache - place to put cache for remote bigBed/bigWigs
   -addHub=url           - output statments to add url to table

================================================================
========   ixIxx   ====================================
================================================================
### kent source version 362 ###
ixIxx - Create indices for simple line-oriented file of format 
<symbol> <free text>
usage:
   ixIxx in.text out.ix out.ixx
Where out.ix is a word index, and out.ixx is an index into the index.
options:
   -prefixSize=N Size of prefix to index on in ixx.  Default is 5.
   -binSize=N Size of bins in ixx.  Default is 64k.

================================================================
========   lavToAxt   ====================================
================================================================
lavToAxt - Convert blastz lav file to an axt file (which includes sequence)
usage:
   lavToAxt in.lav tNibDir qNibDir out.axt
Where tNibDir/qNibDir are either directories full of nib files, or a single
twoBit file
options:
   -fa  qNibDir is interpreted as a fasta file of multiple dna seq instead of directory of nibs
   -tfa tNibDir is interpreted as a fasta file of multiple dna seq instead of directory of nibs
   -dropSelf  drops alignment blocks on the diagonal for self alignments
   -scoreScheme=fileName Read the scoring matrix from a blastz-format file.
                (only used in conjunction with -dropSelf, to rescore 
                alignments when blocks are dropped)

================================================================
========   lavToPsl   ====================================
================================================================
lavToPsl - Convert blastz lav to psl format
usage:
   lavToPsl in.lav out.psl
options:
   -target-strand=c set the target strand to c (default is no strand)
   -bed output bed instead of psl
   -scoreFile=filename  output lav scores to side file, such that
                        each psl line in out.psl is matched by a score line.

================================================================
========   ldHgGene   ====================================
================================================================
### kent source version 362 ###
ldHgGene - load database with gene predictions from a gff file.
usage:
     ldHgGene database table file(s).gff
options:
     -bin         Add bin column (now the default)
     -nobin       don't add binning (you probably don't want this)
     -exon=type   Sets type field for exons to specific value
     -oldTable    Don't overwrite what's already in table
     -noncoding   Forces whole prediction to be UTR
     -gtf         input is GTF, stop codon is not in CDS
     -predTab     input is already in genePredTab format
     -requireCDS  discard genes that don't have CDS annotation
     -out=gpfile  write output, in genePred format, instead of loading
                  table. Database is ignored.
     -genePredExt create a extended genePred, including frame
                  information and gene name
     -impliedStopAfterCds - implied stop codon in GFF/GTF after CDS

================================================================
========   liftOver   ====================================
================================================================
### kent source version 362 ###
liftOver - Move annotations from one assembly to another
usage:
   liftOver oldFile map.chain newFile unMapped
oldFile and newFile are in bed format by default, but can be in GFF and
maybe eventually others with the appropriate flags below.
The map.chain file has the old genome as the target and the new genome
as the query.

***********************************************************************
WARNING: liftOver was only designed to work between different
         assemblies of the same organism. It may not do what you want
         if you are lifting between different organisms. If there has
         been a rearrangement in one of the species, the size of the
         region being mapped may change dramatically after mapping.
***********************************************************************

options:
   -minMatch=0.N Minimum ratio of bases that must remap. Default 0.95
   -gff  File is in gff/gtf format.  Note that the gff lines are converted
         separately.  It would be good to have a separate check after this
         that the lines that make up a gene model still make a plausible gene
         after liftOver
   -genePred - File is in genePred format
   -sample - File is in sample format
   -bedPlus=N - File is bed N+ format
   -positions - File is in browser "position" format
   -hasBin - File has bin value (used only with -bedPlus)
   -tab - Separate by tabs rather than space (used only with -bedPlus)
   -pslT - File is in psl format, map target side only
   -ends=N - Lift the first and last N bases of each record and combine the
             result. This is useful for lifting large regions like BAC end pairs.
   -minBlocks=0.N Minimum ratio of alignment blocks or exons that must map
                  (default 1.00)
   -fudgeThick    (bed 12 or 12+ only) If thickStart/thickEnd is not mapped,
                  use the closest mapped base.  Recommended if using 
                  -minBlocks.
   -multiple               Allow multiple output regions
   -minChainT, -minChainQ  Minimum chain size in target/query, when mapping
                           to multiple output regions (default 0, 0)
   -minSizeT               deprecated synonym for -minChainT (ENCODE compat.)
   -minSizeQ               Min matching region size in query with -multiple.
   -chainTable             Used with -multiple, format is db.tablename,
                               to extend chains from net (preserves dups)
   -errorHelp              Explain error messages

================================================================
========   liftOverMerge   ====================================
================================================================
### kent source version 362 ###
liftOverMerge - Merge multiple regions in BED 5 files
                   generated by liftOver -multiple
usage:
   liftOverMerge oldFile newFile
options:
   -mergeGap=N    Max size of gap to merge regions (default 0)

================================================================
========   liftUp   ====================================
================================================================
### kent source version 362 ###
liftUp - change coordinates of .psl, .agp, .gap, .gl, .out, .align, .gff, .gtf
.bscore .tab .gdup .axt .chain .net, .gp, .genepred, .wab, .bed, .bed3, or .bed8 files to
parent coordinate system.

usage:
   liftUp [-type=.xxx] destFile liftSpec how sourceFile(s)
The optional -type parameter tells what type of files to lift
If omitted the type is inferred from the suffix of destFile
Type is one of the suffixes described above.
DestFile will contain the merged and lifted source files,
with the coordinates translated as per liftSpec.  LiftSpec
is tab-delimited with each line of the form:
   offset oldName oldSize newName newSize
LiftSpec may optionally have a sixth column specifying + or - strand,
but strand is not supported for all input types.
The 'how' parameter controls what the program will do with
items which are not in the liftSpec.  It must be one of:
   carry - Items not in liftSpec are carried to dest without translation
   drop  - Items not in liftSpec are silently dropped from dest
   warn  - Items not in liftSpec are dropped.  A warning is issued
   error - Items not in liftSpec generate an error
If the destination is a .agp file then a 'large inserts' file
also needs to be included in the command line:
   liftUp dest.agp liftSpec how inserts sourceFile(s)
This file describes where large inserts due to heterochromitin
should be added. Use /dev/null and set -gapsize if there's not inserts file.

options:
   -nohead  No header written for .psl files
   -dots=N Output a dot every N lines processed
   -pslQ  Lift query (rather than target) side of psl
   -axtQ  Lift query (rather than target) side of axt
   -chainQ  Lift query (rather than target) side of chain
   -netQ  Lift query (rather than target) side of net
   -wabaQ  Lift query (rather than target) side of waba alignment
   	(waba lifts only work with query side at this time)
   -nosort Don't sort bed, gff, or gdup files, to save memory
   -gapsize change contig gapsize from default
   -ignoreVersions - Ignore NCBI-style version number in sequence ids of input files
   -extGenePred lift extended genePred

================================================================
========   linesToRa   ====================================
================================================================
### kent source version 362 ###
linesToRa - generate .ra format from lines with pipe-separated fields
usage:
   linesToRa in.txt out.ra

================================================================
========   localtime   ====================================
================================================================
localtime - convert unix timestamp to date string
usage: localtime <time stamp>
	<time stamp> - integer 0 to 2147483647
================================================================
========   mafAddIRows   ====================================
================================================================
### kent source version 362 ###
mafAddIRows - add 'i' rows to a maf
usage:
   mafAddIRows mafIn twoBitFile mafOut
WARNING:  requires a maf with only a single target sequence
options:
   -nBeds=listOfBedFiles
       reads in list of bed files, one per species, with N locations
   -addN
       adds rows of N's into maf blocks (rather than just annotating them)
   -addDash
       adds rows of -'s into maf blocks (rather than just annotating them)

================================================================
========   mafAddQRows   ====================================
================================================================
### kent source version 362 ###
mafAddQRows - Add quality data to a maf
usage:
   mafAddQRows species.lst in.maf out.maf
where each species.lst line contains two fields
   1) species name
   2) directory where the .qac and .qdx files are located
options:
  -divisor=n is value to divide Q value by.  Default is 5.

================================================================
========   mafCoverage   ====================================
================================================================
### kent source version 362 ###
mafCoverage - Analyse coverage by maf files - chromosome by 
chromosome and genome-wide.
usage:
   mafCoverage database mafFile
Note maf file must be sorted by chromosome,tStart
   -restrict=restrict.bed Restrict to parts in restrict.bed
   -count=N Number of matching species to count coverage. Default = 3 

================================================================
========   mafFetch   ====================================
================================================================
mafFetch - get overlapping records from an MAF using an index table
usage:
   mafFetch db table overBed mafOut

Select MAF records overlapping records in the BED using the
the database table to lookup the file and record offset.
Only the first 3 columns are required in the bed.

Options:

================================================================
========   mafFilter   ====================================
================================================================
### kent source version 362 ###
mafFilter - Filter out maf files. Output goes to standard out
usage:
   mafFilter file(s).maf
options:
   -tolerate - Just ignore bad input rather than aborting.
   -minCol=N - Filter out blocks with fewer than N columns (default 1)
   -minRow=N - Filter out blocks with fewer than N rows (default 2)
   -maxRow=N - Filter out blocks with >= N rows (default 100)
   -factor - Filter out scores below -minFactor * (ncol**2) * nrow
   -minFactor=N - Factor to use with -minFactor (default 5)
   -minScore=N - Minimum allowed score (alternative to -minFactor)
   -reject=filename - Save rejected blocks in filename
   -needComp=species - all alignments must have species as one of the component
   -overlap - Reject overlapping blocks in reference (assumes ordered blocks)
   -componentFilter=filename - Filter out blocks without a component listed in filename 
   -speciesFilter=filename - Filter out blocks without a species listed in filename 

================================================================
========   mafFrag   ====================================
================================================================
### kent source version 362 ###
mafFrag - Extract maf sequences for a region from database
usage:
   mafFrag database mafTrack chrom start end strand out.maf
options:
   -outName=XXX  Use XXX instead of database.chrom for the name

================================================================
========   mafFrags   ====================================
================================================================
### kent source version 362 ###
mafFrags - Collect MAFs from regions specified in a 6 column bed file
usage:
   mafFrags database track in.bed out.maf
options:
   -orgs=org.txt - File with list of databases/organisms in order
   -bed12 - If set, in.bed is a bed 12 file, including exons
   -thickOnly - Only extract subset between thickStart/thickEnd
   -meFirst - Put native sequence first in maf
   -txStarts - Add MAF txstart region definitions ('r' lines) using BED name
    and output actual reference genome coordinates in MAF.
   -refCoords - output actual reference genome coordinates in MAF.

================================================================
========   mafGene   ====================================
================================================================
### kent source version 362 ###
mafGene - output protein alignments using maf and genePred
usage:
   mafGene dbName mafTable genePredTable species.lst output
arguments:
   dbName         name of SQL database
   mafTable       name of maf file table
   genePredTable  name of the genePred table
   species.lst    list of species names
   output         put output here
options:
   -useFile           genePredTable argument is a genePred file name
   -geneName=foobar   name of gene as it appears in genePred
   -geneList=foolst   name of file with list of genes
   -geneBeds=foo.bed  name of bed file with genes and positions
   -chrom=chr1        name of chromosome from which to grab genes
   -exons             output exons
   -noTrans           don't translate output into amino acids
   -uniqAA            put out unique pseudo-AA for every different codon
   -includeUtr        include the UTRs, use only with -noTrans
   -delay=N           delay N seconds between genes (default 0)
   -noDash            don't output lines with all dashes

================================================================
========   mafMeFirst   ====================================
================================================================
### kent source version 362 ###
mafMeFirst - Move component to top if it is one of the named ones.  
Useful in conjunction with mafFrags when you don't want the one with 
the gene name to be in the middle.
usage:
   mafMeFirst in.maf me.list out.maf
options:
   -xxx=XXX

================================================================
========   mafOrder   ====================================
================================================================
### kent source version 362 ###
mafOrder - order components within a maf file
usage:
   mafOrder mafIn order.lst mafOut
where order.lst has one species per line
options:

================================================================
========   mafRanges   ====================================
================================================================
### kent source version 362 ###
mafRanges - Extract ranges of target (or query) coverage from maf and 
            output as BED 3 (e.g. for processing by featureBits).
usage:
   mafRanges in.maf db out.bed
            db should appear in in.maf alignments as the first part of 
            "db.seqName"-style sequence names.  The seqName part will 
            be used as the chrom field in the range printed to out.bed.
options:
   -otherDb=oDb  Output ranges only for alignments that include oDb.
                 oDB can be comma-separated list.
   -notAllOGap   Don't include bases for which all other species have a gap.


================================================================
========   mafSpeciesList   ====================================
================================================================
### kent source version 362 ###
mafSpeciesList - Scan maf and output all species used in it.
usage:
   mafSpeciesList in.maf out.lst
options:
   -ignoreFirst - If true ignore first species in each maf, useful when this
                  is a mafFrags result that puts gene id there.

================================================================
========   mafSpeciesSubset   ====================================
================================================================
### kent source version 362 ###
mafSpeciesSubset - Extract a maf that just has a subset of species.
usage:
   mafSpeciesSubset in.maf species.lst out.maf
Where:
    in.maf is a file where the sequence source are either simple species
           names, or species.something.  Usually actually it's a genome
           database name rather than a species before the dot to tell the
           truth.
    species.lst is a file with a list of species to keep
    out.maf is the output.  It will have columns that are all - or . in
           the reduced species set removed, as well as the lines representing
           species not in species.lst removed.
options:
   -keepFirst - If set, keep the first 'a' line in a maf no matter what
                Useful for mafFrag results where we use this for the gene name

================================================================
========   mafSplit   ====================================
================================================================
### kent source version 362 ###
mafSplit - Split multiple alignment files
usage:
   mafSplit splits.bed outRoot file(s).maf
options:
   -byTarget       Make one file per target sequence.  (splits.bed input
                   is ignored).
   -outDirDepth=N  For use only with -byTarget.
                   Create N levels of output directory under current dir.
                   This helps prevent NFS problems with a large number of
                   file in a directory.  Using -outDirDepth=3 would
                   produce ./1/2/3/outRoot123.maf.
   -useSequenceName  For use only with -byTarget.
                     Instead of auto-incrementing an integer to determine
                     output filename, expect each target sequence name to
                     end with a unique number and use that number as the
                     integer to tack onto outRoot.
   -useFullSequenceName  For use only with -byTarget.
                     Instead of auto-incrementing an integer to determine
                     output filename, use the target sequence name
                     to tack onto outRoot.
   -useHashedName=N  For use only with -byTarget.
                     Instead of auto-incrementing an integer or requiring
                     a unique number in the sequence name, use a hash
                     function on the sequence name to compute an N-bit
                     number.  This limits the max #filenames to 2^N and
                     ensures that even if different subsets of sequences
                     appear in different pairwise mafs, the split file
                     names will be consistent (due to hash function).
                     This option is useful when a "scaffold-based"
                     assembly has more than one sequence name pattern,
                     e.g. both chroms and scaffolds.


================================================================
========   mafSplitPos   ====================================
================================================================
### kent source version 362 ###
mafSplitPos - Pick positions to split multiple alignment input files
usage:
   mafSplitPos database size(Mbp) out.bed
options:
   -chrom=chrN   Restrict to one chromosome
   -minGap=N     Split only on gaps >N bp, defaults to 100, specify -1 to disable
   -minRepeat=N  Split only on repeats >N bp, defaults to 100, specify -1 to disable

================================================================
========   mafToAxt   ====================================
================================================================
### kent source version 362 ###
mafToAxt - Convert from maf to axt format
usage:
   mafToAxt in.maf tName qName output
Where tName and qName are the names for the
target and query sequences respectively.
tName should be maf target since it must always
be oriented in "+" direction.
  Use 'first' for tName to always use first sequence
Options:
  -stripDb - Strip names from start to first period.

================================================================
========   mafToBigMaf   ====================================
================================================================
### kent source version 362 ###
mafToBigMaf - Put ucsc standard maf file into bigMaf format
usage:
   mafToBigMaf referenceDb input.maf out.bed
options:
   -xxx=XXX

================================================================
========   mafToPsl   ====================================
================================================================
### kent source version 362 ###
mafToPsl - Convert maf to psl format
usage:
   mafToPsl querySrc targetSrc in.maf out.psl

The query and target src can be either an organism prefix (hg17),
or a full src sequence name (hg17.chr11), or just the sequence name
if the MAF does not contain organism prefixes.


================================================================
========   mafToSnpBed   ====================================
================================================================
### kent source version 362 ###
mafToSnpBed - finds SNPs in MAF and builds a bed with their functional consequence
usage:
   mafToSnpBed database input.maf input.gp output.bed
options:
   -xxx=XXX

================================================================
========   mafsInRegion   ====================================
================================================================
### kent source version 362 ###
mafsInRegion - Extract MAFS in a genomic region
usage:
    mafsInRegion regions.bed out.maf|outDir in.maf(s)
options:
    -outDir - output separate files named by bed name field to outDir
    -keepInitialGaps - keep alignment columns at the beginning and of a block that are gapped in all species

================================================================
========   makeTableList   ====================================
================================================================
### kent source version 362 ###
makeTableList - create/recreate tableList tables (cache of SHOW TABLES and DESCRIBE)
usage:
   makeTableList [assemblies]
options:
   -host               show tables: mysql host
   -user               show tables: mysql user
   -password           show tables: mysql password
   -toProf             optional: mysql profile to write table list to (target server)
   -toHost             alternative to toProf: mysql target host
   -toUser             alternative to toProf: mysql target user
   -toPassword         alternative to toProf: mysql target pwd
   -hgcentral          specify an alternative hgcentral db name when using -all
   -all                recreate tableList for all active assemblies in hg.conf's hgcentral
   -bigFiles           create table with tuples (track, name of bigfile)

================================================================
========   maskOutFa   ====================================
================================================================
### kent source version 362 ###
maskOutFa - Produce a masked .fa file given an unmasked .fa and
a RepeatMasker .out file, or a .bed file to mask on.
usage:
   maskOutFa in.fa maskFile out.fa.masked
where in.fa and out.fa.masked can be the same file, and
maskFile can end in .out (for RepeatMasker) or .bed.
MaskFile parameter can also be the word 'hard' in which case 
lower case letters are converted to N's.
options:
   -soft - puts masked parts in lower case other in upper.
   -softAdd - lower cases masked bits, leaves others unchanged
   -clip - clip out of bounds mask records rather than dying.
   -maskFormat=fmt - "out" or "bed" for when input does not have required extension.

================================================================
========   mktime   ====================================
================================================================
mktime - convert date string to unix timestamp
usage: mktime YYYY-MM-DD HH:MM:SS
valid dates: 1970-01-01 00:00:00 to 2038-01-19 03:14:07
================================================================
========   mrnaToGene   ====================================
================================================================
### kent source version 362 ###
mrnaToGene - convert PSL alignments of mRNAs to gene annotations
usage:
   mrnaToGene [options] psl genePredFile

Convert PSL alignments with CDS annotation from genbank to  gene
annotations in genePred format.  Accessions without valids CDS are
optionally dropped. A best attempt is made to convert incomplete CDS
annotations.

The psl argument may either be a PSL file or a table in a databases,
depending on options.  CDS maybe obtained from the database or file.
Accession in PSL files are tried with and with out genbank versions.

Options:
  -db=db - get PSLs and CDS from this database, psl specifies the table.
  -cdsDb=db - get CDS from this database, psl is a file.
  -cdsFile=file - get CDS from this file, psl is a file.
   File is tab separate with name as the first column and
   NCBI CDS the second
  -insertMergeSize=8 - Merge inserts (gaps) no larger than this many bases.
   A negative size disables merging of blocks.  This differs from specifying zero
   in that adjacent blocks will not be merged, allowing tracking of frame for
   each block. Defaults to 8 unless -cdsMergeSize or -utrMergeSize are specified,
   if either of these are specified, this option is ignored.
  -smallInsertSize=n - alias for -insertMergetSize
  -cdsMergeSize=-1 - merge gaps in CDS no larger than this size.
   A negative values disables.
  -cdsMergeMod3 - only merge CDS gaps if they mod 3
  -utrMergeSize=-1  - merge gaps in UTR no larger than this size.
   A negative values disables.
  -requireUtr - Drop sequences that don't have both 5' and 3' UTR annotated.
  -genePredExt - create a extended genePred, including frame information.
  -allCds - consider PSL to be all CDS.
  -noCds - consider PSL to not contain any CDS.
  -keepInvalid - Keep sequences with invalid CDS.
  -quiet - Don't print print info about dropped sequences.
  -ignoreUniqSuffix - ignore all characters after last `-' in qName
   when looking up CDS. Used when a suffix has been added to make qName
   unique.  It is not removed from the name in the genePred.


================================================================
========   netChainSubset   ====================================
================================================================
### kent source version 362 ###
netChainSubset - Create chain file with subset of chains that appear in the net
usage:
   netChainSubset in.net in.chain out.chain
options:
   -gapOut=gap.tab - Output gap sizes to file
   -type=XXX - Restrict output to particular type in net file
   -splitOnInsert - Split chain when get an insertion of another chain
   -wholeChains - Write entire chain references by net, don't split
    when a high-level net is encoundered.  This is useful when nets
    have been filtered.
   -skipMissing - skip chains that are not found instead of generating
    an error.  Useful if chains have been filtered.

================================================================
========   netClass   ====================================
================================================================
### kent source version 362 ###
netClass - Add classification info to net
usage:
   netClass [options] in.net tDb qDb out.net
       tDb - database to fetch target repeat masker table information
       qDb - database to fetch query repeat masker table information
options:
   -tNewR=dir - Dir of chrN.out.spec files, with RepeatMasker .out format
                lines describing lineage specific repeats in target
   -qNewR=dir - Dir of chrN.out.spec files for query
   -noAr - Don't look for ancient repeats
   -qRepeats=table - table name for query repeats in place of rmsk
   -tRepeats=table - table name for target repeats in place of rmsk
                   - for example: -tRepeats=windowmaskerSdust
   -liftQ=file.lft - Lift in.net's query coords to chrom-level using
                     file.lft (for accessing chrom-level coords in qDb)
   -liftT=file.lft - Lift in.net's target coords to chrom-level using
                     file.lft (for accessing chrom-level coords in tDb)

================================================================
========   netFilter   ====================================
================================================================
### kent source version 362 ###
netFilter - Filter out parts of net.  What passes
filter goes to standard output.  Note a net is a
recursive data structure.  If a parent fails to pass
the filter, the children are not even considered.
usage:
   netFilter in.net(s)
options:
   -q=chr1,chr2 - restrict query side sequence to those named
   -notQ=chr1,chr2 - restrict query side sequence to those not named
   -t=chr1,chr2 - restrict target side sequence to those named
   -notT=chr1,chr2 - restrict target side sequence to those not named
   -minScore=N - restrict to those scoring at least N
   -maxScore=N - restrict to those scoring less than N
   -minGap=N  - restrict to those with gap size (tSize) >= minSize
   -minAli=N - restrict to those with at least given bases aligning
   -maxAli=N - restrict to those with at most given bases aligning
   -minSizeT=N - restrict to those at least this big on target
   -minSizeQ=N - restrict to those at least this big on query
   -qStartMin=N - restrict to those with qStart at least N
   -qStartMax=N - restrict to those with qStart less than N
   -qEndMin=N - restrict to those with qEnd at least N
   -qEndMax=N - restrict to those with qEnd less than N
   -tStartMin=N - restrict to those with tStart at least N
   -tStartMax=N - restrict to those with tStart less than N
   -tEndMin=N - restrict to those with tEnd at least N
   -tEndMax=N - restrict to those with tEnd less than N
   -qOverlapStart=N - restrict to those where the query overlaps a region starting here
   -qOverlapEnd=N - restrict to those where the query overlaps a region ending here
   -tOverlapStart=N - restrict to those where the target overlaps a region starting here
   -tOverlapEnd=N - restrict to those where the target overlaps a region ending here
   -type=XXX - restrict to given type, maybe repeated to allow several types
   -syn        - do filtering based on synteny (tuned for human/mouse).  
   -minTopScore=N - Minimum score for top level alignments. default 300000
   -minSynScore=N - Min syntenic block score (def=200,000). 
                      Default covers 27,000 bases including 9,000 
                      aligning--a very stringent requirement. 
   -minSynSize=N - Min syntenic block size (def=20,000). -
   -minSynAli=N  - Min syntenic alignment size(def=10,000). -
   -maxFar=N     - Max distance to allow synteny (def=200,000). 
   -nonsyn     - do inverse filtering based on synteny (tuned for human/mouse).  
   -chimpSyn   - do filtering based on synteny (tuned for human/chimp).  
   -fill - Only pass fills, not gaps. Only useful with -line.
   -gap  - Only pass gaps, not fills. Only useful with -line.
   -line - Do this a line at a time, not recursing
   -noRandom      - suppress chains involving 'random' chromosomes
   -noHap         - suppress chains involving chromosome names inc '_hap|_alt'

================================================================
========   netSplit   ====================================
================================================================
netSplit - Split a genome net file into chromosome net files
usage:
   netSplit in.net outDir
options:
   -xxx=XXX

================================================================
========   netSyntenic   ====================================
================================================================
### kent source version 362 ###
netSyntenic - Add synteny info to net.
usage:
   netSyntenic in.net out.net
options:
   -xxx=XXX

================================================================
========   netToAxt   ====================================
================================================================
### kent source version 362 ###
netToAxt - Convert net (and chain) to axt.
usage:
   netToAxt in.net in.chain target.2bit query.2bit out.axt
note:
   directories full of .nib files (an older format)
   may also be used in place of target.2bit and query.2bit.
options:
   -qChain - net is with respect to the q side of chains.
   -maxGap=N - maximum size of gap before breaking. Default 100
   -gapOut=gap.tab - Output gap sizes to file
   -noSplit - Don't split chain when there is an insertion of another chain

================================================================
========   netToBed   ====================================
================================================================
### kent source version 362 ###
netToBed - Convert target coverage of net to a bed file.
usage:
   netToBed in.net out.bed
options:
   -maxGap=N - break up at gaps of given size or more
   -minFill=N - only include fill of given size of above.

================================================================
========   newProg   ====================================
================================================================
### kent source version 362 ###
newProg - make a new C source skeleton.
usage:
   newProg progName description words
This will make a directory 'progName' and a file in it 'progName.c'
with a standard skeleton

Options:
   -jkhgap - include jkhgap.a and mysql libraries as well as jkweb.a archives 
   -cgi    - create shell of a CGI script for web
================================================================
========   newPythonProg   ====================================
================================================================
### kent source version 362 ###
newPythonProg - Make a skeleton for a new python program
usage:
   newPythonProg programName "The usage statement"
options:
   -xxx=XXX

================================================================
========   nibFrag   ====================================
================================================================
### kent source version 362 ###
nibFrag - Extract part of a nib file as .fa (all bases/gaps lower case by default)
usage:
   nibFrag [options] file.nib start end strand out.fa
where strand is + (plus) or m (minus)
options:
   -masked       Use lower-case characters for bases meant to be masked out.
   -hardMasked   Use upper-case for not masked-out, and 'N' characters for masked-out bases.
   -upper        Use upper-case characters for all bases.
   -name=name    Use given name after '>' in output sequence.
   -dbHeader=db  Add full database info to the header, with or without -name option.
   -tbaHeader=db Format header for compatibility with tba, takes database name as argument.

================================================================
========   nibSize   ====================================
================================================================
### kent source version 362 ###
nibSize - print size of nibs
usage:
   nibSize nib1 [...]

================================================================
========   oligoMatch   ====================================
================================================================
oligoMatch - find perfect matches in sequence.
usage:
   oligoMatch oligos sequence output.bed
where "oligos" and "sequence" can be .fa, .nib, or .2bit files.
The oligos may contain IUPAC codes.

================================================================
========   overlapSelect   ====================================
================================================================
### kent source version 362 ###
wrong # args:  overlapSelect [options] selectFile inFile outFile

Select records based on overlapping chromosome ranges.  The ranges are
specified in the selectFile, with each block specifying a range.
Records are copied from the inFile to outFile based on the selection
criteria.  Selection is based on blocks or exons rather than entire
range.

Options starting with -select* apply to selectFile and those starting
with -in* apply to inFile.

Options:
  -selectFmt=fmt - specify selectFile format:
          psl - PSL format (default for *.psl files).
          pslq - PSL format, using query instead of target
          genePred - genePred format (default for *.gp or
                     *.genePred files).
          bed - BED format (default for *.bed files).
                If BED doesn't have blocks, the bed range is used. 
          chain - chain file format (default from .chain files)
          chainq - chain file format, using query instead of target
  -selectCoordCols=spec - selectFile is tab-separate with coordinates
       as described by spec, which is one of:
            o chromCol - chrom in this column followed by start and end.
            o chromCol,startCol,endCol,strandCol,name - chrom, start, end, and
              strand in specified columns. Columns can be omitted from the end
              or left empty to not specify.
          NOTE: column numbers are zero-based
  -selectCds - Use only CDS in the selectFile
  -selectRange - Use entire range instead of blocks from records in
          the selectFile.
  -inFmt=fmt - specify inFile format, same values as -selectFmt.
  -inCoordCols=spec - inFile is tab-separate with coordinates specified by
      spec, in format described above.
  -inCds - Use only CDS in the inFile
  -inRange - Use entire range instead of blocks of records in the inFile.
  -nonOverlapping - select non-overlapping instead of overlapping records
  -strand - must be on the same strand to be considered overlapping
  -oppositeStrand - must be on the opposite strand to be considered overlapping
  -excludeSelf - don't compare records with the same coordinates and name.
      Warning: using only one of -inCds or -selectCds will result in different
      coordinates for the same record.
  -idMatch - only select overlapping records if they have the same id
  -aggregate - instead of computing overlap bases on individual select entries, 
      compute it based on the total number of inFile bases overlap by selectFile
      records. -overlapSimilarity and -mergeOutput will not work with
      this option.
  -overlapThreshold=0.0 - minimum fraction of an inFile record that
      must be overlapped by a single select record to be considered
      overlapping.  Note that this is only coverage by a single select
      record, not total coverage.
  -overlapThresholdCeil=1.1 - select only inFile records with less than
      this amount of overlap with a single record, provided they are selected
      by other criteria.
  -overlapSimilarity=0.0 - minimum fraction bases in inFile and selectFile
      records that overlap the same genomic locations.  This is computed
      by (2*overlapBase)/(inFileBase+selectFileBases).
      Note that this is only coverage by a single select record and this
      is bidirectional inFile and selectFile must overlap by this
      amount.  A value of 1.0 will select identical records (or CDS if
      both CDS options are specified.  Not currently supported with
      -aggregate.
  -overlapSimilarityCeil=1.1 - select only inFile records with less than this
      amount of similarity with a single record. provided they are selected by
      other criteria.
  -overlapBases=-1 - minimum number of bases of overlap, < 0 disables.
  -statsOutput - output overlap statistics instead of selected records. 
      If no overlap criteria is specified, all overlapping entries are
      reported, Otherwise only the pairs passing the criteria are
      reported. This results in a tab-separated file with the columns:
         inId selectId inOverlap selectOverlap overBases
      Where inOverlap is the fraction of the inFile record overlapped by
      the selectFile record and selectOverlap is the fraction of the
      select record overlap by inFile records.  With -aggregate, output
      is:
         inId inOverlap inOverBases inBases
  -statsOutputAll - like -statsOutput, however output all inFile records,
      including those that are not overlapped.
  -statsOutputBoth - like -statsOutput, however output all selectFile and
      inFile records, including those that are not overlapped.
  -mergeOutput - output file with be a merge of the input file with the
      selectFile records that selected it.  The format is
         inRec<tab>selectRec.
      if multiple select records hit, inRec is repeated. This will increase
      the memory required. Not supported with -nonOverlapping or -aggregate.
  -idOutput - output a tab-separated file of pairs of
         inId selectId
      with -aggregate, only a single column of inId is written
  -dropped=file  - output rows that were dropped to this file.
  -verbose=n - verbose > 1 prints some details,

================================================================
========   para   ====================================
================================================================
### kent source version 362 ###
para - version 12.18
Manage a batch of jobs in parallel on a compute cluster.
Normal usage is to do a 'para create' followed by 'para push' until
job is done.  Use 'para check' to check status.
usage:

   para [options] command [command-specific arguments]

The commands are:

para create jobList
   This makes the job-tracking database from a text file with the
   command line for each job on a separate line.
   options:
      -cpu=N  Number of CPUs used by the jobs, default 1.
      -ram=N  Number of bytes of RAM used by the jobs.
         Default is RAM on node divided by number of cpus on node.
         Shorthand expressions allow t,g,m,k for tera, giga, mega, kilo.
         e.g. 4g = 4 Gigabytes.
      -batch=batchDir - specify the directory path that is used to store the
       batch control files.  The batchDir can be an absolute path or a path
       relative to the current directory.  The resulting path is use as the
       batch name.  The directory is created if it doesn't exist.  When
       creating a new batch, batchDir should not have been previously used as
       a batch name.  The batchDir must be writable by the paraHub process.
       This does not affect the working directory assigned to jobs.  It defaults
       to the directory where para is run.  If used, this option must be specified
       on all para commands for the  batch.  For example to run two batches in the
       same directory:
          para -batch=b1 make jobs1
          para -batch=b2 make jobs2
para push 
   This pushes forward the batch of jobs by submitting jobs to parasol
   It will limit parasol queue size to something not too big and
   retry failed jobs.
   options:
      -retries=N  Number of retries per job - default 4.
      -maxQueue=N  Number of jobs to allow on parasol queue. 
         Default 2000000.
      -minPush=N  Minimum number of jobs to queue. 
         Default 1.  Overrides maxQueue.
      -maxPush=N  Maximum number of jobs to queue - default 100000.
      -warnTime=N  Number of minutes job runs before hang warning. 
         Default 4320 (3 days).
      -killTime=N  Number of minutes hung job runs before push kills it.
         By default kill off for backwards compatibility.
      -delayTime=N  Number of seconds to delay before submitting next job 
         to minimize i/o load at startup - default 0.
      -priority=x  Set batch priority to high, medium, or low.
         Default medium (use high only with approval).
         If needed, use with make, push, create, shove, or try.
         Or, set batch priority to a specific numeric value - default 10.
         1 is emergency high priority, 
         10 is normal medium, 
         100 is low for bottomfeeders.
         Setting priority higher than normal (1-9) will be logged.
         Please keep low priority jobs short, they won't be pre-empted.
      -maxJob=x  Limit the number of jobs the batch can run.
         Specify number of jobs, for example 10 or 'unlimited'.
         Default unlimited displays as -1.
      -jobCwd=dir - specify the directory path to use as the current working
       directory for each job.  The dir can be an absolute path or a path
       relative to the current directory. It defaults to the directory where
       para is run.
para try 
   This is like para push, but only submits up to 10 jobs.
para shove
   Push jobs in this database until all are done or one fails after N retries.
para make jobList
   Create database and run all jobs in it if possible.  If one job
   fails repeatedly this will fail.  Suitable for inclusion in makefiles.
   Same as a 'create' followed by a 'shove'.
para check
   This checks on the progress of the jobs.
para stop
   This stops all the jobs in the batch.
para chill
   Tells system to not launch more jobs in this batch, but
   does not stop jobs that are already running.
para finished
   List jobs that have finished.
para hung
   List hung jobs in the batch (running > killTime).
para slow
   List slow jobs in the batch (running > warnTime).
para crashed
   List jobs that crashed or failed output checks the last time they were run.
para failed
   List jobs that crashed after repeated restarts.
para status
   List individual job status, including times.
para problems
   List jobs that had problems (even if successfully rerun).
   Includes host info.
para running
   Print info on currently running jobs.
para hippos time
   Print info on currently running jobs taking > 'time' (minutes) to run.
para time
   List timing information.
para recover jobList newJobList
   Generate a job list by selecting jobs from an existing list where
   the `check out' tests fail.
para priority 999
   Set batch priority. Values explained under 'push' options above.
para maxJob 999
   Set batch maxJob. Values explained under 'push' options above.
para ram 999
   Set batch ram usage. Values explained under 'push' options above.
para cpu 999
   Set batch cpu usage. Values explained under 'push' options above.
para resetCounts
   Set batch done and crash counters to 0.
para flushResults
   Flush results file.  Warns if batch has jobs queued or running.
para freeBatch
   Free all batch info on hub.  Works only if batch has nothing queued or running.
para showSickNodes
   Show sick nodes which have failed when running this batch.
para clearSickNodes
   Clear sick nodes statistics and consecutive crash counts of batch.

Common options
   -verbose=1 - set verbosity level.

================================================================
========   paraFetch   ====================================
================================================================
### kent source version 362 ###
paraFetch - try to fetch url with multiple connections
usage:
   paraFetch N R URL {outPath}
   where N is the number of connections to use
         R is the number of retries
   outPath is optional. If not specified, it will attempt to parse URL to discover output filename.
options:
   -newer  only download a file if it is newer than the version we already have.
   -progress  Show progress of download.

================================================================
========   paraHub   ====================================
================================================================
### kent source version 362 ###
paraHub - parasol hub server version 12.18
usage:
    paraHub machineList
Where machine list is a file with the following columns:
    name - Network name
    cpus - Number of CPUs we can use
    ramSize - Megabytes of memory
    tempDir - Location of (local) temp dir
    localDir - Location of local data dir
    localSize - Megabytes of local disk
    switchName - Name of switch this is on

options:
   -spokes=N  Number of processes that feed jobs to nodes - default 30.
   -jobCheckPeriod=N  Minutes between checking on job - default 10.
   -machineCheckPeriod=N  Minutes between checking on machine - default 20.
   -subnet=XXX.YYY.ZZZ  Only accept connections from subnet (example 192.168).
   -nextJobId=N  Starting job ID number.
   -logFacility=facility  Log to the specified syslog facility - default local0.
   -logMinPriority=pri minimum syslog priority to log, also filters file logging.
    defaults to "warn"
   -log=file  Log to file instead of syslog.
   -debug  Don't daemonize
   -noResume  Don't try to reconnect with jobs running on nodes.
   -ramUnit=N  Number of bytes of RAM in the base unit used by the jobs.
      Default is RAM on node divided by number of cpus on node.
      Shorthand expressions allow t,g,m,k for tera, giga, mega, kilo.
      e.g. 4g = 4 Gigabytes.
   -defaultJobRam=N Number of ram units in a job has no specified ram usage.
      Defaults to 1.

================================================================
========   paraHubStop   ====================================
================================================================
paraHubStop - version 12.18
Shut down paraHub daemon.
usage:
   paraHubStop now

================================================================
========   paraNode   ====================================
================================================================
### kent source version 362 ###
paraNode - version 12.18
Parasol node server.
usage:
    paraNode start
options:
    -logFacility=facility  Log to the specified syslog facility - default local0.
    -logMinPriority=pri minimum syslog priority to log, also filters file logging.
     defaults to "warn"
    -log=file  Log to file instead of syslog.
    -debug  Don't daemonize
    -hub=host  Restrict access to connections from hub.
    -umask=000  Set umask to run under - default 002.
    -userPath=bin:bin/i386  User dirs to add to path.
    -sysPath=/sbin:/local/bin  System dirs to add to path.
    -env=name=value - add environment variable to jobs.  Maybe repeated.
    -randomDelay=N  Up to this many milliseconds of random delay before
        starting a job.  This is mostly to avoid swamping NFS with
        file opens when loading up an idle cluster.  Also it limits
        the impact on the hub of very short jobs. Default 5000.
    -cpu=N  Number of CPUs to use - default 1.

================================================================
========   paraNodeStart   ====================================
================================================================
### kent source version 362 ###
paraNodeStart - version 12.18
Start up parasol node daemons on a list of machines.
usage:
    paraNodeStart machineList
where machineList is a file containing a list of hosts.
Machine list contains the following columns:
     <name> <number of cpus>
It may have other columns as well.
options:
    -exe=/path/to/paraNode
    -logFacility=facility  Log to the specified syslog facility - default local0.
    -logMinPriority=pri minimum syslog priority to log, also filters file logging.
     defaults to "warn"
    -log=file  Log to file instead of syslog.
    -umask=000  Set umask to run under - default 002.
    -randomDelay=N  Set random start delay in milliseconds - default 5000.
    -userPath=bin:bin/i386  User dirs to add to path.
    -sysPath=/sbin:/local/bin  System dirs to add to path.
    -env=name=value - add environment variable to jobs.  Maybe repeated.
    -hub=machineHostingParaHub  Nodes will ignore messages from elsewhere.
    -rsh=/path/to/rsh/like/command.

================================================================
========   paraNodeStatus   ====================================
================================================================
paraNodeStatus - version 12.18
Check status of paraNode on a list of machines.
usage:
    paraNodeStatus machineList
options:
    -retries=N  Number of retries to get in touch with machine.
        The first retry is after 1/100th of a second. 
        Each retry after that takes twice as long up to a maximum
        of 1 second per retry.  Default is 7 retries and takes
        about a second.
    -long  List details of current and recent jobs.

================================================================
========   paraNodeStop   ====================================
================================================================
Couldn't open -verbose=2 , No such file or directory
================================================================
========   paraSync   ====================================
================================================================
### kent source version 362 ###
paraSync 1.0
paraSync - uses paraFetch to recursively mirror url to given path
usage:
   paraSync {options} N R URL outPath
   where N is the number of connections to use
         R is the number of retries
options:
   -A='ext1,ext2'  means accept only files with ext1 or ext2
   -newer  only download a file if it is newer than the version we already have.
   -progress  Show progress of download.

================================================================
========   paraTestJob   ====================================
================================================================
paraTestJob - version 12.18
A good test job to run on Parasol.  Can be configured to take a long time or crash.
usage:
   paraTestJob count
Run a relatively time consuming algorithm count times.
This algorithm takes about 1/10 per second each time.
options:
   -crash  Try to write to NULL when done.
   -err  Return -1 error code when done.
   -output=file  Make some output in file as well.
   -heavy=n  Make output heavy: n extra lumberjack lines.
   -input=file  Make it read in a file too.
   -sleep=n  Sleep for N seconds.

================================================================
========   parasol   ====================================
================================================================
Parasol version 12.18
Parasol is the name given to the overall system for managing jobs on
a computer cluster and to this specific command.  This command is
intended primarily for system administrators.  The 'para' command
is the primary command for users.
Usage in brief:
   parasol add machine machineFullHostName localTempDir  - Add new machine to pool.
    or 
   parasol add machine machineFullHostName cpus ramSizeMB localTempDir localDir localSizeMB switchName
   parasol remove machine machineFullHostName "reason why"  - Remove machine from pool.
   parasol check dead - Check machines marked dead ASAP, some have been fixed.
   parasol add spoke  - Add a new spoke daemon.
   parasol [options] add job command-line   - Add job to list.
         options:
            -in=in - Where to get stdin, default /dev/null
            -out=out - Where to put stdout, default /dev/null
            -wait - If set wait for job to finish to return and return with job status code
            -err=outFile - set stderr to out file - only works with wait flag
            -verbose=N - set verbosity level, default level is 1
            -printId - prints jobId to stdout
            -dir=dir - set output results dir, default is current dir
            -results=resultFile fully qualified path to the results file, 
             or `results' in the current directory if not specified.
            -cpu=N  Number of CPUs used by the jobs, default 1.
            -ram=N  Number of bytes of RAM used by the jobs.
             Default is RAM on node divided by number of cpus on node.
             Shorthand expressions allow t,g,m,k for tera, giga, mega, kilo.
             e.g. 4g = 4 Gigabytes.
   parasol [options] clear sick  - Clear sick stats on a batch.
         options:
            -results=resultFile fully qualified path to the results file, 
             or `results' in the current directory if not specified.
   parasol remove job id  - Remove job of given ID.
   parasol ping [count]  - Ping hub server to make sure it's alive.
   parasol remove jobs userName [jobPattern]  - Remove jobs submitted by user that
         match jobPattern (which may include ? and * escaped for shell).
   parasol list machines  - List machines in pool.
   parasol [-extended] list jobs  - List jobs one per line.
   parasol list users  - List users one per line.
   parasol [options] list batches  - List batches one per line.
         option - 'all' if set include inactive
   parasol list sick  - List nodes considered sick by all running batches, one per line.
   parasol status  - Summarize status of machines, jobs, and spoke daemons.
   parasol [options] pstat2  - Get status of jobs queued and running.
         options:
            -results=resultFile fully qualified path to the results file, 
             or `results' in the current directory if not specified.
   parasol flushResults
         Flush results file.  Warns if batch has jobs queued or running.
         options:
            -results=resultFile fully qualified path to the results file, 
             or `results' in the current directory if not specified.
options:
   -host=hostname - connect to a paraHub process on a remote host instead
                    localhost.
Important note:
  Options must precede positional arguments

================================================================
========   positionalTblCheck   ====================================
================================================================
### kent source version 362 ###
positionalTblCheck - check that positional tables are sorted
usage:
   positionalTblCheck db table

options:
  -verbose=n  n>=2, print tables as checked
This will check sorting of a table in a variety of formats.
It looks for commonly used names for chrom and chrom start
columns.  It also handles split tables

================================================================
========   pslCDnaFilter   ====================================
================================================================
### kent source version 362 ###
wrong # of args:  pslCDnaFilter [options] inPsl outPsl

Filter cDNA alignments in psl format.  Filtering criteria are
comparative, selecting near best in genome alignments for each
given cDNA and non-comparative, based only on the quality of an
individual alignment.

WARNING: comparative filters requires that the input is sorted by
query name.  The command: 'sort -k 10,10' will do the trick.

Each alignment is assigned a score that is based on identity and
weighted towards longer alignments and those with introns.  This
can do either global or local best-in-genome selection.  Local
near best in genome keeps fragments of an mRNA that align in
discontinuous locations from other fragments.  It is useful for
unfinished genomes.  Global near best in genome keeps alignments
based on overall score.

Options:
   -algoHelp - print message describing the filtering algorithm.

   -localNearBest=-1.0 - local near best in genome filtering,
    keeping aligments within this fraction of the top score for
    each aligned portion of the mRNA. A value of zero keeps only
    the best for each fragment. A value of -1.0 disables
    (default).

   -globalNearBest=-1.0 - global near best in genome filtering,
    keeping aligments withing this fraction of the top score.  A
    value of zero keeps only the best alignment.  A value of -1.0
    disables (default).

   -ignoreNs - don't include Ns (repeat masked) while calculating the
    score and coverage. That is treat them as unaligned rather than
    mismatches.  Ns are still counts as mismatches when calculating
    the identity.

   -ignoreIntrons - don't favor apparent introns when scoring.

   -minId=0.0 - only keep alignments with at least this fraction
    identity.

   -minCover=0.0 - minimum fraction of query that must be
    aligned.  If -polyASizes is specified and the query is in
    the file, the ploy-A is not included in coverage
    calculation.

   -decayMinCover  -  the minimum coverage is calculated
    per alignment from the query size using the formula:
       minCoverage = 1.0 - qSize / 250.0
    and minCoverage is bounded between 0.25 and 0.9.

   -minSpan=0.0 - keep only alignments whose target length are
    at least this fraction of the longest alignment passing the
    other filters.  This can be useful for removing possible
    retroposed genes.

   -minQSize=0 - drop queries shorter than this size

   -minAlnSize=0 - minimum number of aligned bases.  This includes
    repeats, but excludes poly-A/poly-T bases if available.

   -minNonRepSize=0 - Minimum number of matching bases that are not repeats.
    This does not include mismatches.
    Must use -repeats on BLAT if doing unmasked alignments.

   -maxRepMatch=1.0 - Maximum fraction of matching bases
    that are repeats.  Must use -repeats on BLAT if doing
    unmasked alignments.

   -repsAsMatch - treat matches in repeats just like other matches
 
   -maxAlignsDrop=-1 - maximum number of alignments for a given query. If
    exceeded, then all alignments of this query are dropped.
    A value of -1 disables (default)

   -maxAligns=-1 - maximum number of alignments for a given query. If
    exceeded, then alignments are sorted by score and only this number
    will be saved.  A value of -1 disables (default)

   -polyASizes=file - tab separate file with information about
    poly-A tails and poly-T heads.  Format is outputted by
    faPolyASizes:

        id seqSize tailPolyASize headPolyTSize

   -usePolyTHead - if a poly-T head was detected and is longer
    than the poly-A tail, it is used when calculating coverage
    instead of the poly-A head.

   -bestOverlap - filter overlapping alignments, keeping the best of
    alignments that are similar.  This is designed to be used with
    overlapping, windowed alignments, where one alignment might be truncated.
    Does not discarding ones with weird overlap unless -filterWeirdOverlapped
    is specified.

   -hapRegions=psl - PSL format alignments of each haplotype pseudo-chromosome
    to the corresponding reference chromosome region.  This is used to map
    alignments between regions.

   -dropped=psl - save psls that were dropped to this file.

   -weirdOverlapped=psl - output weirdly overlapping PSLs to
    this file.

   -filterWeirdOverlapped - Filter weirdly overlapped alignments, keeping
    the single highest scoring one or an arbitrary one if multiple with
    the same high score.

   -alignStats=file - output the per-alignment statistics to this file

   -uniqueMapped - keep only cDNAs that are uniquely aligned after all
    other filters have been applied.

   -noValidate - don't run pslCheck validation.

   -statsOut=file - write filtering stats to this file, overrides -verbose=1

   -verbose=1 - 0: quite
                1: output stats, unless -statsOut is specified
                2: list problem alignment (weird or invalid)
                3: list dropped alignments and reason for dropping
                4: list kept psl and info
                5: info about all PSLs

   -hapRefMapped=psl - output PSLs of haplotype to reference chromosome
    cDNA alignments mappings (for debugging purposes).

   -hapRefCDnaAlns=psl - output PSLs of haplotype cDNA to reference cDNA
    alignments (for debugging purposes).

   -hapLociAlns=outfile - output grouping of final alignments create by
    haplotype mapping process.  Each row will start with an integer haplotype
    group id number follow by a PSL record.  All rows with the same id are
    alignments of the a given cDNA that were determined to be haplotypes of
    the same locus.  Alignments that are not part of a haplotype locus are not
    included.

   -alnIdQNameMode - add internal assigned alignment numbers to cDNA names
    on output.  Useful for debugging, as they are include in the verbose
    tracing as [#1], etc.  Will make a mess of normal production usage.

   -blackList=file.txt - adds a list of accession ranges to a black list.
    Any accession on this list is dropped. Black list file is two columns
    where the first column is the beginning of the range, and the second
    column is the end of the range, inclusive.


The default options don't do any filtering. If no filtering
criteria are specified, all PSLs will be passed though, except
those that are internally inconsistent.

THE INPUT MUST BE BE SORTED BY QUERY for the comparative filters.

================================================================
========   pslCat   ====================================
================================================================
pslCat - concatenate psl files
usage:
   pslCat file(s)
options:
   -check parses input.  Detects more errors but slower
   -nohead omit psl header
   -dir  files are directories (concatenate all in dirs)
   -out=file put output to file rather than stdout
   -ext=.xxx  limit files in directories to those with extension

================================================================
========   pslCheck   ====================================
================================================================
### kent source version 362 ###
pslCheck - validate PSL files
usage:
   pslCheck fileTbl(s)
options:
   -db=db - get targetSizes from this database, and if file doesn't exist,
    look for a table in this database.
   -prot - confirm psls are protein psls
   -noCountCheck - don't validate that match/mismatch counts are match
    the total size of the alignment blocks
   -pass=pslFile - write PSLs without errors to this file
   -fail=pslFile - write PSLs with errors to this file
   -targetSizes=sizesFile - tab file with columns of target and size.
    If specified, psl is check to have a valid target and target
    coordinates.
   -skipInsertCounts - Don't validate insert counts.  Useful for BLAT protein
    PSLs where these are not computed consistently.
   -querySizes=sizesFile - file with query sizes.
   -ignoreQUniq - ignore everything after the last `-' in the qName field, that
    is sometimes used to generate a unique identifier
   -quiet - no write error message, just filter

================================================================
========   pslDropOverlap   ====================================
================================================================
pslDropOverlap - deletes all overlapping self alignments. 
usage:
    pslDropOverlap in.psl out.psl

================================================================
========   pslFilter   ====================================
================================================================
pslFilter - filter out psl file
    pslFilter in.psl out.psl 
options
    -dir  Input files are directories rather than single files
    -reward=N (default 1) Bonus to score for match
    -cost=N (default 1) Penalty to score for mismatch
    -gapOpenCost=N (default 4) Penalty for gap opening
    -gapSizeLogMod=N (default 1.00) Penalty for gap sizes
    -minScore=N (default 15) Minimum score to pass filter
    -minMatch=N (default 30) Min match (including repeats to pass)
    -minUniqueMatch (default 20) Min non-repeats to pass)
    -maxBadPpt (default 700) Maximum divergence in parts per thousand
    -minAli (default 600) Minimum ratio query in alignment in ppt
    -noHead  Don't output psl header
    -minAliT (default 0) Like minAli for target

================================================================
========   pslHisto   ====================================
================================================================
### kent source version 362 ###
wrong # of args:
pslHisto [options] what inPsl outHisto

Collect counts on PSL alignments for making histograms. These
then be analyzed with R, textHistogram, etc.

The 'what' argument determines what data to collect, the following
are currently supported:

  o alignsPerQuery - number of alignments per query. Output is one
    line per query with the number of alignments.

  o coverSpread - difference between the highest and lowest coverage
    for alignments of a query.  Output line per query, with the difference.
    Only includes queries with multiple alignments

  o idSpread - difference between the highest and lowest fraction identity
    for alignments of a query.  Output line per query, with the difference.

Options:
   -multiOnly - omit queries with only one alignment from output.
   -nonZero - omit queries with zero values.

================================================================
========   pslLiftSubrangeBlat   ====================================
================================================================
### kent source version 362 ###
pslLiftSubrangeBlat - lift PSLs from blat subrange alignments
usage:
   pslLiftSubrangeBlat isPsl outPsl

Lift a PSL with target coordinates from a blat subrange query
(e.g. blah/hg18.2bit:chr1:1000-20000) which has subrange
coordinates as the target name (e.g. chr1:1000-200000) to
actual target coordinates.

options:
  -tSizes=szfile - lift target side based on tName, using target sizes from
                   this tab separated file.
  -qSizes=szfile - lift query side based on qName, using query sizes from
                   this tab separated file.
Must specify at least on of -tSizes or -qSize or both.

================================================================
========   pslMap   ====================================
================================================================
### kent source version 362 ###
Error: wrong number of arguments
pslMap - map PSLs alignments to new targets using alignments of
the old target to the new target.  Given inPsl and mapPsl, where
the target of inPsl is the query of mapPsl, create a new PSL
with the query of inPsl aligned to all the targets of mapPsl.
If inPsl is a protein to nucleotide alignment and mapPsl is a
nucleotide to nucleotide alignment, the resulting alignment is
nucleotide to nucleotide alignment of a hypothetical mRNA that
would code for the protein.  This is useful as it gives base
alignments of spliced codons.  A chain file may be used instead
mapPsl.

usage:
   pslMap [options] inPsl mapFile outPsl

Options:
  -chainMapFile - mapFile is a chain file instead of a psl file
  -swapMap - swap query and target sides of map file.
  -swapIn - swap query and target sides of inPsl file.
  -suffix=str - append str to the query ids in the output
   alignment.  Useful with protein alignments, where the result
   is not actually and alignment of the protein.
  -keepTranslated - if either psl is translated, the output psl
   will be translated (both strands explicted).  Normally an
   untranslated psl will always be created
  -mapFileWithInQName - The first column of the mapFile PSL records are a qName,
   the remainder is a standard PSL.  When an inPsl record is mapped, only
   mapping records are used with the corresponding qName.
  -mapInfo=file - output a file with information about each mapping.
   The file has the following columns:
     o srcQName, srcQStart, srcQEnd, srcQSize - qName, etc of
       psl being mapped (source alignment)
     o srcTName, srcTStart, srcTEnd - tName, etc of psl being
       mapped
     o srcStrand - strand of psl being mapped
     o srcAligned - number of aligned based in psl being mapped
     o mappingQName, mappingQStart, mappingQEnd - qName, etc of
       mapping psl used to map alignment
     o mappingTName, mappingTStart, mappingTEnd - tName, etc of
       mapping psl
     o mappingStrand - strand of mapping psl
     o mappingId - chain id, or psl file row
     o mappedQName mappedQStart, mappedQEnd - qName, etc of
       mapped psl
     o mappedTName, mappedTStart, mappedTEnd - tName, etc of
       mapped psl
     o mappedStrand - strand of mapped psl
     o mappedAligned - number of aligned bases that were mapped
     o qStartTrunc - aligned bases at qStart not mapped due to
       mapping psl/chain not covering the entire soruce psl.
       This is from the start of the query in the positive
       direction.
     o qEndTrunc - similary for qEnd
   If the psl count not be mapped, the mapping* and mapped* columns are empty.
  -mappingPsls=pslFile - write mapping alignments that were used in
   PSL format to this file.  Transformations that were done, such as
   -swapMap, will be reflected in this file.  There will be a one-to-one
   correspondence of rows of this file to rows of the outPsl file.
  -simplifyMappingIds - simplifying mapping ids (inPsl target
   name and mapFile query name) before matching them. This
   first drops everything after the last `-', and then drops
   everything after the last remaining `.'.
  -verbose=n  - verbose output
     2 - show each overlap and the mapping

================================================================
========   pslMapPostChain   ====================================
================================================================
### kent source version 362 ###
wrong # of args:

postTransMapChain [options] inPsl outPsl

Post genomic pslMap (TransMap) chaining.  This takes transcripts
that have been mapped via genomic chains adds back in
blocks that didn't get include in genomic chains due
to complex rearrangements or other issues.

This program has not seen much use and may not do what you want

================================================================
========   pslMrnaCover   ====================================
================================================================
pslMrnaCover - Make histogram of coverage percentage of mRNA in psl.
usage:
   pslMrnaCover mrna.psl mrna.fa
options:
   -minSize=N  - default 100.  Minimum size of mRNA considered
   -listZero=zero.tab - List accessions that don't align in zero.tab

================================================================
========   pslPairs   ====================================
================================================================
pslPairs - join paired ends in psl alignments
usage: pslPairs <pslFile> <pairFile> <pslTableName> <outFilePrefix>
  creates: <outFilePrefix>.pairs file
  pslFile	- filtered psl alignments of ends from kluster run
  pairFile	- three column tab separated: forward reverse cloneId
		- forward and reverse columns can be comma separated end ids
  pslTableName	- table name the psl alignments have been loaded into
  outFilePrefix	- prefix used for each output file name
Options:
  -max=N	- maximum length of clone sequence (default=47000)
  -min=N	- minimum length of clone sequence (default=32000)
  -slopval=N	- deviation from max/min clone lengths allowed for slop report
		- (default=5000)
  -nearTop=N	- maximium deviation from best match allowed (default=0.001)
  -minId=N	- minimum pct ID of at least one end (default=0.96)
  -minOrphanId=N - minimum pct ID for orphan alignment (default=0.96)
  -tInsert=N	- maximum insert bases allowed in sequence alignment
		- (default=500)
  -hardMax=N	- absolute maximum clone length for long report (default=75000)
  -verbose	- display all informational messages
  -noBin	- do not include bin column in output file
  -noRandom	- do not include placements on random portions
		- {length(chr name) < 7}
  -slop		- create <outFilePrefix>.slop file of pairs that fall within
		- slop length
  -short	- create <outFilePrefix>.short file of pairs shorter than
		- min size
  -long		- create <outFilePrefix>.long file of pairs longer than
		- max size, but less than hardMax size
  -mismatch	- create <outFilePrefix>.mismatch file of pairs with
		- bad orientation of ends
  -orphan	- create <outFilePrefix>.orphan file of unmatched end sequences
================================================================
========   pslPartition   ====================================
================================================================
### kent source version 362 ###
Error: wrong # args
pslPartition - split PSL files into non-overlapping sets
usage:
   pslPartition [options] pslFile outDir

Split psl files into non-overlapping sets for use in cluster jobs,
limiting memory usage, etc. Multiple levels of directories can be are
created under outDir to prevent slow access to huge directories.
The pslFile maybe compressed and no ordering is assumed.

options:
  -outLevels=0 - number of output subdirectory levels.  0 puts all files
   directly in outDir, 2, will create files in the form outDir/0/0/00.psl
  -partSize=20000 - will combine non-overlapping partitions, while attempting
   to keep them under this number of PSLs.  This reduces the number of
   files that are created while ensuring that there are no overlaps
   between any two PSL files.  A value of 0 creates a PSL file per set of
   overlapping PSLs.
  -dropContained - drop PSLs that are completely contained in a block of
   another PSL.


================================================================
========   pslPosTarget   ====================================
================================================================
### kent source version 362 ###
pslPosTarget - flip psl strands so target is positive and implicit
usage:
   pslPosTarget inPsl outPsl

================================================================
========   pslPretty   ====================================
================================================================
pslPretty - Convert PSL to human-readable output
usage:
   pslPretty in.psl target.lst query.lst pretty.out
options:
   -axt             Save in format like Scott Schwartz's axt format.
                    Note gaps in both sequences are still allowed in the
                    output, which not all axt readers will expect.
   -dot=N           Output a dot every N records.
   -long            Don't abbreviate long inserts.
   -check=fileName  Output alignment checks to filename.
It's recommended that the psl file be sorted by target if it contains
multiple targets; otherwise, this will be extremely slow. The target and query
lists can be fasta, 2bit or nib files, or a list of these files, one per line.

================================================================
========   pslRc   ====================================
================================================================
### kent source version 362 ###
wrong # args:
pslRc [options] inPsl outPsl

reverse-complement psl

Options:

================================================================
========   pslRecalcMatch   ====================================
================================================================
### kent source version 362 ###
pslRecalcMatch - Recalculate match,mismatch,repMatch columns in psl file.
This can be useful if the psl went through pslMap, or if you've added 
lower-case repeat masking after the fact
usage:
   pslRecalcMatch in.psl targetSeq querySeq out.psl
where targetSeq is either a nib directory or a two bit file
and querySeq is a fasta file, nib file, two bit file, or list
of such files.  The psl's should be simple non-translated ones.
This will work faster if the in.psl is sorted on target.
options:
   -ignoreQUniq - ignore everything after the last `-' in the qName field, that
    is sometimes used to generate a unique identifier
   -ignoreQMissing - pass through the record if querySeq doesn't include qName

================================================================
========   pslReps   ====================================
================================================================
### kent source version 362 ###
pslReps - Analyze repeats and generate genome-wide best alignments from a
sorted set of local alignments
usage:
    pslReps in.psl out.psl out.psr
where:
    in.psl is an alignment file generated by psLayout and sorted by pslSort
    out.psl is the best alignment output
    out.psr contains repeat info
options:
    -nohead            Don't add PSL header.
    -ignoreSize        Will not weigh as much in favor of larger alignments.
    -noIntrons         Will not penalize for not having introns when calculating
                       size factor.
    -singleHit         Takes single best hit, not splitting into parts.
    -minCover=0.N      Minimum coverage to output.  Default is 0.
    -ignoreNs          Ignore Ns when calculating minCover.
    -minAli=0.N        Minimum alignment ratio.  Default is 0.93.
    -nearTop=0.N       How much can deviate from top and be taken.
                       Default is 0.01.
    -minNearTopSize=N  Minimum size of alignment that is near top
                       for alignment to be kept.  Default 30.
    -coverQSizes=file  Tab-separate file with effective query sizes.
                       When used with -minCover, this allows polyAs
                       to be excluded from the coverage calculation.

================================================================
========   pslScore   ====================================
================================================================
### kent source version 362 ###
pslScore - calculate web blat score from psl files
usage:
   pslScore <file.psl> [moreFiles.psl]
options:
   none at this time

columns in output:

#tName	tStart	tEnd	qName:qStart-qEnd	score	percentIdentity
================================================================
========   pslSelect   ====================================
================================================================
### kent source version 362 ###
pslSelect - select records from a PSL file.

usage:
   pslSelect [options] inPsl outPsl

Must specify a selection option

Options:
   -qtPairs=file - file is tab-separated qName and tName pairs to select
   -qPass        - pass all PSLs with queries that do not appear in qtPairs file at all
                   (default is to remove all PSLs for queries that are not in file)
   -queries=file - file has qNames to select
   -queryPairs=file - file is tab-separated pairs of qNames to select
    with new qName to substitute in output file
   -qtStart=file - file is tab-separate rows of qName,tName,tStart
   -qDelim=char  - use only the part of the query name before this character

================================================================
========   pslSomeRecords   ====================================
================================================================
### kent source version 362 ###
pslSomeRecords - Extract multiple psl records
usage:
   pslSomeRecords pslIn listFile pslOut
where:
   pslIn is the input psl file
   listFile is a file with a qName (rna accession usually)
          on each line
   pslOut is the output psl file
options:
   -not  - include psl if name is NOT in list
   -tToo - if set, the list file is two column, qName and tName.
           In this case only records matching on both q and t are
           output

================================================================
========   pslSort   ====================================
================================================================
pslSort - Merge and sort psCluster .psl output files
usage:
      pslSort dirs[1|2] outFile tempDir inDir(s)OrFile(s)

   This will sort all of the .psl input files or those in the directories
   inDirs in two stages - first into temporary files in tempDir
   and second into outFile.  The device on tempDir must have
   enough space (typically 15-20 gigabytes if processing whole genome).

      pslSort g2g[1|2] outFile tempDir inDir(s)

   This will sort a genome-to-genome alignment, reflecting the
   alignments across the diagonal.

   Adding 1 or 2 to the dirs or g2g option will limit the program to only
   the first or second pass respectively of the sort.

options:
   -nohead      Do not write psl header.
   -verbose=N   Set verbosity level, higher for more output. Default is 1.

================================================================
========   pslStats   ====================================
================================================================
### kent source version 362 ###
pslStats - collect statistics from a psl file.

usage:
   pslStats [options] psl statsOut

Options:
  -queryStats - output per-query statistics, the default is per-alignment stats
  -overallStats - output overall statistics.
  -queries=querySizeFile - tab separated file with of expected qNames and sizes.
   If specified, statistic will include queries that didn't align.

================================================================
========   pslSwap   ====================================
================================================================
### kent source version 362 ###
wrong # args:
pslSwap [options] inPsl outPsl

Swap target and query in psls

Options:
  -noRc - don't reverse complement untranslated alignments to
   keep target positive strand.  This will make the target strand
   explict.

================================================================
========   pslToBed   ====================================
================================================================
### kent source version 362 ###
pslToBed: tranform a psl format file to a bed format file.
usage:
    pslToBed psl bed
options:
    -cds=cdsFile
cdsFile specifies a input cds tab-separated file which contains
genbank-style CDS records showing cdsStart..cdsEnd
e.g. NM_123456 34..305
These coordinates are assumed to be in the query coordinate system
of the psl, like those that are created from genePredToFakePsl
    -posName
changes the qName field to qName:qStart-qEnd
(can be used to create links to query position on details page)

================================================================
========   pslToBigPsl   ====================================
================================================================
### kent source version 362 ###
pslToBigPsl - converts psl to bigPsl input (bed format with extra fields)
usage:
  pslToBigPsl file.psl stdout | sort -k1,1 -k2,2n > file.bigPslInput
options:
  -cds=file.cds
  -fa=file.fasta
NOTE: to build bigBed:
   bedToBigBed -type=bed12+13 -tab -as=bigPsl.as file.bigPslInput chrom.sizes output.bb

================================================================
========   pslToChain   ====================================
================================================================
### kent source version 362 ###
pslToChain - Convert psl records to chain records 
usage:
   pslToChain pslIn chainOut
options:
   -xxx=XXX

================================================================
========   pslToPslx   ====================================
================================================================
### kent source version 362 ###
pslToPslx - Convert from psl to pslx format, which includes sequences
usage:
   pslToPslx [options] in.psl qSeqSpec tSeqSpec out.pslx

qSeqSpec and tSeqSpec can be nib directory, a 2bit file, or a FASTA file.
FASTA files should end in .fa, .fa.gz, .fa.Z, or .fa.bz2 and are read into
memory.

Options:
  -masked - if specified, repeats are in lower case cases, otherwise entire
            sequence is loader case.

================================================================
========   pslxToFa   ====================================
================================================================
### kent source version 362 ###
pslxToFa - convert pslx (with sequence) to fasta file
usage:
   pslxToFa in.psl out.fa
options:
   -liftTarget=liftTarget.lft
   -liftQuery=liftQuery.lft

================================================================
========   qaToQac   ====================================
================================================================
qaToQac - convert from uncompressed to compressed
quality score format.
usage:
   qaToQac in.qa out.qac
================================================================
========   qacAgpLift   ====================================
================================================================
### kent source version 362 ###
qacAgpLift - Use AGP to combine per-scaffold qac into per-chrom qac.
usage:
   qacAgpLift scaffoldToChrom.agp scaffolds.qac chrom.qac
options:
    -mScore=N - score to use for missing data (otherwise fail)
            range: 0-99, recommended values are 98 (low qual) or 99 (high)
================================================================
========   qacToQa   ====================================
================================================================
### kent source version 362 ###
qacToQa - convert from compressed to uncompressed
quality score format.
usage:
   qacToQa in.qac out.qa
	-name=name  restrict output to just this sequence name

================================================================
========   qacToWig   ====================================
================================================================
### kent source version 362 ###
qacToWig - convert from compressed quality score format to wiggle format.
usage:
   qacToWig in.qac outFileOrDir
	-name=name    restrict output to just this sequence name
	-fixed        output single file with wig headers and fixed step size
   If neither -name nor -fixed is used, outFileOrDir is a directory which
   will be created if it does not already exist.  If -name and/or -fixed is
   used, outFileOrDir is a file (or "stdout").

================================================================
========   raSqlQuery   ====================================
================================================================
### kent source version 362 ###
raSqlQuery - Do a SQL-like query on a RA file.
   raSqlQuery raFile(s) query-options
or
   raSqlQuery -db=dbName query-options
Where dbName is a UCSC Genome database like hg18, sacCer1, etc.
One of the following query-options must be specified
   -queryFile=fileName
   "-query=select list,of,fields from file where field='this'"
The queryFile just has a query in it in the same form as the query option.
The syntax of a query statement is very SQL-like. The most common commands are:
    select tag1,tag2,tag3 where tag1 like 'prefix%'
where the % is a SQL wildcard.  Sorry to mix wildcards. Another command query is
    select count(*) from * where tag = 'val
The from list is optional.  If it exists it is a list of raFile names
    select track,type from *Encode* where type like 'bigWig%'
Other command line options:
   -addFile - Add 'file' field to say where record is defined
   -addDb - Add 'db' field to say where record is defined
   -strict - Used only with db option.  Only report tracks that exist in db
   -key=keyField - Use the as the key field for merges and parenting. Default name
   -parent - Merge together inheriting on parentField
   -parentField=field - Use field as the one that tells us who is our parent. Default subTrack
   -overrideNeeded - If set records are only overridden field-by-field by later records
               if 'override' follows the track name. Otherwiser later record replaces
               earlier record completely.  If not set all records overridden field by field
   -noInheritField=field - If field is present don't inherit fields from parent
   -merge - If there are multiple raFiles, records with the same keyField will be
          merged together with fields in later files overriding fields in earlier files
   -restrict=keyListFile - restrict output to only ones with keys in file.
   -db=hg19 - Acts on trackDb files for the given database.  Sets up list of files
              appropriately and sets parent, merge, and override all.
              Use db=all for all databases

================================================================
========   raToLines   ====================================
================================================================
### kent source version 362 ###
raToLines - Output .ra file stanzas as single lines, with pipe-separated fields.

usage:
   raToLines in.ra out.txt

================================================================
========   raToTab   ====================================
================================================================
### kent source version 362 ###
raToTab - Convert ra file to table.
usage:
   raToTab in.ra out.tab
options:
   -cols=a,b,c - List columns in order to output in table
                 Only these columns will be output.  If you
                 Don't give this option, all columns are output
                 in alphabetical order
   -head - Put column names in header

================================================================
========   randomLines   ====================================
================================================================
### kent source version 362 ###
randomLines - Pick out random lines from file
usage:
   randomLines inFile count outFile
options:
   -seed=N - Set seed used for randomizing, useful for debugging.
   -decomment - remove blank lines and those starting with 

================================================================
========   rmFaDups   ====================================
================================================================
rmFaDup - remove duplicate records in FA file
usage
   rmFaDup oldName.fa newName.fa

================================================================
========   rowsToCols   ====================================
================================================================
### kent source version 362 ###
rowsToCols - Convert rows to columns and vice versa in a text file.
usage:
   rowsToCols in.txt out.txt
By default all columns are space-separated, and all rows must have the
same number of columns.
options:
   -varCol - rows may to have various numbers of columns.
   -tab - fields are separated by tab
   -fs=X - fields are separated by given character
   -fixed - fields are of fixed width with space padding
   -offsets=X,Y,Z - fields are of fixed width at given offsets

================================================================
========   sizeof   ====================================
================================================================
     type   bytes    bits
     char	1	8
unsigned char	1	8
short int	2	16
u short int	2	16
      int	4	32
 unsigned	4	32
     long	8	64
unsigned long	8	64
long long	8	64
u long long	8	64
   size_t	8	64
   void *	8	64
    float	4	32
   double	8	64
long double	16	128
LITTLE ENDIAN machine detected
byte order: normal order: 0x12345678 in memory: 0x78563412
================================================================
========   spacedToTab   ====================================
================================================================
### kent source version 362 ###
spacedToTab - Convert fixed width space separated fields to tab separated
Note this requires two passes, so it can't be done on a pipe
usage:
   spacedToTab in.txt out.tab
options:
   -sizes=X,Y,Z - Force it to have columns of the given widths.
                 The final char in each column should be space or newline

================================================================
========   splitFile   ====================================
================================================================
splitFile - Split up a file
usage:
   splitFile source linesPerFile outBaseName
options:
   -head=file - put head in front of each output
   -tail=file - put tail at end of each output
================================================================
========   splitFileByColumn   ====================================
================================================================
### kent source version 362 ###
splitFileByColumn - Split text input into files named by column value
usage:
   splitFileByColumn source outDir
options:
   -col=N      - Use the Nth column value (default: N=1, first column)
   -head=file  - Put head in front of each output
   -tail=file  - Put tail at end of each output
   -chromDirs  - Split into subdirs of outDir that are distilled from chrom
                 names, e.g. chr3_random -> outDir/3/chr3_random.XXX .
   -ending=XXX - Use XXX as the dot-suffix of split files (default: taken
                 from source).
   -tab        - Split by tab characters instead of whitespace.
Split source into multiple files in outDir, with each filename determined
by values from a column of whitespace-separated input in source.
If source begins with a header, you should pipe "tail +N source" to this
program where N is number of header lines plus 1, or use some similar
method to strip the header from the input.

================================================================
========   sqlToXml   ====================================
================================================================
### kent source version 362 ###
sqlToXml - dump out all or part of a relational database to XML, guided
by a dump specification.  See sqlToXml.doc for additional information.
usage:
   sqlToXml database dumpSpec.od output.xml
options:
   -topTag=name - Give the top level XML tag the given name.  By
               default it will be the same as the database name.
   -query=file.sql - Instead of dumping whole database, just dump those
                  records matching SQL select statement in file.sql.
                  This statement should be of the form:
           select * from table where ...
                   or
           select table.* from table,otherTables where ...
                   Where the table is the same as the table in the first
                   line of dumpSpec.
   -tab=N - number of spaces betweeen tabs in xml.dumpSpec - by default it's 8.
            (It may be best just to avoid tabs in that file though.)
   -maxList=N - This will limit any lists in the output to no more than
                size N.  This is mostly just for testing.

================================================================
========   stringify   ====================================
================================================================
### kent source version 362 ###
stringify - Convert file to C strings
usage:
   stringify [options] in.txt
A stringified version of in.txt  will be printed to standard output.

Options:
  -var=varname - create a variable with the specified name containing
                 the string.
  -static - create the variable but put static in front of it.
  -array - create an array of strings, one for each line


================================================================
========   subChar   ====================================
================================================================
subChar - Substitute one character for another throughout a file.
usage:
   subChar oldChar newChar file(s)
oldChar and newChar can either be single letter literal characters,
or two digit hexadecimal ascii codes
================================================================
========   subColumn   ====================================
================================================================
### kent source version 362 ###
subColumn - Substitute one column in a tab-separated file.
usage:
   subColumn column in.tab sub.tab out.tab
Where:
    column is the column number (starting with 1)
    in.tab is a tab-separated file
    sub.tab is a where first column is old values, second new
    out.tab is the substituted output
options:
   -list - Column is a comma-separated list.  Substitute all elements in list
   -miss=fileName - Print misses to this file instead of aborting

================================================================
========   tailLines   ====================================
================================================================
tailLines - add tail to each line of file
usage:
   tailLines file tail
This will add tail to each line of file and print to stdout.
================================================================
========   tdbQuery   ====================================
================================================================
### kent source version 362 ###
tdbQuery - Query the trackDb system using SQL syntax.
Usage:
    tdbQuery sqlStatement
Where the SQL statement is enclosed in quotations to avoid the shell interpreting it.
Only a very restricted subset of a single SQL statement (select) is supported.   Examples:
    tdbQuery "select count(*) from hg18"
counts all of the tracks in hg18 and prints the results to stdout
   tdbQuery "select count(*) from *"
counts all tracks in all databases.
   tdbQuery "select  track,shortLabel from hg18 where type like 'bigWig%'"
prints to stdout a a two field .ra file containing just the track and shortLabels of bigWig 
type tracks in the hg18 version of trackDb.
   tdbQuery "select * from hg18 where track='knownGene' or track='ensGene'"
prints the hg18 knownGene and ensGene track's information to stdout.
   tdbQuery "select *Label from mm9"
prints all fields that end in 'Label' from the mm9 trackDb.
OPTIONS:
   -root=/path/to/trackDb/root/dir
Sets the root directory of the trackDb.ra directory hierarchy to be given path. By default
this is ~/kent/src/hg/makeDb/trackDb.
   -check
Check that trackDb is internally consistent.  Prints diagnostic output to stderr and aborts if 
there's problems.
   -strict
Mimic -strict option on hgTrackDb. Suppresses tracks where corresponding table does not exist.
   -release=alpha|beta|public
Include trackDb entries with this release tag only. Default is alpha.
   -noBlank
Don't print out blank lines separating records
   -oneLine
Print single ('|') pipe-separated line per record
   -noCompSub
Subtracks don't inherit fields from parents
   -shortLabelLength=N
Complain if shortLabels are over N characters
   -longLabelLength=N
Complain if longLabels are over N characters

================================================================
========   textHistogram   ====================================
================================================================
### kent source version 362 ###
textHistogram - Make a histogram in ascii
usage:
   textHistogram [options] inFile
Where inFile contains one number per line.
  options:
   -binSize=N - Size of bins, default 1
   -maxBinCount=N - Maximum # of bins, default 25
   -minVal=N - Minimum value to put in histogram, default 0
   -log - Do log transformation before plotting
   -noStar - Don't draw asterisks
   -col=N - Which column to use. Default 1
   -aveCol=N - A second column to average over. The averages
             will be output in place of counts of primary column.
   -real - Data input are real values (default is integer)
   -autoScale=N - autoscale to N # of bins
   -probValues - show prob-Values (density and cum.distr.) (sets -noStar too)
   -freq - show frequences instead of counts
   -skip=N - skip N lines before starting, default 0

================================================================
========   tickToDate   ====================================
================================================================
tickToDate - Convert seconds since 1970 to time and date
usage:
   tickToDate ticks
Use 'now' for current ticks and date

================================================================
========   toLower   ====================================
================================================================
toLower - Convert upper case to lower case in file. Leave other chars alone
usage:
   toLower inFile outFile
equivalent to the unix commands: cat inFile | tr '[A-Z]' '[a-z]' > outFile
================================================================
========   toUpper   ====================================
================================================================
toUpper - Convert lower case to upper case in file. Leave other chars alone
usage:
   toUpper inFile outFile
equivalent to the unix commands: cat inFile | tr '[a-z]' '[A-Z]' > outFile
================================================================
========   transMapPslToGenePred   ====================================
================================================================
### kent source version 362 ###
transMapPslToGenePred - convert PSL alignments of mRNAs to gene annotations.

usage:
   mrnaToGene [options] sourceGenePred mappedPsl mappedGenePred

Convert PSL alignments from transmap to genePred.  It specifically handles
alignments where the source genes are genomic annotations in genePred
format, that are converted to PSL for mapping and using this program to
create a new genePred.

This is an alternative to mrnaToGene which determines CDS and frame from
the original annotation, which may have been imported from GFF/GTF.  This
was created because the genbankCds structure use by mrnaToGene doesn't
handle partial start/stop codon or programmed frame shifts.  This requires
handling the list of CDS regions and the /codon_start attribute,  At some
point, this program may be extended to do handle genbank alignments correctly.

Options:
  -nonCodingGapFillMax=0 - fill gaps in non-coding regions up to this many bases
   in length.
  -codingGapFillMax=0 - fill gaps in coding regions up to this many bases
   in length.  Only coding gaps that are a multiple of three will be fill,
   with the max rounded down.
  -noBlockMerge - don't do any block merging of genePred, even of adjacent blocks.
   This is mainly for debugging.


================================================================
========   trfBig   ====================================
================================================================
### kent source version 362 ###
trfBig - Mask tandem repeats on a big sequence file.
usage:
   trfBig inFile outFile
This will repeatedly run trf to mask tandem repeats in infile
and put masked results in outFile.  inFile and outFile can be .fa
or .nib format. Outfile can be .bed as well. Sequence output is hard
masked, lowercase.

   -bed creates a bed file in current dir
   -bedAt=path.bed - create a bed file at explicit location
   -tempDir=dir Where to put temp files.
   -trf=trfExe explicitly specifies trf executable name
   -maxPeriod=N  Maximum period size of repeat (default 2000)
   -keep  don't delete tmp files
   -l=<n> when used here, for new trf v4.09 option:
          maximum TR length expected (in millions)
          (eg, -l=3 for 3 million), Human genome hg38 would need -l=6
================================================================
========   twoBitDup   ====================================
================================================================
### kent source version 362 ###
twoBitDup - check to see if a twobit file has any identical sequences in it
usage:
   twoBitDup file.2bit
options:
  -keyList=file - file to write a key list, two columns: md5sum and sequenceName
                   NOTE: use of keyList is very time expensive for 2bit files
                   with a large number of sequences (> 5,000).  Better to
                   use a cluster run with the doIdKeys.pl automation script.
  -udcDir=/dir/to/cache - place to put cache for remote bigBed/bigWigs

example: twoBitDup -keyList=stdout db.2bit \
          | grep -v 'are identical' | sort > db.idKeys.txt
================================================================
========   twoBitInfo   ====================================
================================================================
### kent source version 362 ###
twoBitInfo - get information about sequences in a .2bit file
usage:
   twoBitInfo input.2bit output.tab
options:
   -maskBed instead of seq sizes, output BED records that define 
           areas with masked sequence
   -nBed   instead of seq sizes, output BED records that define 
           areas with N's in sequence
   -noNs   outputs the length of each sequence, but does not count Ns 
   -udcDir=/dir/to/cache - place to put cache for remote bigBed/bigWigs
Output file has the columns::
   seqName size

The 2bit file may be specified in the form path:seq or path:seq1,seq2,seqN...
so that information is returned only on the requested sequence(s).
If the form path:seq:start-end is used, start-end is ignored.

================================================================
========   twoBitMask   ====================================
================================================================
### kent source version 362 ###
twoBitMask - apply masking to a .2bit file, creating a new .2bit file
usage:
   twoBitMask input.2bit maskFile output.2bit
options:
   -add   Don't remove pre-existing masking before applying maskFile.
   -type=.XXX   Type of maskFile is XXX (bed or out).
maskFile can be a RepeatMasker .out file or a .bed file.  It must not
contain rows for sequences which are not in input.2bit.

================================================================
========   twoBitToFa   ====================================
================================================================
### kent source version 362 ###
twoBitToFa - Convert all or part of .2bit file to fasta
usage:
   twoBitToFa input.2bit output.fa
options:
   -seq=name       Restrict this to just one sequence.
   -start=X        Start at given position in sequence (zero-based).
   -end=X          End at given position in sequence (non-inclusive).
   -seqList=file   File containing list of the desired sequence names 
                   in the format seqSpec[:start-end], e.g. chr1 or chr1:0-189
                   where coordinates are half-open zero-based, i.e. [start,end).
   -noMask         Convert sequence to all upper case.
   -bpt=index.bpt  Use bpt index instead of built-in one.
   -bed=input.bed  Grab sequences specified by input.bed. Will exclude introns.
   -bedPos         With -bed, use chrom:start-end as the fasta ID in output.fa.
   -udcDir=/dir/to/cache  Place to put cache for remote bigBed/bigWigs.

Sequence and range may also be specified as part of the input
file name using the syntax:
      /path/input.2bit:name
   or
      /path/input.2bit:name
   or
      /path/input.2bit:name:start-end

================================================================
========   vai.pl   ====================================
================================================================

usage: vai.pl [options] db input.(vcf|pgsnp|pgSnp|txt)[.gz] > output.tab

Invokes hgVai (Variant Annotation Integrator) on a set of variant calls to
add functional effect predictions (e.g. does the variant fall within a
regulatory region or part of a gene) and other data relevant to function.

input.(...) must be a file or URL containing either variants formatted as VCF
or pgSnp, or a sequence of dbSNP rs# IDs, optionally compressed by gzip.
Output is printed to stdout.

options:
  --hgVai=/path/to/hgVai          Path to hgVai executable
                                  (default: /usr/local/apache/cgi-bin/hgVai)
  --position=chrX:N-M             Sequence name, start and end of range to query
                                  (default: genome-wide query)
  --rsId                          Attempt to match dbSNP rs# ID with variant
                                  position at the expense of performance.
                                  (default: don't attempt to match dbSNP rs# ID)
  --udcCache=/path/to/udcCache    Path to udc cache, overriding hg.conf setting
                                  (default: use value in hg.conf file)
  --geneTrack=track               Genome Browser track with transcript predictions
                                  (default: refGene)
  --hgvsBreakDelIns=on|off        HGVS delins: show "delAGinsTT" instead of "delinsTT"
                                  (default: off)
  --hgvsCN=on|off                 Include HGVS c./n. (coding/noncoding) terms in output (RefSeq transcripts only)
                                  (default: on)
  --hgvsG=on|off                  Include HGVS g. (genomic) terms in output (RefSeq transcripts only)
                                  (default: on)
  --hgvsP=on|off                  Include HGVS p. (protein) terms in output (RefSeq transcripts only)
                                  (default: on)
  --hgvsPAddParens=on|off         Add parentheses around HGVS p. predicted changes
                                  (default: off)
  --include_cdsNonSyn=on|off      Include CDS non-synonymous variants in output
                                  (default: on)
  --include_cdsSyn=on|off         Include CDS synonymous variants in output
                                  (default: on)
  --include_exonLoss=on|off       Include exon loss variants in output
                                  (default: on)
  --include_intergenic=on|off     Include intergenic variants in output
                                  (default: on)
  --include_intron=on|off         Include intron variants in output
                                  (default: on)
  --include_nmdTranscript=on|off  Include variants in NMD transcripts in output
                                  (default: on)
  --include_noVariation=on|off    Include "variants" with no observed variation in output
                                  (default: on)
  --include_nonCodingExon=on|off  Include non-coding exon variants in output
                                  (default: on)
  --include_splice=on|off         Include splice site and splice region variants in output
                                  (default: on)
  --include_upDownstream=on|off   Include upstream and downstream variants in output
                                  (default: on)
  --include_utr=on|off            Include 3' and 5' UTR variants in output
                                  (default: on)
  --variantLimit=N                Maximum number of variants to process
                                  (default: 10000)
  -n, --dry-run                   Display hgVai command, but don't execute it
  -h, --help                      Display this message
================================================================
========   validateFiles   ====================================
================================================================
### kent source version 362 ###
validateFiles - Validates the format of different genomic files.
                Exits with a zero status for no errors detected and non-zero for errors.
                Uses filename 'stdin' to read from stdin.
                Automatically decompresses Files in .gz, .bz2, .zip, .Z format.
                Accepts multiple input files of the same type.
                Writes Error messages to stderr
usage:
   validateFiles -chromInfo=FILE -options -type=FILE_TYPE file1 [file2 [...]]

   -type=
       fasta        : Fasta files (only one line of sequence, and no quality scores)
       fastq        : Fasta with quality scores (see http://maq.sourceforge.net/fastq.shtml)
       csfasta      : Colorspace fasta (implies -colorSpace)
       csqual       : Colorspace quality (see link below)
                      See http://marketing.appliedbiosystems.com/mk/submit/SOLID_KNOWLEDGE_RD?_JS=T&rd=dm
       bam          : Binary Alignment/Map
                      See http://samtools.sourceforge.net/SAM1.pdf
       bigWig       : Big Wig
                      See http://genome.ucsc.edu/goldenPath/help/bigWig.html
       bedN[+P]     : BED N or BED N+ or BED N+P
                      where N is a number between 3 and 15 of standard BED columns,
                      optional + indicates the presence of additional columns
                      and P is the number of addtional columns
                      Examples: -type=bed6 or -type=bed6+ or -type=bed6+3 
                      See http://genome.ucsc.edu/FAQ/FAQformat.html#format1
       bigBedN[+P]  : bigBED N  or bigBED N+ or bigBED N+P, similar to BED
                      See http://genome.ucsc.edu/goldenPath/help/bigBed.html
       tagAlign     : Alignment files, replaced with BAM
       pairedTagAlign  
       broadPeak    : ENCODE Peak formats
       narrowPeak     These are specialized bedN+P formats.
       gappedPeak     See http://genomewiki.cse.ucsc.edu/EncodeDCC/index.php/File_Formats
       bedGraph    :  BED Graph
       rcc         :  NanoString RCC
       idat        :  Illumina IDAT

   -as=fields.as                If you have extra "bedPlus" fields, it's great to put a definition
                                of each field in a row in AutoSql format here. Applies to bed-related types.
   -tab                         If set, expect fields to be tab separated, normally
                                expects white space separator. Applies to bed-related types.
   -chromDb=db                  Specify DB containing chromInfo table to validate chrom names
                                and sizes
   -chromInfo=file.txt          Specify chromInfo file to validate chrom names and sizes
   -colorSpace                  Sequences include colorspace values [0-3] (can be used 
                                with formats such as tagAlign and pairedTagAlign)
   -isSorted                    Input is sorted by chrom, only affects types tagAlign and pairedTagAlign
   -doReport                    Output report in filename.report
   -version                     Print version

For Alignment validations
   -genome=path/to/hg18.2bit    REQUIRED to validate sequence mappings match the genome specified
                                in the .2bit file. (BAM, tagAlign, pairedTagAlign)
   -nMatch                      N's do not count as a mismatch
   -matchFirst=n                Only check the first N bases of the sequence
   -mismatches=n                Maximum number of mismatches in sequence (or read pair) 
   -mismatchTotalQuality=n      Maximum total quality score at mismatching positions
   -mmPerPair                   Check either pair dont exceed mismatch count if validating
                                  pairedTagAlign files (default is the total for the pair)
   -mmCheckOneInN=n             Check mismatches in only one in 'n' lines (default=1, all)
   -allowOther                  Allow chromosomes that aren't native in BAM's
   -allowBadLength              Allow chromosomes that have the wrong length in BAM
   -complementMinus             Complement the query sequence on the minus strand (for testing BAM)
   -bamPercent=N.N              Percentage of BAM alignments that must be compliant
   -privateData                 Private data so empty sequence is tolerated


================================================================
========   validateManifest   ====================================
================================================================
### kent source version 362 ###
manifest.txt not found in workingDir .
validateManifest v1.9 - Validates the ENCODE3 manifest.txt file.
                Calls validateFiles on each file in the manifest.
                Exits with a zero status for no errors detected and non-zero for errors.
                Writes Error messages to stderr
usage:
   validateManifest

   -dir=workingDir, defaults to the current directory.
   -encValData=encValDataDir, relative to workingDir, defaults to encValData.

   Input files in the working directory: 
     manifest.txt - current input manifest file
     validated.txt - input from previous run of validateManifest

   Output file in the working directory: 
     validated.txt - results of validated input


================================================================
========   webSync   ====================================
================================================================
Usage: webSync [options] <url> - download from https server, using files.txt on their end to get the list of files

    To create files.txt on the remote end, run this command:
      du -ab > files.txt
    Or preferably this command (otherwise empty directories will lead to "transmit" errors):
      find . -type f -exec du -ab {} + > files.txt
    Or this one if you have symlinks:
      find -L . -type f -exec du -Lab {} + > files.txt

    Then run this in the download directory:
      webSync https://there.org/

    This will create a "webSyncLog" directory in the current directory, compare
    https://there.org/files.txt with the files in the current directory,
    transfer the missing files and write the changes to webSync/transfer.log.

    The URL will be saved after the first run and is not necessary from then on. You can add
    cd xxx && webSync to your crontab. It will not start if it's already running (flagfile).

    Status files after a run:
    - webSyncLog/biggerHere.txt - list of files that are bigger here. These could be errors or OK.
    - webSyncLog/files.here.txt - the list of files here
    - webSyncLog/files.there.txt - the list of files there, current copy of https://there.org/files.txt
    - webSyncLog/missingThere.txt - the list of files not on https://there.org anymore but here
    - webSyncLog/transfer.log - big transfer log, each run, date and size of transferred file is noted here.
    

Options:
  -h, --help            show this help message and exit
  -d, --debug           show debug messages
  -x CONNECTIONS, --connections=CONNECTIONS
                        Maximum number of parallel connections to the server,
                        default 10
  -s, --skipScan        Do not scan local file sizes again, in case you know
                        it is up to date
================================================================
========   wigCorrelate   ====================================
================================================================
### kent source version 362 ###
wigCorrelate - Produce a table that correlates all pairs of wigs.
usage:
   wigCorrelate one.wig two.wig ... n.wig
This works on bigWig as well as wig files.
The output is to stdout
options:
   -clampMax=N - values larger than this are clipped to this value

================================================================
========   wigEncode   ====================================
================================================================
### kent source version 362 ###
wigEncode - convert Wiggle ascii data to binary format

usage:
    wigEncode [options] wigInput wigFile wibFile
	wigInput - wiggle ascii data input file (stdin OK)
	wigFile - .wig output file to be used with hgLoadWiggle
	wibFile - .wib output file to be symlinked into /gbdb/<db>/wib/

This processes the three data input format types described at:
	http://genome.ucsc.edu/encode/submission.html#WIG
	(track and browser lines are tolerated, i.e. ignored)
options:
    -lift=<D> - lift all input coordinates by D amount, default 0
              - can be negative as well as positive
    -allowOverlap - allow overlapping data, default: overlap not allowed
              - only effective for fixedStep and if fixedStep declarations
              - are in order by chromName,chromStart
    -noOverlapSpanData - check for overlapping span data
    -wibSizeLimit=<N> - ignore rest of input when wib size is >= N

Example:
    hgGcPercent -wigOut -doGaps -file=stdout -win=5 xenTro1 \
        /cluster/data/xenTro1 | wigEncode stdin gc5Base.wig gc5Base.wib
load the resulting .wig file with hgLoadWiggle:
    hgLoadWiggle -pathPrefix=/gbdb/xenTro1/wib xenTro1 gc5Base gc5Base.wig
    ln -s `pwd`/gc5Base.wib /gbdb/xenTro1/wib
================================================================
========   wigToBigWig   ====================================
================================================================
### kent source version 362 ###
wigToBigWig v 4 - Convert ascii format wig file (in fixedStep, variableStep
or bedGraph format) to binary big wig format.
usage:
   wigToBigWig in.wig chrom.sizes out.bw
Where in.wig is in one of the ascii wiggle formats, but not including track lines
and chrom.sizes is a two-column file/URL: <chromosome name> <size in bases>
and out.bw is the output indexed big wig file.
If the assembly <db> is hosted by UCSC, chrom.sizes can be a URL like
  http://hgdownload.cse.ucsc.edu/goldenPath/<db>/bigZips/<db>.chrom.sizes
or you may use the script fetchChromSizes to download the chrom.sizes file.
If not hosted by UCSC, a chrom.sizes file can be generated by running
twoBitInfo on the assembly .2bit file.
options:
   -blockSize=N - Number of items to bundle in r-tree.  Default 256
   -itemsPerSlot=N - Number of data points bundled at lowest level. Default 1024
   -clip - If set just issue warning messages rather than dying if wig
                  file contains items off end of chromosome.
   -unc - If set, do not use compression.
   -fixedSummaries - If set, use a predefined sequence of summary levels.
   -keepAllChromosomes - If set, store all chromosomes in b-tree.
================================================================
========   wordLine   ====================================
================================================================
### kent source version 362 ###
wordLine - chop up words by white space and output them with one
word to each line.
usage:
    wordLine inFile(s)
Output will go to stdout.Options:
    -csym - Break up words based on C symbol rules rather than white space

================================================================
========   xmlCat   ====================================
================================================================
### kent source version 362 ###
xmlCat - Concatenate xml files together, stuffing all records inside a single outer tag. 
usage:
   xmlCat XXX
options:
   -xxx=XXX

================================================================
========   xmlToSql   ====================================
================================================================
### kent source version 362 ###
xmlToSql - Convert XML dump into a fairly normalized relational database
   in the form of a directory full of tab-separated files and table
   creation SQL.  You'll need to run autoDtd on the XML file first to
   get the dtd and stats files.
usage:
   xmlToSql in.xml in.dtd in.stats outDir
options:
   -prefix=name - A name to prefix all tables with
   -textField=name - Name to use for text field (default 'text')
   -maxPromoteSize=N - Maximum size (default 32) for a element that
                       just defines a string to be promoted to a field
                       in parent table

================================================================