Added a section “ID classification” in the documentation for exported data catalog.row.order
.
New argument suppress.discarded.variants.warnings
in exported function AnnotateIDVCF
with default value TRUE.
Added another paper information in AddRunInformation
. “Characterization of colibactin-associated mutational signature in an Asian oral squamous cell carcinoma and in other mucosal tumor types”, Genome Research 2020 https://doi.org/10.1101/gr.255620.119.
Changed the format of DOIs in DESCRIPTION according to CRAN policy.
Changed back the return value of ReadStrelkaIDVCFs
, ReadStrelkaSBSVCFs
, ReadMutectVCFs
to a list of data frames with no variants discarded.
Combined all the discarded variants from ReadAndSplitMutectVCFs
and ReadAndSplitStrelkaSBSVCFs
under one element discarded.variants
in the return value. An extra column discarded.reason
were added to show the details.
Updated internal functions ReadVCF
and ReadVCFs
not to remove any discarded variants.
No more removal of “chr” in the CHROM
column when reading in VCFs.
CheckAndReturnSBSMatrix
, CheckAndReturnDBSMatrix
, CreateOneColSBSMatrix
,CreateOneColDBSMatrix
, VCFsToSBSCatalogs
, VCFsToDBSCatalogs
.CalculateExpressionLevel
for the edge case.CreateOneColIDMatrix
when the ID.class contains non canonical representation of the ID mutation type.The return value of exported function ReadStrelkaIDVCFs
now sometimes contains a new element, discarded.variants
. This appears when there are variants that were discarded immediately after reading in the VCFs. At present these are variants that have duplicated chromosome/positions and variants that have illegal chromosome names. This means that the user must check the return to see if discarded.variants
is present and remove it before passing the return to a function that expects a list of VCFs. Code in ICAMS that takes lists of VCFs already checks for this element and removes it if present.
Added argument return.annotated.vcfs
to exported function VCFsToIDCatalogs
. The default value for the argument is FALSE to be consistent with other functions.
Argument return.annotated.vcfs
in functions VCFsToSBSCatalogs
,VCFsToDBSCatalogs
, VCFsToIDCatalogs
, MutectVCFFilesToCatalog
, MutectVCFFilesToCatalogAndPlotToPdf
, MutectVCFFilesToZipFile
, StrelkaSBSVCFFilesToCatalog
, StrelkaSBSVCFFilesToCatalogAndPlotToPdf
, StrelkaSBSVCFFilesToZipFile
, StrelkaIDVCFFilesToCatalog
, StrelkaIDVCFFilesToCatalogAndPlotToPdf
and StrelkaIDVCFFilesToZipFile
.
Argument suppress.discarded.variants.warnings
in functions ReadAndSplitMutectVCFs
, ReadAndSplitStrelkaSBSVCFs
, VCFsToSBSCatalogs
,VCFsToDBSCatalogs
, VCFsToIDCatalogs
, MutectVCFFilesToCatalog
, MutectVCFFilesToCatalogAndPlotToPdf
, MutectVCFFilesToZipFile
, StrelkaSBSVCFFilesToCatalog
, StrelkaSBSVCFFilesToCatalogAndPlotToPdf
, StrelkaSBSVCFFilesToZipFile
, StrelkaIDVCFFilesToCatalog
, StrelkaIDVCFFilesToCatalogAndPlotToPdf
and StrelkaIDVCFFilesToZipFile
.
Added documentation to exported functions ReadAndSplitStrelkaSBSVCFs
, StrelkaSBSVCFFilesToCatalog
, StrelkaSBSVCFFilesToCatalogAndPlotToPdf
and StrelkaSBSVCFFilesToZipFile
.
Added information on the “ID classification” in documentation of functions generating ID catalogs, FindDelMH
and FindMaxRepeatDel
.
Minor changes to documentation of functions PlotCatalog
, PlotCatalogToPdf
, StrelkaSBSVCFFilesToZipFile
, StrelkaIDVCFFilesToZipFile
and MutectVCFFilesToZipFile
.
Updated documentation for the return value of functions
StrelkaIDVCFFilesToCatalog
, StrelkaIDVCFFilesToCatalogAndPlotToPdf
, StrelkaIDVCFFilesToZipFile
and VCFsToIDCatalogs
to make it clearer to the user.
Added new exported data of catalog row order for SBS96, SBS1536 and DBS78 in SigProfiler format to catalog.row.order.sp
.
New internal function ConvertICAMSCatalogToSigProSBS96
, ReadVCF
, ReadVCFs
.
New exported function GetFreebayesVAF
for calculating variant allele frequencies from Freebayes VCF.
New test data for Strelka mixed VCF.
Added time zone information to file “run-information.txt” when calling functions MutectVCFFilesToZipFile
, StrelkaSBSVCFFilesToZipFile
and StrelkaIDVCFFilesToZipFile
.
Enabled “counts” -> “counts.signature” catalog transformation when the source catalog has NULL abundance.
Added legend for SBS192 plot and changed the legend text for SBS12 plot.
Added a second element plot.object
to the return list from function PlotCatalog
for catalog types “SBS192Catalog”, “DBS78Catalog”, “DBS144Catalog” and “IndelCatalog”. The second element is a numeric vector giving the coordinates of the bar midpoints, useful for adding to the graph.
Made the returns from PlotCatalog
and PlotCatalogToPdf
invisible.
Improved time performance of GetMutectVAF
, CanonicalizeDBS
, CanonicalizeQUAD
.
if
statements in GetCustomKmerCounts
、 GetStrandedKmerCounts
and GetGenomeKmerCounts
.
CreateOneColIDMatrix
when there is NA ID category.
GetMutectVAF
to check if the VCF is indeed a Mutect VCF.
CreateOneColDBSMatrix
when the VCF does not have any variant in the transcribed region.
CalculatePValues
when there is only a single expression value.
Created an internal function MakeDataFrameFromVCF
to read in data lines of a VCF.
New argument name.of.VCF
in internal function CheckAndFixChrNames
to make the error message more informative.
New argument name.of.VCF
in exported function AnnotateIDVCF
to make the error message more informative.
ReadStrelkaIDVCF
to make the error message more informative.AnnotateIDVCF
to a list. The first element annotated.vcf
contains the annotated VCF. If there are rows that are discarded, the function will generate a warning and a second element discarded.variants
will be included in the returned list.flag.mismatches
deprecated in exported function AnnotateIDVCF
. If there are mismatches to references, the function will automatically discard these rows. User can refer to the element discarded.variants
in the return value for the discarded variants.SplitStrelkaSBSVCF
when there are no non.SBS mutations in the input.MakeDataFrameFromMutectVCF
when a Mutect VCF has no meta-information lines.CreateOneColSBSMatrix
when an annotated SBS VCF has variants on transcribed regions that all fall on transcripts on both strand.CreateOneColDBSMatrix
when an annotated DBS VCF has variants on transcribed regions that all fall on transcripts on both strand.ReadAndSplitStrelkaSBSVCFs
.MutectVCFFilesToZipFile
, StrelkaSBSVCFFilesToZipFile
and StrelkaIDVCFFilesToZipFile
.trans.ranges
to make it optional.name.of.VCF
in internal functions ReadStrelkaSBSVCF
, ReadStrelkaIDVCF
and exported function GetStrelkaVAF
.flag.mismatches
in functions VCFsToIDCatalogs
, MutectVCFFilesToCatalog
, MutectVCFFilesToCatalogAndPlotToPdf
, MutectVCFFilesToZipFile
, StrelkaIDVCFFilesToCatalog
, StrelkaIDVCFFilesToCatalogAndPlotToPdf
and StrelkaIDVCFFilesToZipFile
.GetStrelkaVAF
andGetMutectVAF
to a data frame which contains the VAF and read depth information.PlotCatalogToPdf
a list. The first element is a logical value indicating whether the plot is successful. The second element is a list containing the strand bias statistics (only for SBS192Catalog with “counts” catalog.type and non-NULL abundance and argument plot.SBS12
= TRUE).PlotCatalog
and PlotCatalogToPdf
: For class SBS96Catalog: (New) Allow setting ylim and cex. (New) For PlotCatalog
(not PlotCatalogToPdf
), allow plotting of a 96 x 2 catalog, in which case behavior is a stacked bar chart. (New) Plot x axis tick marks if xlabels
is not TRUE; set par(tck = 0)
to suppress. For class IndelCatalog: (New) Allow setting ylim.GetCustomKmerCounts
.PlotTransBiasGeneExpToPdf
so that ymax on the plot will be changed based on plot.type
.flat.abundance
from “numeric” to “integer”.TransformCatalog
; see documentation for rationale.TransformCatalog
and updated its documentation for parameter target.abundance
.CheckAndFixChrNames
and updated the automated tests.TransformCatalog
.GetMutectVAF
and updated the warning message to make it more informative.cbind
to check the attributes of the incoming catalogs and assign attributes accordingly.TransformCatalog
to check the attributes of the catalog to be transformed in the first place.AnnotateSBSVCF
, AnnotateDBSVCF
and AnnotateIDVCF
.PlotTransBiasGeneExp
and PlotTransBiasGeneExpToPdf
.names.of.VCFs
in functions ReadAndSplitMutectVCFs
, ReadAndSplitStrelkaSBSVCFs
, ReadStrelkaIDVCFs
, MutectVCFFilesToCatalog
, MutectVCFFilesToCatalogAndPlotToPdf
, StrelkaIDVCFFilesToCatalog
, StrelkaIDVCFFilesToCatalogAndPlotToPdf
, StrelkaSBSVCFFilesToCatalog
and StrelkaSBSVCFFilesToCatalogAndPlotToPdf
for users to specify the names of samples in the VCF files.as.catalog
.gene.expression.data.HepG2
and gene.expression.data.MCF10A
.tumor.col.names
in functions ReadAndSplitMutectVCFs
, MutectVCFFilesToCatalog
and MutectVCFFilesToCatalogAndPlotToPdf
to specify the column of the VCF that contains sequencing statistics such as sequencing depth; this column is often called “unknown” in Mutect.MutectVCFFilesToCatalog
, MutectVCFFilesToCatalogAndPlotToPdf
, StrelkaSBSVCFFilesToCatalog
, StrelkaSBSVCFFilesToCatalogAndPlotToPdf
, VCFsToSBSCatalogs
, VCFsToDBSCatalogs
, ReadCatalog
informing the user how to change attributes of the generated catalog.VCFsToIDCatalogs
, StrelkaIDVCFFilesToCatalog
and StrelkaIDVCFFilesToCatalogAndPlotToPdf
a list; 1st element is the spectrum catalog (previously the only return); 2nd element is a list of VCFs with additional annotations.PlotCatalog
a list. The first element is a logical value indicating whether the plot is successful. The second element is a numeric vector giving the coordinates of all the bar midpoints drawn, useful for adding to the graph (only implemented for SBS96Catalog).output.file
argument in MutectVCFFilesToCatalogAndPlotToPdf
, StrelkaSBSVCFFilesToCatalogAndPlotToPdf
, and StrelkaIDVCFFilesToCatalogAndPlotToPdf
so that an indicator of the catalog type plus “.pdf” is simply appended to the base output.file
name. Also made this argument optional with sensible default behavior.trans.ranges.GRCh37
, trans.ranges.GRCh38
and trans.ranges.GRCm38
.FindDelMH
, cryptic repeats (i.e. un-normalized deletions in a repeat such as GAGG deleted from CCCAGGGAGGGTCCC, which should be normalized to a deletion of AGGG) are now ignored with a warning rather than causing a stop
.FindDelMH
, which previously did not flag the cryptic repeat in what is now the second example in the function documentation.as.catalog
supports creation of the catalog from a vector (interpreted as a 1-column matrix) and optionally infers the class from the number of rows in the input.