Package 'PlasmaMutationDetector2'

Title: Tumor Mutation Detection in Plasma using Barcoding
Description: Aims at detecting single nucleotide variation (SNV) and insertion/deletion (INDEL) in circulating tumor DNA (ctDNA), used as a surrogate marker for tumor, at each base position of an Next Generation Sequencing (NGS) analysis using barcoding. Mutations are assessed by comparing the minor-allele frequency at each position to the measured PER in control samples. This package has been used for Kjersti Tjensvoll, Morten Lapin, Bjørnar Gilje, Herish Garresori, Satu Oltedal, Rakel Brendsdal Forthun, Anders Molven, Yves Rozenholc and Oddmund N\o{o}rdgaard (2022) <>.
Authors: Yves Rozenholc [cre, aut] Oddmund Nordgård [con, aut] Nicolas Pécuchet [con] Pierre-Laurent Puig [con]
Maintainer: Rozenholc <[email protected]>
License: MIT + file LICENSE
Version: 1.1.11
Built: 2025-02-02 08:23:11 UTC

Help Index

The package provide the SNV and INDEL PERs computed for the Ion AmpliSeq™ Colon and Lung Cancer Panel v2 from 29 controls in a table available in the data file background_error_rate.txt.


This table contains 9 variables for each genomic position

  • chrpos, char, of the form chrN:XXXXXXXXX defining genomic position

  • N0, integer, the coverture in the controls

  • E0, integer, the number of errors in the controls

  • p.sain, numeric, the ratio E0/N0

  • up.sain, numeric, the 95th quantile of the Binomial with parameter N0 and E0/N0

  • E0indel, integer, the amount of indel

  • indel.p.sain, numeric, the ration E0indel/N0

  • indel.up.sain, numeric, the 95th quantile of the Binomial with parameter N0 and E0indel/N0

  • hotspot, char, either 'Non-hotspot' or 'Hotspot' depending if the genomic position is known as hotspot or not.




N. Pécuchet, P. Laurent-Puig, O. Nordgård and Y. Rozenholc


Analysis of base-position error rate of next-generation sequencing to detect tumor mutations in circulating DNA N. Pécuchet, Y. Rozenholc, E. Zonta, D. Pietraz, A. Didelot, P. Combe, L. Gibault, J-B. Bachet, V. Taly, E. Fabre, H. Blons, P. Laurent-Puig in Clinical Chemistry

Novel hybridization- and tag-based error-corrected method for sensitive ctDNA mutation detection using ion semiconductor sequencing Kjersti Tjensvoll, Morten Lapin, Bjørnar Gilje, Herish Garresori, Satu Oltedal, Rakel Brendsdal Forthun, Anders Molven, Yves Rozenholc and Oddmund Nordgård in Scientific Reports

See Also


function BuildCtrlErrorRate


Compute the SNV Position-Error Rates and INDEL Position-Error Rates from control samples (available in the control directory ctrl.dir). This function requires MAF files, that will be automatically generated if not present in the specified control folder. SNV PER is computed as the sum in control samples of SNV background counts / sum in control samples of depths where SNV background counts = depth - major allele count. INDEL PER is computed as sum in control samples of INDEL background counts / sum in control samples of depths where INDEL background counts = sum of insertion and deletion counts.


  ctrl.dir = "Plasma ctrl/",
  bai.ext = ".bai",
  pos_ranges.file = NULL,
  hotspot.file = NULL,
  cov.min = 5000,
  force = FALSE,
  output.dir = ctrl.dir,
  n.trim = 0



char, foldername containing the control files (default 'Plasma ctrl/'). The typical folder hierarchy will consist of 'Plasma ctrl/rBAM'


char, filename extension of the bai files (default '.bai')


char, name of the Rdata file containing the three variables pos_ind, pos_snp and pos_ranges as build by the function PrepareLibrary. Default NULL, use the position_ranges.rda provided, used for our analysis.


char, name of the text file containing a list of the genomic positions of the hotspots (default NULL, read the provide hotspot.txt, see hotspot)


integer, minimal coverture to take into account a position (default 5000)


boolean, (default FALSE) if TRUE force all computations to all files including already processed ones


char, name of the folder to save results (default ctrl.dir).


integer, number of base positions trimmed at the ends of each amplicon (default 8)


the number of processed files


N. Pécuchet, P. Laurent-Puig, O. Nordgård and Y. Rozenholc


Analysis of base-position error rate of next-generation sequencing to detect tumor mutations in circulating DNA N. Pécuchet, Y. Rozenholc, E. Zonta, D. Pietraz, A. Didelot, P. Combe, L. Gibault, J-B. Bachet, V. Taly, E. Fabre, H. Blons and P. Laurent-Puig in Clinical Chemistry

Novel hybridization- and tag-based error-corrected method for sensitive ctDNA mutation detection using ion semiconductor sequencing Kjersti Tjensvoll, Morten Lapin, Bjørnar Gilje, Herish Garresori, Satu Oltedal, Rakel Brendsdal Forthun, Anders Molven, Yves Rozenholc and Oddmund Nordgård in Scientific Reports


## Not run: 
   ctrl.dir = system.file("extdata", "4test_only/ctrl/", package = "PlasmaMutationDetector2")
   if (substr(ctrl.dir,nchar(ctrl.dir),nchar(ctrl.dir))!='/')
     ctrl.dir = paste0(ctrl.dir,'/') # TO RUN UNDER WINDOWS
## End(Not run)

function DetectPlasmaMutation


This is the main function of the package that calls mutations by comparing at each genomic position the SNV or INDEL frequencies computed in one tested sample to the SNV or INDEL Position-Error Rates computed from several control samples by a binomial test. An outlier detection is performed among all intra-sample p-values to call a mutation. For users wishing to develop their own analysis for other sequencing panel, it requires recalibrated BAM files control samples to be processed to compute the Position-Error Rates stored in a file specified in ber.ctrl.file.


  patient.dir = "./", = NULL,
  pos_ranges.file = NULL,
  ber.ctrl.file = NULL,
  bai.ext = ".bai",
  alpha = 0.05,
  n.trim = 0,
  force = FALSE,
  show.more = FALSE,
  qcutoff.snv = 1,
  qcutoff.indel = 1, = Inf, =, =, = 0.9,
  hotspot.indel = "chr7:55227950:55249171",
  output.dir = patient.dir



char, foldername containing the rBAM folder of the patients. The typical folder hierarchy will consist of 'Plasma/rBAM'

char, filename of the patient .bam file(s) (default NULL read all patients in folder patient.dir)


char, name of the Rdata file containing the three variables pos_ind, pos_snp, pos_ranges as build by the function PrepareLibrary. Default NULL, use the position_ranges.rda provides that we used for our analysis.


char, pathname of the file providing the background error rates obtained from the controls (default NULL use the provided background error rates obtained from our 29 controls). See background_error_rate.txt data and BuildCtrlErrorRate function.


char, filename extension of the bai files (default '.bai')


num, global false positive rate = global test level (default 0.05)


integer, number of base positions trimmed at the ends of each amplicon (default 0)


boolean, (default FALSE) if TRUE force all computations to all files including already processed ones


boolean, (default FALSE show only detected positions) if TRUE additional annotations on result plots are given for non-significant mutations


numeric, proportion of kept base positions ranged by increasing percentile SNV PER in control samples (default 1)


numeric, proportion of kept base positions ranged by increasing percentile INDEL PER in control samples (default 1)

numeric, exclude hotspot positions without Symmetric Odds Ratio test < cutoff (default 1)

numeric, exclude non-hotspot positions without Symmetric Odds Ratio test < cutoff (default

numeric, exclude indel positions without Symmetric Odds Ratio test < cutoff (default

numeric, exclude ref positions without Symmetric Odds Ratio test < cutoff (default cutoff = 0.9)


char, a vector containing the known positions of hotspot deletion/insertion defined as chrX:start:end (default 'chr7:55227950:55249171')


char, name of the folder to save results (default patient.dir).


the number of processed patients


N. Pécuchet, P. Laurent-Puig, O. Nordgård and Y. Rozenholc


Analysis of base-position error rate of next-generation sequencing to detect tumor mutations in circulating DNA N. Pécuchet, Y. Rozenholc, E. Zonta, D. Pietraz, A. Didelot, P. Combe, L. Gibault, J-B. Bachet, V. Taly, E. Fabre, H. Blons and P. Laurent-Puig in Clinical Chemistry

Novel hybridization- and tag-based error-corrected method for sensitive ctDNA mutation detection using ion semiconductor sequencing Kjersti Tjensvoll, Morten Lapin, Bjørnar Gilje, Herish Garresori, Satu Oltedal, Rakel Brendsdal Forthun, Anders Molven, Yves Rozenholc and Oddmund Nordgård in Scientific Reports


     if (substr(patient.dir,nchar(patient.dir),nchar(patient.dir))!='/')
       patient.dir = paste0(patient.dir,'/') # TO RUN UNDER WINDOWS

The package provide a list of known hotspot positions located on the amplicons of the Ion AmpliSeq™ Colon and Lung Cancer Panel v2 as a txt file hotspot.txt which contains a vector/variable —named chrpos (first row)— of chars, of the form chrN:XXXXXXXXX defining genomic positions.


The package provide a list of known hotspot positions located on the amplicons of the Ion AmpliSeq™ Colon and Lung Cancer Panel v2 as a txt file hotspot.txt which contains a vector/variable —named chrpos (first row)— of chars, of the form chrN:XXXXXXXXX defining genomic positions.




N. Pécuchet, P. Laurent-Puig, O. Nordgård and Y. Rozenholc


Analysis of base-position error rate of next-generation sequencing to detect tumor mutations in circulating DNA N. Pécuchet, Y. Rozenholc, E. Zonta, D. Pietraz, A. Didelot, P. Combe, L. Gibault, J-B. Bachet, V. Taly, E. Fabre, H. Blons, P. Laurent-Puig in Clinical Chemistry

Novel hybridization- and tag-based error-corrected method for sensitive ctDNA mutation detection using ion semiconductor sequencing Kjersti Tjensvoll, Morten Lapin, Bjørnar Gilje, Herish Garresori, Satu Oltedal, Rakel Brendsdal Forthun, Anders Molven, Yves Rozenholc and Oddmund Nordgård in Scientific Reports

function LoadBackgroundErrorRate


This function will load the background error rates created from the controls using the function BuildCtrlErrorRate


LoadBackgroundErrorRate(pos_ranges.file, ber.ctrl.file)



char, name of the Rdata file containing the three variables pos_ind, pos_snp, pos_ranges as build by the function PrepareLibrary. Default NULL, use the position_ranges.rda provides that we used for our analysis.


char, pathname of the file providing the background error rates obtained from the controls (default NULL use the provided background error rates obtained from our 29 controls). See background_error_rate.txt data and BuildCtrlErrorRate function.


the adapted background error rate


N. Pécuchet, P. Laurent-Puig, O. Nordgård and Y. Rozenholc


Analysis of base-position error rate of next-generation sequencing to detect tumor mutations in circulating DNA N. Pécuchet, Y. Rozenholc, E. Zonta, D. Pietraz, A. Didelot, P. Combe, L. Gibault, J-B. Bachet, V. Taly, E. Fabre, H. Blons and P. Laurent-Puig in Clinical Chemistry

Novel hybridization- and tag-based error-corrected method for sensitive ctDNA mutation detection using ion semiconductor sequencing Kjersti Tjensvoll, Morten Lapin, Bjørnar Gilje, Herish Garresori, Satu Oltedal, Rakel Brendsdal Forthun, Anders Molven, Yves Rozenholc and Oddmund Nordgård in Scientific Reports

function MAF_from_BAM


Read BAM files and create MAF file. BAMfiles are stored in a sub-folder '/rBAM'. MAF files are intermediate files stored in a sub-folder '/BER'. MAF files contain the raw counts of A,T,C,G, insertion, deletion, insertion>2bp, deletion >2bp for strand plus and stand minus. Note : we strongly recommand to externally recalibrate BAM files using tools like GATK.


  study.dir = "Plasma/",
  input.filenames = NULL,
  bai.ext = ".bai",
  pos_ranges.file = NULL,
  force = FALSE,
  output.dir = study.dir,
  n.trim = 8



char, name of the folder containing the rBAM directory (default 'Plasma/'). The typical folder hierarchy will consist of 'Plasma/rBAM'


a vector of char (default NULL), the names of the BAM files to process. If NULL all BAM files in the rBAM folder will be processed


char, filename extension of the bai files (default '.bai')


char, name of the Rdata file containing the three variables pos_ind, pos_snp and pos_ranges as build by the function PrepareLibrary. Default NULL, use the position_ranges.rda provided, used for our analysis.


boolean, (default FALSE) if TRUE force all computations to all files including already processed ones


char, name of the folder to save results (default study.dir)


integer, number of base positions trimmed at the ends of each amplicon (default 8)


the path/names of the MAF files


N. Pécuchet, P. Laurent-Puig, O. Nordgård and Y. Rozenholc


Analysis of base-position error rate of next-generation sequencing to detect tumor mutations in circulating DNA N. Pécuchet, Y. Rozenholc, E. Zonta, D. Pietraz, A. Didelot, P. Combe, L. Gibault, J-B. Bachet, V. Taly, E. Fabre, H. Blons, P. Laurent-Puig in Clinical Chemistry

Novel hybridization- and tag-based error-corrected method for sensitive ctDNA mutation detection using ion semiconductor sequencing Kjersti Tjensvoll, Morten Lapin, Bjørnar Gilje, Herish Garresori, Satu Oltedal, Rakel Brendsdal Forthun, Anders Molven, Yves Rozenholc and Oddmund Nordgård in Scientific Reports


## Not run: 
     ctrl.dir = system.file("extdata", "4test_only/ctrl/",
       package = "PlasmaMutationDetector2")
     if (substr(ctrl.dir,nchar(ctrl.dir),nchar(ctrl.dir))!='/')
       ctrl.dir = paste0(ctrl.dir,'/') # TO RUN UNDER WINDOWS
## End(Not run)

The package provide the positions and ranges computed for the Ion AmpliSeq™ Colon and Lung Cancer Panel v2 as a Rdata file positions_ranges.rda.


This file contains 4 variables

  • pos_ind, vector of chars, of the form chrN:XXXXXXXXX defining genomic positions of the Ion AmpliSeq™ Colon and Lung Cancer Panel v2

  • pos_snp, vector of chars, of the form chrN:XXXXXXXXX defining the known snp genomic positions

  • pos_ranges, GRanges object, describing the 92 amplicons of the Ion AmpliSeq™ Colon and Lung Cancer Panel v2




N. Pécuchet, P. Laurent-Puig, O. Nordgård and Y. Rozenholc


Analysis of base-position error rate of next-generation sequencing to detect tumor mutations in circulating DNA N. Pécuchet, Y. Rozenholc, E. Zonta, D. Pietraz, A. Didelot, P. Combe, L. Gibault, J-B. Bachet, V. Taly, E. Fabre, H. Blons, P. Laurent-Puig in Clinical Chemistry

Novel hybridization- and tag-based error-corrected method for sensitive ctDNA mutation detection using ion semiconductor sequencing Kjersti Tjensvoll, Morten Lapin, Bjørnar Gilje, Herish Garresori, Satu Oltedal, Rakel Brendsdal Forthun, Anders Molven, Yves Rozenholc and Oddmund Nordgård in Scientific Reports

See Also


function PrepareLibrary


Define the Genomic Ranges and Genomic Positions covered by the AmpliSeq™ Panel to include in the study and define SNP positions to exclude from the study. Trimming amplicon ends is performed if specified. This function is mostly useful if you want to add some SNP positions which are not existing in the positions_ranges.rda file provided within the package. It is provided to be able to reconstruct positions_ranges.rda data.


  info.dir = "Info/",
  bed.filename = "PACT-ACT_iDES_1_Regions.bed",
  snp.filename = "ExAC.r1.sites.vep.vcf.gz",
  snp.extra = NULL, = "positions_ranges.rda",
  output.dir = info.dir



char, name of the folder containing the library information files (default 'Info/')


char, name of a BED table (tab-delimited) describing the Panel (with first 3 columns: "chr" (ex:chr1), "start position" (ex:115252190), "end position" (ex:115252305), i.e. the Ion AmpliSeq™ Colon and Lung Cancer Research Panel v2 (default 'lungcolonV2.bed.txt' as provided in the inst/extdata/Info folder of the package).


char, name of the vcf file describing known SNP positions, obtained from (default 'ExAC.r0.3.sites.vep.vcf.gz'). It requires a corresponding TBI file to be in the same folder (obtained from


a vector of char, a vector of extra known snp positions manually curated (ex:"chrN:XXXXXXXXX")

char, filename to save pos_ind and pos_snp (default 'positions_ranges.rda')


char, directory where to save pos_ind and pos_snp (default info.dir)


Save the following variables in a .rda file defined by in the folder defined by output.dir:

  • pos_ranges, a GRanges descriptor of amplicon positions

  • pos_ind, a vector of char "chrN:XXXXXXXXX", defining ALL index positions

  • pos_snp, a vector of char "chrN:XXXXXXXXX", defining SNP positions


N. Pécuchet, P. Laurent-Puig, O. Nordgård and Y. Rozenholc


Analysis of base-position error rate of next-generation sequencing to detect tumor mutations in circulating DNA N. Pécuchet, Y. Rozenholc, E. Zonta, D. Pietraz, A. Didelot, P. Combe, L. Gibault, J-B. Bachet, V. Taly, E. Fabre, H. Blons, P. Laurent-Puig in Clinical Chemistry

Novel hybridization- and tag-based error-corrected method for sensitive ctDNA mutation detection using ion semiconductor sequencing Kjersti Tjensvoll, Morten Lapin, Bjørnar Gilje, Herish Garresori, Satu Oltedal, Rakel Brendsdal Forthun, Anders Molven, Yves Rozenholc and Oddmund Nordgård in Scientific Reports

See Also



bad.pos = "chr7:15478"