createMutabilityMatrix - Builds a mutability model
createMutabilityMatrix builds a 5-mer nucleotide mutability model by counting
the number of mutations occuring in the center position for all 5-mer motifs.
createMutabilityMatrix(db, substitutionModel, model = c("RS", "S"), sequenceColumn = "SEQUENCE_IMGT", germlineColumn = "GERMLINE_IMGT_D_MASK", vCallColumn = "V_CALL", multipleMutation = c("independent", "ignore"), minNumSeqMutations = 500, numSeqMutationsOnly = FALSE, returnSource = FALSE)
- data.frame containing sequence data.
- matrix of 5-mer substitution rates built by createSubstitutionMatrix.
- type of model to create. The default model, “RS”, creates a model by counting both replacement and silent mutations. The “S” specification builds a model by counting only silent mutations.
- name of the column containing IMGT-gapped sample sequences.
- name of the column containing IMGT-gapped germline sequences.
- name of the column containing the V-segment allele call.
- string specifying how to handle multiple mutations occuring
within the same 5-mer. If
"independent"then multiple mutations within the same 5-mer are counted indepedently. If
"ignore"then 5-mers with multiple mutations are excluded from the total mutation tally.
- minimum number of mutations in sequences containing each 5-mer
to compute the mutability rates. If the number is smaller
than this threshold, the mutability for the 5-mer will be
inferred. Default is 500. Not required if
TRUE, return only a vector counting the number of observed mutations in sequences containing each 5-mer. This option can be used for parameter tuning for
minNumSeqMutationsduring preliminary analysis using minNumSeqMutationsTune. Default is
- return the sources of 5-mer mutabilities (measured vs.
inferred). Default is
FALSE, a named numeric vector of 1024
normalized mutability rates for each 5-mer motif with names defining the 5-mer
TRUE, a named numeric
vector of length 1024 counting the number of observed mutations in sequences containing
- Yaari G, et al. Models of somatic hypermutation targeting and substitution based on synonymous mutations from high-throughput immunoglobulin sequencing data. Front Immunol. 2013 4(November):358.
# Subset example data to one isotype and sample as a demo data(ExampleDb, package="alakazam") db <- subset(ExampleDb, ISOTYPE == "IgA" & SAMPLE == "-1h") # Create model using only silent mutations sub_model <- createSubstitutionMatrix(db, model="S") mut_model <- createMutabilityMatrix(db, sub_model, model="S", minNumSeqMutations=200, numSeqMutationsOnly=FALSE)
Warning:Insufficient number of mutations to infer some 5-mers. Filled with 0.
# Count the number of mutations in sequences containing each 5-mer mut_count <- createMutabilityMatrix(db, sub_model, model="S", numSeqMutationsOnly=TRUE)