testBaseline - Two-sided test of BASELINe PDFs
Description¶
testBaseline
performs a two-sample signifance test of BASELINe
posterior probability density functions (PDFs).
Usage¶
testBaseline(baseline, groupBy)
Arguments¶
- baseline
Baseline
object containing thedb
and grouped BASELINe PDFs returned by groupBaseline.- groupBy
- string defining the column in the
db
slot of theBaseline
containing sequence or group identifiers.
Value¶
A data.frame with test results containing the following columns:
region
: sequence region, such ascdr
andfwr
.test
: string defining the groups be compared. The string is formated as the conclusion associated with the p-value in the formGROUP1 != GROUP2
. Meaning, the p-value for rejection of the null hypothesis that GROUP1 and GROUP2 have equivalent distributions.pvalue
: two-sided p-value for the comparison.fdr
: FDR correctedpvalue
.
References¶
- Yaari G, et al. Quantifying selection in high-throughput immunoglobulin sequencing data sets. Nucleic Acids Res. 2012 40(17):e134. (Corretions at http://selection.med.yale.edu/baseline/correction/)
Examples¶
# Subset example data
data(ExampleDb, package="alakazam")
db <- subset(ExampleDb, c_call %in% c("IGHM", "IGHG", "IGHA"))
# Collapse clones
db <- collapseClones(db, cloneColumn="clone_id",
sequenceColumn="sequence_alignment",
germlineColumn="germline_alignment_d_mask",
method="thresholdedFreq", minimumFrequency=0.6,
includeAmbiguous=FALSE, breakTiesStochastic=FALSE)
# Calculate BASELINe
baseline <- calcBaseline(db,
sequenceColumn="clonal_sequence",
germlineColumn="clonal_germline",
testStatistic="focused",
regionDefinition=IMGT_V,
targetingModel=HH_S5F,
nproc=1)
calcBaseline will calculate observed and expected mutations for clonal_sequence using clonal_germline as a reference.
Calculating BASELINe probability density functions...
# Group PDFs by the isotype
grouped <- groupBaseline(baseline, groupBy="c_call")
Grouping BASELINe probability density functions...
Calculating BASELINe statistics...
# Visualize isotype PDFs
plot(grouped, "c_call")
# Perform test on isotype PDFs
testBaseline(grouped, groupBy="c_call")
region test pvalue fdr
1 cdr IGHM != IGHA 0.126536071 0.15184329
2 cdr IGHM != IGHG 0.040137450 0.08027490
3 cdr IGHA != IGHG 0.101394824 0.15184329
4 fwr IGHM != IGHA 0.010416292 0.03124888
5 fwr IGHM != IGHG 0.006309629 0.03124888
6 fwr IGHA != IGHG 0.334456382 0.33445638
See also¶
To generate the Baseline input object see groupBaseline.