library(oncoPredict)
#This vignette demonstrates how to prepare predicted drug response and mutation
#data for mutation-based IDWAS with idwas(cnv=FALSE).
#Determine the parameters of the idwas() function...
#Set the drug_prediction parameter.
#Make sure rownames() are samples, and colnames() are drugs. Also make sure this data is a data frame.
drug_prediction<-as.data.frame(read.table(vignette_file("DrugPredictions.txt"), header=TRUE, row.names=1))
#In this example, replace '.' with '-' so the TCGA sample identifiers match the
#format used in the mutation data.
colnames(drug_prediction)<-gsub(".", "-", colnames(drug_prediction), fixed=T)
#Make sure the sample identifiers in the 'drug prediction' data are of similar form as the sample identifiers in the 'data' parameter.
cols=colnames(drug_prediction)
colnames(drug_prediction)<-substring(cols, 3, nchar(cols))
drug_prediction<-as.data.frame(t(drug_prediction))This vignette provides an example of how to prepare mutation data
from the GDC database for GBM (glioblastoma) and how to apply
idwas() to test predicted drug response against somatic
mutations.
Because GDC and TCGAbiolinks access patterns can change over time, the download code is shown as non-executed guidance.
Download mutation data for your cancer of interest from GDC database.
https://bioconductor.org/packages/release/bioc/vignettes/TCGAbiolinks/inst/doc/mutation.html
https://rdrr.io/bioc/TCGAbiolinks/f/vignettes/mutation.Rmd
The code would look something like this:
library(TCGAbiolinks)
query_maf <- GDCquery(project = "TCGA-GBM",
data.category = "Simple Nucleotide Variation",
access = "open",
data.type = "Simple somatic mutation",
legacy = TRUE)
GDCdownload(query_maf)
maf <- GDCprepare(query_maf)After downloading the mutation data, format the mutation table before running IDWAS.
#Make sure this data is a data frame with mutation annotations in columns.
#For idwas(cnv=FALSE), the data should include Variant_Classification,
#Hugo_Symbol, and Tumor_Sample_Barcode.
data<-as.data.frame(maf)
samps<-data$Tumor_Sample_Barcode
data$Tumor_Sample_Barcode<-substr(samps,1,nchar(samps)-12) #Make sure these sample ids are of the same form as the sample ids in your prediction data.
#Determine the number of samples you want mutations to occur in. The default is 10.
n=10
#Indicate whether or not you would like to test CNA amplification data. If TRUE, you will test CNA amplifications. If FALSE, you will test mutation data.
cnv=FALSE