An introduction to aPEAR

aPEAR is designed to help you notice the most important biological themes in your enrichment analysis results. It analyses the gene lists of the pathways and detects clusters of redundant overlapping gene sets.

Let’s begin by performing a simple gene set enrichment analysis with clusterProfiler:

Generate an enrichment network with `enrichmentNetwork()`

enrichmentNetwork is the most important function exported by aPEAR. It detects clusters of similar pathways and generates a ggplot2 visualization. The only thing it asks you to provide is your enrichment result:

set.seed(654824)
enrichmentNetwork(enrich@result)

Internally, enrichmentNetwork calls two functions, findPathClusters and plotPathClusters, which are described in more detail below.

What if I performed my enrichment analysis using another method, not `clusterProfiler`?

aPEAR currently recognizes input from clusterProfiler and gProfileR. However, if you have custom enrichment input, do not worry!

aPEAR accepts any kind of enrichment input as long as it is formatted correctly, the only requirement is that the gene list of each pathway is known. You should format your data so that:

it is a data.frame.
it has a column titled Description - it will be used to label each node and to select the name of the most important cluster.
it has a column titled pathwayGenes which contains the gene list of each pathway - it will be used to calculate the similarities between the pathways. It can be leading edge genes or the full gene list. The ID type (Ensembl, Gene symbol, etc.) does not matter but should be the same between all the pathways. The genes should be separated by “/”.
a column for colouring the nodes - it should be specified with the parameter colorBy.
a column for setting the node size - it should be specified with the parameter nodeSize.

For example, you might format your data like this:

enrichmentData[ 1:5 ]
#>                                 Description         pathwayGenes      NES Size
#> 1:           chromosome, centromeric region 55143/1062/10403/... 2.646268  188
#> 2:                              kinetochore 1062/10403/55355/... 2.630240  130
#> 3: condensed chromosome, centromeric region 1062/10403/55355/... 2.625070  138
#> 4:                       nuclear chromosome 8318/55388/7153/2... 2.582163  175
#> 5:                       chromosomal region 55143/1062/10403/... 2.544742  305

Then, tell the enrichmentNetwork what to do:

p <- enrichmentNetwork(enrichmentData, colorBy = 'NES', nodeSize = 'Size', verbose = TRUE)
#> Validating parameters...
#> Validating enrichment data...
#> Detected enrichment type custom
#> Calculating pathway similarity using method jaccard
#> Using Markov Cluster Algorithm to detect pathway clusters...
#> Clustering done
#> Using Pagerank algorithm to assign cluster titles...
#> Pagerank scores calculated
#> Creating the enrichment network visualization...
#> Validating theme parameters...
#> Preparing enrichment data for plotting...
#> Detected enrichment type custom
#> Creating the enrichment graph...

What if I performed ORA and do not have the normalized enrichment score (NES)?

Good news: you can use the p-values to color the nodes! Just specify the colorBy column and colorType = 'pval':

set.seed(348934)
enrichmentNetwork(enrich@result, colorBy = 'pvalue', colorType = 'pval', pCutoff = -5)

Find pathway clusters with `findPathClusters()`

If your goal is only to obtain the clusters of redundant pathways, the function findPathClusters is the way to go. It accepts a data.frame with the enrichment results and returns a list of the pathway clusters and the similarity matrix:

clusters <- findPathClusters(enrich@result, cluster = 'hier', minClusterSize = 6)

clusters$clusters[ 1:5 ]
#>            Pathway     Cluster
#> 1: mitotic spindle microtubule
#> 2:         spindle microtubule
#> 3:         midbody microtubule
#> 4:      centrosome microtubule
#> 5:     microtubule microtubule

pathways <- clusters$clusters[ 1:5, Pathway ]
clusters$similarity[ pathways, pathways ]
#>                 mitotic spindle   spindle   midbody centrosome microtubule
#> mitotic spindle       1.0000000 0.4090909 0.3142857  0.1940299   0.2857143
#> spindle               0.4090909 1.0000000 0.2686567  0.2659574   0.3793103
#> midbody               0.3142857 0.2686567 1.0000000  0.1428571   0.2586207
#> centrosome            0.1940299 0.2659574 0.1428571  1.0000000   0.1630435
#> microtubule           0.2857143 0.3793103 0.2586207  0.1630435   1.0000000

For more information about available similarity metrics, clustering methods, cluster naming conventions, and other available parameters, see ?aPEAR.theme.

An introduction to aPEAR

Ieva Kerseviciute

2023-06-02

Generate an enrichment network with `enrichmentNetwork()`

What if I performed my enrichment analysis using another method, not `clusterProfiler`?

What if I performed ORA and do not have the normalized enrichment score (NES)?

Find pathway clusters with `findPathClusters()`

Visualize pathway clusters with `plotPathClusters()`

An introduction to aPEAR

Ieva Kerseviciute

2023-06-02

Generate an enrichment network with enrichmentNetwork()

What if I performed my enrichment analysis using another method, not clusterProfiler?

What if I performed ORA and do not have the normalized enrichment score (NES)?

Find pathway clusters with findPathClusters()

Visualize pathway clusters with plotPathClusters()

Generate an enrichment network with `enrichmentNetwork()`

What if I performed my enrichment analysis using another method, not `clusterProfiler`?

Find pathway clusters with `findPathClusters()`

Visualize pathway clusters with `plotPathClusters()`