Skip to content
This repository was archived by the owner on Nov 8, 2021. It is now read-only.
This repository was archived by the owner on Nov 8, 2021. It is now read-only.

Volcano plot for showing significant changes in reads per gene #35

@Gregory94

Description

@Gregory94

To indicate genes that have a fitness change between different libraries, I am creating a volcano plot.
A dot represents a single gene. On the x-axis it shows the fold change in number of reads or insertions between two libraries and on the y-axis it shows the p-value (determined by an independent t-test). Note the the x-axis is in log2 scale and the y-axis is in -log10 scale.
So the interesting genes are the ones that are high on the y-axis and far away from 0 on the x-axis.
volcanoplot_reads
When I compare my results with those found by Agnes (Kornmann lab), the numbers don't seem to match. See the below figure which is using the same dataset.
volcanoplot_reads_Agnes
The method I use for determining the fold change is I sum over all reads of a specific gene in all datasets of a library (e.g. we have 2 wt datasets and 4 dNrp1 datasets) and then I normalize for the total number of insertions in the library. So I end up with two values, the normalized summed number of reads for wt and for dNrp1.
The p-value of the student t-test is determined by python using scipy.stats.ttest_ind(wt_datasets, dNrp1_datasets).

One thing in my graph is the cluster with large negative fold change. I checked the genes in this cluster and they are all genes that have 0 reads in all datasets except for one. This probably messes up the fold change calculation, but I am not sure yet about a way of dealing with this.
Another thing is that I don't know for each dataset the total number of insertions. Now I only use the insertion count in all genes (which is the only thing I have data from), but this ignores all insertions outside the genes which might explain some differences between my results and those from Agnes.

I use volcano.py for creating the above plot.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions