Skip to content

Commit 3095567

Browse files
authored
Merge pull request #43 from AlexandrovLab/development
Development
2 parents 3c258eb + 3eebbd0 commit 3095567

36 files changed

+16030
-13244
lines changed

MANIFEST.in

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,5 @@ include SigProfilerAssignment/data/Reference_Signatures/GRCh38/*
66
include SigProfilerAssignment/data/Reference_Signatures/mm9/*
77
include SigProfilerAssignment/data/Reference_Signatures/mm10/*
88
include SigProfilerAssignment/data/Reference_Signatures/rn6/*
9-
include SigProfilerAssignment/src/FormatFiles/*
10-
include SigProfilerAssignment/src/Fonts/*
11-
include SigProfilerAssignment/src/*
9+
include SigProfilerAssignment/DecompositionPlots/reference_files/Fonts/*
10+
include SigProfilerAssignment/DecompositionPlots/reference_files/*

README.md

Lines changed: 44 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@
33

44

55

6-
<img src="SigProfilerAssignment/src/figures/SigProfilerAssignment.png" alt="drawing" width="1000"/>
6+
<img src="SigProfilerAssignment/figures/SigProfilerAssignment.png" alt="drawing" width="1000"/>
77

88
# SigProfilerAssignment
99
SigProfilerAssignment is a new mutational attribution and decomposition tool that performs the following functions:
@@ -72,7 +72,7 @@ spa_analyze( samples, output, signatures=None, signature_database=None,decompo
7272
``` -->
7373
### Decompose Fit
7474
Decomposes the De Novo Signatures into COSMIC Signatures and assigns COSMIC signatures into samples.
75-
<img src="SigProfilerAssignment/src/figures/decomp_pic.jpg" alt="drawing" width="600"/>
75+
<img src="SigProfilerAssignment/figures/decomp_pic.jpg" alt="drawing" width="600"/>
7676

7777
```python
7878
from SigProfilerAssignment import Analyzer as Analyze
@@ -86,9 +86,12 @@ Analyze.decompose_fit(samples,
8686
exclude_signature_subgroups=exclude_signature_subgroups,
8787
exome=False)
8888
```
89+
90+
## Analysis
91+
8992
### *De Novo* Fit
9093
Attributes mutations of given Samples to input denovo signatures.
91-
<img src="SigProfilerAssignment/src/figures/denovo_fit.jpg" alt="drawing" width="600"/>
94+
<img src="SigProfilerAssignment/figures/denovo_fit.jpg" alt="drawing" width="600"/>
9295

9396
```python
9497
from SigProfilerAssignment import Analyzer as Analyze
@@ -102,7 +105,7 @@ Analyze.denovo_fit( samples,
102105
### COSMIC Fit
103106
Attributes mutations of given Samples to input COSMIC signatures. Note that penalties associated with denovo fit and COSMIC fits are different.
104107

105-
<img src="SigProfilerAssignment/src/figures/cosmic_fit.jpg" alt="drawing" width="600"/>
108+
<img src="SigProfilerAssignment/figures/cosmic_fit.jpg" alt="drawing" width="600"/>
106109

107110
```python
108111
from SigProfilerAssignment import Analyzer as Analyze
@@ -121,23 +124,25 @@ Analyze.cosmic_fit( samples,
121124
## Main Parameters
122125
| Parameter | Variable Type | Parameter Description |
123126
| --------------------- | -------- |-------- |
124-
| **samples** | String | Path to a tab delimilted file that contains the samples table where the rows are mutation types and colunms are sample IDs. or Path to VCF files directory if input files are VCF Files. |
127+
| **samples** | String | Path to input file for `input_type`:<ul><li>"matrix"</li><li>"seg:TYPE"</li></ul> Path to input folder for `input_type`:<ul><li>"vcf"</li></ul>|
125128
| **output** | String | Path to the output folder. |
126-
| **input_type** | String | The type of input:<br><ul><li>"vcf": used for vcf format inputs.</li><li>"matrix": used for table format inputs using a tab seperated file.</li></ul> Default value is "matrix"|
129+
| **input_type** | String | The type of input:<br><ul><li>"matrix": used for table format inputs using a tab-separated file where the rows are mutation types and the columns are sample IDs.</li><li>"vcf": used for mutation calling file inputs (VCFs, MAFs or simple text files).</li><li>"seg:TYPE": used for a multi-sample segmentation file for copy number analysis. The accepted callers for TYPE are the following {"ASCAT", "ASCAT_NGS", "SEQUENZA", "ABSOLUTE", "BATTENBERG", "FACETS", "PURPLE", "TCGA"}. For example, when using segmentation file from BATTENBERG then set input_type to "seg:BATTENBERG".</li></ul> The default value is "matrix".|
130+
| **context_type**| String| Required context type if `input_type` is "vcf". `context_type` takes which context type of the input data is considered for assignment. Valid options include "96", "288", "1536", "DINUC", and "INDEL". The default value is "96".|
127131
| **signatures** | String | Path to a tab delimited file that contains the signature table where the rows are mutation types and colunms are signature IDs. |
128132
| **genome_build** | String | The reference genome build. List of supported genomes: "GRCh37", "GRCh38", "mm9", "mm10" and "rn6". The default value is "GRCh37". If the selected genome is not in the supported list, the default genome will be used. |
129133
| **cosmic_version** | Float | Takes a positive float among 1, 2, 3, 3.1, 3.2 and 3.3. Defines the version of the COSMIC reference signatures. The default value is 3.3. |
130134
| **new_signature_thresh_hold**| Float | Parameter in cosine similarity to declare a new signature. Applicable for decompose_fit only. The default value is 0.8. |
131135
| **make_plots** | Boolean | Toggle on and off for making and saving all plots. Default value is True. |
132136
| **exclude_signature_subgroups** | List | Removes the signatures corresponding to specific subtypes for better fitting. The usage is given above. Default value is None. |
133137
| **exome** | Boolean | Defines if the exome renormalized signatures will be used. The default value is False. |
134-
| **context_type**| String| Reqd context type if "input_type" is "vcf". 'context_type' takes what context type of the mutation matrix to be considered for assignment. Valid options include '96', '6', '24', '4608', '288', '18','6144', '384', '1536', 'DINUC'. Default Value is '96'|
135138
| **verbose** | Boolean | Prints statements. Default value is False. |
136139

137140

138141
139142

140-
#### SPA analysis Example for a matrix
143+
## Examples
144+
145+
### SPA analysis - Example for a matrix
141146

142147

143148
```python
@@ -167,7 +172,7 @@ Analyze.cosmic_fit( samples,
167172

168173
```
169174

170-
#### SPA analysis Example for input vcf files
175+
### SPA analysis - Example for input vcf files
171176

172177

173178
```python
@@ -198,6 +203,36 @@ Analyze.cosmic_fit( samples,
198203
exome=False)
199204

200205
```
206+
207+
### SPA analysis - Example for an input multi-sample segmentation file
208+
209+
210+
```python
211+
#import modules
212+
import SigProfilerAssignment as spa
213+
from SigProfilerAssignment import Analyzer as Analyze
214+
215+
#set directories and paths to signatures and samples
216+
dir_inp = spa.__path__[0]+'/data/Examples/'
217+
samples = spa.__path__[0]+'/data/cnvtest/all.breast.ascat.summary.sample.tsv' # segmentation file
218+
output = "output_example/"
219+
220+
#Analysis of SP Assignment
221+
Analyze.cosmic_fit( samples,
222+
output,
223+
input_type="seg:ASCAT_NGS",
224+
context_type="CNV48",
225+
signatures=None,
226+
signature_database=None,
227+
genome_build="GRCh37",
228+
cosmic_version=3.3,
229+
verbose=False,
230+
collapse_to_SBS96=False,
231+
make_plots=True,
232+
exclude_signature_subgroups=None,
233+
exome=False)
234+
```
235+
201236
## <a name="copyright"></a> Copyright
202237
This software and its documentation are copyright 2022 as a part of the SigProfiler project. The SigProfilerAssignment framework is free software and is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
203238

0 commit comments

Comments
 (0)