Germline SNV - WES Analysis

Germline SNV - WES analysis pipelines by Genetic Predisposition to Gastrointestinal Cancer Group.

Folder structure changed February 26th 2019 but most workflows not modified already

Two different models of inheritance are considered: autosomal dominant (heterozygosity) and recessive (homozygosity or compound heterozygosity). You can see an overview of the filtering criteria used in the workflow diagrams.

Different workflows have also been developed according to the structure of the input files. One file per sample is allowed, as well as one big file containing the information of the germline variants of all the samples of the cohort. In all cases, tab separated text files are required to run these pipelines (not vcf files).

In all cases, an annotation folder will be required. This folder is not directly provided in GitHub because of size issue, but it should contain the following folders and files:

DBs_annotation
- filtering_terms_CRC.xls
- HGNC_ProteinAtlas_cancer.csv
- interactions
- gene2go
- HGNC_ProteinAtlas_normal_tissue.csv
- OMIM_morbidmap.txt
- generifs_basic
- HGNC_ProteinAtlas_subcellular_location.csv
- refSeqSummaryfilt_noalmohad.txt
- HGNC_entrez.txt
- HGNC_RefSeq.txt
- uniprot-all.tab
- HGNC_OMIM.txt
- HGNC_synonyms.txt
gene_families_&_pathways
- Functional_Families_con_genes_hereditarios.txt
- Gene_Families_con_genes_hereditarios.txt
- PATHWAYS_db.txt
genes_candidatos
- genes_candidatos_23.09.2016.txt
- Cancer_predisposing_genes_Nature_2014.txt
LD_regions
- LIST_Regions.txt
pLI
- fordist_cleaned_exac_r03_march16_z_pli_rec_null_data.txt

One file per sample

Regarding this methodology, HERNANDEZ project is the most recent workflow used. In this case two files will be required to run the pipeline:

Directories file (directories.txt): Tabulated text file containing the absolute paths of the directories used for running the pipeline (input directory, output/results directory, workflow directory -the home directory by default- and annotation directory). It should contain 4 columns with the following mandatory headers: Input_dir, Output_dir, Workflow_dir and Annotation_dir
Files to analyze file (file_to_analyze.txt): Tabulated tect file containing in two columns the names of the files and samples to analysze. It should contain 2 columns with the following mandatory headers: FILE and NAME.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
Annotation		Annotation
Project_specific_workflows		Project_specific_workflows
Workflow_diagram		Workflow_diagram
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Germline SNV - WES Analysis

One file per sample

About

Uh oh!

Releases

Packages

Languages

License

gptogcg/germline_SNV_WES_analysis

Folders and files

Latest commit

History

Repository files navigation

Germline SNV - WES Analysis

One file per sample

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages