-
Notifications
You must be signed in to change notification settings - Fork 17
Open
Description
@sacdallago hi, thank you very much for your great data curation. I am planning to use the AAV dataset for my research.
I found that some deletion masks may not have been properly applied to the wild type sequences: as the image below shows, there are 29 sequences with different mutation_mask but with the same full_aa_sequnece as the wild type. Is this intended result?
Below is the code for replication:
import pandas as pd
from Bio import SeqIO
wt_seq = str(next(SeqIO.parse("P03135.fasta", "fasta")).seq)
variant_effects = pd.read_csv("full_data.csv")
wild_types = variant_effects.loc[variant_effects["full_aa_sequence"] == wt_seq]
wild_typesIeremie and alex-hh
Metadata
Metadata
Assignees
Labels
No labels
