csq
module¶
CSQ
¶
- class CSQ(dict=None, **kwargs)[source]¶
Bases:
collections.UserDict
Consequence of a variant. Access each CSQ field like a dict.
The class is used to set the annotation records in a
Variant
object. List of CSQ per feature will be stored atVariant.parsed_csq
.Examples
>>> csq = variant.parsed_csq[0]; csq CSQ(SYMBOL='FANCM', HGVSc='ENST00000267430.5:c.5101N>T', Consequence='stop_gained', …) >>> list(csq.keys())[:5] ['Allele', 'Consequence', 'IMPACT', 'SYMBOL', 'Gene'] >>> list(csq.values())[:5] ['T', 'stop_gained', 'HIGH', 'FANCM', 'ENSG00000187790'] >>> csq['HGVSc'] 'ENST00000267430.5:c.5101N>T'
- data¶
- REQUIRED_FIELDS: Set[str] = {'Allele', 'Amino_acids', 'BIOTYPE', 'CDS_position', 'Codons', 'Consequence', 'Existing_variation', 'Feature', 'Feature_type', 'Gene', 'HGVSc', 'HGVSp', 'Protein_position', 'STRAND', 'SYMBOL', 'cDNA_position'}¶
Required CSQ fields. Will raise a ValueError if any of the fields is missing when creating a new CSQ object.
- rank_consequence_type()[source]¶
Rank the severeness of its consequence type (CSQ column
Consequence
).Severe consequence type has smaller rank (smallest being 0). Ranking is based on the order in
ALL_CONSEQUENCE_TYPES
. When the CSQ has multiple consequence types separated by&
, return the smallest rank of all the types. When the consequence type is not known, return the biggest possible rank + 1.- Return type
- is_truncation_type()[source]¶
Whether the consequence type is truncation.
See
ALL_TRUNCATION_TYPES
for the full list of consequence types.- Return type
- is_inframe_type()[source]¶
Whether the consequence type is inframe.
See
ALL_INFRAME_TYPES
for the full list of consequence types.- Return type
Constants and helpers¶
- ALL_CONSEQUENCE_TYPES: List[str] = ['transcript_ablation', 'splice_acceptor_variant', 'splice_donor_variant', 'stop_gained', 'frameshift_variant', 'stop_lost', 'start_lost', 'transcript_amplification', 'inframe_insertion', 'inframe_deletion', 'missense_variant', 'protein_altering_variant', 'splice_region_variant', 'incomplete_terminal_codon_variant', 'start_retained_variant', 'stop_retained_variant', 'synonymous_variant', 'coding_sequence_variant', 'mature_miRNA_variant', '5_prime_UTR_variant', '3_prime_UTR_variant', 'non_coding_transcript_exon_variant', 'intron_variant', 'NMD_transcript_variant', 'non_coding_transcript_variant', 'upstream_gene_variant', 'downstream_gene_variant', 'TFBS_ablation', 'TFBS_amplification', 'TF_binding_site_variant', 'regulatory_region_ablation', 'regulatory_region_amplification', 'feature_elongation', 'regulatory_region_variant', 'feature_truncation', 'intergenic_variant']¶
All the possible consequence types fetched from Ensembl v99 (January 2020).
The consequence types here are ordered by their severeness.