variant
module¶
Variant
¶
- class Variant(chrom, start_pos, end_pos, ref_allele, alt_allele, id=None, filter=None, info=NOTHING, parsed_csq=None)[source]¶
Bases:
object
Biallelic variant.
For normal usage, consider using
read_and_parse_vcf()
to construct the objects from a VEP annotated VCF.Examples
>>> variant = Variant('13', 32340300, 32340301, 'GT', 'G', id='rs80359550') >>> variant Variant(13:32340300GT>G info: ) >>> v.is_snp() False >>> v.is_sv() False >>> v.is_indel() True >>> v.is_deletion() True
Annotate it with online VEP,
>>> v = next(Variant.read_and_parse_vcf('rs80359550.vcf')) >>> v Variant(13:32340300GT>G info: CSQ[4 parsed])
- Parameters
- Return type
- chrom¶
Chromosome.
- start_pos¶
Start position (1-based closed). Same as POS in the VCF record.
- end_pos¶
End position (1-based closed).
- ref_allele¶
Reference allele sequence.
- alt_allele¶
Alternative allele sequence (currently only allow one possible allele).
- id¶
ID in the VCF record. None when the original value is
.
.
- filter¶
FILTER in the VCF record. None when the original value is
PASS
.
- info¶
INFO in the VCF record.
- parsed_csq¶
All parsed CSQ annotations of the variant as a list of
CSQ
objects. Useread_and_parse_vcf()
to automatically parse CSQ while reading an annotated VCF.
- get_most_severe_csq()[source]¶
Get the most severe CSQ based on the consequence type.
If multiple CSQs have the same consequence type, the canonical CSQ determined by VEP will be selected.
- Return type
- classmethod from_cyvcf2(variant)[source]¶
Create one Variant object based on the given
cyvcf2.Variant
VCF record.- Parameters
variant (cyvcf2.cyvcf2.Variant) –
- Return type
charger.variant.V
- classmethod get_vep_csq_fields(vcf_raw_headers)[source]¶
Extract the CSQ fields VEP output in the given VCF.
- classmethod read_vcf(path)[source]¶
Read VCF record from path.
This function walks through each variant record in the given VCF using
cyvcf2.VCF
, and yields the record as aVariant
object.See also
read_and_parse_vcf()
to read and parse the VCF.- Parameters
path (pathlib.Path) – Path to the VCF.
- Returns
An generator walking through all variants per record.
- Return type
- classmethod read_and_parse_vcf(path)[source]¶
Read and parse VCF record with its VEP-annotated CSQ from path.
This function walks through each variant record in the given VCF using
cyvcf2.VCF
, and yields the record as aVariant
object. The parsed CSQ will be stored in the generatedVariant.parsed_csq
.- Parameters
path (pathlib.Path) – Path to the VCF.
- Returns
An generator walking through all variants per record.
- Return type
Examples
Read an annotated VCF:
>>> vcf_reader = Variant.read_and_parse_vcf('my.vcf') >>> variant = next(vcf_reader) >>> variant Variant(14:45658326C>T info: CSQ[5 parsed]) >>> variants[4].parsed_csq[0] CSQ(SYMBOL='FANCM', HGVSc='ENST00000267430.5:c.5101N>T', Consequence='stop_gained', …)
Iterate all the VCF variants records:
>>> for variant in vcf_reader: ... print(variant.chrom, variant.parsed_csq[0]['Allele'])
GeneInheritanceMode
¶
- class GeneInheritanceMode(value)[source]¶
Bases:
enum.Flag
All possible modes of the gene inheritance dominance.
Used by
CharGerConfig.inheritance_gene_table
.- AUTO_DOMINANT = 1¶
The gene is autosomal dominant.
- AUTO_RECESSIVE = 2¶
The gene is autosomal recessive.
- X_LINKED_DOMINANT = 4¶
The gene is X-linked dominant.
- X_LINKED_RECESSIVE = 8¶
The gene is X-linked recessive.
- Y_LINKED = 16¶
The gene is Y-linked.
- classmethod parse(value)[source]¶
Parse the inheritance modes from the given string. Multiple modes are comma separated.
>>> m = GeneInheritanceMode.parse("autosomal dominant, autosomal recessive") >>> m <GeneInheritanceMode.AUTO_RECESSIVE|AUTO_DOMINANT: 3>
>>> bool(m & GeneInheritanceMode.AUTO_RECESSIVE) True >>> bool(m & GeneInheritanceMode.Y_LINKED) False
>>> GeneInheritanceMode.parse("unknown") is None True
- Parameters
value (str) –
- Return type
Optional[charger.variant.GeneInheritanceMode]
ClinicalSignificance
¶
- class ClinicalSignificance(value)[source]¶
Bases:
enum.Enum
All possible clinical significance types of a variant.
- PATHOGENIC = 'Pathogenic'¶
- LIKELY_PATHOGENIC = 'Likely Pathogenic'¶
- LIKELY_BENIGN = 'Likely Benign'¶
- BENIGN = 'Benign'¶
- UNCERTAIN = 'Uncertain Significance'¶