Biomina/MedGen VariantDB: Annotation and filtering of variants detected using next-generation sequencing (tutorial)

Documentation : Filtering Criteria Explained

Dynamic_Filters
De Novo	This filter automatically searches for De Novo variants in the selected sample. It retains only variants absent from parental samples (or present as homozygous reference). If a single parent is available, a warning will be issued. If no parents are available, the query is not executed. The optional quality check applies the same quality metrics defined for the selected sample, to the parental samples. Note, this might increase the number of putative De Novo variants in case of low quality parental data! Activating the option to exclude non_covered variants will only report a variant if the position is confidently called as reference in the parental VCF. Note, this triggers exclusion of positions of low parental quality (applying the criteria. In case GVCF data is available, confident regions (GQ>20) are also taken into account as covered positions.
Dominant	This filter automatically searches for Dominant variants in the selected sample. It retains only variants present in all affected family members. Increase the unaffected-carrier value to simulate reduced penetrance. If no family is available, the query is not executed. The optional quality check applies the same quality metrics defined for the selected sample, to the family samples. Note, this might affect the number of variants in case of low quality family data! Excluding homozygous variants limits the candidate variants to heterozygous cosegregation.
Recessive (general)	This filter automatically searches for recessive variants in the selected sample. It retains only variants present as homozygous in all affected family members. Parental samples are not checked for heterozygous presence of the variants. Increase the unaffected-carrier value to allow homozygous unaffected carriers, simulating reduced penetrance. If no family is available, this query reduces to retrieving all homozygous variants. The optional quality check applies the same quality metrics defined for the selected sample, to the family samples. Note, this might affect the number of variants in case of low quality family data!
Recessive (biparental)	This filter automatically searches for recessive variants in the selected sample. It retains only variants present as homozygous in all affected family members AND heterozygous in both parents of each affected case. Increase the unaffected-carrier value to allow homozygous unaffected carriers, simulating reduced penetrance. If no family is available, or any affected sample does not have two parents associated, this query will not execute. The optional quality check applies the same quality metrics defined for the selected sample, to the family samples. Note, this might affect the number of variants in case of low quality family data!
Recessive (Compound Heterozygous)	This filter automatically searches for compound heterozygous variants in the set of remaining variants. It retains variant combinations in the same gene that are biparentally inherited. If parents are not available, the query is not executed. The optional quality check applies the same quality metrics defined for the selected sample, to the parental samples. Note, this might impact the number of putative variants in case of low quality parental data! Activating the option to exclude non_covered variants will only report a variant if the position is called as heterozygous in the parental VCF. By default non-called positions in the parents are considered as possibly heterozygous.
Pathogenic Prediction	Select variants predicted to be damaging by a minimal number of tools, out of a user-defined list. Pathogenic predictions are : LJB-LRT:'D' ; LJB-MT:0.95; LJB-PhyloP:0.95; LJB-PP2:0.95; LJB-SIFT:0.95; CADD-Phred:20; Web-Sift:0.95; Provean:2.5 (absolute value)

Family
In Parents	If parental samples are available, select one or more samples that should (not) contain variants seen in the current sample. Genotype can be specified to only consider heterozygous or homozygous variants. Note that selection of multiple samples in a single rule equals to an "OR" statement, while setting multiple rules each containing one parent, equals to "AND" filtering.
In Siblings	If sibling samples are available, select one or more samples that should (not) contain variants seen in the current sample. Genotype can be specified to only consider heterozygous or homozygous variants. Note that selection of multiple samples in a single rule equals to an "OR" statement, while setting multiple rules each containing one sample, equals to "AND" filtering.
In Siblings beta	If sibling samples are available, select one or more samples that should (not) contain variants seen in the current sample. Genotype can be specified to only consider heterozygous or homozygous variants. Note that selection of multiple samples in a single rule equals to an "OR" statement, while setting multiple rules each containing one sample, equals to "AND" filtering.
In Children	If offspring samples are available, select one or more samples that should (not) contain variants seen in the current sample. Genotype can be specified to only consider heterozygous or homozygous variants. Note that selection of multiple samples in a single rule equals to an "OR" statement, while setting multiple rules each containing one child, equals to "AND" filtering.
In Replica	If biological replicates are available, for example paired tumor/normal samples, select one or more samples that should (not) contain variants seen in the current sample. Genotype can be specified to only consider heterozygous or homozygous variants. Note that selection of multiple samples in a single rule equals to an "OR" statement, while setting multiple rules each containing one replicate, equals to "AND" filtering.
In Custom A	Custom groups allow to assign related (e.g. phenotype), but not familial samples samples to a individual. . Genotype can be specified to only consider heterozygous or homozygous variants. Note that selection of multiple samples in a single rule equals to an "OR" statement, while setting multiple rules each containing one replicate, equals to "AND" filtering.
In Custom B	Custom groups allow to assign related (e.g. phenotype), but not familial samples samples to a individual. Genotype can be specified to only consider heterozygous or homozygous variants. Note that selection of multiple samples in a single rule equals to an "OR" statement, while setting multiple rules each containing one replicate, equals to "AND" filtering.

Occurence
In All Control Samples	Select for variants that (do not) occur in any of the samples labeled as controls. Genotype specific filtering is available.
Abs.Occ Control Samples	Provide an absolute value ( >0 !! ) for the number of times a variant was seen in your control samples. Occurence does not take quality filters into account for the control samples
Rel.Occ Control Samples	Provide an relative value ( < 1 ) for the number of times a variant was seen in your control samples. Occurence does not take quality filters into account for the control samples
Abs.Occ All Samples	Provide an absolute value ( >0 !! ) for the number of times a variant was seen in your samples. Occurence does not take quality filters into account for the cohort samples
Rel.Occ All Samples	Provide an relative value ( < 1 ) for the number of times a variant was seen in your samples. Occurence does not take quality filters into account for the cohort samples
Abs.Occ Female Samples	Provide an absolute value ( >0 !! ) for the number of times a variant was seen in your female samples. Occurence does not take quality filters into account for the female samples
Rel.Occ Female Samples	Provide an relative value ( < 1 ) for the number of times a variant was seen in your female samples. Occurence does not take quality filters into account for the female samples
Abs.Occ Male Samples	Provide an absolute value ( >0 !! ) for the number of times a variant was seen in your male samples. Occurence does not take quality filters into account for the male samples
Rel.Occ Male Samples	Provide an relative value ( < 1 ) for the number of times a variant was seen in your male samples. Occurence does not take quality filters into account for the male samples
In Selected Control Samples	DEPRECATED Select for variants that (do not) occur in a selection of the samples labeled as controls. Genotype specific filtering is available.
Abs.Occ. Control Samples (Any Genotype)	Provide an absolute value ( >0 !! ) for the number of times a variant was seen in your control samples. Occurence does not take quality filters into account for the control samples
Abs.Occ. Control Samples (Heterozygous)	Provide an absolute value ( >0 !! ) for the number of times a variant was seen in your control samples as a heterozygous call. Occurence does not take quality filters into account for the control samples.
Abs.Occ. Control Samples (Homozygous Alt.)	Provide an absolute value ( >0 !! ) for the number of times a variant was seen in your control samples as a homozygous call. Occurence does not take quality filters into account for the control samples.
Rel.Occ. Control Samples (Any Genotype)	Provide a fraction in the range of 0-1, of your samples that the variant was seen in. Occurence does not take quality filters into account for the control samples.
Rel.Occ. Control Samples (Heterozygous)	Provide a fraction in the range of 0-1, of your samples that the variant was seen in as a heterozygous call. Occurence does not take quality filters into account for the control samples.
Rel.Occ. Control Samples (Homozygous Alt.)	Provide a fraction in the range of 0-1, of your samples that the variant was seen in as a heterozygous call. Occurence does not take quality filters into account.
Abs.Occ. All Samples (Any Genotype)	Provide an absolute value ( >0 !! ) for the number of times a variant was seen in your samples. Occurence does not take quality filters into account
Abs.Occ. All Samples (Heterozygous)	Provide an absolute value ( >0 !! ) for the number of times a variant was seen in your samples as a heterozygous call. Occurence does not take quality filters into account
Abs.Occ. All Samples (Homozygous Alt.)	Provide an absolute value ( >0 !! ) for the number of times a variant was seen in your samples as a homozygous call. Occurence does not take quality filters into account
Rel.Occ. All Samples (Any Genotype)	Provide a fraction in the range of 0-1, of your samples that the variant was seen in. Occurence does not take quality filters into account
Rel.Occ. All Samples (Heterozygous)	Provide a fraction in the range of 0-1, of your samples that the variant was seen in as a heterozygous call. Occurence does not take quality filters into account
Rel.Occ. All Samples (Homozygous Alt.)	Provide a fraction in the range of 0-1, of your samples that the variant was seen in as a homozygous call. Occurence does not take quality filters into account
Abs.Occ. By Project (Any Genotype.All Samples)	Provide an absolute value ( >0 !! ) for the number of times a variant was seen in certain Projects. Occurence does not take quality filters into account
Abs.Occ. By Project (Any Alternate.All Samples)	Provide an absolute value ( >0 !! ) for the number of times a variant was seen in certain Projects. Occurence does not take quality filters into account
Abs.Occ. By Project (Any Genotype)	DEPRECATED Provide an absolute value ( >0 !! ) for the number of times a variant was seen in certain Project. Occurence does not take quality filters into account
Abs.Occ. By Project (Homozygous Ref)	Provide an absolute value ( >0 !! ) for the number of times a variant was seen in certain Project as a homozygous reference call. This filter takes GVCF entries with GQ > 20 into account as reference calls. Occurence does not take quality filters into account
Abs.Occ. By Project (Heterozygous)	Provide an absolute value ( >0 !! ) for the number of times a variant was seen in certain Project as a heterozygous call. Occurence does not take quality filters into account
Abs.Occ. By Project (Homozygous Alt.)	Provide an absolute value ( >0 !! ) for the number of times a variant was seen in certain Project as a homozygous alternative call. Occurence does not take quality filters into account
Rel.Occ. By Project (Any Genotype)	Provide a fraction in the range of 0-1, of your samples that the variant was seen in a selection of projects. Occurence does not take quality filters into account
Rel.Occ. By Project (Heterozygous)	Provide a fraction in the range of 0-1, of your samples that the variant was seen in a selection of projects as a heterozygous call. Occurence does not take quality filters into account
Rel.Occ. By Project (Homozygous Alt.)	Provide a fraction in the range of 0-1, of your samples that the variant was seen in a selection of projects as a heterozygous call. Occurence does not take quality filters into account
Gene Hit In Other Cases (RefSeq VariantType)	Filter for variants affecting (RefSeq) genes, that are also affected by variants of the selected types in other cases you have access to. Other parameters such as quality are not taken into account for additional samples. Provide the minimal number of ADDITIONAL hits in the gene in the text field. Control samples are excluded as additional hits
Gene Hit In Other Cases (RefGene VariantType)	Filter for variants affecting (RefGene) genes, that are also affected by variants of the selected types in other cases you have access to. Other parameters such as quality are not taken into account for additional samples. Provide the minimal number of ADDITIONAL hits in the gene in the text field. Control samples are excluded as additional hits
Gene Hit In Other Cases (SnpEff Impact)	Filter for variants affecting (Ensembl) genes, that are also affected by variants of the selected SnpEff Impact types in other cases you have access to. Other parameters such as quality are not taken into account for additional samples. Provide the minimal number of ADDITIONAL hits in the gene in the text field. Control samples are excluded as additional hits
In dbSNP v130	DEPRECATED Include/Exclude all variants present in dbSNP v130
In dbSNP v135	DEPRECATED Include/Exclude all variants present in dbSNP v135
dbSNP v135 MAF	DEPRECATED Include/Exclude variants present in dbSNPv135 based on a minor allele frequency. might be ambiguous for multi-allelic snps!
In dbSNP v137	DEPRECATED Include/Exclude all variants present in dbSNP v137
dbSNP v137 MAF	DEPRECATED Include/Exclude variants present in dbSNPv137 based on a minor allele frequency. might be ambiguous for multi-allelic snps!
In dbSNP v138	Include/Exclude all variants present in dbSNP v138
dbSNP v138 MAF	Include/Exclude variants present in dbSNPv138 based on a minor allele frequency. might be ambiguous for multi-allelic snps!
In dbSNP v142	Include/Exclude all variants present in dbSNP v142
dbSNP v142 MAF	Include/Exclude variants present in dbSNPv142 based on a minor allele frequency. might be ambiguous for multi-allelic snps!
dbSNP v142 nrChr	Include/Exclude variants present in dbSNPv142 based on a population size (number of chromosomes). Note: Variants not listed in dbSNP have a chormosome count of zero.
In gnomAD g 2.1	Include/Exclude all variants present in the gnomAD v2.1 genome samples.
gnomAD g 2.1 AF (ALL)	Include/Exclude variants present in the gnomAD v2.1 genome samples based on a allele frequency. might be ambiguous for multi-allelic snps!
gnomAD g 2.1 AN (ALL)	Include/Exclude variants present in the gnomAD v2.1 genome samples based on amount of covered alleles.
gnomAD g 2.1 nHomAlt (ALL)	Include/Exclude variants present in the gnomAD v2.1 genome samples based on amount of homozygous samples.
gnomAD g 2.1 AF (NFE)	Include/Exclude variants present in the gnomAD v2.1 genome NFE samples based on a allele frequency. might be ambiguous for multi-allelic snps!
gnomAD g 2.1 AN (NFE)	Include/Exclude variants present in the gnomAD v2.1 genome NFE samples based on amount of covered alleles.
gnomAD g 2.1 nHomAlt (NFE)	Include/Exclude variants present in the gnomAD v2.1 genome NFE samples based on amount of homozygous samples.
gnomAD g 2.1 AF (Controls)	Include/Exclude variants present in the gnomAD v2.1 genome control samples based on a allele frequency. might be ambiguous for multi-allelic snps!
gnomAD g 2.1 AN (Controls)	Include/Exclude variants present in the gnomAD v2.1 genome control samples based on amount of covered alleles.
gnomAD g 2.1 nHomAlt (Controls)	Include/Exclude variants present in the gnomAD v2.1 genome control samples based on amount of homozygous samples.
gnomAD g 2.1 AF (Female Controls)	Include/Exclude variants present in the gnomAD v2.1 genome female control samples based on a allele frequency. might be ambiguous for multi-allelic snps!
gnomAD g 2.1 AN (Female Controls)	Include/Exclude variants present in the gnomAD v2.1 genome female control samples based on amount of covered alleles.
gnomAD g 2.1 nHomAlt (Female Controls)	Include/Exclude variants present in the gnomAD v2.1 genome female control samples based on amount of homozygous samples.
gnomAD g 2.1 AF (Male Controls)	Include/Exclude variants present in the gnomAD v2.1 genome male control samples based on a allele frequency. might be ambiguous for multi-allelic snps!
gnomAD g 2.1 AN (Male Controls)	Include/Exclude variants present in the gnomAD v2.1 genome male control samples based on amount of covered alleles.
gnomAD g 2.1 nHomAlt (Male Controls)	Include/Exclude variants present in the gnomAD v2.1 genome male control samples based on amount of homozygous samples.
gnomAD g 2.1 AF (NFE Controls)	Include/Exclude variants present in the gnomAD v2.1 genome NFE control samples based on a allele frequency. might be ambiguous for multi-allelic snps!
gnomAD g 2.1 AN (NFE Controls)	Include/Exclude variants present in the gnomAD v2.1 genome NFE control samples based on amount of covered alleles.
gnomAD g 2.1 nHomAlt (NFE Controls)	Include/Exclude variants present in the gnomAD v2.1 genome NFE control samples based on amount of homozygous samples.
gnomAD g 2.1 AF (non topmed)	Include/Exclude variants present in the gnomAD v2.1 genome non_topmed samples based on a allele frequency. might be ambiguous for multi-allelic snps!
gnomAD g 2.1 AN (non topmed)	Include/Exclude variants present in the gnomAD v2.1 genome non_topmed samples based on amount of covered alleles.
gnomAD g 2.1 nHomAlt (non topmed)	Include/Exclude variants present in the gnomAD v2.1 genome non_topmed samples based on amount of homozygous samples.
gnomAD g 2.1 AF (Female non topmed)	Include/Exclude variants present in the gnomAD v2.1 genome female non_topmed samples based on a allele frequency. might be ambiguous for multi-allelic snps!
gnomAD g 2.1 AN (Female non topmed)	Include/Exclude variants present in the gnomAD v2.1 genome female non_topmed samples based on amount of covered alleles.
gnomAD g 2.1 nHomAlt (Female non topmed)	Include/Exclude variants present in the gnomAD v2.1 genome female non_topmed samples based on amount of homozygous samples.
gnomAD g 2.1 AF (Male non topmed)	Include/Exclude variants present in the gnomAD v2.1 genome male non_topmed samples based on a allele frequency. might be ambiguous for multi-allelic snps!
gnomAD g 2.1 AN (Male non topmed)	Include/Exclude variants present in the gnomAD v2.1 genome male non_topmed samples based on amount of covered alleles.
gnomAD g 2.1 nHomAlt (Male non topmed)	Include/Exclude variants present in the gnomAD v2.1 genome male non_topmed samples based on amount of homozygous samples.
gnomAD g 2.1 AF (NFE non topmed)	Include/Exclude variants present in the gnomAD v2.1 genome NFE non_topmed samples based on a allele frequency. might be ambiguous for multi-allelic snps!
gnomAD g 2.1 AN (NFE non topmed)	Include/Exclude variants present in the gnomAD v2.1 genome NFE non_topmed samples based on amount of covered alleles.
gnomAD g 2.1 nHomAlt (NFE non topmed)	Include/Exclude variants present in the gnomAD v2.1 genome NFE non_topmed samples based on amount of homozygous samples.
gnomAD g 2.1 AF (non neuro)	Include/Exclude variants present in the gnomAD v2.1 genome non_neuro samples based on a allele frequency. might be ambiguous for multi-allelic snps!
gnomAD g 2.1 AN (non neuro)	Include/Exclude variants present in the gnomAD v2.1 genome non_neuro samples based on amount of covered alleles.
gnomAD g 2.1 nHomAlt (non neuro)	Include/Exclude variants present in the gnomAD v2.1 genome non_neuro samples based on amount of homozygous samples.
gnomAD g 2.1 AF (Female non neuro)	Include/Exclude variants present in the gnomAD v2.1 genome female non_neuro samples based on a allele frequency. might be ambiguous for multi-allelic snps!
gnomAD g 2.1 AN (Female non neuro)	Include/Exclude variants present in the gnomAD v2.1 genome female non_neuro samples based on amount of covered alleles.
gnomAD g 2.1 nHomAlt (Female non neuro)	Include/Exclude variants present in the gnomAD v2.1 genome female non_neuro samples based on amount of homozygous samples.
gnomAD g 2.1 AF (Male non neuro)	Include/Exclude variants present in the gnomAD v2.1 genome male non_neuro samples based on a allele frequency. might be ambiguous for multi-allelic snps!
gnomAD g 2.1 AN (Male non neuro)	Include/Exclude variants present in the gnomAD v2.1 genome male non_neuro samples based on amount of covered alleles.
gnomAD g 2.1 nHomAlt (Male non neuro)	Include/Exclude variants present in the gnomAD v2.1 genome male non_neuro samples based on amount of homozygous samples.
In ESP5400 all	DEPRECATED Include/Exclude all variants present in the Exome Sequencing Project, release 5400, all populations
ESP5400 all MAF	DEPRECATED Include/Exclude variants present in esp5400 all populations based on a minor allele frequency. might be ambiguous for multi-allelic snps!
In ESP5400 ea	DEPRECATED Include/Exclude all variants present in the Exome Sequencing Project, release 5400, European Americans
ESP5400 ea MAF	DEPRECATED Include/Exclude variants present in esp5400 ea population based on a minor allele frequency. might be ambiguous for multi-allelic snps!
In ESP5400 aa	DEPRECATED Include/Exclude all variants present in the Exome Sequencing Project, release 5400, African Americans
ESP5400 aa MAF	DEPRECATED Include/Exclude variants present in esp5400 aa population based on a minor allele frequency. might be ambiguous for multi-allelic snps!
In ESP6500 all	Include/Exclude all variants present in the Exome Sequencing Project, release 6500, all populations
ESP6500 all MAF	Include/Exclude variants present in esp6500 all populations based on a minor allele frequency. might be ambiguous for multi-allelic snps!
In ESP6500 ea	Include/Exclude all variants present in the Exome Sequencing Project, release 6500, European Americans
ESP6500 ea MAF	Provide a minor allele frequency. might be ambiguous for multi-allelic snps!
In ESP6500 aa	Include/Exclude all variants present in the Exome Sequencing Project, release 6500, African Americans
ESP6500 aa MAF	Provide a minor allele frequency. might be ambiguous for multi-allelic snps!
In ExAC v02	DEPRECATED Include/Exclude all variants present in the ExAC database, release 02, All populations
ExAC v02 MAF	DEPRECATED Provide a minor allele frequency. Although values are similar, there are discrepancies between ANNOVAR ALL_population frequencies (used here) and the data available online. This is due to ANNOVAR using raw allele counts, versus the ExAC Browser using adjusted allele counts (DP >= 10 & GQ >= 20)
In ExAC v03 ALL	Include/Exclude all variants present in the ExAC database, release 03, All populations
ExAC v03 MAF ALL	Provide a minor allele frequency. Although values are similar, there are discrepancies between ANNOVAR ALL_population frequencies (used here) and the data available online. This is due to ANNOVAR using raw allele counts, versus the ExAC Browser using adjusted allele counts (DP >= 10 & GQ >= 20)
ExAC v03 nrChr ALL	Provide a minimal number of chromosomes included for genotyping the variant in ALL populations
In ExAC v03 AFR	Include/Exclude all variants present in the ExAC database, release 03, AFR population
ExAC v03 MAF AFR	Provide a minor allele frequency. Although values are similar, there are discrepancies between ANNOVAR AFR_population frequencies (used here) and the data available online. This is due to ANNOVAR using raw allele counts, versus the ExAC Browser using adjusted allele counts (DP >= 10 & GQ >= 20)
ExAC v03 nrChr AFR	Provide a minimal number of chromosomes included for genotyping the variant in AFR population
In ExAC v03 AMR	Include/Exclude all variants present in the ExAC database, release 03, AMR population
ExAC v03 MAF AMR	Provide a minor allele frequency. Although values are similar, there are discrepancies between ANNOVAR AMR_population frequencies (used here) and the data available online. This is due to ANNOVAR using raw allele counts, versus the ExAC Browser using adjusted allele counts (DP >= 10 & GQ >= 20)
ExAC v03 nrChr AMR	Provide a minimal number of chromosomes included for genotyping the variant in AMR population
In ExAC v03 EAS	Include/Exclude all variants present in the ExAC database, release 03, EAS population
ExAC v03 MAF EAS	Provide a minor allele frequency. Although values are similar, there are discrepancies between ANNOVAR EAS_population frequencies (used here) and the data available online. This is due to ANNOVAR using raw allele counts, versus the ExAC Browser using adjusted allele counts (DP >= 10 & GQ >= 20)
ExAC v03 nrChr EAS	Provide a minimal number of chromosomes included for genotyping the variant in EAS population
In ExAC v03 FIN	Include/Exclude all variants present in the ExAC database, release 03, FIN population
ExAC v03 MAF FIN	Provide a minor allele frequency. Although values are similar, there are discrepancies between ANNOVAR FIN_population frequencies (used here) and the data available online. This is due to ANNOVAR using raw allele counts, versus the ExAC Browser using adjusted allele counts (DP >= 10 & GQ >= 20)
ExAC v03 nrChr FIN	Provide a minimal number of chromosomes included for genotyping the variant in FIN population
In ExAC v03 NFE	Include/Exclude all variants present in the ExAC database, release 03, NFE population
ExAC v03 MAF NFE	Provide a minor allele frequency. Although values are similar, there are discrepancies between ANNOVAR NFE_population frequencies (used here) and the data available online. This is due to ANNOVAR using raw allele counts, versus the ExAC Browser using adjusted allele counts (DP >= 10 & GQ >= 20)
ExAC v03 nrChr NFE	Provide a minimal number of chromosomes included for genotyping the variant in NFE population
In ExAC v03 OTH	Include/Exclude all variants present in the ExAC database, release 03, OTH population
ExAC v03 MAF OTH	Provide a minor allele frequency. Although values are similar, there are discrepancies between ANNOVAR OTH_population frequencies (used here) and the data available online. This is due to ANNOVAR using raw allele counts, versus the ExAC Browser using adjusted allele counts (DP >= 10 & GQ >= 20)
ExAC v03 nrChr OTH	Provide a minimal number of chromosomes included for genotyping the variant in OTH population
In ExAC v03 SAS	Include/Exclude all variants present in the ExAC database, release 03, SAS population
ExAC v03 MAF SAS	Provide a minor allele frequency. Although values are similar, there are discrepancies between ANNOVAR SAS_population frequencies (used here) and the data available online. This is due to ANNOVAR using raw allele counts, versus the ExAC Browser using adjusted allele counts (DP >= 10 & GQ >= 20)
ExAC v03 nrChr SAS	Provide a minimal number of chromosomes included for genotyping the variant in SAS population
In Kaviar 150923	Include/Exclude all variants present in the Kaviar database, release 20150923
Kaviar 150923 MAF	Provide a minor allele frequency for the Kaviar database, release 20150923
Kaviar 150923 nrChr	Provide a minimal number of chromosomes included for genotyping the variant Kaviar, release 20150923
In 1000g2012apr all	DEPRECATED Include/Exclude all variants present in the 1000 genomes project, release 2012-04, all populations
1000g2012apr all MAF	DEPRECATED Include/Exclude all variants present in the 1000 genomes project, release 2012-04, all populations, based on a minor allele frequency. might be ambiguous for multi-allelic snps!
In 1000g2012apr afr	DEPRECATED Include/Exclude all variants present in the 1000 genomes project, release 2012-04, afr population
1000g2012apr afr MAF	DEPRECATED Include/Exclude all variants present in the 1000 genomes project, release 2012-04, afr population, based on a minor allele frequency. might be ambiguous for multi-allelic snps!
In 1000g2012apr amr	DEPRECATED Include/Exclude all variants present in the 1000 genomes project, release 2012-04, amr population
1000g2012apr amr MAF	DEPRECATED Include/Exclude all variants present in the 1000 genomes project, release 2012-04, amr population, based on a minor allele frequency. might be ambiguous for multi-allelic snps!
In 1000g2012apr asn	DEPRECATED Include/Exclude all variants present in the 1000 genomes project, release 2012-04, asn population
1000g2012apr asn MAF	DEPRECATED Include/Exclude all variants present in the 1000 genomes project, release 2012-04, asn population, based on a minor allele frequency. might be ambiguous for multi-allelic snps!
In 1000g2012apr eur	DEPRECATED Include/Exclude all variants present in the 1000 genomes project, release 2012-04, eur population
1000g2012apr eur MAF	DEPRECATED Include/Exclude all variants present in the 1000 genomes project, release 2012-04, eur population, based on a minor allele frequency. might be ambiguous for multi-allelic snps!

Location
Chromosomes	Include/Exclude variants from selected chromosomes
Position	Include/Exclude variants based on chromsomal position. Provide a chromosomal location without comma between thousands !
GenePanels	Include/Exclude variants affecting genes (according to ANNOVAR RefSeq annotations) listed in the selected gene panels
Public GenePanels	Include/Exclude variants affecting genes (according to ANNOVAR RefSeq annotations) listed in the selected gene panels. Presented panels were made available by other users.
RefSeq Is Hit	Select only variants that have annotation for RefSeq (according to ANNOVAR RefSeq annotations; exonic/intronic/..., but not intergenic)
RefSeq Gene Symbol	Include/Exclude variants affecting any of the listed genes (according to ANNOVAR RefSeq annotations). Provide a comma seperated list of gene symbols (GAPDH,etc)
RefSeq Gene Transcript	Include/Exclude variants affecting any of the listed transcript ids (according to ANNOVAR RefSeq annotations). Provide a comma seperated list of gene transcripts (NM_003242.2,etc)
RefGene Is Hit	Select only variants that have annotation for RefGene (according to ANNOVAR RefGene annotations; exonic/intronic/..., but not intergenic)
RefGene Gene Symbol	Include/Exclude variants affecting any of the listed genes (according to ANNOVAR RefGene annotations). Provide a comma seperated list of gene symbols (GAPDH,etc)
RefGene Gene Transcript	Include/Exclude variants affecting any of the listed transcript ids (according to ANNOVAR RefGene annotations). Provide a comma seperated list of gene transcripts (NM_003242.2,etc)
Ensembl GeneID	Include/Exclude variants affecting any of the listed Ensembl GeneIDs (according to ANNOVAR EnsGene annotations). Provide a comma seperated list of Ensembl geneIDs (ENSGxxx)
Ensembl Transcript	Include/Exclude variants affecting any of the listed Ensembl TranscriptIDs (according to ANNOVAR EnsGene annotations). Provide a comma seperated list of Ensembl transcripts (ENSTxxx)
UCSC Gene Symbol	Include/Exclude variants affecting any of the listed Genes (according to ANNOVAR KnownGene annotations). Provide a comma seperated list of gene symbols (GAPDH,etc)
UCSC Transcript	Include/Exclude variants affecting any of the listed transcripts (according to ANNOVAR KnownGene annotations). Provide a comma seperated list of ucsc transcripts (uc003cen.3,etc)
In Genomic SuperDup	Include/Exclude variants located in Genomic SuperDups (UCSC Segmental Duplication Table)

Effect_On_Transcript
RefSeq VariantType	Include/Exclude variants matching any of the selected variant types (frameshift, (non-)synonymous, stopgain, ...). Variants are selected if they match at lease one transcript variant.
RefSeq GeneLocation	Include/Exclude variants based on their position related to genes (intron, exon, splice, intergenic, ...). Variants are selected if they match at lease one transcript variant.
RefGene VariantType	Include/Exclude variants matching any of the selected variant types (frameshift, (non-)synonymous, stopgain, ...). Variants are selected if they match at lease one transcript variant.
RefGene GeneLocation	Include/Exclude variants based on their position related to genes (intron, exon, splice, intergenic, ...). Variants are selected if they match at lease one transcript variant.
Ensembl VariantType	Include/Exclude variants matching any of the selected variant types (frameshift, (non-)synonymous, stopgain, ...). Variants are selected if they match at lease one transcript variant.
Ensembl GeneLocation	Include/Exclude variants based on their position related to genes (intron, exon, splice, intergenic, ...). Variants are selected if they match at lease one transcript variant.
UCSC VariantType	Include/Exclude variants matching any of the selected variant types (frameshift, (non-)synonymous, stopgain, ...). Variants are selected if they match at lease one transcript variant.
UCSC GeneLocation	Include/Exclude variants based on their position related to genes (intron, exon, splice, intergenic, ...). Variants are selected if they match at lease one transcript variant.
Splicing scSNV11 ADA	Ensemble splice alteration prediction, using Adaptive boosting, based on PWM, MaxEntScan, NNSplice and HSF. pmid: 25416802. Score represents probability of altered spicing (0-1), proposed cutoff is 0.6
Splicing scSNV11 RF	Ensemble splice alteration prediction, using Random Forest Classification, based on PWM, MaxEntScan, NNSplice and HSF. pmid: 25416802. Score represents probability of altered spicing (0-1), proposed cutoff is 0.6
Splicing SPIDEX Zscore	Deep Learning prediction of alternate splicing. Zscore is a ranking of the observed value among all predictions. Assuming normal distribution, this rank can be represented as a Zscore. Lower tail represents reduced splicing, upper tail represents a new splice site. This annotation source is only available for non-profic, academic users. Please send a signed copy of the EULA to activate it.

Genotype_Composition
SNV or Indel	Include/Exclude single nucleotide variants, insertions or deletions only
Homozygous Or Heterozygous	Include/Exclude variants based on the allelic composition
Genotype Ratio	Allelic Ratio, as called by GATK Genotyper for samples of high ploidy. Use Ploidy annotation to interpret the output. For Mutect/VarScan samples, this value equals to the AllelicRatio based on (Mutect: FA, VarScan: FREQ)
Multi Allelic Position	Multi-Allelic positions are defined as positions with more than one alternative allele.
Compound Heterozygous RefSeq	Select genes (as defined by RefSeq) that are hit by more than one variant matching all set criteria. There is no check that the variants come from different parents
Compound Heterozygous RefGene	Select genes (as defined by RefGene) that are hit by more than one variant matching all set criteria. There is no check that the variants come from different parents
Compound Heterozygous Ensembl	Select genes (as defined by Ensembl) that are hit by more than one variant matching all set criteria. There is no check that the variants come from different parents
Somatic State	Select variants of specific somatic state. This is only usefull for Mutect/VarScan VCF imports. States are 0:reference, 1:germline, 2:somatic, 3:LOH, 4:post-translational,5:unknown

Quality
Variant PhredScore	The Phred scaled probability of Probability that REF/ALT polymorphism exists at this site given sequencing data. Because the Phred scale is -10 * log(1-p), a value of 10 indicates a 1 in 10 chance of error, while a 100 indicates a 1 in 10^10 chance. The GATK values can grow very large when lots of NGS data is used to call.
Genotype PhredScore	The Genotype Quality, as a Phred-scaled confidence at the true genotype is the one provided in GT. In diploid case, if GT is 0/1, then GQ is really L(0/1) / (L(0/0) + L(0/1) + L(1/1)), where L is the likelihood of the NGS sequencing data under the model of that the sample is 0/0, 0/1/, or 1/1.
Reference Allele Depth	Number of reads passing GATK quality threshold, with the reference allele
Alternative Allele Depth	Number of reads passing GATK quality threshold, with the alternative allele
Total Depth	Sum of Reference and Alternative Depth
Allelic Ratio	Fraction of total reads called as alternative allele
Mapping Quality	Root Mean Square of the mapping quality of the reads across all samples.
BaseQuality RankSum	The u-based z-approximation from the Mann-Whitney Rank Sum Test for base qualities (ref bases vs. bases of the alternate allele). Note that the base quality rank sum test can not be calculated for homozygous sites.
MappingQuality RankSum	The u-based z-approximation from the Mann-Whitney Rank Sum Test for mapping qualities (reads with ref bases vs. those with the alternate allele) Note that the mapping quality rank sum test can not be calculated for homozygous sites.
ReadPosition RankSum	The u-based z-approximation from the Mann-Whitney Rank Sum Test for the distance from the end of the read for reads with the alternate allele; if the alternate allele is only seen near the ends of reads this is indicative of error. Note that the read position rank sum test can not be calculated for homozygous sites.
StrandBias	How much evidence is there for Strand Bias (the variation being seen on only the forward or only the reverse strand) in the reads? Higher SB values denote more bias (and therefore are more likely to indicate false positive calls).
Quality By Depth	Variant confidence (given as (AB+BB)/AA from the PLs) / unfiltered depth. (PL : phred-scaled likelyhood of the genotype). Low scores are indicative of false positive calls and artifacts. Note that QualByDepth requires sequencing reads associated with the samples with polymorphic genotypes.
Fisher Scaled StrandBias	Phred-scaled p-value using Fisher Exact Test to detect strand bias (the variation being seen on only the forward or only the reverse strand) in the reads. More bias is indicative of false positive calls. Note that the fisher strand test may not be calculated for certain complex indel cases or for multi-allelic sites.
VQSLOD	VQSLOD is the log odds ratio of being a true variant versus being false under the trained Gaussian mixture model when Variant Recalibration is applied.
DeltaPL	This field provides the likelihoods of the given genotypes (here, 0/0, 0/1, and 1/1). These are normalized, Phred-scaled likelihoods for each of the 0/0, 0/1, and 1/1, without priors. To be concrete, for the heterozygous case, this is L(data given that the true genotype is 0/1). The most likely genotype (given in the GT field) is scaled so that it's P = 1.0 (0 when Phred-scaled), and the other likelihoods reflect their Phred-scaled likelihoods relative to this most likely genotype
Tranches Filter	Content from VCF FILTER field. In case of GATK VQSR this is the confidence tranche.
Variant In Stretch	Select variants located in stretch of repetitive sequence, as indicated by GATK recent versions. Annotations StretchUnit and StretchLength Give more information about the stretch
Stretch Length	Select variants based on the stretch repeat length (eg homopolymer stretches). Both alleles must fullfill this criterium.

Mutation_Effect_Predictions
CADDv1.4 Phred	CADD v1.4 Phred-Scores. Integrated pathogenicity predication score, see publication for details (pmid: 24487276). The scores represent the likelyhood that the variant is pathogenic, as being in the top (10 to the power minus the Score)th percent of genome wide variants.
dbSNP v135 Clinical	DEPRECATED Select Variants from dbSNPv135 that are tagged as clinically relevant. This is a recent feature and contains many unclear entries
dbSNP v137 Clinical	DEPRECATED Select Variants from dbSNPv137 that are tagged as clinically relevant. This is a recent feature and contains many unclear entries
dbSNP v138 Clinical	Select Variants from dbSNPv138 that are tagged as clinically relevant. This is a recent feature and contains many unclear entries
LJB GERP Score	DEPRECATED GERP is a conservation score. High score is indicative of constrained site. (See Davidov et al, 2010, plos computational biology)
LJB LRT Score	DEPRECATED Rescaled (0-1) likelihood ratio test of codon constrained. Higher score for more constrained codons. See dbNSFP paper for details (pmid: 21520341)
LJB Mutation Taster Score	DEPRECATED Mutation Taster has four categories: (A)utomatic and (D)isease causing, for which the score is p-value for a true prediction. (N)on-deleterious and (P)olymorphism known, for which the score is 1 - p-value for a true prediction. The higher the score, the more likely the variant is deleterious. See dbNSFP paper for details (pmid: 21520341)
LJB PhyloP Score	DEPRECATED Rescaled (0-1) PhyloP score. Higher Score for more conserved sites. Prediction is Rescaled Score higher than 0.95 for a conserved site. See dbNSFP paper for details (pmid: 21520341). Note: Vissers et al reported PhyloP threshold of 2.5, rescaled to 0.998 for deleterious variants.
LJB PolyPhen2 Score	DEPRECATED Polyphen2 score. Higher Score for more likely damaging sites. Prediction is probably damaging if higher than 0.85 for a damaging site. Possibly damaging between 0.85 and 0.15. See dbNSFP paper for details (pmid: 21520341).
LJB Sift Score	DEPRECATED Rescaled Sift score. Higher Score for more likely damaging sites. Prediction is damaging if higher than 0.95. See dbNSFP paper for details (pmid: 21520341).
CADD Raw Score	DEPRECATED CADD C-Scores. Integrated pathogenicity predication score. Raw Scores, see publication for details (pmid: 24487276).
CADD Phred Score	DEPRECATED CADD C-Scores in Phred Scale. E.g.: Scores above 20 are in the 1% top scoring variants, scores above 30 are in the 0.1% top scoring variants (pmid: 24487276).
WebTool Sift Score	Rescaled Sift score. Higher Score for more likely damaging sites. Prediction is damaging if higher than 0.95. Scores are obtained from the PROVEAN WebTool.
WebTool PROVEAN Score	PROVEAN score. Score below -2.5 indicates likely damaging sites. Scores are obtained from the PROVEAN WebTool.
WebTool Grantham Score	PROVEAN score. Score below -2.5 indicates likely damaging sites. Scores are obtained from the PROVEAN WebTool.
WebTool MutationTaster Score	Mutation Taster has four categories: (A)utomatic and (D)isease causing, for which the score is p-value for a true prediction. (N)on-deleterious and (P)olymorphism known, for which the score is 1 - p-value for a true prediction. The higher the score, the more likely the variant is deleterious. High score for class P, might indicate a false positive in reference databases such as dbSNP. Scores are obtained from the MTQE.
dbnsfp30a SIFT Score	dbNSFP is a database developed for functional prediction and annotation of all potential non-synonymous single-nucleotide variants (nsSNVs) in the human genome. Predictions are between 0 and 1, with lower values being more likely to represent pathogenicity. The proposed threshold for SIFT is 0.05
dbnsfp30a pph2 HDIV Score	dbNSFP is a database developed for functional prediction and annotation of all potential non-synonymous single-nucleotide variants (nsSNVs) in the human genome. Predictions are between 0 and 1, with higher values being more likely to represent pathogenicity. The proposed threshold for PolyPhen2_HDIV is 0.5 (Neutral vs non-neutral), or 0.957 and 0.453 (Probably damaging, possibly damaging, neutral)
dbnsfp30a pph2 HVAR Score	dbNSFP is a database developed for functional prediction and annotation of all potential non-synonymous single-nucleotide variants (nsSNVs) in the human genome. Predictions are between 0 and 1, with higher values being more likely to represent pathogenicity. The proposed threshold for PolyPhen2_HVAR is 0.5 (Neutral vs non-neutral), or 0.909 and 0.447 (Probably damaging, possibly damaging, neutral)
dbnsfp30a LRT pred	dbNSFP is a database developed for functional prediction and annotation of all potential non-synonymous single-nucleotide variants (nsSNVs) in the human genome. Predictions are between 0 and 1, with higher values being more likely to represent pathogenicity. The Likelyhood ratio test is a composite descision logic, providing benign,damaging or unclear classes
dbnsfp30a MutationTaster Score	dbNSFP is a database developed for functional prediction and annotation of all potential non-synonymous single-nucleotide variants (nsSNVs) in the human genome. Predictions are between 0 and 1, with higher values being more likely to represent pathogenicity. The proposed threshold for MutationTaster is 0.5
dbnsfp30a MutationAssessor Score	dbNSFP is a database developed for functional prediction and annotation of all potential non-synonymous single-nucleotide variants (nsSNVs) in the human genome. Predictions are between -5.545 and 5.975, with higher values being more likely to represent pathogenicity. The proposed threshold for MutationAssessor is 0.65
dbnsfp30a FATHMM Score	dbNSFP is a database developed for functional prediction and annotation of all potential non-synonymous single-nucleotide variants (nsSNVs) in the human genome. Predictions are between -16.13 and 10.64, with smaller values being more likely to represent pathogenicity. The proposed threshold for FATHMM is -1.5
dbnsfp30a FATHMM MKL Score	dbNSFP is a database developed for functional prediction and annotation of all potential non-synonymous single-nucleotide variants (nsSNVs) in the human genome. Predictions are between 0 and 1, with higher values being more likely to represent pathogenicity. The proposed threshold for FATHMM_MKL_coding is 0.5
dbnsfp30a PROVEAN Score	dbNSFP is a database developed for functional prediction and annotation of all potential non-synonymous single-nucleotide variants (nsSNVs) in the human genome. Predictions are between -14 and 14, with smaller values being more likely to represent pathogenicity. The proposed threshold for PROVEAN is -2.5
dbnsfp30a VEST3 Score	dbNSFP is a database developed for functional prediction and annotation of all potential non-synonymous single-nucleotide variants (nsSNVs) in the human genome. Predictions are between 0 and 1, with higher values being more likely to represent pathogenicity. The proposed threshold for VEST3 is not provided
dbnsfp30a CADD Phred	dbNSFP is a database developed for functional prediction and annotation of all potential non-synonymous single-nucleotide variants (nsSNVs) in the human genome. Predictions are between 0 and 1, with higher values being more likely to represent pathogenicity. CADD_phred scores should be interpreted as being in the 10^score % of most pathogenic variants. NOTE: dbNSFP only provides scores for EXONIC SNVs!
dbnsfp30a DANN Score	dbNSFP is a database developed for functional prediction and annotation of all potential non-synonymous single-nucleotide variants (nsSNVs) in the human genome. Predictions are between 0 and 1, with higher values being more likely to represent pathogenicity. The proposed threshold for DANN is not provided
dbnsfp30a MetaSVM Score	dbNSFP is a database developed for functional prediction and annotation of all potential non-synonymous single-nucleotide variants (nsSNVs) in the human genome. Predictions are between -2 and 3, with higher values being more likely to represent pathogenicity. The proposed threshold for MetaSVM is 0
dbnsfp30a MetaLR Score	dbNSFP is a database developed for functional prediction and annotation of all potential non-synonymous single-nucleotide variants (nsSNVs) in the human genome. Predictions are between 0 and 1, with higher values being more likely to represent pathogenicity. The proposed threshold for MetaLR is 0.5
dbnsfp30a Integrated fitCons Score	dbNSFP is a database developed for functional prediction and annotation of all potential non-synonymous single-nucleotide variants (nsSNVs) in the human genome. Predictions are between 0 and 1, with higher values being more likely to represent pathogenicity. The proposed threshold for fitCons is not provided
dbnsfp30a GERP RS	dbNSFP is a database developed for functional prediction and annotation of all potential non-synonymous single-nucleotide variants (nsSNVs) in the human genome. Prediction are between -12.3 and 6.17. Higher values more likely represent pathogenicity. The proposed threshold for GERP is 4.4
dbnsfp30a PhyloP7 vert	dbNSFP is a database developed for functional prediction and annotation of all potential non-synonymous single-nucleotide variants (nsSNVs) in the human genome. Predictions are between -5.172 and 1.062. Higher values more likely represent pathogenicity. The proposed threshold for PhyloP is not provided
dbnsfp30a PhyloP20 mam	dbNSFP is a database developed for functional prediction and annotation of all potential non-synonymous single-nucleotide variants (nsSNVs) in the human genome. Predictions range between -13.282 and 1.199. Higher values more likely represent pathogenicity. The proposed threshold for PhyloP is not provided
dbnsfp30a SiPhy 29way	dbNSFP is a database developed for functional prediction and annotation of all potential non-synonymous single-nucleotide variants (nsSNVs) in the human genome. Predictions range between 0 and 37.9718. Higher values more likely represent pathogenicity. The proposed threshold for SiPhy is not provided

snpEff
Effect	SnpEff annotation on variant effect, based on ensembl (stop_lost, intronic, splice_site, UTR, exon, ...)
Effect Impact	SnpEff annotation on variant impact, based on ensembl (High, Moderate, Modifier, Low). For more information see : http://snpeff.sourceforge.net/SnpEff_manual.html#eff
Functional Class	SnpEff annotation on variant class, based on ensembl (None, Silent, Missense, Nonsense).
Gene Coding	Include/Exclude variants in Coding/NonCoding genes, according to SnpEff annotation, based on Ensembl
Transcript BioType	Various biological types, according to Ensembl based SnpEff annotations (gene, pseudogene, lncRNA, ...)
Gene Symbol	Provide a comma seperated list of gene symbols (GAPDH,etc)
Gene Transcript	Provide a comma seperated list of Ensembl transcripts (ENSTxxx)

ClinVar_SNPs
Match Type	Include/Exclude variants with exact or overlapping matches, at nucleotide level, to variants in ClinVar. Exact means same position, size, and alternative allele.
Amino Acid Match Type	Include/Exclude variants with exact or overlapping matches, at amino acid level, to variants in ClinVar. Exact means same position, size, and alternative amino acid.
Variant Effect	Include/Exclude variants based on effect annotation in ClinVar (stop-gain, UTR, synonymous, ...)
Classification	Include/Exclude variants overlapping variants of specified pathogenicity class in ClinVar (benign, risk factor, pathogenic, protective, ...)
Gene Symbol	Provide a comma seperated list of gene symbols (GAPDH,etc)
Disease	Provide a fragement of text to match ClinVar Disease linked to the variant.

Gene_Ontology
Associated To Any GO ID	Return variants that affect genes associated to a Gene-Ontology term.
Associated To Specified GO ID	Provide a comma seperated list of GO-identifiers (GO:2001295) associated to the gene affected by the returned variant.
Associated To Text Matched Term	Provide a fragement of text to match GO_terms associated to the gene affected by the returned variant.

Oncology_Specific
In COSMIC v70	Include/Exclude all variants present in COSMICv70
COSMIC v70 Occurence in tissue	Include/Exclude all variants present in a specific tissue in COSMICv70, with a minimal/maximal occurence count
Genotype Ratio	Include/Exclude variants based on the genotype ratio. Range is 0 to 1. Relates to the number of alternate alleles vs the ploidy.
Ploidy	Sample Ploidy used during genotyping. By default, this is two.

User_Provided
AutoClassified	Was a variant classified as the specified diagnostic class during import
Diagnostic Class (Current Sample)	Retrieve variants previously assigned by a user-specified to the requested diagnostic class.
Validation (Current Sample)	Retrieve variants validated by the requested methods.
Inheritance (Current Sample)	Retrieve variants previously set by a user to the requested inheritance.
Diagnostic Class (All Samples)	Retrieve variants previously assigned by a user-specified diagnostic class.
Diagnostic Class (By Project)	Retrieve variants previously assigned by a user-specified diagnostic class, limiting to specific projects.
Validation (All Samples)	Retrieve variants validated by the requested methods.
Inheritance (All Samples)	Retrieve variants previously set by a user to the requested inheritance.

Custom_VCF_Fields
help	No description available

VariantDB

Import options

Configuration

Manage access

Generate PDF

Use our BETA server

Platform Settings

Gene Panels

Manage Variant Classifiers

Checkbox Lists

Usergroup Settings

Documentation : Filtering Criteria Explained