Documentation : Filtering Criteria Explained

Dynamic_Filters
De Novo This filter automatically searches for De Novo variants in the selected sample. It retains only variants absent from parental samples (or present as homozygous reference). If a single parent is available, a warning will be issued. If no parents are available, the query is not executed. The optional quality check applies the same quality metrics defined for the selected sample, to the parental samples. Note, this might increase the number of putative De Novo variants in case of low quality parental data! Activating the option to exclude non_covered variants will only report a variant if the position is confidently called as reference in the parental VCF. Note, this triggers exclusion of positions of low parental quality (applying the criteria. In case GVCF data is available, confident regions (GQ>20) are also taken into account as covered positions.
Dominant This filter automatically searches for Dominant variants in the selected sample. It retains only variants present in all affected family members. Increase the unaffected-carrier value to simulate reduced penetrance. If no family is available, the query is not executed. The optional quality check applies the same quality metrics defined for the selected sample, to the family samples. Note, this might affect the number of variants in case of low quality family data! Excluding homozygous variants limits the candidate variants to heterozygous cosegregation.
Recessive (general) This filter automatically searches for recessive variants in the selected sample. It retains only variants present as homozygous in all affected family members. Parental samples are not checked for heterozygous presence of the variants. Increase the unaffected-carrier value to allow homozygous unaffected carriers, simulating reduced penetrance. If no family is available, this query reduces to retrieving all homozygous variants. The optional quality check applies the same quality metrics defined for the selected sample, to the family samples. Note, this might affect the number of variants in case of low quality family data!
Recessive (biparental) This filter automatically searches for recessive variants in the selected sample. It retains only variants present as homozygous in all affected family members AND heterozygous in both parents of each affected case. Increase the unaffected-carrier value to allow homozygous unaffected carriers, simulating reduced penetrance. If no family is available, or any affected sample does not have two parents associated, this query will not execute. The optional quality check applies the same quality metrics defined for the selected sample, to the family samples. Note, this might affect the number of variants in case of low quality family data!
Recessive (Compound Heterozygous) This filter automatically searches for compount heterozygous variants in the set of remaining variants. It retains variant combinations in the same gene that are biparentally inherited. If parents are not available, the query is not executed. The optional quality check applies the same quality metrics defined for the selected sample, to the parental samples. Note, this might impact the number of putative variants in case of low quality parental data! Activating the option to exclude non_covered variants will only report a variant if the position is called as heterozygous in the parental VCF. By default non-called positions in the parents are considered as possibly heterozygous.
Pathogenic Prediction Select variants predicted to be damaging by a minimal number of tools, out of a user-defined list. Pathogenic predictions are : LJB-LRT:'D' ; LJB-MT:0.95; LJB-PhyloP:0.95; LJB-PP2:0.95; LJB-SIFT:0.95; CADD-Phred:20; Web-Sift:0.95; Provean:2.5 (absolute value)
 

Family
In Parents If parental samples are available, select one or more samples that should (not) contain variants seen in the current sample. Genotype can be specified to only consider heterozygous or homozygous variants. Note that selection of multiple samples in a single rule equals to an "OR" statement, while setting multiple rules each containing one parent, equals to "AND" filtering.
In Siblings If sibling samples are available, select one or more samples that should (not) contain variants seen in the current sample. Genotype can be specified to only consider heterozygous or homozygous variants. Note that selection of multiple samples in a single rule equals to an "OR" statement, while setting multiple rules each containing one sample, equals to "AND" filtering.
In Siblings beta If sibling samples are available, select one or more samples that should (not) contain variants seen in the current sample. Genotype can be specified to only consider heterozygous or homozygous variants. Note that selection of multiple samples in a single rule equals to an "OR" statement, while setting multiple rules each containing one sample, equals to "AND" filtering.
In Children If offspring samples are available, select one or more samples that should (not) contain variants seen in the current sample. Genotype can be specified to only consider heterozygous or homozygous variants. Note that selection of multiple samples in a single rule equals to an "OR" statement, while setting multiple rules each containing one child, equals to "AND" filtering.
In Replica If biological replicates are available, for example paired tumor/normal samples, select one or more samples that should (not) contain variants seen in the current sample. Genotype can be specified to only consider heterozygous or homozygous variants. Note that selection of multiple samples in a single rule equals to an "OR" statement, while setting multiple rules each containing one replicate, equals to "AND" filtering.
In Custom A Custom groups allow to assign related (e.g. phenotype), but not familial samples samples to a individual. . Genotype can be specified to only consider heterozygous or homozygous variants. Note that selection of multiple samples in a single rule equals to an "OR" statement, while setting multiple rules each containing one replicate, equals to "AND" filtering.
In Custom B Custom groups allow to assign related (e.g. phenotype), but not familial samples samples to a individual. Genotype can be specified to only consider heterozygous or homozygous variants. Note that selection of multiple samples in a single rule equals to an "OR" statement, while setting multiple rules each containing one replicate, equals to "AND" filtering.
 

Occurence
In All Control Samples Select for variants that (do not) occur in any of the samples labeled as controls. Genotype specific filtering is available.
Abs.Occ Control Samples Provide an absolute value ( >0 !! ) for the number of times a variant was seen in your control samples. Occurence does not take quality filters into account for the control samples
Rel.Occ Control Samples Provide an relative value ( < 1 ) for the number of times a variant was seen in your control samples. Occurence does not take quality filters into account for the control samples
Abs.Occ All Samples Provide an absolute value ( >0 !! ) for the number of times a variant was seen in your samples. Occurence does not take quality filters into account for the cohort samples
Rel.Occ All Samples Provide an relative value ( < 1 ) for the number of times a variant was seen in your samples. Occurence does not take quality filters into account for the cohort samples
Abs.Occ Female Samples Provide an absolute value ( >0 !! ) for the number of times a variant was seen in your female samples. Occurence does not take quality filters into account for the female samples
Rel.Occ Female Samples Provide an relative value ( < 1 ) for the number of times a variant was seen in your female samples. Occurence does not take quality filters into account for the female samples
Abs.Occ Male Samples Provide an absolute value ( >0 !! ) for the number of times a variant was seen in your male samples. Occurence does not take quality filters into account for the male samples
Rel.Occ Male Samples Provide an relative value ( < 1 ) for the number of times a variant was seen in your male samples. Occurence does not take quality filters into account for the male samples
In Selected Control SamplesDEPRECATED Select for variants that (do not) occur in a selection of the samples labeled as controls. Genotype specific filtering is available.
Abs.Occ. Control Samples (Any Genotype) Provide an absolute value ( >0 !! ) for the number of times a variant was seen in your control samples. Occurence does not take quality filters into account for the control samples
Abs.Occ. Control Samples (Heterozygous) Provide an absolute value ( >0 !! ) for the number of times a variant was seen in your control samples as a heterozygous call. Occurence does not take quality filters into account for the control samples.
Abs.Occ. Control Samples (Homozygous Alt.) Provide an absolute value ( >0 !! ) for the number of times a variant was seen in your control samples as a homozygous call. Occurence does not take quality filters into account for the control samples.
Rel.Occ. Control Samples (Any Genotype) Provide a fraction in the range of 0-1, of your samples that the variant was seen in. Occurence does not take quality filters into account for the control samples.
Rel.Occ. Control Samples (Heterozygous) Provide a fraction in the range of 0-1, of your samples that the variant was seen in as a heterozygous call. Occurence does not take quality filters into account for the control samples.
Rel.Occ. Control Samples (Homozygous Alt.) Provide a fraction in the range of 0-1, of your samples that the variant was seen in as a heterozygous call. Occurence does not take quality filters into account.
Abs.Occ. All Samples (Any Genotype) Provide an absolute value ( >0 !! ) for the number of times a variant was seen in your samples. Occurence does not take quality filters into account
Abs.Occ. All Samples (Heterozygous) Provide an absolute value ( >0 !! ) for the number of times a variant was seen in your samples as a heterozygous call. Occurence does not take quality filters into account
Abs.Occ. All Samples (Homozygous Alt.) Provide an absolute value ( >0 !! ) for the number of times a variant was seen in your samples as a homozygous call. Occurence does not take quality filters into account
Rel.Occ. All Samples (Any Genotype) Provide a fraction in the range of 0-1, of your samples that the variant was seen in. Occurence does not take quality filters into account
Rel.Occ. All Samples (Heterozygous) Provide a fraction in the range of 0-1, of your samples that the variant was seen in as a heterozygous call. Occurence does not take quality filters into account
Rel.Occ. All Samples (Homozygous Alt.) Provide a fraction in the range of 0-1, of your samples that the variant was seen in as a homozygous call. Occurence does not take quality filters into account
Abs.Occ. By Project (Any Genotype) Provide an absolute value ( >0 !! ) for the number of times a variant was seen in certain Project. Occurence does not take quality filters into account
Abs.Occ. By Project (Homozygous Ref) Provide an absolute value ( >0 !! ) for the number of times a variant was seen in certain Project as a homozygous reference call. This filter takes GVCF entries with GQ > 20 into account as reference calls. Occurence does not take quality filters into account
Abs.Occ. By Project (Heterozygous) Provide an absolute value ( >0 !! ) for the number of times a variant was seen in certain Project as a heterozygous call. Occurence does not take quality filters into account
Abs.Occ. By Project (Homozygous Alt.) Provide an absolute value ( >0 !! ) for the number of times a variant was seen in certain Project as a homozygous alternative call. Occurence does not take quality filters into account
Rel.Occ. By Project (Any Genotype) Provide a fraction in the range of 0-1, of your samples that the variant was seen in a selection of projects. Occurence does not take quality filters into account
Rel.Occ. By Project (Heterozygous) Provide a fraction in the range of 0-1, of your samples that the variant was seen in a selection of projects as a heterozygous call. Occurence does not take quality filters into account
Rel.Occ. By Project (Homozygous Alt.) Provide a fraction in the range of 0-1, of your samples that the variant was seen in a selection of projects as a heterozygous call. Occurence does not take quality filters into account
Gene Hit In Other Cases (RefSeq VariantType) Filter for variants affecting (RefSeq) genes, that are also affected by variants of the selected types in other cases you have access to. Other parameters such as quality are not taken into account for additional samples. Provide the minimal number of ADDITIONAL hits in the gene in the text field. Control samples are excluded as additional hits
Gene Hit In Other Cases (SnpEff Impact) Filter for variants affecting (Ensembl) genes, that are also affected by variants of the selected SnpEff Impact types in other cases you have access to. Other parameters such as quality are not taken into account for additional samples. Provide the minimal number of ADDITIONAL hits in the gene in the text field. Control samples are excluded as additional hits
In dbSNP v130DEPRECATED Include/Exclude all variants present in dbSNP v130
In dbSNP v135DEPRECATED Include/Exclude all variants present in dbSNP v135
dbSNP v135 MAFDEPRECATED Include/Exclude variants present in dbSNPv135 based on a minor allele frequency. might be ambiguous for multi-allelic snps!
In dbSNP v137DEPRECATED Include/Exclude all variants present in dbSNP v137
dbSNP v137 MAFDEPRECATED Include/Exclude variants present in dbSNPv137 based on a minor allele frequency. might be ambiguous for multi-allelic snps!
In dbSNP v138 Include/Exclude all variants present in dbSNP v138
dbSNP v138 MAF Include/Exclude variants present in dbSNPv138 based on a minor allele frequency. might be ambiguous for multi-allelic snps!
In dbSNP v142 Include/Exclude all variants present in dbSNP v142
dbSNP v142 MAF Include/Exclude variants present in dbSNPv142 based on a minor allele frequency. might be ambiguous for multi-allelic snps!
dbSNP v142 nrChr Include/Exclude variants present in dbSNPv142 based on a population size (number of chromosomes). Note: Variants not listed in dbSNP have a chormosome count of zero.
In ESP5400 allDEPRECATED Include/Exclude all variants present in the Exome Sequencing Project, release 5400, all populations
ESP5400 all MAFDEPRECATED Include/Exclude variants present in esp5400 all populations based on a minor allele frequency. might be ambiguous for multi-allelic snps!
In ESP5400 eaDEPRECATED Include/Exclude all variants present in the Exome Sequencing Project, release 5400, European Americans
ESP5400 ea MAFDEPRECATED Include/Exclude variants present in esp5400 ea population based on a minor allele frequency. might be ambiguous for multi-allelic snps!
In ESP5400 aaDEPRECATED Include/Exclude all variants present in the Exome Sequencing Project, release 5400, African Americans
ESP5400 aa MAFDEPRECATED Include/Exclude variants present in esp5400 aa population based on a minor allele frequency. might be ambiguous for multi-allelic snps!
In ESP6500 all Include/Exclude all variants present in the Exome Sequencing Project, release 6500, all populations
ESP6500 all MAF Include/Exclude variants present in esp6500 all populations based on a minor allele frequency. might be ambiguous for multi-allelic snps!
In ESP6500 ea Include/Exclude all variants present in the Exome Sequencing Project, release 6500, European Americans
ESP6500 ea MAF Provide a minor allele frequency. might be ambiguous for multi-allelic snps!
In ESP6500 aa Include/Exclude all variants present in the Exome Sequencing Project, release 6500, African Americans
ESP6500 aa MAF Provide a minor allele frequency. might be ambiguous for multi-allelic snps!
In ExAC v02DEPRECATED Include/Exclude all variants present in the ExAC database, release 02, All populations
ExAC v02 MAFDEPRECATED Provide a minor allele frequency. Although values are similar, there are discrepancies between ANNOVAR ALL_population frequencies (used here) and the data available online. This is due to ANNOVAR using raw allele counts, versus the ExAC Browser using adjusted allele counts (DP >= 10 & GQ >= 20)
In ExAC v03 ALL Include/Exclude all variants present in the ExAC database, release 03, All populations
ExAC v03 MAF ALL Provide a minor allele frequency. Although values are similar, there are discrepancies between ANNOVAR ALL_population frequencies (used here) and the data available online. This is due to ANNOVAR using raw allele counts, versus the ExAC Browser using adjusted allele counts (DP >= 10 & GQ >= 20)
ExAC v03 nrChr ALL Provide a minimal number of chromosomes included for genotyping the variant in ALL populations
In ExAC v03 AFR Include/Exclude all variants present in the ExAC database, release 03, AFR population
ExAC v03 MAF AFR Provide a minor allele frequency. Although values are similar, there are discrepancies between ANNOVAR AFR_population frequencies (used here) and the data available online. This is due to ANNOVAR using raw allele counts, versus the ExAC Browser using adjusted allele counts (DP >= 10 & GQ >= 20)
ExAC v03 nrChr AFR Provide a minimal number of chromosomes included for genotyping the variant in AFR population
In ExAC v03 AMR Include/Exclude all variants present in the ExAC database, release 03, AMR population
ExAC v03 MAF AMR Provide a minor allele frequency. Although values are similar, there are discrepancies between ANNOVAR AMR_population frequencies (used here) and the data available online. This is due to ANNOVAR using raw allele counts, versus the ExAC Browser using adjusted allele counts (DP >= 10 & GQ >= 20)
ExAC v03 nrChr AMR Provide a minimal number of chromosomes included for genotyping the variant in AMR population
In ExAC v03 EAS Include/Exclude all variants present in the ExAC database, release 03, EAS population
ExAC v03 MAF EAS Provide a minor allele frequency. Although values are similar, there are discrepancies between ANNOVAR EAS_population frequencies (used here) and the data available online. This is due to ANNOVAR using raw allele counts, versus the ExAC Browser using adjusted allele counts (DP >= 10 & GQ >= 20)
ExAC v03 nrChr EAS Provide a minimal number of chromosomes included for genotyping the variant in EAS population
In ExAC v03 FIN Include/Exclude all variants present in the ExAC database, release 03, FIN population
ExAC v03 MAF FIN Provide a minor allele frequency. Although values are similar, there are discrepancies between ANNOVAR FIN_population frequencies (used here) and the data available online. This is due to ANNOVAR using raw allele counts, versus the ExAC Browser using adjusted allele counts (DP >= 10 & GQ >= 20)
ExAC v03 nrChr FIN Provide a minimal number of chromosomes included for genotyping the variant in FIN population
In ExAC v03 NFE Include/Exclude all variants present in the ExAC database, release 03, NFE population
ExAC v03 MAF NFE Provide a minor allele frequency. Although values are similar, there are discrepancies between ANNOVAR NFE_population frequencies (used here) and the data available online. This is due to ANNOVAR using raw allele counts, versus the ExAC Browser using adjusted allele counts (DP >= 10 & GQ >= 20)
ExAC v03 nrChr NFE Provide a minimal number of chromosomes included for genotyping the variant in NFE population
In ExAC v03 OTH Include/Exclude all variants present in the ExAC database, release 03, OTH population
ExAC v03 MAF OTH Provide a minor allele frequency. Although values are similar, there are discrepancies between ANNOVAR OTH_population frequencies (used here) and the data available online. This is due to ANNOVAR using raw allele counts, versus the ExAC Browser using adjusted allele counts (DP >= 10 & GQ >= 20)
ExAC v03 nrChr OTH Provide a minimal number of chromosomes included for genotyping the variant in OTH population
In ExAC v03 SAS Include/Exclude all variants present in the ExAC database, release 03, SAS population
ExAC v03 MAF SAS Provide a minor allele frequency. Although values are similar, there are discrepancies between ANNOVAR SAS_population frequencies (used here) and the data available online. This is due to ANNOVAR using raw allele counts, versus the ExAC Browser using adjusted allele counts (DP >= 10 & GQ >= 20)
ExAC v03 nrChr SAS Provide a minimal number of chromosomes included for genotyping the variant in SAS population
In Kaviar 150923 Include/Exclude all variants present in the Kaviar database, release 20150923
Kaviar 150923 MAF Provide a minor allele frequency for the Kaviar database, release 20150923
Kaviar 150923 nrChr Provide a minimal number of chromosomes included for genotyping the variant Kaviar, release 20150923
In 1000g2012apr allDEPRECATED Include/Exclude all variants present in the 1000 genomes project, release 2012-04, all populations
1000g2012apr all MAFDEPRECATED Include/Exclude all variants present in the 1000 genomes project, release 2012-04, all populations, based on a minor allele frequency. might be ambiguous for multi-allelic snps!
In 1000g2012apr afrDEPRECATED Include/Exclude all variants present in the 1000 genomes project, release 2012-04, afr population
1000g2012apr afr MAFDEPRECATED Include/Exclude all variants present in the 1000 genomes project, release 2012-04, afr population, based on a minor allele frequency. might be ambiguous for multi-allelic snps!
In 1000g2012apr amrDEPRECATED Include/Exclude all variants present in the 1000 genomes project, release 2012-04, amr population
1000g2012apr amr MAFDEPRECATED Include/Exclude all variants present in the 1000 genomes project, release 2012-04, amr population, based on a minor allele frequency. might be ambiguous for multi-allelic snps!
In 1000g2012apr asnDEPRECATED Include/Exclude all variants present in the 1000 genomes project, release 2012-04, asn population
1000g2012apr asn MAFDEPRECATED Include/Exclude all variants present in the 1000 genomes project, release 2012-04, asn population, based on a minor allele frequency. might be ambiguous for multi-allelic snps!
In 1000g2012apr eurDEPRECATED Include/Exclude all variants present in the 1000 genomes project, release 2012-04, eur population
1000g2012apr eur MAFDEPRECATED Include/Exclude all variants present in the 1000 genomes project, release 2012-04, eur population, based on a minor allele frequency. might be ambiguous for multi-allelic snps!
 

Location
Chromosomes Include/Exclude variants from selected chromosomes
Position Include/Exclude variants based on chromsomal position. Provide a chromosomal location without comma between thousands !
GenePanels Include/Exclude variants affecting genes (according to ANNOVAR RefSeq annotations) listed in the selected gene panels
Public GenePanels Include/Exclude variants affecting genes (according to ANNOVAR RefSeq annotations) listed in the selected gene panels. Presented panels were made available by other users.
RefSeq Is Hit Select only variants that have annotation for RefSeq (according to ANNOVAR RefSeq annotations; exonic/intronic/..., but not intergenic)
RefSeq Gene Symbol Include/Exclude variants affecting any of the listed genes (according to ANNOVAR RefSeq annotations). Provide a comma seperated list of gene symbols (GAPDH,etc)
RefSeq Gene Transcript Include/Exclude variants affecting any of the listed transcript ids (according to ANNOVAR RefSeq annotations). Provide a comma seperated list of gene transcripts (NM_003242.2,etc)
Ensembl GeneID Include/Exclude variants affecting any of the listed Ensembl GeneIDs (according to ANNOVAR EnsGene annotations). Provide a comma seperated list of Ensembl geneIDs (ENSGxxx)
Ensembl Transcript Include/Exclude variants affecting any of the listed Ensembl TranscriptIDs (according to ANNOVAR EnsGene annotations). Provide a comma seperated list of Ensembl transcripts (ENSTxxx)
UCSC Gene Symbol Include/Exclude variants affecting any of the listed Genes (according to ANNOVAR KnownGene annotations). Provide a comma seperated list of gene symbols (GAPDH,etc)
UCSC Transcript Include/Exclude variants affecting any of the listed transcripts (according to ANNOVAR KnownGene annotations). Provide a comma seperated list of ucsc transcripts (uc003cen.3,etc)
In Genomic SuperDup Include/Exclude variants located in Genomic SuperDups (UCSC Segmental Duplication Table)
 

Effect_On_Transcript
RefSeq VariantType Include/Exclude variants matching any of the selected variant types (frameshift, (non-)synonymous, stopgain, ...). Variants are selected if they match at lease one transcript variant.
RefSeq GeneLocation Include/Exclude variants based on their position related to genes (intron, exon, splice, intergenic, ...). Variants are selected if they match at lease one transcript variant.
Ensembl VariantType Include/Exclude variants matching any of the selected variant types (frameshift, (non-)synonymous, stopgain, ...). Variants are selected if they match at lease one transcript variant.
Ensembl GeneLocation Include/Exclude variants based on their position related to genes (intron, exon, splice, intergenic, ...). Variants are selected if they match at lease one transcript variant.
UCSC VariantType Include/Exclude variants matching any of the selected variant types (frameshift, (non-)synonymous, stopgain, ...). Variants are selected if they match at lease one transcript variant.
UCSC GeneLocation Include/Exclude variants based on their position related to genes (intron, exon, splice, intergenic, ...). Variants are selected if they match at lease one transcript variant.
Splicing scSNV11 ADA Ensemble splice alteration prediction, using Adaptive boosting, based on PWM, MaxEntScan, NNSplice and HSF. pmid: 25416802. Score represents probability of altered spicing (0-1), proposed cutoff is 0.6
Splicing scSNV11 RF Ensemble splice alteration prediction, using Random Forest Classification, based on PWM, MaxEntScan, NNSplice and HSF. pmid: 25416802. Score represents probability of altered spicing (0-1), proposed cutoff is 0.6
Splicing SPIDEX Zscore Deep Learning prediction of alternate splicing. Zscore is a ranking of the observed value among all predictions. Assuming normal distribution, this rank can be represented as a Zscore. Lower tail represents reduced splicing, upper tail represents a new splice site. This annotation source is only available for non-profic, academic users. Please send a signed copy of the EULA to activate it.
 

Genotype_Composition
SNV or Indel Include/Exclude single nucleotide variants, insertions or deletions only
Homozygous Or Heterozygous Include/Exclude variants based on the allelic composition
Genotype Ratio Allelic Ratio, as called by GATK Genotyper for samples of high ploidy. Use Ploidy annotation to interpret the output. For Mutect/VarScan samples, this value equals to the AllelicRatio based on (Mutect: FA, VarScan: FREQ)
Multi Allelic Position Multi-Allelic positions are defined as positions with more than one alternative allele.
Compound Heterozygous RefSeq Select genes (as defined by RefGene) that are hit by more than one variant matching all set criteria. There is no check that the variants come from different parents
Compound Heterozygous Ensembl Select genes (as defined by Ensembl) that are hit by more than one variant matching all set criteria. There is no check that the variants come from different parents
Somatic State Select variants of specific somatic state. This is only usefull for Mutect/VarScan VCF imports. States are 0:reference, 1:germline, 2:somatic, 3:LOH, 4:post-translational,5:unknown
 

Quality
Variant PhredScore The Phred scaled probability of Probability that REF/ALT polymorphism exists at this site given sequencing data. Because the Phred scale is -10 * log(1-p), a value of 10 indicates a 1 in 10 chance of error, while a 100 indicates a 1 in 10^10 chance. The GATK values can grow very large when lots of NGS data is used to call.
Genotype PhredScore The Genotype Quality, as a Phred-scaled confidence at the true genotype is the one provided in GT. In diploid case, if GT is 0/1, then GQ is really L(0/1) / (L(0/0) + L(0/1) + L(1/1)), where L is the likelihood of the NGS sequencing data under the model of that the sample is 0/0, 0/1/, or 1/1.
Reference Allele Depth Number of reads passing GATK quality threshold, with the reference allele
Alternative Allele Depth Number of reads passing GATK quality threshold, with the alternative allele
Total Depth Sum of Reference and Alternative Depth
Allelic Ratio Fraction of total reads called as alternative allele
Mapping Quality Root Mean Square of the mapping quality of the reads across all samples.
BaseQuality RankSum The u-based z-approximation from the Mann-Whitney Rank Sum Test for base qualities (ref bases vs. bases of the alternate allele). Note that the base quality rank sum test can not be calculated for homozygous sites.
MappingQuality RankSum The u-based z-approximation from the Mann-Whitney Rank Sum Test for mapping qualities (reads with ref bases vs. those with the alternate allele) Note that the mapping quality rank sum test can not be calculated for homozygous sites.
ReadPosition RankSum The u-based z-approximation from the Mann-Whitney Rank Sum Test for the distance from the end of the read for reads with the alternate allele; if the alternate allele is only seen near the ends of reads this is indicative of error. Note that the read position rank sum test can not be calculated for homozygous sites.
StrandBias How much evidence is there for Strand Bias (the variation being seen on only the forward or only the reverse strand) in the reads? Higher SB values denote more bias (and therefore are more likely to indicate false positive calls).
Quality By Depth Variant confidence (given as (AB+BB)/AA from the PLs) / unfiltered depth. (PL : phred-scaled likelyhood of the genotype). Low scores are indicative of false positive calls and artifacts. Note that QualByDepth requires sequencing reads associated with the samples with polymorphic genotypes.
Fisher Scaled StrandBias Phred-scaled p-value using Fisher Exact Test to detect strand bias (the variation being seen on only the forward or only the reverse strand) in the reads. More bias is indicative of false positive calls. Note that the fisher strand test may not be calculated for certain complex indel cases or for multi-allelic sites.
VQSLOD VQSLOD is the log odds ratio of being a true variant versus being false under the trained Gaussian mixture model when Variant Recalibration is applied.
DeltaPL This field provides the likelihoods of the given genotypes (here, 0/0, 0/1, and 1/1). These are normalized, Phred-scaled likelihoods for each of the 0/0, 0/1, and 1/1, without priors. To be concrete, for the heterozygous case, this is L(data given that the true genotype is 0/1). The most likely genotype (given in the GT field) is scaled so that it's P = 1.0 (0 when Phred-scaled), and the other likelihoods reflect their Phred-scaled likelihoods relative to this most likely genotype
Tranches Filter Content from VCF FILTER field. In case of GATK VQSR this is the confidence tranche.
Variant In Stretch Select variants located in stretch of repetitive sequence, as indicated by GATK recent versions. Annotations StretchUnit and StretchLength Give more information about the stretch
Stretch Length Select variants based on the stretch repeat length (eg homopolymer stretches). Both alleles must fullfill this criterium.
 

Mutation_Effect_Predictions
dbSNP v135 ClinicalDEPRECATED Select Variants from dbSNPv135 that are tagged as clinically relevant. This is a recent feature and contains many unclear entries
dbSNP v137 ClinicalDEPRECATED Select Variants from dbSNPv137 that are tagged as clinically relevant. This is a recent feature and contains many unclear entries
dbSNP v138 Clinical Select Variants from dbSNPv138 that are tagged as clinically relevant. This is a recent feature and contains many unclear entries
LJB GERP ScoreDEPRECATED GERP is a conservation score. High score is indicative of constrained site. (See Davidov et al, 2010, plos computational biology)
LJB LRT ScoreDEPRECATED Rescaled (0-1) likelihood ratio test of codon constrained. Higher score for more constrained codons. See dbNSFP paper for details (pmid: 21520341)
LJB Mutation Taster ScoreDEPRECATED Mutation Taster has four categories: (A)utomatic and (D)isease causing, for which the score is p-value for a true prediction. (N)on-deleterious and (P)olymorphism known, for which the score is 1 - p-value for a true prediction. The higher the score, the more likely the variant is deleterious. See dbNSFP paper for details (pmid: 21520341)
LJB PhyloP ScoreDEPRECATED Rescaled (0-1) PhyloP score. Higher Score for more conserved sites. Prediction is Rescaled Score higher than 0.95 for a conserved site. See dbNSFP paper for details (pmid: 21520341). Note: Vissers et al reported PhyloP threshold of 2.5, rescaled to 0.998 for deleterious variants.
LJB PolyPhen2 ScoreDEPRECATED Polyphen2 score. Higher Score for more likely damaging sites. Prediction is probably damaging if higher than 0.85 for a damaging site. Possibly damaging between 0.85 and 0.15. See dbNSFP paper for details (pmid: 21520341).
LJB Sift ScoreDEPRECATED Rescaled Sift score. Higher Score for more likely damaging sites. Prediction is damaging if higher than 0.95. See dbNSFP paper for details (pmid: 21520341).
CADD Raw Score CADD C-Scores. Integrated pathogenicity predication score. Raw Scores, see publication for details (pmid: 24487276).
CADD Phred Score CADD C-Scores in Phred Scale. E.g.: Scores above 20 are in the 1% top scoring variants, scores above 30 are in the 0.1% top scoring variants (pmid: 24487276).
WebTool Sift Score Rescaled Sift score. Higher Score for more likely damaging sites. Prediction is damaging if higher than 0.95. Scores are obtained from the PROVEAN WebTool.
WebTool PROVEAN Score PROVEAN score. Score below -2.5 indicates likely damaging sites. Scores are obtained from the PROVEAN WebTool.
WebTool Grantham Score PROVEAN score. Score below -2.5 indicates likely damaging sites. Scores are obtained from the PROVEAN WebTool.
WebTool MutationTaster Score Mutation Taster has four categories: (A)utomatic and (D)isease causing, for which the score is p-value for a true prediction. (N)on-deleterious and (P)olymorphism known, for which the score is 1 - p-value for a true prediction. The higher the score, the more likely the variant is deleterious. High score for class P, might indicate a false positive in reference databases such as dbSNP. Scores are obtained from the MTQE.
dbnsfp30a SIFT Score dbNSFP is a database developed for functional prediction and annotation of all potential non-synonymous single-nucleotide variants (nsSNVs) in the human genome. Predictions are between 0 and 1, with lower values being more likely to represent pathogenicity. The proposed threshold for SIFT is 0.05
dbnsfp30a pph2 HDIV Score dbNSFP is a database developed for functional prediction and annotation of all potential non-synonymous single-nucleotide variants (nsSNVs) in the human genome. Predictions are between 0 and 1, with higher values being more likely to represent pathogenicity. The proposed threshold for PolyPhen2_HDIV is 0.5 (Neutral vs non-neutral), or 0.957 and 0.453 (Probably damaging, possibly damaging, neutral)
dbnsfp30a pph2 HVAR Score dbNSFP is a database developed for functional prediction and annotation of all potential non-synonymous single-nucleotide variants (nsSNVs) in the human genome. Predictions are between 0 and 1, with higher values being more likely to represent pathogenicity. The proposed threshold for PolyPhen2_HVAR is 0.5 (Neutral vs non-neutral), or 0.909 and 0.447 (Probably damaging, possibly damaging, neutral)
dbnsfp30a LRT pred dbNSFP is a database developed for functional prediction and annotation of all potential non-synonymous single-nucleotide variants (nsSNVs) in the human genome. Predictions are between 0 and 1, with higher values being more likely to represent pathogenicity. The Likelyhood ratio test is a composite descision logic, providing benign,damaging or unclear classes
dbnsfp30a MutationTaster Score dbNSFP is a database developed for functional prediction and annotation of all potential non-synonymous single-nucleotide variants (nsSNVs) in the human genome. Predictions are between 0 and 1, with higher values being more likely to represent pathogenicity. The proposed threshold for MutationTaster is 0.5
dbnsfp30a MutationAssessor Score dbNSFP is a database developed for functional prediction and annotation of all potential non-synonymous single-nucleotide variants (nsSNVs) in the human genome. Predictions are between -5.545 and 5.975, with higher values being more likely to represent pathogenicity. The proposed threshold for MutationAssessor is 0.65
dbnsfp30a FATHMM Score dbNSFP is a database developed for functional prediction and annotation of all potential non-synonymous single-nucleotide variants (nsSNVs) in the human genome. Predictions are between -16.13 and 10.64, with smaller values being more likely to represent pathogenicity. The proposed threshold for FATHMM is -1.5
dbnsfp30a FATHMM MKL Score dbNSFP is a database developed for functional prediction and annotation of all potential non-synonymous single-nucleotide variants (nsSNVs) in the human genome. Predictions are between 0 and 1, with higher values being more likely to represent pathogenicity. The proposed threshold for FATHMM_MKL_coding is 0.5
dbnsfp30a PROVEAN Score dbNSFP is a database developed for functional prediction and annotation of all potential non-synonymous single-nucleotide variants (nsSNVs) in the human genome. Predictions are between -14 and 14, with smaller values being more likely to represent pathogenicity. The proposed threshold for PROVEAN is -2.5
dbnsfp30a VEST3 Score dbNSFP is a database developed for functional prediction and annotation of all potential non-synonymous single-nucleotide variants (nsSNVs) in the human genome. Predictions are between 0 and 1, with higher values being more likely to represent pathogenicity. The proposed threshold for VEST3 is not provided
dbnsfp30a CADD Phred dbNSFP is a database developed for functional prediction and annotation of all potential non-synonymous single-nucleotide variants (nsSNVs) in the human genome. Predictions are between 0 and 1, with higher values being more likely to represent pathogenicity. CADD_phred scores should be interpreted as being in the 10^score % of most pathogenic variants. NOTE: dbNSFP only provides scores for EXONIC SNVs!
dbnsfp30a DANN Score dbNSFP is a database developed for functional prediction and annotation of all potential non-synonymous single-nucleotide variants (nsSNVs) in the human genome. Predictions are between 0 and 1, with higher values being more likely to represent pathogenicity. The proposed threshold for DANN is not provided
dbnsfp30a MetaSVM Score dbNSFP is a database developed for functional prediction and annotation of all potential non-synonymous single-nucleotide variants (nsSNVs) in the human genome. Predictions are between -2 and 3, with higher values being more likely to represent pathogenicity. The proposed threshold for MetaSVM is 0
dbnsfp30a MetaLR Score dbNSFP is a database developed for functional prediction and annotation of all potential non-synonymous single-nucleotide variants (nsSNVs) in the human genome. Predictions are between 0 and 1, with higher values being more likely to represent pathogenicity. The proposed threshold for MetaLR is 0.5
dbnsfp30a Integrated fitCons Score dbNSFP is a database developed for functional prediction and annotation of all potential non-synonymous single-nucleotide variants (nsSNVs) in the human genome. Predictions are between 0 and 1, with higher values being more likely to represent pathogenicity. The proposed threshold for fitCons is not provided
dbnsfp30a GERP RS dbNSFP is a database developed for functional prediction and annotation of all potential non-synonymous single-nucleotide variants (nsSNVs) in the human genome. Prediction are between -12.3 and 6.17. Higher values more likely represent pathogenicity. The proposed threshold for GERP is 4.4
dbnsfp30a PhyloP7 vert dbNSFP is a database developed for functional prediction and annotation of all potential non-synonymous single-nucleotide variants (nsSNVs) in the human genome. Predictions are between -5.172 and 1.062. Higher values more likely represent pathogenicity. The proposed threshold for PhyloP is not provided
dbnsfp30a PhyloP20 mam dbNSFP is a database developed for functional prediction and annotation of all potential non-synonymous single-nucleotide variants (nsSNVs) in the human genome. Predictions range between -13.282 and 1.199. Higher values more likely represent pathogenicity. The proposed threshold for PhyloP is not provided
dbnsfp30a SiPhy 29way dbNSFP is a database developed for functional prediction and annotation of all potential non-synonymous single-nucleotide variants (nsSNVs) in the human genome. Predictions range between 0 and 37.9718. Higher values more likely represent pathogenicity. The proposed threshold for SiPhy is not provided
 

snpEff
Effect SnpEff annotation on variant effect, based on ensembl (stop_lost, intronic, splice_site, UTR, exon, ...)
Effect Impact SnpEff annotation on variant impact, based on ensembl (High, Moderate, Modifier, Low). For more information see : http://snpeff.sourceforge.net/SnpEff_manual.html#eff
Functional Class SnpEff annotation on variant class, based on ensembl (None, Silent, Missense, Nonsense).
Gene Coding Include/Exclude variants in Coding/NonCoding genes, according to SnpEff annotation, based on Ensembl
Transcript BioType Various biological types, according to Ensembl based SnpEff annotations (gene, pseudogene, lncRNA, ...)
Gene Symbol Provide a comma seperated list of gene symbols (GAPDH,etc)
Gene Transcript Provide a comma seperated list of Ensembl transcripts (ENSTxxx)
 

ClinVar_SNPs
Match Type Include/Exclude variants with exact or overlapping matches, at nucleotide level, to variants in ClinVar. Exact means same position, size, and alternative allele.
Amino Acid Match Type Include/Exclude variants with exact or overlapping matches, at amino acid level, to variants in ClinVar. Exact means same position, size, and alternative amino acid.
Variant Effect Include/Exclude variants based on effect annotation in ClinVar (stop-gain, UTR, synonymous, ...)
Classification Include/Exclude variants overlapping variants of specified pathogenicity class in ClinVar (benign, risk factor, pathogenic, protective, ...)
Gene Symbol Provide a comma seperated list of gene symbols (GAPDH,etc)
Disease Provide a fragement of text to match ClinVar Disease linked to the variant.
 

Gene_Ontology
Associated To Any GO ID Return variants that affect genes associated to a Gene-Ontology term.
Associated To Specified GO ID Provide a comma seperated list of GO-identifiers (GO:2001295) associated to the gene affected by the returned variant.
Associated To Text Matched Term Provide a fragement of text to match GO_terms associated to the gene affected by the returned variant.
 

Oncology_Specific
In COSMIC v70 Include/Exclude all variants present in COSMICv70
COSMIC v70 Occurence in tissue Include/Exclude all variants present in a specific tissue in COSMICv70, with a minimal/maximal occurence count
Genotype Ratio Include/Exclude variants based on the genotype ratio. Range is 0 to 1. Relates to the number of alternate alleles vs the ploidy.
Ploidy Sample Ploidy used during genotyping. By default, this is two.
 

User_Provided
Diagnostic Class (Current Sample) Retrieve variants previously assigned by a user-specified to the requested diagnostic class.
Validation (Current Sample) Retrieve variants validated by the requested methods.
Inheritance (Current Sample) Retrieve variants previously set by a user to the requested inheritance.
Diagnostic Class (All Samples) Retrieve variants previously assigned by a user-specified diagnostic class.
Diagnostic Class (By Project) Retrieve variants previously assigned by a user-specified diagnostic class, limiting to specific projects.
Validation (All Samples) Retrieve variants validated by the requested methods.
Inheritance (All Samples) Retrieve variants previously set by a user to the requested inheritance.
 

Custom_VCF_Fields
help No description available