E-values and Bit-scores in BLAST (2024)

Table of Contents
E-value Bit-score FAQs
  • E-value
  • Bit-score

E-value

The BLAST E-value is the number of expected hits of similar quality (score) that could be found just by chance.

E-value of 10 means that up to 10 hits can be expected to be found just by chance, given the same size of a random database.

E-value can be used as a first quality filter for the BLAST search result, to obtain only results equal to or better than the number given by the -evalue  option. Blast results are sorted by E-value by default (best hit in first line).

blastn -query genes.ffn -subject genome.fna -evalue 1e-10 

The smaller the E-value, the better the match.

-evalue 1e-50  

small E-value: low number of hits, but of high quality

Blast hits with an E-value smaller than 1e-50  includes database matches of very high quality.

-evalue 0.01

Blast hits with E-value smaller than 0.01 can still be considered as good hit for hom*ology matches.

-evalue 10   (default)

large E-value: many hits, partly of low quality

E-value smaller than 10 will include hits that cannot be considered as significant, but may give an idea of potential relations.

The E-value (expectation value) is a corrected bit-score adjusted to the sequence database size. The E-value therefore depends on the size of the used sequence database. Since large databases increase the chance of false positive hits,  the E-value corrects for the higher chance. It’s a correction for multiple comparisons. This means that a sequence hit would get a better E-value when present in a smaller database.

$E = m \cdot n  / 2^{bit-score}$

        $m$ - query sequence length

        $n$ - total database length (sum of all sequences)

Bit-score

The higher the bit-score, the better the sequence similarity

The bit-score is the requireds size of a sequence database in which the current match could be found just by chance. The bit-score is a log2 scaled and normalized raw-score. Each increase by one doubles the required database size (2bit-score).

Bit-score does not depend on database size. The bit-score gives the same value for hits in databases of different sizes and hence can be used for searching in an constantly increasing database.

From http://www.metagenomics.wiki/tools/blast/evalue

The E-value provides information about the likelihood that a given sequence match is purely by chance. The lower the E-value, the less likely the database match is a result of random chance and therefore the more significant the match is. Empirical interpretation of the E-value is as follows. If E < 1e - 50 (or 1 × 10-50), there should be an extremely high confidence that the database match is a result of hom*ologous relationships. If E is between 0.01 and 1e - 50, the match can be considered a result of hom*ology. If E is between 0.01 and 10, the match is considered not significant, but may hint at a tentative remote hom*ology relationship. Additional evidence is needed to confirm the tentative relationship. If E > 10, the sequences under consideration are either unrelated or related by extremely distant relationships that fall below the limit of detection with the current method. Because the E-value is proportionally affected by the database size, an obvious problem is that as the database grows, the E-value for a given sequence match also increases.

A bit score is another prominent statistical indicator used in addition to the Evalue in a BLAST output. The bit score measures sequence similarity independent of query sequence length and database size and is normalized based on the rawpairwise alignment score. The bit score (S) is determined by the following formula: S = (λ × S − lnK)/ ln2 where λ is the Gumble distribution constant, S is the raw alignment score, and K is a constant associated with the scoring matrix used. Clearly, the bit score (S) is linearly related to the rawalignment score (S). Thus, the higher the bit score, the more highly significant the match is. The bit score provides a constant statistical indicator for searching different databases of different sizes or for searching the same database at different times as the database enlarges.
it score provides a constant statistical indicator for searching different databases of different sizes or for searching the same database at different times as the database enlarges.

From https://www.biostars.org/p/187230/

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3820096/

The bit-score provides a better rule-of-thumb for inferring hom*ology. For average length proteins, a bit score of 50 is almost always significant. A bit score of 40 is only significant (E() < 0.001) in searches of protein databases with fewer than 7000 entries. Increasing the score by 10 b

From https://www.biostars.org/p/187230/

E-values and Bit-scores in BLAST (2024)

FAQs

What is the E-value and bit score in BLAST? ›

Bit scores are normalized, which means that the bit scores from different alignments can be compared, even if different scoring matrices have been used. The E-value gives an indication of the statistical significance of a given pairwise alignment and reflects the size of the database and the scoring system used.

How do you interpret E-value score in a blast search results? ›

Blast results are sorted by E-value by default (best hit in first line). The smaller the E-value, the better the match. Blast hits with an E-value smaller than 1e -50 includes database matches of very high quality. Blast hits with E-value smaller than 0.01 can still be considered as good hit for hom*ology matches.

What does an E-value of 0.0 mean in BLAST? ›

The E-value of 0.0 indicate the number of alignments with scores equivalent to or greater than that are expected to occur in a database by chance therefore the lower the E-value the more significant the score hence a better quality of the alignment blast search.

What is the E-value in PSI BLAST? ›

The e-value is a parameter that describes the number of hits one can “expect” to see by chance when searching a database of a particular size. It decreases exponentially with the score (S) that is assigned to a match between two sequences.

What is a good e-value? ›

The E-value is the expectation value that indicates the number of alignments with a score≥S that one can expect to find by chance in a database of size N. Hence, the E-value is dependent on the database size and the query length. The closer the E-value to 0, the better is the alignment. For E<1e−2 (=1×102=0.01), P≈E.

What is considered a large e-value? ›

A large E-value implies that considerable unmeasured confounding would be needed to explain away an effect estimate. A small E-value implies little unmeasured confounding would be needed to explain away an effect estimate.

How to interpret BLAST results? ›

Interpreting BLAST Results. BLAST results show all of the taxa that share sequence similarity with the query sequence based on the selected database. The results page includes a search summary, hit description table, graphic summary, and alignments that can help determine the quality or accuracy of a given hit.

Can e-values be negative BLAST? ›

Since E-values are estimates, not probabilities , they can be lower than 0. However, if NCBI BLAST retuns an e-value of say: "4e -19" do they mean: 4⋅10−19 (this would not result in negative values )

How do you interpret E numbers? ›

In statistics, the symbol e is a mathematical constant approximately equal to 2.71828183. Prism switches to scientific notation when the values are very large or very small. For example: 2.3e-5, means 2.3 times ten to the minus five power, or 0.000023.

What is a bad e-value? ›

10e-10 < E-value < 1 Could be a true hom*ologue but it is a gray area. E-value > 1 Proteins are most likely not related. E-value > 10 Hits are most likely junk unless the query sequence is very short.

What does an e-value of 1 mean? ›

Interpreting E-values

The E-value describes the number of hits we expect to see by chance when BLASTing a database. It helps us understand if our hits are relatively unique or not. For example, an E-value of 1 means that one expects by chance to see 1 match with a similar score.

What is the difference between P value and E-value in BLAST? ›

The BLAST programs report E-value rather than P-values because it is easier to understand the difference between, for example, E-value of 5 and 10 than P-values of 0.993 and 0.99995. However, when E < 0.01, P-values and E-value are nearly identical.

What is the difference between e value and expect threshold? ›

Expect threshold is the expected number of chance matches in a random model. In this case the E-value show the expected number of hits with a given score.

Why is PSI-BLAST better? ›

PSIBLAST may be more sensitive than BLAST, meaning that it might be able to find distantly related sequences that are missed in a BLAST search.

What is the max score on BLAST? ›

Max[imum] Score: the highest alignment score calculated from the sum of the rewards for matched nucleotides and penalities for mismatches and gaps.

What is the bitscore? ›

A bit score is another prominent statistical indicator used in addition to the Evalue in a BLAST output. The bit score measures sequence similarity independent of query sequence length and database size and is normalized based on the rawpairwise alignment score.

What is an e-value in statistics? ›

In statistical hypothesis testing, e-values quantify the evidence in the data against a null hypothesis (e.g., "the coin is fair", or, in a medical context, "this new treatment has no effect"). They serve as a more robust alternative to p-values, addressing some shortcomings of the latter.

What does the e-value of 6e 12 mean? ›

6e12 is shorthand for 6 x 10^12, which means 6 multiplied by 10 to the power of 12. To calculate the value of 6e12, first note that 10 to the power of 12 is equal to 1000000000000. Then, multiply 6 by 1000000000000 to get 6000000000000. Therefore, 6e12 is equal to 6000000000000.

Top Articles
Game Modes
What are Lightning Payments? [Guide] - Lightspark
Katie Pavlich Bikini Photos
Gamevault Agent
Hocus Pocus Showtimes Near Harkins Theatres Yuma Palms 14
Free Atm For Emerald Card Near Me
Craigslist Mexico Cancun
Hendersonville (Tennessee) – Travel guide at Wikivoyage
Doby's Funeral Home Obituaries
Vardis Olive Garden (Georgioupolis, Kreta) ✈️ inkl. Flug buchen
Select Truck Greensboro
Things To Do In Atlanta Tomorrow Night
Non Sequitur
How To Cut Eelgrass Grounded
Pac Man Deviantart
Alexander Funeral Home Gallatin Obituaries
Craigslist In Flagstaff
Shasta County Most Wanted 2022
Energy Healing Conference Utah
Testberichte zu E-Bikes & Fahrrädern von PROPHETE.
Aaa Saugus Ma Appointment
Geometry Review Quiz 5 Answer Key
Walgreens Alma School And Dynamite
Bible Gateway passage: Revelation 3 - New Living Translation
Yisd Home Access Center
Home
Shadbase Get Out Of Jail
Gina Wilson Angle Addition Postulate
Celina Powell Lil Meech Video: A Controversial Encounter Shakes Social Media - Video Reddit Trend
Walmart Pharmacy Near Me Open
A Christmas Horse - Alison Senxation
Ou Football Brainiacs
Access a Shared Resource | Computing for Arts + Sciences
Pixel Combat Unblocked
Cvs Sport Physicals
Mercedes W204 Belt Diagram
'Conan Exiles' 3.0 Guide: How To Unlock Spells And Sorcery
Teenbeautyfitness
Where Can I Cash A Huntington National Bank Check
Facebook Marketplace Marrero La
Nobodyhome.tv Reddit
Topos De Bolos Engraçados
Gregory (Five Nights at Freddy's)
Grand Valley State University Library Hours
Holzer Athena Portal
Hampton In And Suites Near Me
Hello – Cornerstone Chapel
Stoughton Commuter Rail Schedule
Bedbathandbeyond Flemington Nj
Free Carnival-themed Google Slides & PowerPoint templates
Otter Bustr
Selly Medaline
Latest Posts
Article information

Author: Lilliana Bartoletti

Last Updated:

Views: 5806

Rating: 4.2 / 5 (73 voted)

Reviews: 88% of readers found this page helpful

Author information

Name: Lilliana Bartoletti

Birthday: 1999-11-18

Address: 58866 Tricia Spurs, North Melvinberg, HI 91346-3774

Phone: +50616620367928

Job: Real-Estate Liaison

Hobby: Graffiti, Astronomy, Handball, Magic, Origami, Fashion, Foreign language learning

Introduction: My name is Lilliana Bartoletti, I am a adventurous, pleasant, shiny, beautiful, handsome, zealous, tasty person who loves writing and wants to share my knowledge and understanding with you.