Sensitivity (tests): Difference between revisions
Rim Halaby (talk | contribs) No edit summary |
|||
(8 intermediate revisions by the same user not shown) | |||
Line 3: | Line 3: | ||
==Overview== | ==Overview== | ||
'''Sensitivity''' refers to the statistical measure of how well a [[binary classification]] test correctly identifies a condition. In [[epidemiology]], this is referred to as medical screening tests that detect preclinical disease. In quality control, this is referred to as a '''recall rate''', whereby factories decided if a new product is at an acceptable level to be mass-produced and sold for distribution. | '''Sensitivity''' refers to the statistical measure of how well a [[binary classification]] test correctly identifies a condition<ref name="pmid8019315 ">{{cite journal |author=Altman DG, Bland JM |title=Diagnostic tests. 1: Sensitivity and specificity |journal=BMJ |volume=308 |issue=6943 |pages=1552 |year=1994 |pmid=8019315 |doi= |url=http://www.bmj.com/cgi/content/full/308/6943/1552}}</ref>. In [[epidemiology]], this is referred to as medical screening tests that detect preclinical disease. In quality control, this is referred to as a '''recall rate''', whereby factories decided if a new product is at an acceptable level to be mass-produced and sold for distribution. | ||
==Critical Considerations== | ==Critical Considerations== | ||
Line 23: | Line 23: | ||
The calculation of sensitivity does not take into account indeterminate test results. If a test cannot be repeated, the options are to exclude indeterminate samples from analyses (but the number of exclusions should be stated when quoting sensitivity), or, alternatively, indeterminate samples can be treated as false negatives (which gives the worst-case value for sensitivity and may therefore underestimate it). | The calculation of sensitivity does not take into account indeterminate test results. If a test cannot be repeated, the options are to exclude indeterminate samples from analyses (but the number of exclusions should be stated when quoting sensitivity), or, alternatively, indeterminate samples can be treated as false negatives (which gives the worst-case value for sensitivity and may therefore underestimate it). | ||
==SPPIN and SNNOUT== | |||
{| class="wikitable" | |||
! | |||
! SPPIN | |||
! SNNOUT | |||
! Neither | |||
! Near-perfect | |||
|- | |||
| Proposed definition | |||
| Sp > 95% | |||
| SN > 95% | |||
| Both < 95% | |||
| Both > 99% | |||
|- | |||
| Example | |||
| Many physical dx findings | |||
| Ottawa fracture rules<ref name="ottawa">{{cite web |url=http://www.theottawarules.ca/ |title=The Ottawa Rules |author=Stiell, Ian |date= |website= |publisher=University of Ottawa |access-date=January 5, 2020 |quote=}}</ref> | |||
| [[Exercise treadmill test]]<ref name="pmid22512607">{{cite journal| author=Banerjee A, Newman DR, Van den Bruel A, Heneghan C| title=Diagnostic accuracy of exercise stress testing for coronary artery disease: a systematic review and meta-analysis of prospective studies. | journal=Int J Clin Pract | year= 2012 | volume= 66 | issue= 5 | pages= 477-92 | pmid=22512607 | doi=10.1111/j.1742-1241.2012.02900.x | pmc= | url=https://www.ncbi.nlm.nih.gov/entrez/eutils/elink.fcgi?dbfrom=pubmed&tool=sumsearch.org/cite&retmode=ref&cmd=prlinks&id=22512607 }} ''Note that 80% is a rough estimate of sensitivity and specificity.''</ref> | |||
| HIV-1/HIV-2 4th gen test<ref name="pmid24342484">{{cite journal| author=Malloch L, Kadivar K, Putz J, Levett PN, Tang J, Hatchette TF et al.| title=Comparative evaluation of the Bio-Rad Geenius HIV-1/2 Confirmatory Assay and the Bio-Rad Multispot HIV-1/2 Rapid Test as an alternative differentiation assay for CLSI M53 algorithm-I. | journal=J Clin Virol | year= 2013 | volume= 58 Suppl 1 | issue= | pages= e85-91 | pmid=24342484 | doi=10.1016/j.jcv.2013.08.008 | pmc= | url=https://www.ncbi.nlm.nih.gov/entrez/eutils/elink.fcgi?dbfrom=pubmed&tool=sumsearch.org/cite&retmode=ref&cmd=prlinks&id=24342484 }} </ref> | |||
|- | |||
| colspan="5" | '''Predictive values:''' | |||
|- | |||
| 10% pretest prob | |||
|<span style="color:red;font-weight:bold">PPV= 35%</span> | |||
<span style="color:lime;font-weight:bold">NPV = 99%</span> | |||
| PPV = 64% | |||
<span style="color:lime;font-weight:bold">NPV = 98%</span> | |||
| PPV = 31% | |||
<span style="color:lime;font-weight:bold">NPV = 97%</span> | |||
| PPV = 92% | |||
<span style="color:lime;font-weight:bold">NPV > 99%</span> | |||
|- | |||
| 50% pretest prob | |||
| PPV = 94% | |||
NPV = 83% | |||
| PPV = 83% | |||
NPV = 94% | |||
| PPV = 80% | |||
NPV = 80% | |||
| <span style="color:lime;font-weight:bold">PPV = 99%</span> | |||
<span style="color:lime;font-weight:bold">NPV = 99%</span> | |||
|- | |||
| 90% pretest prob | |||
|<span style="color:lime;font-weight:bold">PPV = 98%</span> | |||
NPV = 64% | |||
|<span style="color:lime;font-weight:bold">PPV = 99%</span> | |||
<span style="color:red;font-weight:bold">NPV = 35%</span> | |||
|<span style="color:lime;font-weight:bold">PPV = 97%</span> | |||
NPV = 31% | |||
| <span style="color:lime;font-weight:bold">PPV > 99%</span> | |||
NPV = 92% | |||
|- | |||
| Clinical messages | |||
| colspan="2" valign="top"| Accept test result when: | |||
# confirms your suspicion | |||
# maybe when pretest was a toss-up | |||
| valign="top"| Accept test result when: | |||
# confirms a strong suspicion | |||
| valign="top"| Accept test result ''unless'': | |||
# Contradicts a strong suspicion | |||
|- | |||
| colspan="5" | '''Notes:'''<br/> | |||
<span style="color:lime;font-weight:bold">Green font</span> indicates when results are more likely to be trustable<br/> | |||
<span style="color:red;font-weight:bold">Red font</span> indicates SPPIN/SNNOUT errors when you should be suspicous a a SPPIN/SNNOUT result | |||
|} | |||
==Terminology in Information Retrieval== | ==Terminology in Information Retrieval== | ||
Line 32: | Line 100: | ||
In the traditional language of [[statistical hypothesis testing]], the sensitivity of a test is called the [[statistical power]] of the test, although the word ''power'' in that context has a more general usage that is not applicable in the present context. A sensitive test will have fewer [[Type I and type II errors | Type II error]]s. | In the traditional language of [[statistical hypothesis testing]], the sensitivity of a test is called the [[statistical power]] of the test, although the word ''power'' in that context has a more general usage that is not applicable in the present context. A sensitive test will have fewer [[Type I and type II errors | Type II error]]s. | ||
==Related Chapters== | |||
* [[binary classification]] | |||
* [[receiver operating characteristic]] | |||
* [[specificity (tests)]] | |||
* [[statistical significance]] | |||
* [[Type I and type II errors]] | |||
* [[Selectivity]] | |||
== Online Calculators == | == Online Calculators == | ||
* [ | * [https://www.medcalc.org/calc/diagnostic_test.php MedCalc's Sensitivity/Specificity Calculator] | ||
==References== | ==References== | ||
{{reflist|2}} | |||
==External links== | ==External links== |
Latest revision as of 22:11, 9 January 2020
Editor-In-Chief: C. Michael Gibson, M.S., M.D. [1]; Assistant Editor(s)-In-Chief: Kristin Feeney, B.S.
Overview
Sensitivity refers to the statistical measure of how well a binary classification test correctly identifies a condition[1]. In epidemiology, this is referred to as medical screening tests that detect preclinical disease. In quality control, this is referred to as a recall rate, whereby factories decided if a new product is at an acceptable level to be mass-produced and sold for distribution.
Critical Considerations
- The results of the screening test are compared to some absolute (Gold standard); for example, for a medical test to determine if a person has a certain disease, the sensitivity to the disease is the probability that if the person has the disease, the test will be positive.
- The sensitivity is the proportion of true positives of all diseased cases in the population. It is a parameter of the test.
- High sensitivity is required when early diagnosis and treatment is beneficial, and when the disease is infectious.
Worked Example
Definition
- <math>{\rm sensitivity}=\frac{\rm number\ of\ True\ Positives}{{\rm number\ of\ True\ Positives}+{\rm number\ of\ False\ Negatives}}.</math>
A sensitivity of 100% means that the test recognizes all sick people as such.
Sensitivity alone does not tell us how well the test predicts other classes (that is, about the negative cases). In the binary classification, as illustrated above, this is the corresponding specificity test, or equivalently, the sensitivity for the other classes.
Sensitivity is not the same as the positive predictive value (ratio of true positives to combined true and false positives), which is as much a statement about the proportion of actual positives in the population being tested as it is about the test.
The calculation of sensitivity does not take into account indeterminate test results. If a test cannot be repeated, the options are to exclude indeterminate samples from analyses (but the number of exclusions should be stated when quoting sensitivity), or, alternatively, indeterminate samples can be treated as false negatives (which gives the worst-case value for sensitivity and may therefore underestimate it).
SPPIN and SNNOUT
SPPIN | SNNOUT | Neither | Near-perfect | |
---|---|---|---|---|
Proposed definition | Sp > 95% | SN > 95% | Both < 95% | Both > 99% |
Example | Many physical dx findings | Ottawa fracture rules[2] | Exercise treadmill test[3] | HIV-1/HIV-2 4th gen test[4] |
Predictive values: | ||||
10% pretest prob | PPV= 35%
NPV = 99% |
PPV = 64%
NPV = 98% |
PPV = 31%
NPV = 97% |
PPV = 92%
NPV > 99% |
50% pretest prob | PPV = 94%
NPV = 83% |
PPV = 83%
NPV = 94% |
PPV = 80%
NPV = 80% |
PPV = 99%
NPV = 99% |
90% pretest prob | PPV = 98%
NPV = 64% |
PPV = 99%
NPV = 35% |
PPV = 97%
NPV = 31% |
PPV > 99%
NPV = 92% |
Clinical messages | Accept test result when:
|
Accept test result when:
|
Accept test result unless:
| |
Notes: Green font indicates when results are more likely to be trustable |
Terminology in Information Retrieval
In information retrieval, positive predictive value is called precision, and sensitivity is called recall.
F-measure: can be used as a single measure of performance of the test. The F-measure is the harmonic mean of precision and recall:
- <math>F = 2 \times ({\rm precision} \times {\rm recall}) / ({\rm precision} + {\rm recall}).</math>
In the traditional language of statistical hypothesis testing, the sensitivity of a test is called the statistical power of the test, although the word power in that context has a more general usage that is not applicable in the present context. A sensitive test will have fewer Type II errors.
Related Chapters
- binary classification
- receiver operating characteristic
- specificity (tests)
- statistical significance
- Type I and type II errors
- Selectivity
Online Calculators
References
- ↑ Altman DG, Bland JM (1994). "Diagnostic tests. 1: Sensitivity and specificity". BMJ. 308 (6943): 1552. PMID 8019315.
- ↑ Stiell, Ian. "The Ottawa Rules". University of Ottawa. Retrieved January 5, 2020.
- ↑ Banerjee A, Newman DR, Van den Bruel A, Heneghan C (2012). "Diagnostic accuracy of exercise stress testing for coronary artery disease: a systematic review and meta-analysis of prospective studies". Int J Clin Pract. 66 (5): 477–92. doi:10.1111/j.1742-1241.2012.02900.x. PMID 22512607. Note that 80% is a rough estimate of sensitivity and specificity.
- ↑ Malloch L, Kadivar K, Putz J, Levett PN, Tang J, Hatchette TF; et al. (2013). "Comparative evaluation of the Bio-Rad Geenius HIV-1/2 Confirmatory Assay and the Bio-Rad Multispot HIV-1/2 Rapid Test as an alternative differentiation assay for CLSI M53 algorithm-I". J Clin Virol. 58 Suppl 1: e85–91. doi:10.1016/j.jcv.2013.08.008. PMID 24342484.
External links
- Sensitivity and Specificity Medical University of South Carolina