How it works — and where it can be wrong
This tool is deliberately transparent. There is no machine learning and no black box — every score is a sum of hand-authored weights you can inspect right here.
The model: adding up evidence
Each mechanism starts at a base rate and accumulates log-odds from your answers:
logit(mechanism) = base rate
+ Σ (weights from your answers)
+ Σ (causal-web propagation)
+ Σ (exclusion-diagnosis boost)
probability = sigmoid(logit)Because evidence simply adds up, a mechanism's score is exactly the list of things that pushed it up or down — which is why every result comes with a "why this scored" breakdown.
The strength scale
Authors never pick raw numbers — they choose a coarse strength, which maps to log-odds in one place. Here is that entire table (and what each does to a 50/50 starting point):
| Strength | Log-odds | From 50% → |
|---|---|---|
| pathognomonic | +3 | 95% |
| strong for | +1.4 | 80% |
| moderate for | +0.7 | 67% |
| weak for | +0.3 | 57% |
| weak against | -0.3 | 43% |
| moderate against | -0.7 | 33% |
| strong against | -1.4 | 20% |
| excludes | -5 | 1% |
Probability bands
We show bands rather than false-precision percentages by default:
- Unlikely — 0% and up
- Possible — 30% and up
- Likely — 60% and up
- Very likely — 85% and up
A mechanism with no relevant answers is shown as "not enough info yet" rather than "unlikely" — absence of evidence isn't evidence of absence.
The causal web
Mechanisms influence each other. After the independent scores are computed, we run one bounded pass: if a mechanism is likely (>60%), it nudges the mechanisms it's known to drive. The pass reads the pre-propagation scores, so it can't run away even when two mechanisms point at each other. There are 11 such links — for example, sustained stress nudging sleep disruption, or autonomic dysfunction nudging deconditioning.
The "most informative test" calculation
For each test you haven't done, we simulate both outcomes against your current scores and estimate how much it would move your beliefs, weighted by how useful resolving that mechanism is, and divided by cost:
uncertainty = 4 · P · (1 − P) (peaks at 50/50) swing = expected change in P if you took the test value = Σ uncertainty · swing · actionability score = value / cost
This is why cheap, high-yield checks — ruling out look-alike conditions, reviewing your supplements, a standing-heart-rate test — tend to come first. You can toggle between "best value" and "most information" on the results page.
The post-exertional-malaise safety rule
One rule overrides the scoring entirely. If you report post-exertional malaise — a delayed crash after exertion — the tool will not recommend graded or push-through exercise, because that can cause lasting harm. This reflects strong evidence and guideline changes, and it's the single most important safety behavior here.
Where this can be wrong
- The weights are expert judgment, not fitted to data. There is no validated ground truth for Long COVID mechanisms, so these numbers are considered estimates and will be revised.
- It can only reason about what you tell it, and self-reported symptoms are imperfect.
- It is not a diagnosis and cannot replace a clinician's judgment, examination, or the tests themselves.
- The treatment evidence is summarized from the literature as of early 2026 and will go out of date.
Built on a review of 778 Long COVID clinical-trial extractions plus 2024–2026 literature. More about the sources and the project.