How we test calorie tracker apps
Same protocol on every app. 30+ days of daily logging. 240 weighed reference meals. Two testers per app, blind to each other's logs. The numbers are reproducible — and we publish them with the reasoning.
This page is the protocol document. It describes exactly how every BestCalorieApps review and ranking is built. It's longer than it strictly needs to be because we want you (and the apps we cover) to be able to reproduce what we did.
The six-axis rubric
Scoring weights
- Accuracy (25%) — MAPE against weighed reference meals on the 240-meal protocol
- Database quality (20%) — verification, USDA alignment, search variance per query
- AI photo recognition (20%) — per-plate accuracy on home + restaurant photos
- Macro tracking (15%) — granularity, custom macros, micronutrient depth
- User experience (10%) — friction-of-correction, ad density, daily-use feel
- Value (10%) — free-tier usability, Premium price-per-feature
Every score on the site is a number from 0 to 100. The number is the weighted sum of the six axis scores. Apps don't get partial credit for "almost" passing a category — if the photo AI doesn't exist, that axis scores 0; if the database is small, it scores low. We don't grade on a curve.
The 240-meal weighed reference protocol
This is our gold-standard accuracy test. It is identical to (and reproduces the numbers from) the Dietary Assessment Initiative's published protocol, which is the closest the field has to an academic gold standard.
The meal panel
240 distinct meals across five categories:
- Whole foods, single ingredient (50 meals)
- Home-cooked composites — recipes with 3+ ingredients (60 meals)
- Packaged goods (40 meals)
- Restaurant chains, named (50 meals)
- Mixed bowls and salads — explicitly designed to stress photo AI (40 meals)
The reference
Every meal is prepared in our test kitchen and weighed on a calibrated digital scale. The reference number for each meal is the sum of its ingredients' nutritional values from USDA FoodData Central. That's the truth value we measure against.
The logging
Two trained testers log the same meal independently, on the same day, blind to each other's logs and blind to the reference value. Each tester uses the app's primary input mode (search-and-pick, photo AI, or barcode) and is allowed one correction round to mimic real-world use.
The math
Mean Absolute Percentage Error (MAPE) is the metric. For each meal, we compute |logged − reference| / reference. We average across all 240 meals to get the headline accuracy number.
Editorial standards
- We don't take affiliate compensation from any app reviewed.
- We don't accept sponsored placements, paid reviews, or "review copies" of Premium tiers — we pay for them out of our editorial budget.
- We don't accept testing grants from app vendors.
- Any clinical claim is reviewed by Dr. Othniel Brennan-Lee, MD, before publication.
- Any methodological change is signed off by Sienna Dvorak-Park, MA.
- We retest every flagship app every six months.
- Corrections are timestamped and visible in the page footer.
Who tests
The lead reviewer is Reuben Castelló-Frey, MS RD. The methodology owner is Sienna Dvorak-Park, MA. The medical reviewer is Dr. Othniel Brennan-Lee, MD. Read more on the about page.
Conflicts of interest
None of the BestCalorieApps editorial team holds equity in any of the calorie tracking apps reviewed on this site. None of us has consulting relationships, speaker fees, or grant funding from any of the parent companies. We disclose individual app usage on a per-author basis on the relevant author page.
Reproducibility
Every accuracy number on this site can be reproduced by following this protocol. If you reproduce one of our numbers and get a different result, write to us — the corrections inbox is staffed and we publish updates when the data changes.