18,696 held-out SMS samples
Detection accuracy, explained with evidence and caveats.
ScamGuard is a WebyStudio product. This page shows the numbers behind message and URL detection, including held-out benchmarks, model checksums, Indian-script coverage, and where we still avoid over-claiming.
35,999 held-out URL samples
9/9 (100%) scam recall, 6/6 (100%) legit precision
1,05,939 real + 36,178 augmented
How ScamGuard reaches a risk decision
Every scan is evaluated by several independent signals before ScamGuard produces a final risk explanation for the user.
Heuristic Signal Engine
Extracts risky entities, urgency language, payment cues, brand impersonation, and scam-category patterns before any model score is applied.
Machine Learning Classifier
Uses character n-grams and India-context SMS training data to catch altered wording, Hinglish, and regional scam phrasing.
NLP Assist Layer
Adds phishing and manipulation-tactic checks for polished messages that avoid obvious scam keywords.
Highest-Risk Wins
The strongest confident signal becomes the final risk explanation, so users see why ScamGuard raised the warning.
Benchmarks that are easy to scan
Message and URL detection are measured separately, with test sizes, benchmark scores, and model artifacts shown beside the claim.
URL Classification Model v4
Headline URL accuracy includes many easy bare-domain examples. Path-rich legitimate URLs are harder, so ScamGuard treats this as a strong signal, not a safety guarantee.
Indian-script smoke test history
The smoke test tracks scam recall and legitimate-message precision across real Indian scripts, while keeping the small sample size visible.
Indian language and scam-pattern coverage
ScamGuard separates smoke-verified languages, training-covered languages, and new v5.2 additions instead of treating all coverage as equal.
20 scam templates
What we can cite publicly
Published results include the sample size, the measurable outcome, and the limitation that should stay attached to the claim.
| Evidence | Samples | Result | Limitation |
|---|---|---|---|
| SMS v5.2 held-out benchmark | 18,696 | 96.54% accuracy, 95.84% recall | Good benchmark coverage, but real-world fraud wording will keep changing. |
| Indian-script smoke test | 15 | 15/15 correct across scam and legitimate banking SMS | Small hand-crafted proxy; not a replacement for large native-labeled corpora. |
| URL model v4 held-out benchmark | 35,999 | 99.33% accuracy, ROC AUC 0.9996 | Easy bare-domain samples inflate headline accuracy; path-rich legitimate URLs remain harder. |
| South Indian language coverage | Smoke test + synthetic training | Covered in pipeline, but benchmark depth varies by language | Needs larger native-labeled Tamil, Telugu, Kannada, and Malayalam validation sets. |
ScamGuard is a seatbelt, not a guarantee.
We publish checksums, sample sizes, and limitations. We do not claim 100% real-world scam detection, equal accuracy in every language, or guaranteed safety.