Accuracy proof page

Detection accuracy, explained with evidence and caveats.

ScamGuard is a WebyStudio product. This page shows the numbers behind message and URL detection, including held-out benchmarks, model checksums, Indian-script coverage, and where we still avoid over-claiming.

Message Model96.54%

18,696 held-out SMS samples

URL Model99.33%

35,999 held-out URL samples

Indian-Script Smoke Test15/15

9/9 (100%) scam recall, 6/6 (100%) legit precision

Training Samples1,42,117

1,05,939 real + 36,178 augmented

Detection Stack

How ScamGuard reaches a risk decision

Every scan is evaluated by several independent signals before ScamGuard produces a final risk explanation for the user.

01

Heuristic Signal Engine

Extracts risky entities, urgency language, payment cues, brand impersonation, and scam-category patterns before any model score is applied.

02

Machine Learning Classifier

Uses character n-grams and India-context SMS training data to catch altered wording, Hinglish, and regional scam phrasing.

03

NLP Assist Layer

Adds phishing and manipulation-tactic checks for polished messages that avoid obvious scam keywords.

04

Highest-Risk Wins

The strongest confident signal becomes the final risk explanation, so users see why ScamGuard raised the warning.

Model Reports

Benchmarks that are easy to scan

Message and URL detection are measured separately, with test sizes, benchmark scores, and model artifacts shown beside the claim.

Message Detection

SMS / Message Classification Model v5.2

96.54%Accuracy
93.19%Precision
95.84%Recall
94.50%F1 Score
12,496True NegativeLegit passed
406False PositiveLegit flagged
241False NegativeScam missed
5,553True PositiveScam caught
SHA-256: 48abaaa3c82c74707ddd1bfa485dc4294aa1aca80ff72ad3be9f387ff943b69d
URL Detection

URL Classification Model v4

99.33%Accuracy
98.23%Precision
99.10%Recall
0.9996ROC AUC

Headline URL accuracy includes many easy bare-domain examples. Path-rich legitimate URLs are harder, so ScamGuard treats this as a strong signal, not a safety guarantee.

2,03,993training URLs
35,999test URLs
SHA-256: 301cb136006c33a9382bf8fcb9fe789d87f1886b72a9e13d49916aad2c2bd584
Progression

Indian-script smoke test history

The smoke test tracks scam recall and legitimate-message precision across real Indian scripts, while keeping the small sample size visible.

v4.1 NLLBscam 5/9 · legit 2/6
7/15
v4.2 M2M100scam 9/9 · legit 0/6
9/15
v5.0 multilingualscam 9/9 · legit 6/6
15/15
v5.1 India expandedscam 9/9 · legit 6/6
15/15
v5.2 syntheticscam 9/9 · legit 6/6
15/15
Coverage

Indian language and scam-pattern coverage

ScamGuard separates smoke-verified languages, training-covered languages, and new v5.2 additions instead of treating all coverage as equal.

हिंदीHindiSmoke verified
தமிழ்TamilSmoke verified
বাংলাBengaliSmoke verified
తెలుగుTeluguSmoke verified
मराठीMarathiTraining covered
ગુજરાતીGujaratiTraining covered
ಕನ್ನಡKannadaTraining covered
മലയാളംMalayalamTraining covered
ਪੰਜਾਬੀPunjabiTraining covered
اردوUrduTraining covered
ଓଡ଼ିଆOdiaTraining covered
অসমীয়াAssameseNew v5.2
नेपालीNepaliNew v5.2
EnglishEnglishBenchmark covered

20 scam templates

KYC blockLottery prizeUPI fraudGovernment schemeIncome tax refundJob fraudCourier feeElectricity billTRAI SIM blockPersonal loanInsurance bonusAadhaar linkRBI noticeEPF withdrawalIRCTC refundFasTag KYCLegal noticePassport issuePAN misuseLPG subsidy
Evidence

What we can cite publicly

Published results include the sample size, the measurable outcome, and the limitation that should stay attached to the claim.

EvidenceSamplesResultLimitation
SMS v5.2 held-out benchmark18,69696.54% accuracy, 95.84% recallGood benchmark coverage, but real-world fraud wording will keep changing.
Indian-script smoke test1515/15 correct across scam and legitimate banking SMSSmall hand-crafted proxy; not a replacement for large native-labeled corpora.
URL model v4 held-out benchmark35,99999.33% accuracy, ROC AUC 0.9996Easy bare-domain samples inflate headline accuracy; path-rich legitimate URLs remain harder.
South Indian language coverageSmoke test + synthetic trainingCovered in pipeline, but benchmark depth varies by languageNeeds larger native-labeled Tamil, Telugu, Kannada, and Malayalam validation sets.
Honest Claim Policy

ScamGuard is a seatbelt, not a guarantee.

We publish checksums, sample sizes, and limitations. We do not claim 100% real-world scam detection, equal accuracy in every language, or guaranteed safety.

No guaranteed safety claims
Dataset size shown beside metrics
Model checksums included
Language limitations visible