Anonymized decision-rule analysis report

Question: can stratified cutoffs or a small combined model improve the baseline decision rule?

Bottom line

Best approach
Logistic regression
Accuracy
89.0%
CV AUC (logreg)
0.934
CV AUC (tree)
0.926

Adding Sex and Delta days improves classification accuracy from the 73.0% baseline. The cleanest pattern: Delta days carries the most information of the two new variables, while Sex shifts the cutoff only slightly.

Approach A — Sex-stratified cutoffs (one cutoff per sex)

Sexn# PositiveCutoff (Result ≤)SensSpecAUC
Female60315.32690.3%55.2%0.758
Male4094.73888.9%80.6%0.878

Combined performance using sex-specific cutoffs: accuracy 77.0%, sensitivity 90.0%, specificity 68.3%.

Approach B — Delta-days–stratified cutoffs (two cutoffs by time window)

We searched all candidate split points on Delta days that leave at least 10 subjects per side. The best split was Delta days = 0, giving:

Rule: If ΔDays < 0: predict Positive when Result ≤ 2.967. Otherwise: predict Positive when Result ≤ 5.326.

Performance: accuracy 83.0%, sensitivity 92.5%, specificity 76.7%.

Approach C — Grey-zone (two cutoffs on Result alone)

Targets: ≥ 95% specificity for the rule-in (positive) zone, ≥ 95% sensitivity for the rule-out (negative) zone.

On the 44 classified subjects: accuracy 86.4%, sensitivity 81.2%, specificity 89.3%, F1 0.812.

Caveat: the indeterminate band is large because the distributions overlap. A grey-zone rule is best used as a triage: confidently classify the easy cases, send the indeterminate band for confirmatory testing.

Approach D — Logistic regression (Result + Sex + Delta days)

FeatureCoefficient
(Intercept)+1.9673
Result-0.7711
Sex_Female+0.6103
DeltaDays+0.0471

The decision boundary in score space corresponds to logit = -0.186 (probability cutoff = 0.454). Held-out (5-fold CV) AUC = 0.934; CV accuracy at the CV-tuned threshold = 89.0% (sens 80.0%, spec 95.0%).

Approach E — Decision tree depth=3 (interpretable multi-cutoff rule)

|--- DeltaDays <= 24.00
|   |--- DeltaDays <= -1.50
|   |   |--- class: 0
|   |--- DeltaDays >  -1.50
|   |   |--- Result <= 5.43
|   |   |   |--- class: 1
|   |   |--- Result >  5.43
|   |   |   |--- class: 0
|--- DeltaDays >  24.00
|   |--- Result <= 4.07
|   |   |--- Result <= 3.70
|   |   |   |--- class: 1
|   |   |--- Result >  3.70
|   |   |   |--- class: 1
|   |--- Result >  4.07
|   |   |--- class: 1

Held-out (5-fold CV) AUC = 0.926; CV accuracy = 85.0%.

Side-by-side comparison

ApproachAccuracySensSpecF1AUCNote
Single global cutoff (Result ≤ 5.326)73.0%92.5%60.0%0.7330.803iteration 1 baseline
Sex-stratified cutoffs (F: 5.326, M: 4.738)77.0%90.0%68.3%0.758in-sample
Delta-days split @ 0 (lo: 2.967, hi: 5.326)83.0%92.5%76.7%0.813in-sample
Logistic regression (Result+Sex+ΔDays) train89.0%80.0%95.0%0.8530.938in-sample
Logistic regression (Result+Sex+ΔDays) 5-fold CV89.0%80.0%95.0%0.8530.934honest out-of-fold
Decision tree depth=3 train88.0%97.5%81.7%0.867in-sample
Decision tree depth=3 5-fold CV85.0%87.5%83.3%0.8240.926honest out-of-fold

Recommendation

All in-sample numbers (sex-stratified, delta-split, tree-train) are optimistic. The CV AUC of the logistic regression (0.934) is the most honest single-number summary of "what would this rule do on a fresh patient?"