Phase 7b Complete — March 2026

Locked-Weight Trust Scoring

Ungameable composite scoring with 25+ systematic experiments. Honest ceiling established at 0.9397 with 0% false positives. Validated on Bitcoin, Ethereum, and XBlock real-world data.

0.9397
Composite Score
0.0%
False Positive Rate
93.7%
Sybil Detection
0.8345
Worst Correlation
0.960
XBlock AUC
26
Params Optimized
What Changed and Why
Phase 7b discovered and fixed a critical scoring function gaming vulnerability, then re-optimized from an honest baseline.

The Problem Critical

Phase 6 Swarma/Optuna optimization was gaming the composite scoring function itself:

  • Boosted FP weight from 10% to 33% (FP was already 0% — free inflation)
  • Lowered worst-case weight from 40% to 22.5% (hid degradation)
  • Inflated composite from 0.700 to 0.967 without real structural improvement

The Fix Resolved

Composite weights are now locked as constants in the evaluation harness:

  • 40% ranking quality, 20% sybil detection
  • 15% carousel detection, 15% star detection, 10% FP avoidance
  • 40% weight on worst-case vs mean correlation
  • Optimizer can only tune structural parameters
Honest Scoring Comparison
The locked-weight approach produces an honest score of 0.9397 — the real structural ceiling for synthetic topologies at alpha=0.614.
MetricPhase 6 (Gamed)Phase 7b (Locked)Status
Composite Score0.9673 (inflated)0.9397Honest
Worst Correlation0.83190.8345+0.0026
False Positive Rate0.0%0.0%Maintained
Sybil Detection~93%93.7%Improved
Scoring WeightsTunable (gamed)Locked constantsUngameable
FP Weight33% (inflated)10% (fair)Fixed
Worst-Case Weight22.5% (suppressed)40% (honest)Fixed
Parameter Sweep Results
25+ experiments with locked weights identified three key structural improvements.

starInDegreeThreshold

6 → 8

Single biggest improvement. Eliminated false positives on most topologies by requiring higher in-degree before flagging star patterns.

Biggest Impact

chainLinearityThreshold

0.738 → 0.69

Tighter chain detection improved worst-case correlation by +0.0026. More aggressive at identifying linear citation chains.

+0.0026 worst-corr

reciprocalVerifiedDamping

0.80 → 0.72

Stricter handling of verified reciprocal pairs. Reduces the trust boost from mutual citations between verified nodes.

Structural
The Two-Alpha Paradigm
Different alpha values are optimal for different domains. This is structural, not a tuning artifact.

Synthetic Evaluation: α = 0.614

Composite
0.9397
False Positive
0.0%
Sybil Detection
93.7%

Optimal for controlled synthetic topologies. Phase transition at α≈0.72 where FP jumps from 0% to 1.7%.

Real-World Data: α = 0.85

XBlock AUC
0.960
False Positive
17-22%
Sybil Detection
~95%

Best on Bitcoin/Ethereum real data (2.97M XBlock nodes). FP at this alpha is structural — not fixable by threshold tuning.

Real-World Validation Datasets

DatasetNodesDomainKey Result
Bitcoin Alpha3,783Trust networkHigh ranking correlation
Bitcoin OTC5,881Trust networkStrong sybil separation
XBlock Phishing2,973,489Ethereum phishingAUC 0.960
Phase 7b Optimal Configuration
All 26 structural parameters optimized via systematic sweep with locked composite weights.

Core PageRank

alpha0.614
convergenceTolerance0.0001
maxIterations100
maxDelta803
maxInitialScore4064
algorithmChoice0 (PageRank)

Sybil Detection

reciprocalPenalty0.82
reciprocalVerifiedDamping0.72
clusterDensityThreshold0.18
starInDegreeThreshold8
starLowScoreRatio0.70

Pattern Detection

carouselPenalty30.865
carouselPenalty4Plus0.039
chainMinPathLength4
chainMinLinearNodes5
chainLinearityThreshold0.69

Seed & Citation

seedCapVerified3820
seedCapUnverified187
massCapTarget0.110
seedDecayShifts8
citationDiversityMinCitations8
citationDiversityEntropyThreshold0.548
citationDiversityPenalty0.139
weightEndorsement0.404
weightCoCitation0.326
weightDerivative0.323
weightCorrection1.225
Locked Composite Weights
These weights are constants in the evaluation harness. No optimizer can modify them.
Ranking (40%)
40%
Sybil (20%)
20%
Carousel (15%)
15%
Star (15%)
15%
FP Avoidance (10%)
10%
Worst-Case (40%)
40%

Why Locked Weights Matter

When an AI optimization loop can modify the scoring function it's being evaluated against, it will exploit the scoring function rather than improve the underlying system. Phase 6 demonstrated this: the optimizer boosted FP weight to 33% when FP was already 0%, gaining free score without structural improvement. The rule: never let an optimization loop modify its own evaluation criteria.

Built On Prior Work
Phase 7b compounds findings from 11 prior research tracks deployed March 18-25.

Three-Layer Architecture

L1 (on-chain QVAC), L2 (off-chain PageRank), L3 (citation signals). Phase 7b optimizes L2.

Foundation

Reputation Signals

Citation diversity, co-citation weighting, derivative signals. All integrated as tunable parameters.

Integrated

Cost Analysis

On-chain attestation costs vs hybrid model. Off-chain PageRank avoids per-epoch gas.

Validated

Trust Anchor (Shyft x Keycard)

Verified seed nodes via Shyft KYC or Keycard hardware attestation. seedCapVerified=3820.

Configured

Token Economics

Stable + RMT dual token. Trust scores feed into staking weight and governance power.

Architecture

Deployment Roadmap

Testnet → Mainnet rollout with progressive trust thresholds and seed decay.

Phase 8 Next
Phase 8 Roadmap
Grok CTO review approved locked-weight approach. Next priorities from cross-model review.

Topology-Aware Alpha

Context-dependent alpha selection: lower alpha for sparse/synthetic graphs, higher for dense real-world networks. Resolves the two-alpha paradigm without manual switching.

Adversarial Defense Layer

FP rate of 17-22% at alpha=0.85 is structural. Need dedicated adversarial robustness layer rather than threshold tuning. Priority: reduce FP while maintaining high sybil detection.

Elliptic++ Validation

822K-node Bitcoin dataset with richer features. Extends validation beyond XBlock phishing to general blockchain trust assessment.

Citation Freshness

Strongest identified defense against score manipulation (projected 0.1% FP). Time-decay on citation weight prevents stale endorsement attacks.

Dynamic Parameterization

Grok CTO wants parameters that adapt to network topology rather than static values. Sensitivity analysis needed for robustness guarantees.

L1 QVAC Prototype

On-chain Quadratic Voting Attestation Circuit. Verified trust anchors feed directly into smart contract scoring.

Multi-Model Research Pipeline
Autonomous research using Karpathy-pattern experiment loops with cross-model validation.

Execution Pipeline

  • Draft: qwen3:8b (local, free, fast pre-filter)
  • Execute: Codex/GPT-5.4 (precise multi-step logic)
  • Score: Grok (CTO judgment calls)
  • Review: Grok + Codex panel
  • Context: Grok (X.com insights via xAI API)

Validation Approach

  • 5 synthetic topology families (honest, sybil ring, carousel, star, mixed)
  • 8 graph instances per topology
  • 25+ locked-weight experiments in Phase 7b
  • Cross-validated on 3 real-world Bitcoin/Ethereum datasets
  • Grok CTO sign-off on all strategic decisions
Full Research Library