Skip to main content

Wallet Safety Benchmarks

How do you know a wallet protects users instead of just claiming to?

Benchmark wallet safety on architectural guarantees, not feature checklists. A wallet that passes these benchmarks prevents known failure classes by design. A wallet that fails them relies on users making no mistakes — which is not safety.

Benchmark Checklist (Landing Surface)

This checklist is the deterministic entry point. Review starts here, then moves to chain-specific implementations.

DimensionDeterministic CheckResult
Connection SafetyApp never receives key material; signing fully delegatedPass / Warn / Fail
Transaction TransparencyPreview shows full effect before sign (assets, gas, steps)Pass / Warn / Fail
Destructive ProtectionIrreversible actions require explicit informed consentPass / Warn / Fail
Asset VisibilityAll owned assets are enumerable; nothing hiddenPass / Warn / Fail
Asset OperationsTransfers/listings are safe by default; no key lifecycle side-effectsPass / Warn / Fail

If any critical check fails, overall result is Fail.

Three Implementations (Comparison Track)

Use one checklist, three implementations:

TrackStatusPrimary PrimitiveComparison Purpose
SolanaActive referenceRuntime guards + simulationBaseline implementation for account-based chain safety
Sui (Move)Primary engineering targetObject model + Move constraintsDeterministic safety with stronger architectural guarantees
EVMPlannedAccount/storage model + contract-level guardsCross-ecosystem comparability and broader adoption

The benchmark is chain-agnostic; implementations are chain-specific.

Value Role

Without BenchmarksWith Benchmarks
Safety claims are marketingSafety claims are testable
Each team reinvents protectionsProven patterns are reusable
Failures are discovered by usersFailure classes are prevented by architecture
Trust depends on brandTrust depends on evidence

Core Benchmarks

Five dimensions, each derived from a known failure class in wallet engineering. Every dimension has a reference implementation proven across Sui and Solana.

1. Connection Safety

Can the wallet establish a session without exposing private keys?

CriterionThresholdTest Method
App never receives private key or seed phraseZero exposureCode audit: no key material in app state or network calls
Wallet adapter delegates signing to user's wallet100% of transactionsIntegration test: app requests signature, wallet signs
Seedless onboarding path exists (zkLogin or equivalent)AvailableFunctional test: complete onboarding without seed phrase
Disconnection fully clears session stateZero residual authState audit after disconnect: no tokens, keys, or session data

Reference: Connection Patterns — Sui, Connection Patterns — Solana

2. Transaction Transparency

Can the user see exactly what will happen before signing?

CriterionThresholdTest Method
Transaction simulation available before signing100% of transaction typesDry-run every supported operation, verify preview matches outcome
Object/balance changes shown in human-readable formAll affected assets visibleUI test: compare preview display against actual state change
Gas cost estimated before executionEstimate within 10% of actualCompare estimate to settled cost across 100 transactions
Multi-operation transactions show all stepsEvery operation in batch visiblePTB/batch test: verify each sub-operation is individually listed

Reference: Transaction Safety — Sui, Transaction Safety — Solana

3. Destructive Operation Protection

Does the wallet prevent irreversible actions without explicit, informed consent?

CriterionThresholdTest Method
Destructive operations require multi-step confirmationAll irreversible actions gatedAttempt every destructive operation: verify confirmation dialog fires
Confirmation includes plain-language description of consequences100% of destructive dialogsUX audit: can a non-expert understand what they will lose?
Typed confirmation required for high-severity actionsKey deletion, large transfersAttempt without typing: verify action is blocked
Cooldown period for highest-severity operationsConfigurable delay (default > 0)Timer test: verify action cannot execute before cooldown expires
Notifications never trigger key lifecycle operationsZero key mutations from notificationsSimulate every notification type: verify no key/seed state change

Reference: Destructive Operations — Sui, Destructive Operations — Solana

4. Asset Visibility

Does the wallet show everything the user owns, with nothing hidden?

CriterionThresholdTest Method
All owned assets enumerable in one view100% of owned objects/tokensCompare wallet display against on-chain state query
Value-at-risk calculation before destructive operationsTotal value shownPre-destruction audit: verify amount displayed matches chain state
Hidden or zero-display assets flaggedNo silent omissionsCreate edge-case assets (dust, unknown tokens): verify they appear
Asset type filtering and search availableFunctionalUX test: filter by type, search by name, verify results

Reference: Object Audit — Sui, Balance Guard — Solana

5. Asset Operations

Are transfers, swaps, and listings safe by default?

CriterionThresholdTest Method
Transfer previews recipient and amount before signing100% of transfer typesInitiate transfer: verify preview before confirmation
Asset operations never trigger key lifecycle changesZero key mutationsExecute every asset operation: audit key state before and after
Marketplace integration uses standard protocolsKiosk, escrow, or equivalentList/buy/delist: verify standard protocol used, not custom
Failed transactions revert cleanly with clear errorNo partial state corruptionForce failure scenarios: verify state rollback and error message

Reference: Asset Operations — Sui, Asset Handling — Solana


Scoring

Each dimension scores Pass / Warn / Fail:

ResultConditionAction
PassAll thresholds met for the dimensionPromote: safe for production use
WarnOne non-critical threshold missedCorrect and re-test within one cycle
FailAny critical threshold missedHold deployment until resolved

Critical thresholds (automatic Fail if missed):

  • App receives private key or seed phrase (Dimension 1)
  • Destructive operation executes without confirmation (Dimension 3)
  • Notification triggers key lifecycle change (Dimension 3)
  • Owned assets not visible to user (Dimension 4)

Certification Mode (Deterministic)

Treat this as certifiable evidence, not narrative review.

RequirementDeterministic Rule
Evidence formatEach dimension must include reproducible test evidence (code path, test case, observed output)
Reviewer independenceBuilder and commissioner cannot be the same actor
RepeatabilitySame test inputs produce same result state
State modelResult is explicit: Pass / Warn / Fail with threshold reason
Audit trailResults and decision traces are stored for re-verification

Target direction: encode benchmark attestations onchain, beginning with Sui/Move.

Aggregate Score

LevelRequirementMeaning
Level 0UntestedNo benchmark evidence
Level 13 of 5 dimensions Pass, zero FailMinimum viable safety
Level 2All 5 dimensions PassProduction-grade safety
Level 3Level 2 + proven across 2+ chainsCross-chain safety standard

The Sui Wallet Safety PRD targets Level 3 — patterns proven on both Sui and Solana.


Chain-Specific Considerations

The same five dimensions apply across chains, but the architectural primitives differ:

DimensionAccount-Based Chains (Solana, EVM)Object-Based Chains (Sui)
Connection SafetyWallet adapter patternWallet adapter + zkLogin (seedless)
Transaction TransparencySimulation via RPC dry-runPTB inspection (1024 ops, atomic)
Destructive ProtectionRuntime checks in UI codeCompile-time guarantees (Move type system)
Asset VisibilityMust discover token accountsAll objects enumerable by default
Asset OperationsChain-specific token standardsUnified object model (coins = NFTs = objects)

Object-based chains have architectural advantages in dimensions 3 and 4 — the type system prevents failure classes that account-based chains must guard against in UI code.


Operating Cadence

CadenceActivity
Pre-deployFull benchmark suite against new wallet build
Per-releaseRegression test on all five dimensions
MonthlyEdge-case audit (dust tokens, unknown assets, gas spikes)
QuarterlyCross-chain benchmark comparison and threshold review

Adoption Path

These benchmarks are designed to be extractable — any wallet team can adopt them:

StageWhat HappensOutput
1. Self-auditTeam runs benchmarks against their walletScore card (Level 0-3)
2. Publish resultsScore card made publicComparable safety claims
3. Peer reviewIndependent team verifies scoreValidated safety level
4. Standard adoptionMultiple wallets benchmark to same specIndustry safety standard

The goal is not certification. The goal is comparable, evidence-based safety claims that users can evaluate before trusting a wallet with their assets.


Context