Part 2 of 3. ← Part 1 — the unreasonable reality. Part 3 — workflow, CLAUDE.md, MCP, .claude folder, cross-review with Codex →.
Accounts and billing — the pitfall-heaviest section

Sub2API account management: an iOS 20x account rate-limited (429), 5h window 100%, auto-recovers in 35 min.
For users outside the US, billing alone filters out half the people. Here's the path I walked.
IP is the first gate. Anthropic detects IPs aggressively; direct connections from many regions risk a ban. My approach: a US residential-IP VPS running a relay (sub2api, findable on GitHub). Must be a residential IP, not a datacentre IP — many cheap VPS providers give datacentre IPs that Anthropic identifies.
Three payment paths, each with a catch:
- Web direct. Use a US residential IP to subscribe on claude.ai directly, credit card or Google Pay. Cleanest. Price is the listed price, no extra fees.
- Google Pay. Subscribe via Google Play. Convenient, but you get charged ~$50 extra in tax/fees. A tax-free-state address theoretically avoids it; I tried a few rounds and never fully solved it.
- Apple Store. iOS in-app purchase, same problem — ~$50 extra. Apple tax.
Which plan? I bought 2× Max x20.
Conclusion first: if you're a heavy user (8+ hours a day on Claude Code), one Max x20 ($200/mo) isn't enough. My current setup is two Max x20 accounts running in parallel. The reason — one x20's 5-hour window is ~900 messages; one plan averages 8 messages; so a 5-hour window does ~100 tasks. Sounds like a lot, but if you run multiple cmux panes in parallel plus occasional Opus-heavy requests, you'll drain a morning's quota by lunch.
Two x20s in rotation covers a full day of heavy development. One hits the wall, switch to the other, wait for the 5-hour window to reset, switch back. $400/mo isn't cheap, but a full-time junior dev's monthly salary is much more.
If you're unsure of your usage, don't open with x20. Start with Max x5 ($100/mo) for a week or two to feel your daily rhythm. Upgrade if it's not enough; Anthropic supports upgrade anytime, prorated. Most non-all-day users find one x5 is actually enough — provided you manage context well (covered below). If x5 hits the wall by afternoon regularly, consider x20 or the dual-account approach.
Tools — terminal and monitoring

Vibe Island: the MacBook notch as a real-time AI-agent control panel.

cmux: a native macOS terminal on Ghostty — vertical tabs, split panes, agent-done notifications.

CodexBar: a macOS menu-bar readout of each tool's quota and reset countdown.

claude-hud plugin: real-time context-consumption percentage in the status bar.
Vibe Island turns the MacBook notch into an AI console. The most pleasant surprise I've found recently — it turns the MacBook notch area into a real-time control panel for AI agents. Claude Code, Codex, Gemini CLI, Cursor — all agent statuses at a glance, approve permission requests right in the notch, answer Claude's questions, jump to the right terminal. You write docs or read requirements in the editor while Claude Code runs in the background; when it needs you, the notch pops out — click Allow or Deny, no window-switching. Zero config, writes all hooks on first launch. Native Swift, under 50MB memory — not another Electron wrapper. One-time $19.99, no subscription.
cmux as the terminal. If you run multiple Claude Code instances at once, you need a good terminal multiplexer. cmux is a native macOS app on Ghostty — vertical tabs, split panes, embedded browser, and crucially it auto-detects Claude Code task completion and sends desktop notifications. Lifesaving for parallel development. Better than tmux: no config file, no prefix key, native GUI integration. Run 3 agents on different features, cmux notifies you when any one needs human intervention.
CodexBar for usage. A macOS menu-bar tool showing your Claude Code quota in real time — how much of the 5-hour window used, weekly remaining, when it resets. Anthropic doesn't provide this (deliberately), so you install it yourself. Supports a dozen tools' usage monitoring (Codex, Cursor, Gemini), local parsing, no password. Free, open source.
Community plugins:
/plugin install claude-hud— real-time context-consumption percentage in the status bar, so you can see at a glance how much space is left./plugin install code-simplifier— official, auto-simplifies redundant code, indirectly saves tokens.
Config — my full settings.json
File location: ~/.claude/settings.json
{
"alwaysThinkingEnabled": true,
"model": "opus",
"enabledPlugins": {
"claude-hud@claude-hud": true
},
"statusLine": {
"type": "command",
"command": "(claude-hud status-line command, auto-configured on install)"
},
"env": {
"ANTHROPIC_AUTH_TOKEN": "sk-your-token",
"ANTHROPIC_BASE_URL": "http://your-relay",
"CLAUDE_CODE_NO_FLICKER": "1",
"CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC": "1",
"CLAUDE_CODE_DISABLE_FILE_CHECKPOINTING": "1",
"CLAUDE_CODE_DISABLE_FEEDBACK_SURVEY": "1",
"CLAUDE_CODE_DISABLE_AUTO_MEMORY": "0",
"CLAUDE_CODE_ADDITIONAL_DIRECTORIES_CLAUDE_MD": "1",
"DISABLE_EXTRA_USAGE_COMMAND": "1",
"MAX_CONSECUTIVE_AUTOCOMPACT_FAILURES": "3"
}
}
Line by line:
| Variable | Value | Effect |
|---|---|---|
| CLAUDE_CODE_NO_FLICKER | 1 | Stabilises the new Claude Code TUI |
| CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC | 1 | Important. Kills all non-essential network requests — telemetry, analytics, error reports. Mandatory if you use a relay; reduces traffic and detection risk |
| CLAUDE_CODE_DISABLE_FILE_CHECKPOINTING | 1 | Disables file checkpoints. By default Claude snapshots files for rollback on each edit; turning it off saves I/O and tokens |
| CLAUDE_CODE_DISABLE_FEEDBACK_SURVEY | 1 | Disables the satisfaction survey popup; don't interrupt my flow |
| CLAUDE_CODE_DISABLE_AUTO_MEMORY | 0 | Keep auto-memory (0 = don't disable). Claude organises your project preferences at night — worth keeping |
| CLAUDE_CODE_ADDITIONAL_DIRECTORIES_CLAUDE_MD | 1 | Loads CLAUDE.md from extra directories added via --add-dir. Mandatory for multi-repo work |
| DISABLE_EXTRA_USAGE_COMMAND | 1 | Disables extra usage-query requests; every bit helps |
| MAX_CONSECUTIVE_AUTOCOMPACT_FAILURES | 3 | Circuit-breaker — stops after 3 consecutive auto-compact failures, preventing a token-burning loop |
Launch — my daily alias
alias cc="claude --dangerously-skip-permissions"
One alias. --dangerously-skip-permissions skips all file-operation permission prompts; the efficiency gain in agent mode is significant. Note: this means you fully trust Claude not to do something dumb.
Daily flow: open cmux → split panes by task (one bug per pane, one feature per pane) → cc in each pane → /clear when done → next. cmux is your task manager; each pane is an independent agent workstation.
Mode — opusplan is the hidden optimum

opusplan mode: Opus reasons in the Plan phase, auto-switches to Sonnet for execution — ~68% token savings.
Claude Code has three permission modes, cycled with Shift+Tab: Normal (confirm each step) → Auto-Accept (auto-execute) → Plan (plan only, don't execute). Most people only know the first two.
The real killer is /model opusplan.
opusplan is a secret menu model alias: Opus for architecture decisions and complex reasoning in Plan mode, auto-switching to Sonnet for code in the execution phase. You don't switch models manually — it decides.
Why optimal? Opus's reasoning is a tier above Sonnet, but its token consumption is a tier higher too. Most code generation is fine on Sonnet; only the figure out how step needs Opus. opusplan allocates automatically — like the head chef writing the recipe and the line cook executing. Quality doesn't drop, cost drops a lot.
Real data: a 100K-token feature implementation, opusplan saves ~68% tokens versus all-Opus, with no architecture quality loss.
My daily routine: first thing on launch, /model opusplan. For a particularly nasty bug (race condition, type inference, weird third-party error), temporarily /model opus for full Opus thinking, then switch back.
Effort level is worth tuning too. /effort selects low / medium / high / max. Low and medium suit simple tasks (rename a variable, add a field) — fast and cheap. High is default. Max is Opus-only, no thinking-token budget cap, for the "I genuinely cannot figure out why this code is broken" moments.
Context management — 40% is the key number, cmux is the vehicle
The most important section.
Claude Code's context window defaults to 200K tokens (Max users on Opus can reach 1M). The system triggers auto-compaction at 87% usage. But waiting for 87% means quality has already started dropping — at high context occupancy, attention scatters and the model starts dropping instructions.
My experience: manually /compact at 40–50%.
But /compact is a tactic. The strategy is — use cmux for task isolation, one clean context per task.
My actual workflow: open multiple cmux panes, each for one independent task. Left pane runs "fix the auth module bug," right pane runs "add a new API endpoint," another runs "write unit tests." Each pane is its own Claude Code instance.
The key: when a task finishes, /clear right in that pane, then start the next. Don't run multiple unrelated tasks in one context — the biggest token waste. If you ask Claude to write a UI component right after fixing an auth bug, all the auth context (files read, plans discussed, tests run) sits uselessly in the window, and the model has to remember the useless while understanding the new.
So the core rhythm:
- Within one task. Manual
/compactevery 5–6 turns, trigger at 40% context. You can add retention instructions, e.g./compact keep the API change list and modified file list. - On task switch.
/clearto wipe completely. The cmux pane stays, your file changes stay, but Claude's context resets — it re-reads CLAUDE.md for project background and starts the new task with minimal context. - Parallel tasks. cmux multi-pane, each runs its own, no cross-pollution. Vibe Island shows each pane's status in the notch.
With claude-hud installed, the status bar shows context-occupancy percentage in real time. Glance at it to know whether to compact.
Advanced config: to trigger auto-compaction earlier than 87%, set CLAUDE_AUTOCOMPACT_PCT_OVERRIDE. Set it to 50 and the system auto-compacts at 50% — combine with manual compaction for a double safety net.
The cmux + /clear combo essentially simulates assigning a fresh intern to each task. Each intern only knows their own thing, undisturbed by others' work. Far more efficient — and cheaper — than one intern doing everything from morning to night.
Off-peak usage
Anthropic has written peak-hour quota reduction into formal policy: US Eastern 8am–2pm is peak, quota shrinks. Weekday off-peak and all weekend are more generous. If you're in an Asian timezone, you have a natural advantage — your day is their overnight trough.
Continue: Part 3 — workflow, CLAUDE.md, MCP, .claude folder, cross-review with Codex →