【已更新 Anthropic 最新回应】Claude 官方订阅存在的缓存异常以及临时解决方案

2026-04-11 12:231阅读0评论SEO基础

内容介绍
文章标签
相关推荐

问题描述：

订阅 Claude 官方 Pro 和 Max plan 的佬们最近可能发现了额度的扣除速率比较反常。

常逛的其它几个社区里，甚至出现了简单的几个 “hi” 就导致额度剧烈消耗的案例。

~~（与此同时，站内疑似受害者）：~~

A你真太牛逼了，五个hi消耗 max5x 12% 的五小时额度搞七捻三

可能是上下文有点长，那也很离谱了好吧··· [image] [image]

claude 额度bug，消耗的异常的快？搞七捻三

[image] 全程用的claude opus 4.6，中途会使用ccg调用codex+gemini max 20x，跑了3个项目，2个半小时就81%了？这对吗

Claude code的额度消耗过快开发调优

用的是max*5,最近这几天这个额度用的好快啊，没有问几个问题呢，一个5小时的上限已经用了50%了，已经都是才10%吧，差距这么大啊，太狗了吧

cursor额度是不是暗改了搞七捻三

cursor最近额度计算消耗量明显变大了，ultra账号也禁不起2下蹬，之前一天高强度最多7%-8%，现在轻松20% api额度。有佬了解什么情况么

claude额度告急！！！开发调优

[image] 还有两天才能reset 感觉现在max 20x 额度减的厉害啊不够用了～～～这两天咋办啊

恰巧 Anthropic 前几天某位 ~~要倒血霉的~~ 员工操作不当全网泄露了 Claude Code 的源码，Reddit 上某位老哥（skibidi-toaleta-2137）整了个逆向分析，推断应该是缓存 bug，并提出了可能导致缓存异常的原因和具体 issue ^[1] 。

该 Reddit 帖提到的这几个 issue 可在此查看：

github.com/anthropics/claude-code

[BUG] Conversation history invalidated on subsequent turns

已打开 08:59AM - 29 Mar 26 UTC jmarianski bug has repro platform:linux area:cost area:core regression

### Preflight Checklist - [x] I - [x] This - [x] I ### What's Wrong? While investigating ### What Should Happen? Cache shouldn't ### Error Messages/Logs ```shell Analysis time ---------- 22:22:48 312377 22:23:39 314321 22:24:19 314814 22:33:42 0 22:34:26 11428 22:34:35 317163 22:34:43 317742 22:37:13 318308 07:55:55 0 07:56:22 11428 07:56:40 174975 07:57:25 11428 07:57:51 319054 07:58:05 319162 07:58:21 319874 07:59:21 320707 07:59:34 321100 07:59:47 321709 08:00:10 322657 08:03:00 323272 08:03:12 323698 08:03:22 324670 08:03:29 325199 08:05:30 0 08:05:48 11428 08:06:06 199123 08:06:25 199999 08:06:38 202198 08:08:06 203831 08:08:17 204312 08:09:16 204720 08:10:25 204926 08:10:34 205154 08:10:45 206347 08:10:54 0 08:11:16 11428 08:11:37 207411 08:11:49 208027 08:12:02 209484 08:12:27 209807 08:12:39 210091 08:13:01 210405 08:13:18 211133 08:15:28 212598 08:16:26 213197 08:18:00 213531 08:18:07 213819 08:18:15 214950 08:18:29 215090 08:18:54 216204 08:19:07 216204 08:19:17 216204 08:19:38 216204 08:20:04 216204 08:20:18 216204 08:20:46 216204 08:22:25 216204 08:22:51 216204 08:23:22 11428 08:24:47 11428 08:25:10 11428 08:25:24 11428 08:25:43 11428 08:26:01 11428 08:26:22 11428 08:28:07 11428 08:37:30 11428 --- (Ignore hour, it's another day) When running 21:28:59 11374 22:02:13 0 22:02:23 0 22:02:25 11374 22:04:51 0 22:04:58 0 22:05:00 11374 22:09:20 0 22:09:35 0 22:12:16 0 22:12:18 11374 22:15:36 0 23:22:46 0 23:23:06 0 23:23:09 11374 23:23:26 28636 23:31:41 0 23:31:50 0 23:31:54 11374 23:32:25 28562 23:33:52 28613 23:34:55 0 23:35:12 0 23:35:22 11374 23:36:50 28572 23:37:15 28939 23:37:19 29445 23:37:56 29968 23:38:06 34837 23:38:19 36180 23:38:27 36399 23:38:33 45910 23:38:37 46352 23:38:42 46602 23:38:59 47763 23:39:06 48178 23:39:09 48605 23:39:13 48605 23:40:17 11374 23:41:35 49581 23:43:02 0 23:43:23 11374 23:43:30 51575 23:43:35 51697 23:43:41 52105 23:43:49 52324 23:43:54 52766 23:44:08 53324 23:51:33 0 23:52:30 55917 23:54:20 56348 23:54:29 56640 23:54:54 60657 23:58:58 60670 23:59:05 61387 23:59:23 62234 23:59:34 63516 23:59:45 64540 23:59:53 65123 00:00:10 65407 00:03:07 65877 00:03:16 66901 00:03:25 68999 00:03:36 70491 00:03:43 71534 00:03:51 72238 00:03:58 72488 00:04:03 72843 00:04:10 73239 00:04:31 73674 00:04:45 74033 00:05:00 75628 00:05:18 76542 00:05:31 78152 00:05:41 79531 00:05:51 80905 00:06:06 82481 00:06:20 83110 00:06:30 84212 00:06:48 85286 00:07:01 87704 00:07:13 88584 00:07:30 89402 00:11:02 91567 00:11:12 92109 00:11:25 92886 00:11:39 93928 00:12:21 98211 00:13:24 98250 00:13:32 98712 00:15:06 98986 00:15:37 99176 00:15:42 6637 00:15:44 99538 00:15:47 22806 00:15:49 100965 00:15:50 26309 00:15:54 26500 00:15:57 26717 00:16:02 26869 00:16:06 27476 00:16:09 27622 00:16:14 28052 00:16:19 28052 00:16:23 28594 00:16:28 29034 00:16:32 29154 00:16:37 29293 00:17:09 29549 00:17:12 30614 00:17:24 30745 00:17:28 30745 00:17:35 30982 00:17:39 31122 00:17:46 31344 00:17:51 31570 00:17:56 31858 00:18:01 32072 00:18:04 32388 00:18:09 32875 00:19:29 33009 00:19:35 33152 00:19:36 0 00:19:36 0 10:08:54 11374 10:09:00 69729 10:09:07 70267 10:09:14 77353 10:09:18 77789 10:09:34 78008 10:10:40 78526 10:10:46 78970 10:10:55 80183 10:13:03 81179 10:18:08 81781 10:18:14 82456 10:29:13 82604 10:29:19 83427 ---------- If you provide *- no idea ``` ### Steps to Reproduce Write "cch=00000" Step to ### Claude Model Opus ### Is this a regression? Yes, this worked in a previous version ### Last Working Version Based on reports: 2.1.67 ### Claude Code Version 2.1.86 (Claude Code) ### Platform Anthropic API ### Operating System Ubuntu/Debian Linux ### Terminal/Shell Other ### Additional Information Similar Tool I wrote Verification have searched [existing issues](https://github.…com/anthropics/claude-code/issues?q=is%3Aissue%20state%3Aopen%20label%3Abug) and this hasn't been reported yet is a single bug report (please file separate reports for different bugs) am using the latest version of Claude Code huge token usage I've noticed it come due to fact suddenly my conversation history gets invalidated and all subsequent turns revert to only caching system prompt and huge cache writes. drop due to history changes. History should not be updated. Or we shouldn't be charged for historical updates. of token usage from the start of my analysis: cache_read cache_cr input out model stop ---------- ---------- ------- ----- ------------------ ------------ 1944 1 215 opus-4-6 end_turn 493 3 159 opus-4-6 end_turn 172 3 108 opus-4-6 end_turn 0 8 1 haiku-4-5-20251001 max_tokens <-- resume 305735 3 213 opus-4-6 tool_use <-- irrelevant cache rewrite after restart 579 1 239 opus-4-6 tool_use 566 1 152 opus-4-6 end_turn 245 3 96 opus-4-6 end_turn 0 8 1 haiku-4-5-20251001 max_tokens <-- resume 163547 3 143 opus-4-6 tool_use <-- partial cache regenerate (wth?) 358 1 90 opus-4-6 end_turn 307626 3 87 opus-4-6 end_turn <-- full cache regenerate 108 3 89 opus-4-6 tool_use 712 1 448 opus-4-6 tool_use 833 1 367 opus-4-6 end_turn 393 3 414 opus-4-6 tool_use 609 1 560 opus-4-6 tool_use 948 1 512 opus-4-6 tool_use 615 1 348 opus-4-6 end_turn 426 3 530 opus-4-6 tool_use 972 1 468 opus-4-6 tool_use 529 1 167 opus-4-6 end_turn 215 3 28 opus-4-6 end_turn 0 8 1 haiku-4-5-20251001 max_tokens <-- resume 187695 3 155 opus-4-6 tool_use 876 1 780 opus-4-6 tool_use 2199 1 1285 opus-4-6 tool_use 1633 1 302 opus-4-6 end_turn 481 3 88 opus-4-6 tool_use 408 1 175 opus-4-6 end_turn 206 3 154 opus-4-6 end_turn 228 3 503 opus-4-6 tool_use 1193 1 507 opus-4-6 tool_use 1007 1 247 opus-4-6 end_turn 0 8 1 haiku-4-5-20251001 max_tokens <-- resume 195983 3 136 opus-4-6 tool_use 616 1 1207 opus-4-6 tool_use 1457 1 290 opus-4-6 end_turn 323 3 270 opus-4-6 end_turn 284 3 190 opus-4-6 tool_use 314 1 278 opus-4-6 tool_use 728 1 1219 opus-4-6 tool_use 1465 1 449 opus-4-6 end_turn 599 3 325 opus-4-6 end_turn 334 3 209 opus-4-6 end_turn 288 3 137 opus-4-6 tool_use 1131 1 122 opus-4-6 tool_use 140 1 193 opus-4-6 tool_use 1114 1 269 opus-4-6 tool_use 10504 1 336 opus-4-6 tool_use <-- cache starts breaking down due to history change* 11815 1 228 opus-4-6 tool_use 12990 1 134 opus-4-6 tool_use 13341 1 301 opus-4-6 tool_use 13758 1 426 opus-4-6 tool_use 15278 1 154 opus-4-6 tool_use 15778 1 508 opus-4-6 tool_use 17092 1 208 opus-4-6 tool_use 17894 1 660 opus-4-6 tool_use 224502 1 315 opus-4-6 end_turn <-- cache cannot get regenerated, reverting to full cache write 224953 3 871 opus-4-6 tool_use 227259 1 597 opus-4-6 tool_use 228249 1 356 opus-4-6 tool_use 228669 1 825 opus-4-6 tool_use 229763 1 468 opus-4-6 tool_use 230278 1 339 opus-4-6 end_turn 230642 3 442 opus-4-6 end_turn 231432 3 430 opus-4-6 end_turn "npx @anthropic-ai/claude-code" 46622 1 473 opus-4-6 tool_use <-- still on standalone binary 0 8 1 haiku-4-5-20251001 max_tokens <-- I tried resuming a couple of times 0 340 11 haiku-4-5-20251001 end_turn 15278 3 21 opus-4-6 end_turn 0 8 1 haiku-4-5-20251001 max_tokens 0 341 11 haiku-4-5-20251001 end_turn 15194 3 20 opus-4-6 end_turn 0 8 1 haiku-4-5-20251001 max_tokens 0 8 1 haiku-4-5-20251001 max_tokens 0 341 11 haiku-4-5-20251001 end_turn 15194 3 21 opus-4-6 end_turn 0 8 1 haiku-4-5-20251001 max_tokens 0 8 1 haiku-4-5-20251001 max_tokens 0 341 12 haiku-4-5-20251001 end_turn 17262 3 19 opus-4-6 end_turn 27 3 12 opus-4-6 end_turn 0 8 1 haiku-4-5-20251001 max_tokens 0 345 13 haiku-4-5-20251001 end_turn 17188 3 32 opus-4-6 end_turn <-- start of npx trials 51 3 167 opus-4-6 end_turn 320 3 666 opus-4-6 end_turn 0 8 1 haiku-4-5-20251001 max_tokens 0 355 15 haiku-4-5-20251001 end_turn 17198 3 328 opus-4-6 end_turn 367 3 500 opus-4-6 end_turn 506 3 143 opus-4-6 tool_use 523 1 91 opus-4-6 tool_use 4869 1 1284 opus-4-6 tool_use 1343 1 173 opus-4-6 end_turn 219 3 151 opus-4-6 tool_use 9511 1 341 opus-4-6 tool_use 442 73 77 opus-4-6 tool_use 250 1 77 opus-4-6 tool_use 1161 1 134 opus-4-6 tool_use 415 1 369 opus-4-6 tool_use 427 1 96 opus-4-6 tool_use 393 1 77 opus-4-6 tool_use 639 1 152 opus-4-6 tool_use 38207 3 362 opus-4-6 end_turn 438 3 766 opus-4-6 end_turn 0 8 1 haiku-4-5-20251001 max_tokens <-- another session 40201 3 97 opus-4-6 tool_use 122 1 310 opus-4-6 tool_use 408 1 152 opus-4-6 tool_use 219 93 170 opus-4-6 tool_use 442 1 259 opus-4-6 tool_use 558 1 102 opus-4-6 tool_use 2593 1 403 opus-4-6 end_turn 0 8 1 haiku-4-5-20251001 max_tokens 431 3 187 opus-4-6 tool_use 292 37 284 opus-4-6 tool_use 612 158 492 opus-4-6 tool_use 13 3 508 opus-4-6 end_turn 717 2 454 opus-4-6 tool_use 847 1 336 opus-4-6 tool_use 1282 1 674 opus-4-6 tool_use 1024 1 506 opus-4-6 tool_use 583 1 264 opus-4-6 tool_use 284 1 393 opus-4-6 tool_use 470 1 887 opus-4-6 tool_use 1024 1 871 opus-4-6 tool_use 2098 1 538 opus-4-6 tool_use 1492 1 379 opus-4-6 tool_use 1043 1 640 opus-4-6 tool_use 704 1 233 opus-4-6 tool_use 250 1 148 opus-4-6 tool_use 355 1 249 opus-4-6 tool_use 396 1 259 opus-4-6 tool_use 435 1 278 opus-4-6 tool_use 359 1 941 opus-4-6 tool_use 1595 1 662 opus-4-6 tool_use 914 1 830 opus-4-6 tool_use 1610 1 963 opus-4-6 tool_use 1379 1 640 opus-4-6 tool_use 1374 1 549 opus-4-6 tool_use 1576 1 550 opus-4-6 tool_use 629 1 986 opus-4-6 tool_use 1102 1 994 opus-4-6 tool_use 1074 1 578 opus-4-6 tool_use 2418 1 854 opus-4-6 tool_use 880 1 555 opus-4-6 tool_use 818 1 601 opus-4-6 tool_use 2165 1 520 opus-4-6 end_turn 542 3 691 opus-4-6 tool_use 777 1 733 opus-4-6 tool_use 1042 1 578 opus-4-6 tool_use 3347 1 694 opus-4-6 tool_use 39 3 442 opus-4-6 end_turn 462 3 161 opus-4-6 tool_use 274 1 178 opus-4-6 end_turn 190 3 237 opus-4-6 tool_use 362 1 1202 opus-4-6 tool_use 16169 2 114 opus-4-6 tool_use 1427 1 280 opus-4-6 tool_use 3503 1 160 opus-4-6 tool_use 2929 1 52 opus-4-6 end_turn <-- my joy is great at this point 191 1 91 opus-4-6 tool_use 217 1 92 opus-4-6 tool_use 152 1 88 opus-4-6 tool_use 607 3 125 opus-4-6 tool_use 146 1 166 opus-4-6 tool_use 430 1 105 opus-4-6 tool_use 123 1 109 opus-4-6 tool_use 542 1 176 opus-4-6 tool_use 440 1 95 opus-4-6 tool_use 120 1 112 opus-4-6 tool_use 139 1 180 opus-4-6 tool_use 256 1 206 opus-4-6 tool_use 1065 1 82 opus-4-6 tool_use 131 1 117 opus-4-6 tool_use 135 1 79 opus-4-6 tool_use 237 1 100 opus-4-6 tool_use 140 1 138 opus-4-6 tool_use 222 1 112 opus-4-6 tool_use 226 1 257 opus-4-6 tool_use 288 1 125 opus-4-6 tool_use 214 1 128 opus-4-6 tool_use 316 1 139 opus-4-6 tool_use 487 1 116 opus-4-6 tool_use 134 1 125 opus-4-6 tool_use 143 1 5925 opus-4-6 tool_use 5912 1 113 opus-4-6 tool_use <-- yup, was at 100% usage at this point 0 0 0 - - 0 0 0 - - 58355 3 270 opus-4-6 tool_use <-- costly resume, but cache TTL = 1h in claude code 538 1 277 opus-4-6 tool_use 7086 180 188 opus-4-6 tool_use 436 1 201 opus-4-6 tool_use 219 1 118 opus-4-6 tool_use 518 1 183 opus-4-6 tool_use 444 1 95 opus-4-6 tool_use 1213 1 185 opus-4-6 tool_use 996 1 270 opus-4-6 tool_use 602 1 268 opus-4-6 tool_use 675 1 121 opus-4-6 tool_use 148 1 226 opus-4-6 tool_use 823 1 184 opus-4-6 tool_use 889 1 239 opus-4-6 tool_use ---------- ---------- ------- ----- ------------------ ------------ me with means, I can send you full request/response dumps if this cache breaking was due to me inspecting binary or some historical tool change happened on the background level. in command line and ask claude what does he see. He still should see "cch=00000". And token usage should be all "cache read" mostly, not "cache write" for subsequent requests. temporarily fix: `npx @anthropic-ai/claude-code@2.1.34` // you need to fix it on older version to benefit from it issue: https://github.com/anthropics/claude-code/issues/34629 - this one relates to immediate start of conversation for debugging: https://gitlab.com/treetank/cc-diag script: https://gitlab.com/treetank/cc-diag/-/raw/c126a7890f2ee12f76d91bfb1cc92612ae95284e/test_cache.py

github.com/anthropics/claude-code

[BUG] Prompt cache regression in --print --resume since v2.1.69(?): cache_read never grows, ~20x cost increase

已打开 12:42PM - 15 Mar 26 UTC 已关闭 01:26AM - 01 Apr 26 UTC cinniezra bug has repro platform:linux area:cost regression

Reddit post 里没提到的几个相关 issue：

github.com/anthropics/claude-code

[BUG] Client-side rate limiter blocks requests with zero API calls when conversation transcript is large (~74MB) — false rate_limit error with synthetic model and 0 input/output tokens

已打开 01:21PM - 29 Mar 26 UTC rwp65 bug has repro area:core

### Preflight Checklist - [x] I have searched [existing issues](https://github.…com/anthropics/claude-code/issues?q=is%3Aissue%20state%3Aopen%20label%3Abug) and this hasn't been reported yet - [x] This is a single bug report (please file separate reports for different bugs) - [x] I am using the latest version of Claude Code ### What's Wrong? After hours of inactivity in a long-running session, every new message from the user immediately returns `"API Error: Rate limit reached"` without making any API call. The error is generated client-side by Claude Code, not by the Anthropic API. The user cannot proceed with any work — every message, including simple ones like "proceed", triggers the same error. ### What Should Happen? After hours of inactivity, the rate limit budget should have fully reset. A simple message should be sent to the API and receive a normal response. ### Error Messages/Logs ```shell Session log: `~/.claude/projects/-home-rich-RE6D/7137463d-be5d-4d5e-a97d-bb12b5e44b58.jsonl` **Six consecutive blocked requests between 13:11:09 and 13:11:28 UTC on 2026-03-29:** Each error entry has this structure: { "type": "assistant", "message": { "model": "<synthetic>", "role": "assistant", "usage": { "input_tokens": 0, "output_tokens": 0, "cache_creation_input_tokens": 0, "cache_read_input_tokens": 0 }, "content": [ { "type": "text", "text": "API Error: Rate limit reached" } ] }, "error": "rate_limit", "isApiErrorMessage": true } Key observations: | Field | Value | Significance | |-------|-------|-------------| | `model` | `"<synthetic>"` | NOT a real API response — generated by Claude Code client | | `input_tokens` | `0` | No tokens were sent to the API | | `output_tokens` | `0` | No tokens were received from the API | | `cache_read_input_tokens` | `0` | No cache was accessed | | `isApiErrorMessage` | `true` | Claude Code flagged this as an API error | | `error` | `"rate_limit"` | Client-side classification | **Contrast with the first successful request after the user persisted (13:11:37 UTC):** { "model": "claude-opus-4-6", "usage": { "input_tokens": 3, "cache_creation_input_tokens": 1315, "cache_read_input_tokens": 668864, "output_tokens": 1, "service_tier": "standard" } } ``` ### Steps to Reproduce 1. Run a Claude Code session for multiple days with heavy agent usage (many subagent dispatches, large code changes) 2. Accumulate a conversation transcript of ~74MB (the `.jsonl` file grows as the session continues) 3. Leave the session idle for several hours 4. Send any message (e.g., "proceed") 5. Observe: immediate `"API Error: Rate limit reached"` with no actual API call ### Claude Model Opus ### Is this a regression? Yes, this worked in a previous version ### Last Working Version _No response_ ### Claude Code Version 2.1.81 ### Platform Anthropic API ### Operating System Other Linux ### Terminal/Shell Xterm ### Additional Information # Bug Report: Client-side rate limiter blocks requests with zero API calls when conversation transcript is large ## Title Client-side rate limiter blocks requests with zero API calls when conversation transcript is large (~74MB) — false rate_limit error with synthetic model and 0 input/output tokens ## Environment - **Claude Code Version:** 2.1.81 - **OS:** Ubuntu Linux 6.17.0-19-generic - **Shell:** bash - **Model:** claude-opus-4-6 (1M context) - **Platform:** CLI (`entrypoint: "cli"`) - **Session ID:** 7137463d-be5d-4d5e-a97d-bb12b5e44b58 ## Description After hours of inactivity in a long-running session, every new message from the user immediately returns `"API Error: Rate limit reached"` without making any API call. The error is generated client-side by Claude Code, not by the Anthropic API. The user cannot proceed with any work — every message, including simple ones like "proceed", triggers the same error. ## Steps to Reproduce 1. Run a Claude Code session for multiple days with heavy agent usage (many subagent dispatches, large code changes) 2. Accumulate a conversation transcript of ~74MB (the `.jsonl` file grows as the session continues) 3. Leave the session idle for several hours 4. Send any message (e.g., "proceed") 5. Observe: immediate `"API Error: Rate limit reached"` with no actual API call ## Expected Behavior After hours of inactivity, the rate limit budget should have fully reset. A simple message should be sent to the API and receive a normal response. ## Actual Behavior Claude Code's client-side rate limiter blocks the request before it reaches the Anthropic API. The user sees `"API Error: Rate limit reached"` and cannot use the tool at all. ## Evidence from Logs Session log: `~/.claude/projects/-home-rich-RE6D/7137463d-be5d-4d5e-a97d-bb12b5e44b58.jsonl` **Six consecutive blocked requests between 13:11:09 and 13:11:28 UTC on 2026-03-29:** Each error entry has this structure: ```json { "type": "assistant", "message": { "model": "<synthetic>", "role": "assistant", "usage": { "input_tokens": 0, "output_tokens": 0, "cache_creation_input_tokens": 0, "cache_read_input_tokens": 0 }, "content": [ { "type": "text", "text": "API Error: Rate limit reached" } ] }, "error": "rate_limit", "isApiErrorMessage": true } ``` Key observations: | Field | Value | Significance | |-------|-------|-------------| | `model` | `"<synthetic>"` | NOT a real API response — generated by Claude Code client | | `input_tokens` | `0` | No tokens were sent to the API | | `output_tokens` | `0` | No tokens were received from the API | | `cache_read_input_tokens` | `0` | No cache was accessed | | `isApiErrorMessage` | `true` | Claude Code flagged this as an API error | | `error` | `"rate_limit"` | Client-side classification | **Contrast with the first successful request after the user persisted (13:11:37 UTC):** ```json { "model": "claude-opus-4-6", "usage": { "input_tokens": 3, "cache_creation_input_tokens": 1315, "cache_read_input_tokens": 668864, "output_tokens": 1, "service_tier": "standard" } } ``` This successful request shows `cache_read_input_tokens: 668,864` — the session context is approximately **668K tokens**. This is likely what the client-side rate limiter is counting against the budget. ## Root Cause Hypothesis The client-side rate limiter appears to calculate the token cost of the next request by estimating the context size (668K+ tokens) and checking it against a per-minute or per-hour token budget. For very large sessions, the CONTEXT ALONE may exceed the rate limit budget — even though the user's actual message is just a few tokens. This creates a situation where: - The session grows over days of heavy use - The context window fills with conversation history - Eventually the context size exceeds the rate limit's per-window token budget - Every subsequent request is blocked client-side, regardless of actual API availability - The user is permanently locked out until they start a new session ## Session Size Data | Metric | Value | |--------|-------| | Session transcript file | 74,019,933 bytes (74MB) | | Estimated context tokens | 668,864 (from cache_read_input_tokens) | | Session duration | ~4 days (2026-03-25 to 2026-03-29) | | Subagents dispatched | 50+ over the session | | Session compactions | Multiple (context was compressed during the session) | ## Impact - **Severity:** High — user is completely blocked from using Claude Code - **Workaround:** Start a new session (loses all conversation context) - **User experience:** Extremely frustrating — the error message gives no indication that the session size is the problem, and retrying makes it worse (each retry attempt may count against the budget) ## Suggested Fix 1. **Don't count cached/context tokens against the rate limit budget** — the user isn't "using" more tokens by having a long session. The cache is already paid for. 2. **If rate limiting must include context, reset the budget after idle periods** — hours of inactivity should fully reset any per-minute/per-hour budget. 3. **Show a more helpful error message** — instead of "API Error: Rate limit reached", show "Session context is very large (668K tokens). Consider starting a new session with `/compact` or a fresh session." 4. **Distinguish client-side rate limiting from API rate limiting** — the current message is identical for both, making it impossible for the user to diagnose.

github.com/anthropics/claude-code

[BUG] Silent context degradation — tool results cleared without notification on 1M context sessions this issue documents three separate mechanisms (microcompact, cached microcompact, session memory compact)

已打开 11:50AM - 02 Apr 26 UTC Sn3th bug has repro platform:linux area:core

同一天，Anthropic 的员工之一 Lydia Hallie 在 X/Twitter 上承认了存在额度消耗过快的问题。并已展开排查：

https://x.com/lydiahallie/status/2038686571676008625

Anthropic 也在 r/Anthropic 的 subreddit 上发布了类似官方声明：

https://www.reddit.com/r/Anthropic/comments/1s7zfap/investigating_usage_limits_hitting_faster_than/

经过几天的发酵和社交媒体上用户海量的抱怨，关注到此事的 BBC 昨天也进行了报道：

bbc.com

Claude Code users hitting usage limits 'way faster than expected'

Anthropic, the company behind the AI coding assistant, said it was fixing a problem blocking users.

时间线大概就是如此。

但是仅知道问题出在哪还不够。作为用户，我们目前可以采取的应对手段是什么呢？

首先，Anthropic 昨天发布了 v2.1.91，这个新版本部分解决了 #40524 和 #34629。所以第一步应该是尽快升级至 v2.1.91 （可pin）。
~~卸载官方推荐的独立二进制（bun runtime）ELF，使用 NPM 包进行安装使用，以避免 sentinal replacement 污染 cache prefix，消耗起飞。~~
定期开启新会话。
避免使用恢复会话，包括 --continue, --continue --dangerously-skip-permissions, /resume，这会导致 cache_read -> 0 以及 cache_creation，消耗起飞。
避免使用 /dream 和 /insights，后台 API 调用也会导致消耗起飞。

祈祷 A/ 做个人，麻利地修好 bug，重置额度，降落消耗。

站内类似贴：

某位佬三天前也发过 skibidi-toaleta-2137 那篇 Reddit post 的总结：

Claude code /resume后的缓存失效问题开发调优

看到了顺便转过来。这个原文是claude写的就不贴过来了。 https://www.reddit.com/r/ClaudeAI/comments/1s7mkn3/psa_claude_code_has_two_cache_bugs_that_can/ 验证脚本 https://gitlab.com/treetank/cc-diag/-/raw/c126a7890f2ee12f76d91bfb…

某位佬两天前也提出了类似的应对措施：

针对 Claude Code 额度掉的飞快的一些优化建议开发调优

最近我自己的 Claude Max 的额度消耗巨快。看了网上一些帖子，可能与缓存失效的bug有关，部分应该也是由于 Claude Opus 升级 1M 上下文后带来的上下文负担。目前感觉 5 小时的使用额度，还没用到 2 小时就消耗殆尽了（max 5x）。我跟 Claude Code聊了几轮，让它自己去网上调研这个现象的原因，总结得到以下优化建议（纯手打）： 1）尽量避免直接使用 /re…

来源：

以上信息部分来自于 ArkNill 整理的 claude-code-cache-analysis 报告。

对于 Root Cause 分析和 Benchmark 感兴趣的可在此阅览该报告：

github.com

GitHub - ArkNill/claude-code-cache-analysis: Measured analysis of Claude Code cache bugs...

Measured analysis of Claude Code cache bugs causing 10-20x token inflation on Max plans

希望能帮到佬们~

– 04/03 上午更新 –

写完这篇帖子后，发现 v2.1.91 已经发布了。在最新版本中，可以继续使用 Anthropic 推荐的独立二进制文件，不再需要 NPM 安装使用了。

同时 Anthropic 官方经过多天排查，终于给出了最新“回应”。

Reddit 上 r/ClaudeAI 的官方回应在此。

以及 Lydia Hallie 几乎相同内容的 X/Twitter 贴和翻译：

asdasdasdasdad23qq3214sfd - Copy706×298 29.5 KB

asdasdasdkajsdhakjdhaskjdk231983742897askjdhakjdhakj - Copy1169×147 17.6 KB

adsasdasd2343245sdfdsfd232df23 - Copy706×593 57.9 KB

asdadsakjdhaskdjadkjad2398472984sdjkdhfskjfh239847skdjfhskjdfhs - Copy1161×395 72.5 KB

不出意外，社区对于这个 ~~gaslighting~~ 回应非常满意：

https://reddit.com/r/claude/comments/1satc4f/the_biggest_gaslighting_in_ai_history_anthropic/

asdasdasdadsjhaskjdh123h1kjhkjahsdkjahk2j1h3kjashdkjah - Copy.PNG721×1090 132 KB

言尽于此，只能留下这六字真言，聊表心意：

。

原帖 ↩︎

网友解答：

--【壹】--：

。

--【贰】--：

会导致缓存失效，缓存归0，恢复的会话，上下文又大

--【叁】--：

奇了怪了，我平时就很关注这个问题，但你提的那些帖子我自己从来没刷到过

--【肆】--： paguro:

避免使用恢复会话，包括 --continue, --continue --dangerously-skip-permissions, /resume，这会导致 cache_read -> 0 以及 cache_creation，消耗起飞

这个为什么不能用啊

--【伍】--：

其实，我也是先在其它社区观察到这个问题的。

可能是因为站内很多佬们主要是使用中转服务而不是 Claude 的官方订阅？所以这类帖子能见度/曝光度被稀释了？

毕竟 Claude 订阅针对中国的风控是出了名的难搞，尤其是 Max plan。

--【陆】--：

所以cursor为啥也消耗的飞快了呢

--【柒】--：

RNM 退钱!!!

--【捌】--：

三句话就没了

标签：人工智能软件开发

问题描述：

订阅 Claude 官方 Pro 和 Max plan 的佬们最近可能发现了额度的扣除速率比较反常。

常逛的其它几个社区里，甚至出现了简单的几个 “hi” 就导致额度剧烈消耗的案例。

~~（与此同时，站内疑似受害者）：~~

A你真太牛逼了，五个hi消耗 max5x 12% 的五小时额度搞七捻三

可能是上下文有点长，那也很离谱了好吧··· [image] [image]

claude 额度bug，消耗的异常的快？搞七捻三

[image] 全程用的claude opus 4.6，中途会使用ccg调用codex+gemini max 20x，跑了3个项目，2个半小时就81%了？这对吗

Claude code的额度消耗过快开发调优

用的是max*5,最近这几天这个额度用的好快啊，没有问几个问题呢，一个5小时的上限已经用了50%了，已经都是才10%吧，差距这么大啊，太狗了吧

cursor额度是不是暗改了搞七捻三

cursor最近额度计算消耗量明显变大了，ultra账号也禁不起2下蹬，之前一天高强度最多7%-8%，现在轻松20% api额度。有佬了解什么情况么

claude额度告急！！！开发调优

[image] 还有两天才能reset 感觉现在max 20x 额度减的厉害啊不够用了～～～这两天咋办啊

该 Reddit 帖提到的这几个 issue 可在此查看：

github.com/anthropics/claude-code

[BUG] Conversation history invalidated on subsequent turns

已打开 08:59AM - 29 Mar 26 UTC jmarianski bug has repro platform:linux area:cost area:core regression

### Preflight Checklist - [x] I have searched [existing issues](https://github.…com/anthropics/claude-code/issues?q=is%3Aissue%20state%3Aopen%20label%3Abug) and this hasn't been reported yet - [x] This is a single bug report (please file separate reports for different bugs) - [x] I am using the latest version of Claude Code ### What's Wrong? While investigating huge token usage I've noticed it come due to fact suddenly my conversation history gets invalidated and all subsequent turns revert to only caching system prompt and huge cache writes. ### What Should Happen? Cache shouldn't drop due to history changes. History should not be updated. Or we shouldn't be charged for historical updates. ### Error Messages/Logs ```shell Analysis of token usage from the start of my analysis: time cache_read cache_cr input out model stop ---------- ---------- ---------- ------- ----- ------------------ ------------ 22:22:48 312377 1944 1 215 opus-4-6 end_turn 22:23:39 314321 493 3 159 opus-4-6 end_turn 22:24:19 314814 172 3 108 opus-4-6 end_turn 22:33:42 0 0 8 1 haiku-4-5-20251001 max_tokens <-- resume 22:34:26 11428 305735 3 213 opus-4-6 tool_use <-- irrelevant cache rewrite after restart 22:34:35 317163 579 1 239 opus-4-6 tool_use 22:34:43 317742 566 1 152 opus-4-6 end_turn 22:37:13 318308 245 3 96 opus-4-6 end_turn 07:55:55 0 0 8 1 haiku-4-5-20251001 max_tokens <-- resume 07:56:22 11428 163547 3 143 opus-4-6 tool_use <-- partial cache regenerate (wth?) 07:56:40 174975 358 1 90 opus-4-6 end_turn 07:57:25 11428 307626 3 87 opus-4-6 end_turn <-- full cache regenerate 07:57:51 319054 108 3 89 opus-4-6 tool_use 07:58:05 319162 712 1 448 opus-4-6 tool_use 07:58:21 319874 833 1 367 opus-4-6 end_turn 07:59:21 320707 393 3 414 opus-4-6 tool_use 07:59:34 321100 609 1 560 opus-4-6 tool_use 07:59:47 321709 948 1 512 opus-4-6 tool_use 08:00:10 322657 615 1 348 opus-4-6 end_turn 08:03:00 323272 426 3 530 opus-4-6 tool_use 08:03:12 323698 972 1 468 opus-4-6 tool_use 08:03:22 324670 529 1 167 opus-4-6 end_turn 08:03:29 325199 215 3 28 opus-4-6 end_turn 08:05:30 0 0 8 1 haiku-4-5-20251001 max_tokens <-- resume 08:05:48 11428 187695 3 155 opus-4-6 tool_use 08:06:06 199123 876 1 780 opus-4-6 tool_use 08:06:25 199999 2199 1 1285 opus-4-6 tool_use 08:06:38 202198 1633 1 302 opus-4-6 end_turn 08:08:06 203831 481 3 88 opus-4-6 tool_use 08:08:17 204312 408 1 175 opus-4-6 end_turn 08:09:16 204720 206 3 154 opus-4-6 end_turn 08:10:25 204926 228 3 503 opus-4-6 tool_use 08:10:34 205154 1193 1 507 opus-4-6 tool_use 08:10:45 206347 1007 1 247 opus-4-6 end_turn 08:10:54 0 0 8 1 haiku-4-5-20251001 max_tokens <-- resume 08:11:16 11428 195983 3 136 opus-4-6 tool_use 08:11:37 207411 616 1 1207 opus-4-6 tool_use 08:11:49 208027 1457 1 290 opus-4-6 end_turn 08:12:02 209484 323 3 270 opus-4-6 end_turn 08:12:27 209807 284 3 190 opus-4-6 tool_use 08:12:39 210091 314 1 278 opus-4-6 tool_use 08:13:01 210405 728 1 1219 opus-4-6 tool_use 08:13:18 211133 1465 1 449 opus-4-6 end_turn 08:15:28 212598 599 3 325 opus-4-6 end_turn 08:16:26 213197 334 3 209 opus-4-6 end_turn 08:18:00 213531 288 3 137 opus-4-6 tool_use 08:18:07 213819 1131 1 122 opus-4-6 tool_use 08:18:15 214950 140 1 193 opus-4-6 tool_use 08:18:29 215090 1114 1 269 opus-4-6 tool_use 08:18:54 216204 10504 1 336 opus-4-6 tool_use <-- cache starts breaking down due to history change* 08:19:07 216204 11815 1 228 opus-4-6 tool_use 08:19:17 216204 12990 1 134 opus-4-6 tool_use 08:19:38 216204 13341 1 301 opus-4-6 tool_use 08:20:04 216204 13758 1 426 opus-4-6 tool_use 08:20:18 216204 15278 1 154 opus-4-6 tool_use 08:20:46 216204 15778 1 508 opus-4-6 tool_use 08:22:25 216204 17092 1 208 opus-4-6 tool_use 08:22:51 216204 17894 1 660 opus-4-6 tool_use 08:23:22 11428 224502 1 315 opus-4-6 end_turn <-- cache cannot get regenerated, reverting to full cache write 08:24:47 11428 224953 3 871 opus-4-6 tool_use 08:25:10 11428 227259 1 597 opus-4-6 tool_use 08:25:24 11428 228249 1 356 opus-4-6 tool_use 08:25:43 11428 228669 1 825 opus-4-6 tool_use 08:26:01 11428 229763 1 468 opus-4-6 tool_use 08:26:22 11428 230278 1 339 opus-4-6 end_turn 08:28:07 11428 230642 3 442 opus-4-6 end_turn 08:37:30 11428 231432 3 430 opus-4-6 end_turn --- (Ignore hour, it's another day) When running "npx @anthropic-ai/claude-code" 21:28:59 11374 46622 1 473 opus-4-6 tool_use <-- still on standalone binary 22:02:13 0 0 8 1 haiku-4-5-20251001 max_tokens <-- I tried resuming a couple of times 22:02:23 0 0 340 11 haiku-4-5-20251001 end_turn 22:02:25 11374 15278 3 21 opus-4-6 end_turn 22:04:51 0 0 8 1 haiku-4-5-20251001 max_tokens 22:04:58 0 0 341 11 haiku-4-5-20251001 end_turn 22:05:00 11374 15194 3 20 opus-4-6 end_turn 22:09:20 0 0 8 1 haiku-4-5-20251001 max_tokens 22:09:35 0 0 8 1 haiku-4-5-20251001 max_tokens 22:12:16 0 0 341 11 haiku-4-5-20251001 end_turn 22:12:18 11374 15194 3 21 opus-4-6 end_turn 22:15:36 0 0 8 1 haiku-4-5-20251001 max_tokens 23:22:46 0 0 8 1 haiku-4-5-20251001 max_tokens 23:23:06 0 0 341 12 haiku-4-5-20251001 end_turn 23:23:09 11374 17262 3 19 opus-4-6 end_turn 23:23:26 28636 27 3 12 opus-4-6 end_turn 23:31:41 0 0 8 1 haiku-4-5-20251001 max_tokens 23:31:50 0 0 345 13 haiku-4-5-20251001 end_turn 23:31:54 11374 17188 3 32 opus-4-6 end_turn <-- start of npx trials 23:32:25 28562 51 3 167 opus-4-6 end_turn 23:33:52 28613 320 3 666 opus-4-6 end_turn 23:34:55 0 0 8 1 haiku-4-5-20251001 max_tokens 23:35:12 0 0 355 15 haiku-4-5-20251001 end_turn 23:35:22 11374 17198 3 328 opus-4-6 end_turn 23:36:50 28572 367 3 500 opus-4-6 end_turn 23:37:15 28939 506 3 143 opus-4-6 tool_use 23:37:19 29445 523 1 91 opus-4-6 tool_use 23:37:56 29968 4869 1 1284 opus-4-6 tool_use 23:38:06 34837 1343 1 173 opus-4-6 end_turn 23:38:19 36180 219 3 151 opus-4-6 tool_use 23:38:27 36399 9511 1 341 opus-4-6 tool_use 23:38:33 45910 442 73 77 opus-4-6 tool_use 23:38:37 46352 250 1 77 opus-4-6 tool_use 23:38:42 46602 1161 1 134 opus-4-6 tool_use 23:38:59 47763 415 1 369 opus-4-6 tool_use 23:39:06 48178 427 1 96 opus-4-6 tool_use 23:39:09 48605 393 1 77 opus-4-6 tool_use 23:39:13 48605 639 1 152 opus-4-6 tool_use 23:40:17 11374 38207 3 362 opus-4-6 end_turn 23:41:35 49581 438 3 766 opus-4-6 end_turn 23:43:02 0 0 8 1 haiku-4-5-20251001 max_tokens <-- another session 23:43:23 11374 40201 3 97 opus-4-6 tool_use 23:43:30 51575 122 1 310 opus-4-6 tool_use 23:43:35 51697 408 1 152 opus-4-6 tool_use 23:43:41 52105 219 93 170 opus-4-6 tool_use 23:43:49 52324 442 1 259 opus-4-6 tool_use 23:43:54 52766 558 1 102 opus-4-6 tool_use 23:44:08 53324 2593 1 403 opus-4-6 end_turn 23:51:33 0 0 8 1 haiku-4-5-20251001 max_tokens 23:52:30 55917 431 3 187 opus-4-6 tool_use 23:54:20 56348 292 37 284 opus-4-6 tool_use 23:54:29 56640 612 158 492 opus-4-6 tool_use 23:54:54 60657 13 3 508 opus-4-6 end_turn 23:58:58 60670 717 2 454 opus-4-6 tool_use 23:59:05 61387 847 1 336 opus-4-6 tool_use 23:59:23 62234 1282 1 674 opus-4-6 tool_use 23:59:34 63516 1024 1 506 opus-4-6 tool_use 23:59:45 64540 583 1 264 opus-4-6 tool_use 23:59:53 65123 284 1 393 opus-4-6 tool_use 00:00:10 65407 470 1 887 opus-4-6 tool_use 00:03:07 65877 1024 1 871 opus-4-6 tool_use 00:03:16 66901 2098 1 538 opus-4-6 tool_use 00:03:25 68999 1492 1 379 opus-4-6 tool_use 00:03:36 70491 1043 1 640 opus-4-6 tool_use 00:03:43 71534 704 1 233 opus-4-6 tool_use 00:03:51 72238 250 1 148 opus-4-6 tool_use 00:03:58 72488 355 1 249 opus-4-6 tool_use 00:04:03 72843 396 1 259 opus-4-6 tool_use 00:04:10 73239 435 1 278 opus-4-6 tool_use 00:04:31 73674 359 1 941 opus-4-6 tool_use 00:04:45 74033 1595 1 662 opus-4-6 tool_use 00:05:00 75628 914 1 830 opus-4-6 tool_use 00:05:18 76542 1610 1 963 opus-4-6 tool_use 00:05:31 78152 1379 1 640 opus-4-6 tool_use 00:05:41 79531 1374 1 549 opus-4-6 tool_use 00:05:51 80905 1576 1 550 opus-4-6 tool_use 00:06:06 82481 629 1 986 opus-4-6 tool_use 00:06:20 83110 1102 1 994 opus-4-6 tool_use 00:06:30 84212 1074 1 578 opus-4-6 tool_use 00:06:48 85286 2418 1 854 opus-4-6 tool_use 00:07:01 87704 880 1 555 opus-4-6 tool_use 00:07:13 88584 818 1 601 opus-4-6 tool_use 00:07:30 89402 2165 1 520 opus-4-6 end_turn 00:11:02 91567 542 3 691 opus-4-6 tool_use 00:11:12 92109 777 1 733 opus-4-6 tool_use 00:11:25 92886 1042 1 578 opus-4-6 tool_use 00:11:39 93928 3347 1 694 opus-4-6 tool_use 00:12:21 98211 39 3 442 opus-4-6 end_turn 00:13:24 98250 462 3 161 opus-4-6 tool_use 00:13:32 98712 274 1 178 opus-4-6 end_turn 00:15:06 98986 190 3 237 opus-4-6 tool_use 00:15:37 99176 362 1 1202 opus-4-6 tool_use 00:15:42 6637 16169 2 114 opus-4-6 tool_use 00:15:44 99538 1427 1 280 opus-4-6 tool_use 00:15:47 22806 3503 1 160 opus-4-6 tool_use 00:15:49 100965 2929 1 52 opus-4-6 end_turn <-- my joy is great at this point 00:15:50 26309 191 1 91 opus-4-6 tool_use 00:15:54 26500 217 1 92 opus-4-6 tool_use 00:15:57 26717 152 1 88 opus-4-6 tool_use 00:16:02 26869 607 3 125 opus-4-6 tool_use 00:16:06 27476 146 1 166 opus-4-6 tool_use 00:16:09 27622 430 1 105 opus-4-6 tool_use 00:16:14 28052 123 1 109 opus-4-6 tool_use 00:16:19 28052 542 1 176 opus-4-6 tool_use 00:16:23 28594 440 1 95 opus-4-6 tool_use 00:16:28 29034 120 1 112 opus-4-6 tool_use 00:16:32 29154 139 1 180 opus-4-6 tool_use 00:16:37 29293 256 1 206 opus-4-6 tool_use 00:17:09 29549 1065 1 82 opus-4-6 tool_use 00:17:12 30614 131 1 117 opus-4-6 tool_use 00:17:24 30745 135 1 79 opus-4-6 tool_use 00:17:28 30745 237 1 100 opus-4-6 tool_use 00:17:35 30982 140 1 138 opus-4-6 tool_use 00:17:39 31122 222 1 112 opus-4-6 tool_use 00:17:46 31344 226 1 257 opus-4-6 tool_use 00:17:51 31570 288 1 125 opus-4-6 tool_use 00:17:56 31858 214 1 128 opus-4-6 tool_use 00:18:01 32072 316 1 139 opus-4-6 tool_use 00:18:04 32388 487 1 116 opus-4-6 tool_use 00:18:09 32875 134 1 125 opus-4-6 tool_use 00:19:29 33009 143 1 5925 opus-4-6 tool_use 00:19:35 33152 5912 1 113 opus-4-6 tool_use <-- yup, was at 100% usage at this point 00:19:36 0 0 0 0 - - 00:19:36 0 0 0 0 - - 10:08:54 11374 58355 3 270 opus-4-6 tool_use <-- costly resume, but cache TTL = 1h in claude code 10:09:00 69729 538 1 277 opus-4-6 tool_use 10:09:07 70267 7086 180 188 opus-4-6 tool_use 10:09:14 77353 436 1 201 opus-4-6 tool_use 10:09:18 77789 219 1 118 opus-4-6 tool_use 10:09:34 78008 518 1 183 opus-4-6 tool_use 10:10:40 78526 444 1 95 opus-4-6 tool_use 10:10:46 78970 1213 1 185 opus-4-6 tool_use 10:10:55 80183 996 1 270 opus-4-6 tool_use 10:13:03 81179 602 1 268 opus-4-6 tool_use 10:18:08 81781 675 1 121 opus-4-6 tool_use 10:18:14 82456 148 1 226 opus-4-6 tool_use 10:29:13 82604 823 1 184 opus-4-6 tool_use 10:29:19 83427 889 1 239 opus-4-6 tool_use ---------- ---------- ---------- ------- ----- ------------------ ------------ If you provide me with means, I can send you full request/response dumps *- no idea if this cache breaking was due to me inspecting binary or some historical tool change happened on the background level. ``` ### Steps to Reproduce Write "cch=00000" in command line and ask claude what does he see. He still should see "cch=00000". And token usage should be all "cache read" mostly, not "cache write" for subsequent requests. Step to temporarily fix: `npx @anthropic-ai/claude-code@2.1.34` // you need to fix it on older version to benefit from it ### Claude Model Opus ### Is this a regression? Yes, this worked in a previous version ### Last Working Version Based on reports: 2.1.67 ### Claude Code Version 2.1.86 (Claude Code) ### Platform Anthropic API ### Operating System Ubuntu/Debian Linux ### Terminal/Shell Other ### Additional Information Similar issue: https://github.com/anthropics/claude-code/issues/34629 - this one relates to immediate start of conversation Tool I wrote for debugging: https://gitlab.com/treetank/cc-diag Verification script: https://gitlab.com/treetank/cc-diag/-/raw/c126a7890f2ee12f76d91bfb1cc92612ae95284e/test_cache.py

github.com/anthropics/claude-code

[BUG] Prompt cache regression in --print --resume since v2.1.69(?): cache_read never grows, ~20x cost increase

已打开 12:42PM - 15 Mar 26 UTC 已关闭 01:26AM - 01 Apr 26 UTC cinniezra bug has repro platform:linux area:cost regression

Reddit post 里没提到的几个相关 issue：

github.com/anthropics/claude-code

[BUG] Client-side rate limiter blocks requests with zero API calls when conversation transcript is large (~74MB) — false rate_limit error with synthetic model and 0 input/output tokens

已打开 01:21PM - 29 Mar 26 UTC rwp65 bug has repro area:core

github.com/anthropics/claude-code

[BUG] Silent context degradation — tool results cleared without notification on 1M context sessions this issue documents three separate mechanisms (microcompact, cached microcompact, session memory compact)

已打开 11:50AM - 02 Apr 26 UTC Sn3th bug has repro platform:linux area:core

同一天，Anthropic 的员工之一 Lydia Hallie 在 X/Twitter 上承认了存在额度消耗过快的问题。并已展开排查：

https://x.com/lydiahallie/status/2038686571676008625

Anthropic 也在 r/Anthropic 的 subreddit 上发布了类似官方声明：

https://www.reddit.com/r/Anthropic/comments/1s7zfap/investigating_usage_limits_hitting_faster_than/

经过几天的发酵和社交媒体上用户海量的抱怨，关注到此事的 BBC 昨天也进行了报道：

bbc.com

Claude Code users hitting usage limits 'way faster than expected'

Anthropic, the company behind the AI coding assistant, said it was fixing a problem blocking users.

时间线大概就是如此。

但是仅知道问题出在哪还不够。作为用户，我们目前可以采取的应对手段是什么呢？

首先，Anthropic 昨天发布了 v2.1.91，这个新版本部分解决了 #40524 和 #34629。所以第一步应该是尽快升级至 v2.1.91 （可pin）。
~~卸载官方推荐的独立二进制（bun runtime）ELF，使用 NPM 包进行安装使用，以避免 sentinal replacement 污染 cache prefix，消耗起飞。~~
定期开启新会话。
避免使用恢复会话，包括 --continue, --continue --dangerously-skip-permissions, /resume，这会导致 cache_read -> 0 以及 cache_creation，消耗起飞。
避免使用 /dream 和 /insights，后台 API 调用也会导致消耗起飞。

祈祷 A/ 做个人，麻利地修好 bug，重置额度，降落消耗。

站内类似贴：

某位佬三天前也发过 skibidi-toaleta-2137 那篇 Reddit post 的总结：

Claude code /resume后的缓存失效问题开发调优

看到了顺便转过来。这个原文是claude写的就不贴过来了。 https://www.reddit.com/r/ClaudeAI/comments/1s7mkn3/psa_claude_code_has_two_cache_bugs_that_can/ 验证脚本 https://gitlab.com/treetank/cc-diag/-/raw/c126a7890f2ee12f76d91bfb…

某位佬两天前也提出了类似的应对措施：

针对 Claude Code 额度掉的飞快的一些优化建议开发调优

最近我自己的 Claude Max 的额度消耗巨快。看了网上一些帖子，可能与缓存失效的bug有关，部分应该也是由于 Claude Opus 升级 1M 上下文后带来的上下文负担。目前感觉 5 小时的使用额度，还没用到 2 小时就消耗殆尽了（max 5x）。我跟 Claude Code聊了几轮，让它自己去网上调研这个现象的原因，总结得到以下优化建议（纯手打）： 1）尽量避免直接使用 /re…

来源：

以上信息部分来自于 ArkNill 整理的 claude-code-cache-analysis 报告。

对于 Root Cause 分析和 Benchmark 感兴趣的可在此阅览该报告：

github.com

GitHub - ArkNill/claude-code-cache-analysis: Measured analysis of Claude Code cache bugs...

Measured analysis of Claude Code cache bugs causing 10-20x token inflation on Max plans

希望能帮到佬们~

– 04/03 上午更新 –

写完这篇帖子后，发现 v2.1.91 已经发布了。在最新版本中，可以继续使用 Anthropic 推荐的独立二进制文件，不再需要 NPM 安装使用了。

同时 Anthropic 官方经过多天排查，终于给出了最新“回应”。

Reddit 上 r/ClaudeAI 的官方回应在此。

以及 Lydia Hallie 几乎相同内容的 X/Twitter 贴和翻译：

asdasdasdasdad23qq3214sfd - Copy706×298 29.5 KB

asdasdasdkajsdhakjdhaskjdk231983742897askjdhakjdhakj - Copy1169×147 17.6 KB

adsasdasd2343245sdfdsfd232df23 - Copy706×593 57.9 KB

asdadsakjdhaskdjadkjad2398472984sdjkdhfskjfh239847skdjfhskjdfhs - Copy1161×395 72.5 KB

不出意外，社区对于这个 ~~gaslighting~~ 回应非常满意：

https://reddit.com/r/claude/comments/1satc4f/the_biggest_gaslighting_in_ai_history_anthropic/

asdasdasdadsjhaskjdh123h1kjhkjahsdkjahk2j1h3kjashdkjah - Copy.PNG721×1090 132 KB

言尽于此，只能留下这六字真言，聊表心意：

。

原帖 ↩︎

网友解答：

--【壹】--：

。

--【贰】--：

会导致缓存失效，缓存归0，恢复的会话，上下文又大

--【叁】--：

奇了怪了，我平时就很关注这个问题，但你提的那些帖子我自己从来没刷到过

--【肆】--： paguro:

避免使用恢复会话，包括 --continue, --continue --dangerously-skip-permissions, /resume，这会导致 cache_read -> 0 以及 cache_creation，消耗起飞

这个为什么不能用啊

--【伍】--：

其实，我也是先在其它社区观察到这个问题的。

可能是因为站内很多佬们主要是使用中转服务而不是 Claude 的官方订阅？所以这类帖子能见度/曝光度被稀释了？

毕竟 Claude 订阅针对中国的风控是出了名的难搞，尤其是 Max plan。

--【陆】--：

所以cursor为啥也消耗的飞快了呢

--【柒】--：

RNM 退钱!!!

--【捌】--：

三句话就没了

标签：人工智能软件开发

[BUG] Conversation history invalidated on subsequent turns

[BUG] Prompt cache regression in --print --resume since v2.1.69(?): cache_read never grows, ~20x cost increase

[BUG] Client-side rate limiter blocks requests with zero API calls when conversation transcript is large (~74MB) — false rate_limit error with synthetic model and 0 input/output tokens

[BUG] Silent context degradation — tool results cleared without notification on 1M context sessions this issue documents three separate mechanisms (microcompact, cached microcompact, session memory compact)

Claude Code users hitting usage limits 'way faster than expected'

站内类似贴：

来源：

GitHub - ArkNill/claude-code-cache-analysis: Measured analysis of Claude Code cache bugs...

– 04/03 上午更新 –

相关推荐

[BUG] Conversation history invalidated on subsequent turns

[BUG] Prompt cache regression in --print --resume since v2.1.69(?): cache_read never grows, ~20x cost increase

[BUG] Client-side rate limiter blocks requests with zero API calls when conversation transcript is large (~74MB) — false rate_limit error with synthetic model and 0 input/output tokens

[BUG] Silent context degradation — tool results cleared without notification on 1M context sessions this issue documents three separate mechanisms (microcompact, cached microcompact, session memory compact)

Claude Code users hitting usage limits 'way faster than expected'

站内类似贴：

来源：

GitHub - ArkNill/claude-code-cache-analysis: Measured analysis of Claude Code cache bugs...

– 04/03 上午更新 –

相关推荐