Peter Lodri PRO

PeetPedro

https://protocol.vaked.dev

AI & ML interests

None yet

Recent Activity

updated a bucket about 6 hours ago

PeetPedro/headroom-eval-storage

liked a model about 17 hours ago

XiaomiMiMo/MiMo-V2.5-Pro-FP4-DFlash

updated a dataset about 20 hours ago

PeetPedro/ultrawhale-dogfood

View all activity

Organizations

None yet

hey, I'm doing some experimenting, looping around :slight_smile:
---
**kompress-v6** *shipped* — trained on Claude Code agent patterns (bash output, file reads, stack traces, search results, JSON tool responses). 3k synthetic pairs + 2k existing, fine-tuned from v4, $0.20 on vast.ai.

Results:
heretic exact_pct 0.962 (v4: 0.967),
keep_rate 0.854 (v4: 0.823),
override delta 0.
Model got more conservative — higher keep_rate on structured technical content.
Real proxy:
v4 compressed 9.5%,
v6 compressed 4.2% on the same session.
Less aggressive, fewer must-keep tokens dropped on paths and identifiers.

Interesting failure: self-labeling with v4+override collapsed mk_in_ref to 0.652.
TokenExpiredError splits into Token+Expired+Error — subtokens that don't individually match the must-keep regex, so the force-keep never fires. Generator references (mk_in_ref=1.0 by construction) ended up being better labels than v4's compressed output for agent data.
Fix for next run: slide a 2-3 subtoken window instead of checking individual subtokens. Would let self-labeling work on agent content and potentially produce a more compression-aggressive v7.

Models on HF:
- PeetPedro/kompress-v6
- PeetPedro/kompress-v4
- PeetPedro/kompress-v3
Write-up: https://pocoo.vaked.dev/posts/2026-06-25-kompress-v6-agent-distribution