Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Website
Tasks
HuggingChat
Collections
Languages
Organizations
Community
Blog
Posts
Daily Papers
Learn
Discord
Forum
GitHub
Solutions
Team & Enterprise
Hugging Face PRO
Enterprise Support
Inference Providers
Inference Endpoints
Storage Buckets
Log In
Sign Up
1605.6
TFLOPS
ginipick
ginipick
76
10
359
Follow
TigranderX's profile picture
lg777's profile picture
soufianemeziany55's profile picture
706 followers
ยท
158 following
AI & ML interests
None yet
Recent Activity
liked
a Space
about 15 hours ago
VIDraft/vkae
reacted
to
ginigen-ai
's
post
with ๐ฅ
about 20 hours ago
๐ง Does your LLM know when it's about to be wrong? Most leaderboards measure accuracy. We measure metacognition โ whether a model catches its own errors. Benchmark + leaderboard + adapters, all open. ๐ The surprise: even a K-AI #1 model (JGOS-31B-Citizen) is the strongest on multiple-choice traps (trap_rate 0.005 โ ~2 misses in 400) yet blind to its own free-form mistakes (self-confidence AUROC = 0.5, pure random). A tiny base-frozen adapter recovers that signal. Two independent axes (never compared across a row): โ trap_rate โ does it fall for tempting trap options? (lower = stronger) โก adapter gain ฮ โ how much a lightweight adapter catches errors the model itself misses. (higher = more adapter value) What's open: ๐ 300+100 trap problems (each with a hidden trap + TICOS type) ๐ 24-model leaderboard ๐งฉ 11 per-model adapters โ adapters, NOT fine-tunes (base stays frozen; the adapter just reads the hidden state โ P(wrong)) Submit any HF model โ auto-scored daily at 09:00 KST and added to the board. ๐ Leaderboard โ https://huggingface.co/spaces/ginigen-ai/Metacognition-Leaderboard-Space ๐ Benchmark โ https://huggingface.co/datasets/ginigen-ai/Metacognition-Bench ๐งฉ Adapters โ https://huggingface.co/collections/FINAL-Bench/metacognition-adapters-6a42c032e6beb803dd032961 ๐ Article โ https://huggingface.co/blog/ginigen-ai/metacognition Benchmark by ginigen-ai ยท Adapters by FINAL-Bench (Darwin/Chimera platform + AETHER metacognition tech).
reacted
to
ginigen-ai
's
post
with โค๏ธ
about 20 hours ago
๐ง Does your LLM know when it's about to be wrong? Most leaderboards measure accuracy. We measure metacognition โ whether a model catches its own errors. Benchmark + leaderboard + adapters, all open. ๐ The surprise: even a K-AI #1 model (JGOS-31B-Citizen) is the strongest on multiple-choice traps (trap_rate 0.005 โ ~2 misses in 400) yet blind to its own free-form mistakes (self-confidence AUROC = 0.5, pure random). A tiny base-frozen adapter recovers that signal. Two independent axes (never compared across a row): โ trap_rate โ does it fall for tempting trap options? (lower = stronger) โก adapter gain ฮ โ how much a lightweight adapter catches errors the model itself misses. (higher = more adapter value) What's open: ๐ 300+100 trap problems (each with a hidden trap + TICOS type) ๐ 24-model leaderboard ๐งฉ 11 per-model adapters โ adapters, NOT fine-tunes (base stays frozen; the adapter just reads the hidden state โ P(wrong)) Submit any HF model โ auto-scored daily at 09:00 KST and added to the board. ๐ Leaderboard โ https://huggingface.co/spaces/ginigen-ai/Metacognition-Leaderboard-Space ๐ Benchmark โ https://huggingface.co/datasets/ginigen-ai/Metacognition-Bench ๐งฉ Adapters โ https://huggingface.co/collections/FINAL-Bench/metacognition-adapters-6a42c032e6beb803dd032961 ๐ Article โ https://huggingface.co/blog/ginigen-ai/metacognition Benchmark by ginigen-ai ยท Adapters by FINAL-Bench (Darwin/Chimera platform + AETHER metacognition tech).
View all activity
Organizations
ginipick
's datasets
9
Sort:ย Recently updated
ginipick/awesome-chatgpt-prompts
Viewer
โข
Updated
Nov 2, 2025
โข
203
โข
11
ginipick/Toucan-1.5M
Viewer
โข
Updated
Nov 2, 2025
โข
1.65M
โข
3.1k
ginipick/finewiki
Viewer
โข
Updated
Nov 2, 2025
โข
61.4M
โข
169
ginipick/darwin-a2ap-analysis
Viewer
โข
Updated
Nov 2, 2025
โข
1k
โข
24
ginipick/market
Updated
Oct 25, 2025
โข
232
ginipick/darwin-a2ap-analysis-20250915_163155
Viewer
โข
Updated
Sep 15, 2025
โข
20
โข
20
ginipick/darwin-a2ap-analysis-20250915_151507
Viewer
โข
Updated
Sep 15, 2025
โข
100
โข
10
ginipick/pdf-test
Viewer
โข
Updated
May 28, 2025
โข
5
โข
39
ginipick/autotrain-data-autotrain-7u119-vc77x
Viewer
โข
Updated
May 9, 2024
โข
8
โข
10