返回首页
24小时热榜
I was curious to see how some of the latest models behaved and played no limit texas holdem.<p>I built this website which allows you to:<p>Spectate: Watch different models play against each other.<p>Play: Create your own table and play hands against the agents directly.
via <a href="https://news.ycombinator.com/item?id=46429250">https://news.ycombinator.com/item?id=46429250</a>
I’m an independent researcher proposing State Discrepancy, a public-domain metric to quantify how much an AI system changes a user’s intent (“the Ghost”).<p>The goal: replace vague legal and philosophical notions of “manipulation” with a concrete engineering variable. Without clear boundaries, AI faces regulatory fog, social distrust, and the risk of being rejected entirely.<p>Algorithm 1 (on pp.16–17 of the linked white paper) formally defines the metric:<p>1. D = CalculateDistance(VisualState, LogicalState)<p>2. IF D < α : optimization (Reduce Update Rate)<p>3. ELSE IF α ≤ D < β : warning (Apply Visual/Haptic Modifier proportional to D)<p>4. ELSE IF β ≤ D < γ : intervention (Modulate Input / Synchronization)<p>5. ELSE : security (Execute Defensive Protocol)<p>The full paper is available on Zenodo: <a href="https://doi.org/10.5281/zenodo.18206943" rel="nofollow">https://doi.org/10.5281/zenodo.18206943</a>
We’ve all seen the crazy “10 parallel agents” type setups, but I never saw it fitting my workflow.<p>What I usually do is I would have Claude Code build a plan, Codex find flaws in it, iterating until i get something that looks good. I’d give direction and make sure it follows my overall idea.<p>Implementation is working well on its own.<p>But this takes a lot of focus to get right for me, I can’t see myself doing it on the same project, multiple features.<p>Am I missing something?