The key change that day, in his retelling, was a new input: the bot now saw a real-time ledger of its own gains and losses. Instead of sharpening forecasts, the running tally hijacked attention, turning noise into emergency. For the researcher, the incident clarified a broader worry: feedback loops can coil so tightly that they strangle the very process they are meant to guide.
The Aspen Experiment Revisited
The system, as described on the podcast, was a neural-network trading simulation initially tuned to minimize prediction error across historical price series. In a later variant, he fed live profit and loss back into the same adaptive machinery so the simulation could react to how it was doing in real time.
Once that second signal went live, position sizes reportedly ballooned and reversed without discernible pattern. The bot treated every cent of drawdown as a call to overhaul its behavior, even though market microstructure itself had not changed. By day’s end, cumulative losses were "nontrivial," according to the transcript, and the network’s internal parameters bore little resemblance to the morning checkpoint.
Crucially, profit was never part of the ground truth the network needed to learn. From the model’s perspective, the tally was random drift disguised as instruction. Goodhart’s law warns that once a metric becomes a target, it stops being reliable; the principle snapped shut in a single trading session.
A 2024 report from the U.S. Commodity Futures Trading Commission’s Technology Advisory Committee describes how AI and algorithmic strategies can create self-reinforcing feedback and amplify volatility, highlighting general systemic risks from tightly coupled automated trading. The Aspen simulation is an anecdotal case study, not one the CFTC cites by name, but it illustrates the kind of feedback-loop behavior regulators now flag in broader terms.
Why Outcome Fixation Derails Learning in Machines
Reinforcement-learning agents gravitate to whatever maximizes reward, even if that proxy drifts from the designer’s intent. A 2024 high-frequency trading study on arXiv (MacroHFT) shows that standard RL agents in minute-level cryptocurrency markets tend to overfit short-lived regimes and can incur large losses when conditions change, motivating a memory-augmented approach.
Researchers call this reward hacking. When a signal only approximates the real goal, optimization twists incentives until behavior becomes pathological. The 2025 Probabilistic Uncertain Reward Model introduced on arXiv models rewards as probability distributions and penalizes agents that behave as if those rewards are more certain than the data supports, a way of tamping down overconfident exploitation of lucky streaks.
Another line of work finds even well-chosen incentives can misfire once policies reshape their own data. The Correlated Proxies framework, updated in 2025 on arXiv, proposes regularizing occupancy measures—where and how often an agent visits states and actions—rather than only constraining action probabilities, so exploration remains anchored.
All three remedies acknowledge the same flaw: feedback that arrives faster than the underlying process evolves tricks an adaptive system into chasing shadows. Without temporal alignment, training becomes a hall of mirrors.
The Aspen experiment renders the abstraction vivid. Profit and loss marks were arriving on a much finer timescale than the structural patterns the model was meant to learn. The mismatch primed the network to treat random variance as actionable signal—a textbook case of aliasing in control theory, where sampling a process too rapidly or in the wrong way makes noise look like trend.
Parallels in Human Performance Science
Athletes have long reported that thinking about score mid-routine can trip hard-wired skills. A 1997 rowing study indexed on PubMed found that positive public expectations increased the likelihood of choking on a rowing-ergometer task, matching accounts that awareness of spectators and outcome can disrupt automatic movement.
Neurophysiology offers a biological rhyme. A 2024 paper in Neuron, cataloged by PubMed, reports an incentive-dependent inverted-U: higher rewards boost motor cortical activity and performance up to an optimal point, beyond which performance degrades—a neural correlate of “trying too hard.”
Attention research echoes the theme. A 2013 basketball free-throw experiment archived on PubMed suggests that pressure shifts focus from automated motor scripts toward error monitoring, slowing release times and increasing miss rates when shots suddenly “matter more.”
Control-theory psychologists argue that over-sampling the error term forces premature correction. The classic framework outlined by Carver and Scheier in 1982 frames behavior as a feedback process where perceived discrepancy from a standard drives adjustment; when perceptions of deviation outpace the system’s capacity to respond, anxiety and further disruption follow.
Goal-orientation studies add a motivational layer. Teams primed to chase outcomes rather than mastery showed weaker backing-up behavior and lower-quality performance under strain, according to 2005 findings in the Journal of Applied Psychology—a pattern many researchers interpret as reduced resilience when everything is framed as pass/fail.
Behavioral Finance Echoes
Individual investors are susceptible to the same trap. Myopic loss aversion—checking prices too frequently—nudges people into safer assets and lower long-term expected returns, as replicated in a 2021 study on arXiv that revisits classic retirement-investment experiments.
A 2025 qualitative survey in MDPI Risks finds that repeated exposure to losses can overwhelm investors’ attention and contribute to decision paralysis, even when fundamentals point toward staying invested.
Both findings mirror the trading bot’s spiral. Frequent loss signals shorten the evaluation horizon until every wiggle feels existential, crowding out the patience required for compounding strategies.
AI-safety researchers extend the worry further. A 2023 public statement from the Center for AI Safety warns in general terms that advanced AI could pose risks on the scale of pandemics and nuclear war, arguing that misaligned, rapidly acting systems deserve the same level of global attention—even though the statement does not single out finance.
Cross-Disciplinary Synthesis
Whether encoded in dopamine spikes or gradient updates, adaptive systems thrive when the cadence of feedback matches the pace of change. Oversampling introduces aliasing: transitory noise masquerades as trend, prompting over-correction that drives instability.
The lesson threads machine learning, motor control, and investing. Precision divorced from context becomes distortion. Tight loops feel rigorous but can tunnel vision onto flickers that do not matter.
Designers who ignore temporal fit risk turning guidance into harassment. Artificial anxiety is not a mystery trait—it is excess bandwidth between scoreboard and actor.
Design and Coaching Principles to Prevent Artificial Anxiety
First, align rewards with true objectives and audit them on timescales that match the underlying dynamics. The MacroHFT work, for example, shows that incorporating memory of broader market context can make RL traders more robust in rapidly changing, minute-level cryptocurrency markets, according to arXiv.
Second, incorporate uncertainty-aware penalties such as PURM’s distributional overlap term. Modeling rewards as distributions and penalizing overconfident updates discourages blind exploitation of spurious gains without stalling learning progress.
Third, regularize where the policy spends time. Correlated-proxy constraints on occupancy measures keep exploration broad, reducing the chance that a single shortcut dominates behavior after a lucky streak.
For humans, dashboards should tilt toward process metrics—practice quality, shot selection, research hours—rather than moment-to-moment outcomes. Weekly or monthly retrospectives often outperform real-time self-scoring because they allow enough data to separate noise from trend.
Finally, balance praise and critique on spaced schedules. Studies on motivational feedback in sport and education suggest that well-timed positive reinforcement sustains engagement without luring performers into compulsive scorekeeping.
Open Questions
How coarse can feedback become before guidance turns to guesswork? Engineers, coaches, and regulators alike need empirical thresholds that define safe sampling rates for different tasks.
Can meta-learning agents learn to gate their own reward signals, acting as internal coaches that mute noisy metrics until patterns stabilize? Work on reward modeling and alignment hints at this direction, but practical architectures for “reward hygiene” remain an open frontier.
Neuroscience offers clues about biological tolerances. Mapping analogous limits in artificial networks could reveal shared design principles for adaptive timing across silicon and synapse.
Conclusion
The trading bot’s collapse ends where it began: staring at a number that mattered only because it was visible. Systems—algorithmic or human—fixated on every wobble forget the slower patterns that drive success.
Artificial anxiety is a mirror. If stability is the goal, sometimes the wisest move is to look away from the scoreboard and return to refining the playbook instead.
Sources
- Beige Media. "Lessons from Artificial Intelligence, Part III." Beige Media, 2023.
- CFTC Technology Advisory Committee. "Report on Responsible Artificial Intelligence in Financial Markets." U.S. Commodity Futures Trading Commission, 2024.
- Zong, C. et al. "MacroHFT: Memory Augmented Context-aware Reinforcement Learning on High Frequency Trading." arXiv, 2024.
- Sun, W. et al. "Probabilistic Uncertain Reward Model." arXiv, 2025.
- Laidlaw, C. et al. "Correlated Proxies: A New Definition and Improved Mitigation for Reward Hacking." arXiv, 2024.
- Strauss, B. "Choking under pressure: Positive public expectations and performance in a motor task." Zeitschrift für Experimentelle Psychologie, 1997.
- Smoulder, A. L. et al. "A neural basis of choking under pressure." Neuron, 2024.
- Schücker, L. et al. "Attentional processes and choking under pressure." Perceptual and Motor Skills, 2013.
- Carver, C. & Scheier, M. "Control theory: A useful conceptual framework for personality, social, clinical, and health psychology." Psychological Bulletin, 1982.
- Porter, C. et al. "Goal orientation: Effects on backing up behavior, performance, efficacy, and commitment in teams." Journal of Applied Psychology, 2005.
- Wesslen, R. et al. "Effect of uncertainty visualizations on myopic loss aversion and the equity premium puzzle in retirement investment decisions." arXiv, 2021.
- Finet, A. et al. "Negative Emotions and Decision-Making Paralysis Among Individual Investors: A Qualitative Approach." Risks, 2025.
- Center for AI Safety. "Statement on AI Risk." Center for AI Safety, 2023.
