Devlog
Build notes, design decisions, and experiment outcomes from AgentFarm development.
-
2026-06-20
The transferable-signal gate: do learned policies beat their own init?
The opening step of the #904 inheritance-ladder experiment: before building richer P2-P4 payloads, confirm there is anything worth inheriting. In a learning-positive regime (8 agents, 3000 steps, reproduction disabled), paired held-out rollouts under the non-degenerate weighted policy show a modest but robust early-age decision-quality signal in all three profiles (~+15–30 net reward, 95% CIs exclude zero). The gate passes at a realistic effect size, justifying the richer-payload work.
Read the postRelated docs: Inheritance-ladder experiment (#904), Inherited payload design.
-
2026-06-09
When every agent has a different goal
Making the reward function itself a per-agent, heritable trait. Across 20 paired seeds, a population where each agent optimizes a different randomly-drawn objective carries ~40% fewer agents than the matched hand-tuned control and collapses its behavior toward gathering (+16.9pp). The goal diversity persists for the whole run, and every effect is huge and significant — un-curated objective diversity lowers collective fitness.
Read the postRelated docs: Intrinsic goals experiment doc, Hyperparameter chromosome design.
-
2026-06-04
Are we measuring at the wrong level?
Re-scoring the 36-run inheritance A/B at the newborn level. Warm-start produces two small, robust behavioral shifts — slightly fewer negative actions, but slightly lower net RL reward — and neither is a fitness gain; survival and resources don't move. The population-level null wasn't a measurement artifact.
Read the postRelated docs: The inheritance A/B this follows up, Inheritance A/B experiment doc.
-
2026-05-21
Baldwinian vs Lamarckian: policy warm-start across three resource regimes
Full 36-run matched matrix (2 arms × 3 profiles × 6 seeds). Lamarckian warm-start applied ~85% of the time and paired runs diverged, but no profile cleared the robustness gate — keep Baldwinian as default for now.
Read the postRelated docs: Inheritance A/B experiment doc, Original Baldwinian context.
-
2026-05-18
Gene flow and the buffer
A crossover-enabled rerun closes the buffer arc with a profile-dependent result: conservative speciation compresses under gene flow, buffered trajectories still diverge, and balanced stays noisy.
Read the postRelated docs: Replication baseline, Crossover rerun experiment doc.
-
2026-05-16
Is the DQN actually learning?
A user suspected agents weren't learning. Instrumenting the decision module surfaced four real bugs in one stack — a global training throttle, a never-applied epsilon schedule, a YAML-to-config mapping that dropped knobs on the floor, and a hidden-size field that did nothing. After fixing all four, training volume jumps ~9× and lifespan +23%, but the late-vs- early decision-quality signal is still small. The remaining bottleneck is the simulation's signal-to-noise ratio, not the code.
Read the postRelated docs: Deep Q-learning module reference, Hyperparameter chromosome design.
-
2026-05-12
When one seed disagrees with six
A 6-seed-per-profile follow-up to the resource-buffer comparison. Speciation always diverges; the learning_rate and ensemble_size "flips" were single-seed artifacts; a couple of gene-level patterns survive but only as magnitude trends.
Read the postRelated docs: Prior devlog, Intrinsic evolution docs.
-
2026-05-04
Does the resource buffer pick the genes?
Three intrinsic-evolution runs share every policy and only differ in their stable resource profile. Most behavioural genes drift the same way, but learning rate, ensemble size, and the speciation trajectory split cleanly along the buffer.
Read the postRelated docs: Glossary, Intrinsic evolution docs, Hyperparameter chromosome design, Companion devlog.
-
2026-04-23
Evolving hyperparameter genomes in foraging and learning agents
How much adaptive behavior can emerge from ecology alone — finite resources, costly reproduction, and inherited learning priors — without hand-crafted fitness functions? A small step toward answering that, with chromosomes attached to each agent.
Read the postRelated docs: Glossary, Hyperparameter chromosome design, Intrinsic evolution docs, Follow-up devlog.
-
2026-04-17
DNA-style hyperparameter evolution results
Design and initial outcomes of the genetics-inspired hyperparameter evolution work in AgentFarm, including the typed gene representation and reproduction-time evolution wiring.
Read the post