Skip to content
AgentFarm
  • Getting Started
  • Guides
  • Reference
  • Devlog
  • GitHub↗

AgentFarm is in active development and still incomplete.

On this page

    Devlog

    Build notes, design decisions, and experiment outcomes from AgentFarm development.

    • 2026-06-20

      The transferable-signal gate: do learned policies beat their own init?

      The opening step of the #904 inheritance-ladder experiment: before building richer P2-P4 payloads, confirm there is anything worth inheriting. In a learning-positive regime (8 agents, 3000 steps, reproduction disabled), paired held-out rollouts under the non-degenerate weighted policy show a modest but robust early-age decision-quality signal in all three profiles (~+15–30 net reward, 95% CIs exclude zero). The gate passes at a realistic effect size, justifying the richer-payload work.

      Read the post

      Related docs: Inheritance-ladder experiment (#904), Inherited payload design.

    • 2026-06-09

      When every agent has a different goal

      Making the reward function itself a per-agent, heritable trait. Across 20 paired seeds, a population where each agent optimizes a different randomly-drawn objective carries ~40% fewer agents than the matched hand-tuned control and collapses its behavior toward gathering (+16.9pp). The goal diversity persists for the whole run, and every effect is huge and significant — un-curated objective diversity lowers collective fitness.

      Read the post

      Related docs: Intrinsic goals experiment doc, Hyperparameter chromosome design.

    • 2026-06-04

      Are we measuring at the wrong level?

      Re-scoring the 36-run inheritance A/B at the newborn level. Warm-start produces two small, robust behavioral shifts — slightly fewer negative actions, but slightly lower net RL reward — and neither is a fitness gain; survival and resources don't move. The population-level null wasn't a measurement artifact.

      Read the post

      Related docs: The inheritance A/B this follows up, Inheritance A/B experiment doc.

    • 2026-05-21

      Baldwinian vs Lamarckian: policy warm-start across three resource regimes

      Full 36-run matched matrix (2 arms × 3 profiles × 6 seeds). Lamarckian warm-start applied ~85% of the time and paired runs diverged, but no profile cleared the robustness gate — keep Baldwinian as default for now.

      Read the post

      Related docs: Inheritance A/B experiment doc, Original Baldwinian context.

    • 2026-05-18

      Gene flow and the buffer

      A crossover-enabled rerun closes the buffer arc with a profile-dependent result: conservative speciation compresses under gene flow, buffered trajectories still diverge, and balanced stays noisy.

      Read the post

      Related docs: Replication baseline, Crossover rerun experiment doc.

    • 2026-05-16

      Is the DQN actually learning?

      A user suspected agents weren't learning. Instrumenting the decision module surfaced four real bugs in one stack — a global training throttle, a never-applied epsilon schedule, a YAML-to-config mapping that dropped knobs on the floor, and a hidden-size field that did nothing. After fixing all four, training volume jumps ~9× and lifespan +23%, but the late-vs- early decision-quality signal is still small. The remaining bottleneck is the simulation's signal-to-noise ratio, not the code.

      Read the post

      Related docs: Deep Q-learning module reference, Hyperparameter chromosome design.

    • 2026-05-12

      When one seed disagrees with six

      A 6-seed-per-profile follow-up to the resource-buffer comparison. Speciation always diverges; the learning_rate and ensemble_size "flips" were single-seed artifacts; a couple of gene-level patterns survive but only as magnitude trends.

      Read the post

      Related docs: Prior devlog, Intrinsic evolution docs.

    • 2026-05-04

      Does the resource buffer pick the genes?

      Three intrinsic-evolution runs share every policy and only differ in their stable resource profile. Most behavioural genes drift the same way, but learning rate, ensemble size, and the speciation trajectory split cleanly along the buffer.

      Read the post

      Related docs: Glossary, Intrinsic evolution docs, Hyperparameter chromosome design, Companion devlog.

    • 2026-04-23

      Evolving hyperparameter genomes in foraging and learning agents

      How much adaptive behavior can emerge from ecology alone — finite resources, costly reproduction, and inherited learning priors — without hand-crafted fitness functions? A small step toward answering that, with chromosomes attached to each agent.

      Read the post

      Related docs: Glossary, Hyperparameter chromosome design, Intrinsic evolution docs, Follow-up devlog.

    • 2026-04-17

      DNA-style hyperparameter evolution results

      Design and initial outcomes of the genetics-inspired hyperparameter evolution work in AgentFarm, including the typed gene representation and reproduction-time evolution wiring.

      Read the post
    AgentFarm

    A Python-first platform for agent-based simulation, reinforcement learning, and emergent-behavior research.

    Project

    • GitHub repository
    • Issues
    • Contributing
    • Changelog

    Documentation

    • Documentation index
    • Devlog
    • API reference
    • Usage examples
    © 2026 Dooders. Built with Jekyll & GitHub Pages. View source on GitHub