Card
One research move.
Definition
A card is one move in the research game: a hypothesis, code change, diagnostic, ablation, rerun, or lateral idea that can gather evidence. Runs are executions. Cards are the ideas those executions test. This lets Picidae show research as a decision tree instead of a flat pile of logs.
How It Looks
A card looks like: try disagreement-weighted labels because MCTS-policy disagreement may indicate target quality; run three seeds; compare score, variance, and diagnostic plots; then continue, branch, stop, or promote.
How To Use It
Use cards to preserve what the agent was trying to learn. A smart agent may run many commands, but the human needs the strategic unit: what idea was tested, what happened, and what decision does it imply?
Not Just A Run
A run says something happened. A card says why it happened. One card may have many runs for seeds, ablations, phases, or reruns. A failed run can still support a useful card if it falsifies a hypothesis or reveals a boundary.
Phone-Sized Unit
Cards are designed for supervision from a phone. The user should be able to see title, claim, score movement, evidence quality, suspicion flags, and suggested action without opening terminal logs.
Agent Freedom
Picidae should not force the agent into a rigid workflow. The agent can invent experiments freely, but every meaningful attempt should eventually be compressed into a card so the research state is inspectable.
Show Examples
Autoresearch card
The agent changes the attention window pattern. The card links the diff, training runs, metric movement, VRAM cost, and whether the change should be kept.
card: title: switch to local-only attention hypothesis: shorter windows improve throughput without hurting val_bpb parent: baseline runs: [run_122, run_123, run_124] status: discarded decision_reason: worse val_bpb at similar memory
Ablation card
A high-performing idea can spawn cards for ablations. These cards check whether each component is real or just complexity.
card: title: remove posterior tempering parent: em-posterior-v4 type: ablation suggested_action: keep parent, reject ablation
Owns / Defines
Hypothesis, proposed change, parent card, linked runs, status, evidence summary, and next action.
Questions Operators Should Answer
- What hypothesis or research move does this card represent?
- Which runs and artifacts support or contradict it?
- Is the card a continuation, branch, ablation, diagnostic, rerun, or promotion candidate?
- What should happen next: continue, branch, stop, rerun, investigate, or promote?
- Can the card be understood without reading raw logs?