AlignmentResearch
/

learned-planner

Reinforcement Learning

machine-learning

Model card Files Files and versions

agaralon commited on Jul 15, 2024

Commit

a986605

·

unverified ·

1 Parent(s): 9118e6e

Parameter counts and some explanation

Files changed (2) hide show

README.md +44 -3
count_params.py +28 -0

README.md CHANGED Viewed

@@ -1,3 +1,44 @@
----
-license: apache-2.0
----

+---
+language: en
+tags:
+- machine-learning
+- reinforcement-learning
+- sokoban
+- planning
+license: apache-2.0
+---
+# Trained learned planners
+This repository contains the trained networks from the paper ["Planning behavior in a recurrent neural network that
+plays Sokoban"](https://openreview.net/forum?id=T9sB3S2hok), presented at the ICML 2024 Mechanistic Interpretability
+Workshop.
+To load and use the NNs, please refer to the [learned-planner
+repository](http://github.com/alignmentresearch/learned-planner), and possibly to the [training code
+](https://github.com/AlignmentResearch/train-learned-planner).
+# Model details
+**Hyperparameters:** see `model/*/cp_*/cfg.json` for the hyperparameters that were used to train a particular run.
+## Parameter counts:
+- DRC(3, 3):  1,285,125 (1.29M)
+- DRC(1, 1):  987,525 (0.99M)
+- ResNet:  3,068,421 (3.07M)
+# Citation
+If you use these neural networks, please cite our work:
+```bibtex
+@inproceedings{TODO: add your citation here,
+  title={Planning behavior in a recurrent neural network that plays Sokoban},
+  author={Your Authors},
+  booktitle={ICML 2024 Mechanistic Interpretability Workshop},
+  year={2024},
+  url={https://openreview.net/forum?id=T9sB3S2hok}
+}
+```

count_params.py ADDED Viewed

	@@ -0,0 +1,28 @@

+import json
+import os
+from pathlib import Path
+import farconf
+from cleanba.config import Args
+from cleanba.environments import SokobanConfig
+soko_env = SokobanConfig(
+    max_episode_steps=100, num_envs=1, dim_room=(10, 10), num_boxes=1, asynchronous=False, tinyworld_obs=True
+).make()
+def parameter_count(root: Path) -> str:
+    model_dir = os.listdir(root)[0]
+    cp_dir = os.listdir(root / model_dir)[0]
+    with open(root / model_dir / cp_dir / "cfg.json", "r") as f:
+        cfg = json.load(f)
+    args = farconf.from_dict(cfg["cfg"], Args)
+    num = args.net.count_params(soko_env)
+    return f"{num:,} ({num/1_000_000:.2f}M)"
+print("- DRC(3, 3): ", parameter_count(Path("drc33")))
+print("- DRC(1, 1): ", parameter_count(Path("drc11")))
+print("- ResNet: ", parameter_count(Path("resnet")))