Spaces:
Running
Running
milwright
commited on
Commit
·
8597f42
1
Parent(s):
978c61a
update documentation to reflect single-model architecture (gemma-3-27b)
Browse files
README.md
CHANGED
|
@@ -25,7 +25,7 @@ But now the loop has closed in an unexpected way. BERT and its descendants—inc
|
|
| 25 |
|
| 26 |
## What This Game Explores
|
| 27 |
|
| 28 |
-
Cloze Reader uses open-weight Gemma-3
|
| 29 |
|
| 30 |
This isn't just automated test generation. It's an investigation into what happens when the twin histories of educational assessment and machine learning collapse into each other. Consider:
|
| 31 |
|
|
@@ -57,13 +57,13 @@ The experience raises more questions than it answers. Is this pedagogy or patter
|
|
| 57 |
|
| 58 |
## How It Works
|
| 59 |
|
| 60 |
-
**
|
| 61 |
|
| 62 |
**Progressive Levels:** The game implements a level system (1-5 with 1 blank, 6-10 with 2 blanks, 11+ with 3 blanks) that scaffolds difficulty through word length constraints, historical period selection, and hint disclosure. Early levels use 1900s texts and show first+last letters; advanced levels draw from any era and provide only first letters. Each round presents two passages from different books, requiring consistent performance across rounds before advancing.
|
| 63 |
|
| 64 |
**Serendipitous Selection:** Passages stream directly from Hugging Face's Project Gutenberg dataset. The model selects words based on its training rather than curricular logic—sometimes choosing obvious vocabulary, sometimes obscure terms, sometimes generating exercises that are trivially easy or frustratingly hard. This unpredictability is a feature: it reveals how algorithmic assessment differs from human-designed pedagogy.
|
| 65 |
|
| 66 |
-
**Chat as Scaffold:** Click the 💬 icon beside any blank to engage the
|
| 67 |
|
| 68 |
The system filters out dictionaries, technical documentation, and poetry—ensuring narrative prose where blanks are theoretically inferable from context, even if the model's choices sometimes suggest otherwise.
|
| 69 |
|
|
@@ -71,7 +71,7 @@ The system filters out dictionaries, technical documentation, and poetry—ensur
|
|
| 71 |
|
| 72 |
**Vanilla JavaScript, No Build Step:** The application runs entirely in the browser using ES6 modules—no webpack, no bundler, no compilation. This architectural choice mirrors the project's conceptual interests: keeping the machinery visible and modifiable rather than obscured behind layers of tooling. A minimal FastAPI backend serves static files and injects API keys; everything else happens client-side.
|
| 73 |
|
| 74 |
-
**Open-Weight Models:** Uses Google's Gemma-3
|
| 75 |
|
| 76 |
**Streaming from Public Archives:** Book data streams directly from Hugging Face's mirror of Project Gutenberg's corpus—public domain texts, open dataset infrastructure, no proprietary content libraries. The entire pipeline from literature to exercises relies on openly accessible resources, making the system reproducible and auditable.
|
| 77 |
|
|
|
|
| 25 |
|
| 26 |
## What This Game Explores
|
| 27 |
|
| 28 |
+
Cloze Reader uses Google's open-weight Gemma-3-27b model to transform Project Gutenberg literature into dynamically generated cloze exercises. The model scans passages, selects vocabulary to remove, generates contextual hints, and provides conversational guidance. Every passage is fresh, every blank algorithmically chosen, every hint synthesized in real time.
|
| 29 |
|
| 30 |
This isn't just automated test generation. It's an investigation into what happens when the twin histories of educational assessment and machine learning collapse into each other. Consider:
|
| 31 |
|
|
|
|
| 57 |
|
| 58 |
## How It Works
|
| 59 |
|
| 60 |
+
**Single-Model Architecture:** The system uses Google's Gemma-3-27b model for all operations—analyzing passages, selecting words to mask, generating contextual hints, and powering the chat interface. The model handles both assessment design and pedagogical guidance through the same algorithmic system.
|
| 61 |
|
| 62 |
**Progressive Levels:** The game implements a level system (1-5 with 1 blank, 6-10 with 2 blanks, 11+ with 3 blanks) that scaffolds difficulty through word length constraints, historical period selection, and hint disclosure. Early levels use 1900s texts and show first+last letters; advanced levels draw from any era and provide only first letters. Each round presents two passages from different books, requiring consistent performance across rounds before advancing.
|
| 63 |
|
| 64 |
**Serendipitous Selection:** Passages stream directly from Hugging Face's Project Gutenberg dataset. The model selects words based on its training rather than curricular logic—sometimes choosing obvious vocabulary, sometimes obscure terms, sometimes generating exercises that are trivially easy or frustratingly hard. This unpredictability is a feature: it reveals how algorithmic assessment differs from human-designed pedagogy.
|
| 65 |
|
| 66 |
+
**Chat as Scaffold:** Click the 💬 icon beside any blank to engage the model in conversation. It attempts to guide you through Socratic questioning, semantic clues, and contextual hints—replicating what a tutor might do, constrained by what a language model trained on text prediction can actually accomplish.
|
| 67 |
|
| 68 |
The system filters out dictionaries, technical documentation, and poetry—ensuring narrative prose where blanks are theoretically inferable from context, even if the model's choices sometimes suggest otherwise.
|
| 69 |
|
|
|
|
| 71 |
|
| 72 |
**Vanilla JavaScript, No Build Step:** The application runs entirely in the browser using ES6 modules—no webpack, no bundler, no compilation. This architectural choice mirrors the project's conceptual interests: keeping the machinery visible and modifiable rather than obscured behind layers of tooling. A minimal FastAPI backend serves static files and injects API keys; everything else happens client-side.
|
| 73 |
|
| 74 |
+
**Open-Weight Models:** Uses Google's Gemma-3-27b model (27 billion parameters) via OpenRouter, or alternatively connects to local LLM servers (LM Studio, etc.) on port 1234 with smaller models like Gemma-3-12b. The choice of open-weight models is deliberate: these systems can be downloaded, inspected, run locally, modified. When assessment becomes algorithmic, transparency about the algorithm matters. You can examine exactly which model is generating your exercises, run the same models yourself, experiment with alternatives.
|
| 75 |
|
| 76 |
**Streaming from Public Archives:** Book data streams directly from Hugging Face's mirror of Project Gutenberg's corpus—public domain texts, open dataset infrastructure, no proprietary content libraries. The entire pipeline from literature to exercises relies on openly accessible resources, making the system reproducible and auditable.
|
| 77 |
|