Update model card: Add pipeline tag (#1)

Browse files

- Update model card: Add pipeline tag (7cda26f29b25184410e39d65259cdb1c32ef9f88)

Co-authored-by: Niels Rogge <[email protected]>

Files changed (1) hide show

README.md +14 -13

README.md CHANGED Viewed

@@ -1,15 +1,16 @@
 ---
-license: mit
 datasets:
 - monology/pile-uncopyrighted
 language:
 - en
 library_name: CALM
 tags:
 - large language models
 - language modeling
-metrics:
-- BrierLM
 ---
 # Continuous Autoregressive Language Models
@@ -25,19 +26,19 @@ Modern Large Language Models (LLMs) are constrained by a fundamental bottleneck:
 This is achieved through a two-stage process:
-1. **A high-fidelity autoencoder** learns to compress K tokens into a single vector and reconstruct them with near-perfect accuracy.
-2. **A continuous-domain language model** then performs autoregressive prediction in this vector space.
 ### Key Features
-* 🚀 **Ultra-Efficient by Design:** Dramatically improves training and inference efficiency by reducing the number of autoregressive steps by a factor of K.
-* 💡 **A New Scaling Axis:** Introduces a new scaling dimension for LLMs—semantic bandwidth (K). Instead of just scaling parameters and data, you can now scale the amount of information processed in a single step.
-* 🛠️ **A Comprehensive Likelihood-Free Toolkit:** Operating in a continuous domain requires new tools. This repository provides the full suite of algorithms that make CALM possible:
-  * **A Robust Autoencoder** to learn high-fidelity continuous representations of token chunks.
-  * **Energy-Based Training**, a principled and likelihood-free method for generative modeling.
-  * **BrierLM**, a new metric for calibrated, likelihood-free evaluation of language models.
-  * **Temperature Sampling** for controlled, high-quality text generation using only a black-box sampler.
 ## How to use
@@ -45,4 +46,4 @@ See our [GitHub README](https://github.com/shaochenze/calm), where we provide sc
 ## Contact
-If you have any questions, feel free to submit an issue or contact `[email protected]`.

 ---
 datasets:
 - monology/pile-uncopyrighted
 language:
 - en
 library_name: CALM
+license: mit
+metrics:
+- BrierLM
 tags:
 - large language models
 - language modeling
+pipeline_tag: text-generation
 ---
 # Continuous Autoregressive Language Models
 This is achieved through a two-stage process:
+1.  **A high-fidelity autoencoder** learns to compress K tokens into a single vector and reconstruct them with near-perfect accuracy.
+2.  **A continuous-domain language model** then performs autoregressive prediction in this vector space.
 ### Key Features
+*   🚀 **Ultra-Efficient by Design:** Dramatically improves training and inference efficiency by reducing the number of autoregressive steps by a factor of K.
+*   💡 **A New Scaling Axis:** Introduces a new scaling dimension for LLMs—semantic bandwidth (K). Instead of just scaling parameters and data, you can now scale the amount of information processed in a single step.
+*   🛠️ **A Comprehensive Likelihood-Free Toolkit:** Operating in a continuous domain requires new tools. This repository provides the full suite of algorithms that make CALM possible:
+  *   **A Robust Autoencoder** to learn high-fidelity continuous representations of token chunks.
+  *   **Energy-Based Training**, a principled and likelihood-free method for generative modeling.
+  *   **BrierLM**, a new metric for calibrated, likelihood-free evaluation of language models.
+  *   **Temperature Sampling** for controlled, high-quality text generation using only a black-box sampler.
 ## How to use
 ## Contact
+If you have any questions, feel free to submit an issue or contact `[email protected]`.