Spaces:

DurstewitzLab
/

DynaMix

Running

App Files Files Community

Dschobby commited on Sep 20

Commit

776877d

verified ·

1 Parent(s): c085019

Upload 14 files

Browse files

Files changed (14) hide show

.gitignore +31 -0
README.md +57 -5
app.py +243 -0
data/chua.npy +3 -0
data/lorenz63.npy +3 -0
data/selkov.npy +3 -0
data/sine.npy +3 -0
dynamix/__init__.py +9 -0
dynamix/dynamix.py +266 -0
dynamix/forecaster.py +199 -0
dynamix/preprocessing.py +262 -0
dynamix/preprocessing_utilities.py +536 -0
dynamix/utilities.py +174 -0
requirements.txt +10 -0

.gitignore ADDED Viewed

	@@ -0,0 +1,31 @@

+example_evaluation.ipynb
+test.py
+forecast.csv
+forecast.npy
+__pycache__/
+*.py[cod]
+*$py.class
+.pytest_cache/
+.coverage
+htmlcov/
+.tox/
+.nox/
+*.so
+.Python
+env/
+build/
+develop-eggs/
+dist/
+downloads/
+eggs/
+.eggs/
+lib/
+lib64/
+parts/
+sdist/
+var/
+*.egg-info/
+.installed.cfg
+*.egg
+.vscode/

README.md CHANGED Viewed

@@ -1,14 +1,66 @@
 ---
 title: DynaMix
-emoji: 🌖
-colorFrom: gray
-colorTo: yellow
 sdk: gradio
-sdk_version: 5.46.1
 app_file: app.py
 pinned: false
 license: cc-by-4.0
 short_description: Zero-shot forecasting of Dynamical Systems using DynaMix
 ---
-Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

 ---
 title: DynaMix
+emoji: 🧨
+colorFrom: blue
+colorTo: red
 sdk: gradio
+sdk_version: 5.43.1
 app_file: app.py
 pinned: false
 license: cc-by-4.0
 short_description: Zero-shot forecasting of Dynamical Systems using DynaMix
 ---
+# DynaMix: Zero-shot Forecasting of Dynamical Systems
+This DynaMix demo is an interactive Gradio app for zero-shot dynamical systems reconstruction using the DynaMix architecture (accepted NeurIPS 2025 paper [![arXiv](https://img.shields.io/badge/arXiv-2505.13192-b31b1b.svg)](https://arxiv.org/abs/2505.13192)). It loads pretrained models from the Hugging Face Hub (see [DynaMix model](https://huggingface.co/DurstewitzLab/dynamix)) and provides predictions from uploaded context data.
+### Key Features
+- **Zero-shot forecasting**: Powered by DynaMix model architecture
+- **Custom Context Upload**: Upload your CSV/NPY data or choose a preset (Lorenz63, Noisy Sine, Chua, Selkov)
+- **Interactive Settings**: Configure forecast settings
+- **Visualizations**: Plots of context data and forecast
+- **Exports**: Download forecast as CSV and NPY
+## Using the Application
+### Data Input
+You can either upload your own data or choose a preset dataset from the left panel.
+- **Upload**: Accepts `.csv` or `.npy` files
+- **Presets**: `Noisy Sine`, `Lorenz63`, `Chua`, `Selkov`
+Supported data formats:
+- **NPY files**: Numpy array of shape `(time_steps, dimensions)`. 1D time series arrays are auto-expanded to `(time_steps, 1)`; otherwise must be 2D with at least 2 time steps and ≥1 dimension.
+- **CSV files**: Each column is a dimension; each row is a time step. Only numeric columns are used. Data must be 2D with at least 2 time steps and ≥1 dimension.
+Example CSV format:
+```csv
+dim_1,dim_2,dim_3
+0.1,0.2,0.3
+0.4,0.5,0.6
+0.7,0.8,0.9
+```
+### Forecast Settings
+- **Model Selection**: Select the pretrained model to use for forecasting.
+- **Forecast Length**: Number of future steps to generate (`1`–`2001`, step `100`, default `512`)
+- **Advanced Settings**
+  - **Preprocessing Method**: Method to use for preprocessing the context data (for cases where input dims < model dims)
+  - **Standardize**: Normalize context to zero mean and unit variance (default: enabled)
+  - **Fit Nonstationary**: Account for non-stationary trends in the data (default: disabled)
+  - **Context Steps**: Maximum number of last steps from the uploaded data to use as context. If your uploaded sequence is longer, it will be truncated to the most recent `Context Steps`. (default `2048`)
+### Outputs
+- **Interactive Plot**: Shows historical context (blue) and forecast (red) per dimension, up to 15 dimensions.
+- **Files**:
+  - `forecast.csv`: Full forecast for all dimensions.
+  - `forecast.npy`: Full forecast ndarray including all dimensions.
+## License
+This project is released under the **CC BY 4.0** license.

app.py ADDED Viewed

	@@ -0,0 +1,243 @@

+import gradio as gr
+import pandas as pd
+import torch
+import os
+import numpy as np
+from datetime import datetime
+from dynamix.forecaster import DynaMixForecaster
+from dynamix.utilities import load_hf_model, auto_model_selection
+from dynamix.utilities import create_forecast_plot
+# --- Gradio UI ---
+with gr.Blocks(title="DynaMix 🧨 - Forecasting", theme=gr.themes.Soft()) as demo:
+    gr.Markdown("# DynaMix 🧨 - Forecasting")
+    with gr.Row():
+        # Left sidebar for configuration
+        with gr.Column(scale=1):
+            gr.Markdown("Upload your data or choose a preset, then generate forecasts.")
+            # Data upload section
+            gr.Markdown("## Data Selection")
+            with gr.Group():
+                file_input = gr.File(
+                    file_types=[".csv", ".npy"],
+                    label="Upload CSV / NPY",
+                    height=200
+                )
+                preset_dropdown = gr.Dropdown(
+                    choices=["-- No preset selected --", "Noisy Sine", "Lorenz63", "Chua", "Selkov"],
+                    value="-- No preset selected --",
+                    label="Or choose a preset",
+                    info="Select from predefined datasets"
+                )
+            # Forecast settings
+            gr.Markdown("## Forecast Settings")
+            with gr.Group():
+                model_selection = gr.Dropdown(
+                    choices=["Auto"],
+                    value="Auto",
+                    label="Model Selection",
+                    info="Choose the DynaMix model to use for forecasting"
+                )
+                horizon_slider = gr.Slider(
+                    minimum=1,
+                    maximum=2001,
+                    value=512,
+                    step=100,
+                    label="Forecast Length",
+                    info="Choose how many future steps to forecast"
+                )
+                # Advanced settings
+                with gr.Accordion("⚙️ Advanced Settings", open=False):
+                    preprocessing_method = gr.Dropdown(
+                        choices=["pos_embedding", "zero_embedding", "delay_embedding", "delay_embedding_random"],
+                        value="pos_embedding",
+                        label="Preprocessing Method",
+                        info="Select the embedding method for time series with dimension < model dimension"
+                    )
+                    standardize = gr.Checkbox(
+                        value=True,
+                        label="Standardize",
+                        info="Normalize the data to zero mean and unit variance"
+                    )
+                    fit_nonstationary = gr.Checkbox(
+                        value=False,
+                        label="Fit Nonstationary",
+                        info="Account for non-stationary trends in the data"
+                    )
+                    context_steps = gr.Number(
+                        value=2048,
+                        label="Context Steps",
+                        info="Maximum number of steps to use as context from provided data (default: 4096)"
+                    )
+            plot_btn = gr.Button("► Plot Forecasts", variant="primary", size="lg")
+            gr.Markdown("# Instructions")
+            instructions = gr.Markdown("""
+            **📊 Data Format Requirements**
+            **NPY Files**: Shape: `(time_steps, dimensions)` or `(time_steps,)`\n
+            **CSV Files**: Each column = one dimension, each row = one time step
+            **⚡ Quick Start**
+            1. **Upload** a single dynamical system or time series (CSV or NPY)
+            2. **Configure** forecast length and settings
+            3. **Generate** predictions with "Plot Forecasts" (up to 15 dims of data are plotted)
+            4. **Download** the forecast as CSV or NPY
+            """)
+        # Right section for plots and downloads
+        with gr.Column(scale=3):
+            gr.Markdown("# Forecast Plot")
+            plot_output = gr.Plot(show_label=False)
+            with gr.Row():
+                csv_output = gr.File(label="Download Forecast CSV", visible=True)
+                npy_output = gr.File(label="Download Forecast NPY", visible=True)
+    def load_preset_data(preset_name):
+        """Load preset data from the data folder"""
+        if preset_name == "-- No preset selected --":
+            return None
+        preset_files = {
+            "Lorenz63": "data/lorenz63.npy",
+            "Noisy Sine": "data/sine.npy",
+            "Chua": "data/chua.npy",
+            "Selkov": "data/selkov.npy"
+        }
+        if preset_name in preset_files:
+            file_path = preset_files[preset_name]
+            if os.path.exists(file_path):
+                return file_path
+        return None
+    def run_forecast(file, horizon, model_selection, preprocessing_method, standardize, fit_nonstationary, context_steps, preset_selection):
+        try:
+            # 1. Load the data
+            # Check if preset is selected
+            preset_file_path = load_preset_data(preset_selection)
+            if not file and not preset_file_path:
+                gr.Warning("Please upload a file or select a preset.")
+                raise ValueError("Please upload a file or select a preset.")
+            # Use preset file if selected, otherwise use uploaded file
+            if preset_file_path:
+                file_path = preset_file_path
+                ext = ".npy"
+            else:
+                file_path = file.name
+                ext = os.path.splitext(file.name)[1].lower()
+            # Load input file (.csv or .npy)
+            if ext == ".csv":
+                df = pd.read_csv(file_path)
+                if 'series_name' in df.columns:
+                    gr.Warning("Unsupported CSV format: only column-per-dimension format is supported.")
+                    raise ValueError("Unsupported CSV format: only column-per-dimension format is supported.")
+                # Keep only numeric columns
+                df = df.select_dtypes(include=[np.number]).copy()
+                if df.shape[1] == 0:
+                    gr.Warning("No numeric columns found in CSV file.")
+                    raise ValueError("No numeric columns found in CSV file.")
+                values = df.values
+            elif ext == ".npy":
+                values = np.load(file_path)
+                # Defer DataFrame creation until after shape validation (handles 1D arrays)
+                df = None
+            else:
+                gr.Warning("Unsupported file format. Please upload .csv or .npy")
+                raise ValueError("Unsupported file format. Please upload .csv or .npy")
+            # 2. Validate shape and dimensions, then construct context
+            if not isinstance(values, np.ndarray):
+                values = np.asarray(values)
+            if values.ndim != 2:
+                if values.ndim == 1:
+                    values = np.reshape(values, (-1, 1))
+                else:
+                    gr.Warning("Input must be 2D with shape (time_steps, dimensions).")
+                    raise ValueError("Input must be 2D with shape (time_steps, dimensions).")
+            if values.shape[0] < 2:
+                gr.Warning("Input must contain at least 2 time steps.")
+                raise ValueError("Input must contain at least 2 time steps.")
+            if values.shape[1] < 1:
+                gr.Warning("Input must contain at least 1 dimension.")
+                raise ValueError("Input must contain at least 1 dimension.")
+            if values.shape[1] > 100:
+                gr.Warning(f"Too many dimensions: {values.shape[1]} > 100. Reduce dimensions to ≤ 100.")
+                raise ValueError(f"Too many dimensions: {values.shape[1]} > 100. Reduce dimensions to ≤ 100.")
+            if context_steps < values.shape[0]:
+                values = values[-context_steps:]  # Use only the last n steps
+            values = values.astype(np.float32)
+            context_ts_tensor = torch.tensor(values, dtype=torch.float32)
+            # 3. Load the selected model
+            if model_selection == "Auto":
+                current_model = load_hf_model(auto_model_selection(context_ts_tensor))
+            else:
+                current_model = load_hf_model(model_selection)
+            forecaster = DynaMixForecaster(current_model)
+            if values.shape[1] > 3 and values.shape[1] <= 100:
+                gr.Warning(f"Input dimension {values.shape[1]} > model dimension {current_model.N}. This may lead to performance degradation.")
+            # 4. Run forecast
+            with torch.no_grad():
+                reconstruction_ts = forecaster.forecast(
+                    context=context_ts_tensor,
+                    horizon=int(horizon),
+                    preprocessing_method=preprocessing_method,
+                    standardize=standardize,
+                    fit_nonstationary=fit_nonstationary,
+                )
+            reconstruction_ts_np = reconstruction_ts.cpu().numpy()
+            # 5. Create Plotly figure
+            fig = create_forecast_plot(values, reconstruction_ts_np, horizon)
+            # 6. Save forecast as CSV (all dimensions) and NPY (all dimensions)
+            if df is None:
+                # Create column names for NPY input after shape normalization
+                df = pd.DataFrame(values, columns=[f"dim_{i+1}" for i in range(values.shape[1])])
+            forecast_df = pd.DataFrame(reconstruction_ts_np, columns=df.columns.tolist())
+            csv_path = "forecast.csv"
+            forecast_df.to_csv(csv_path, index=False)
+            # 7. Save full forecast as NPY (all dimensions)
+            npy_path = "forecast.npy"
+            np.save(npy_path, reconstruction_ts_np)
+            # 8. Print success notification with timestamp
+            current_time = datetime.now().strftime("%Y-%m-%d %H:%M:%S")
+            print(f"[{current_time}] - Forecast completed successfully!")
+            return fig, csv_path, npy_path
+        except Exception as e:
+            current_time = datetime.now().strftime("%Y-%m-%d %H:%M:%S")
+            print(f"[{current_time}] - Forecast error: {str(e)}")
+            return None, None, None
+    plot_btn.click(
+        run_forecast,
+        inputs=[
+            file_input, horizon_slider, model_selection, preprocessing_method, standardize,
+            fit_nonstationary, context_steps, preset_dropdown
+        ],
+        outputs=[plot_output, csv_output, npy_output]
+    )
+if __name__ == "__main__":
+    demo.launch()

data/chua.npy ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:bc0a1090e555f13aab17aa70feca3dd0fe64f50edbd33849105a84ce86f08d11
+size 24128

data/lorenz63.npy ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:aafcc2759e8981b44b4cc9f335967934647b20e27b1952d89fb0f371e1a835a6
+size 48128

data/selkov.npy ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:2576908c5fd0e55267a41022f81b5b9c8f5f8fbec326a0977d8a702982ee4fef
+size 1160

data/sine.npy ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:429315428d533a103b6d772cbb2d5d341f0a73145bb196758717cf73b0a655b2
+size 4224

dynamix/__init__.py ADDED Viewed

	@@ -0,0 +1,9 @@

+"""
+Model module for Zero-shot DSR.
+"""
+from .dynamix import *
+from .preprocessing_utilities import *
+from .preprocessing import *
+from .forecaster import *
+from .utilities import *

dynamix/dynamix.py ADDED Viewed

	@@ -0,0 +1,266 @@

+import torch
+import torch.nn as nn
+import torch.nn.functional as F
+import numpy as np
+class GatingNetwork(nn.Module):
+    def __init__(self, N, M, Experts, dtype=torch.float32):
+        super().__init__()
+        self.conv = nn.Conv1d(N, N, kernel_size=2, padding=0, bias=True, dtype=dtype)
+        self.softmax_temp1 = nn.Parameter(torch.tensor([0.1], dtype=dtype))
+        self.D = nn.Parameter(torch.zeros(N, M, dtype=dtype))
+        self.D.data[:, :N] = torch.eye(N, dtype=dtype)
+        self.mlp_layer1 = nn.Linear(M + N, Experts, dtype=dtype)
+        self.mlp_layer2 = nn.Linear(Experts, Experts, dtype=dtype)
+        self.softmax_temp2 = nn.Parameter(torch.tensor([0.1], dtype=dtype))
+        self.sigma = nn.Parameter(torch.ones(N, dtype=dtype) * 0.05, requires_grad=True)
+    def forward(self, context, z, precomputed_cnn=None):
+        # context: (seq_length, batch_size, N)
+        # z: (M, batch_size)
+        # precomputed_cnn: Optional precomputed CNN features for inference (seq_length-1, batch_size, N)
+        seq_length, batch_size, N = context.shape
+        M = z.shape[0]
+        # Compute attention weights
+        z_obs = self.D @ z.detach()
+        z_current = z_obs + self.sigma.unsqueeze(1) * torch.randn(N, batch_size, dtype=z.dtype, device=z.device)
+        z_current_t = z_current.transpose(0, 1)
+        context_frames = context[:-1]
+        distances = torch.sum(torch.abs(context_frames - z_current_t.unsqueeze(0)), dim=2)
+        attention_weights = F.softmax(-distances / torch.abs(self.softmax_temp1[0]), dim=0)
+        # Process context with convolution
+        # Use precomputed CNN features if provided, otherwise compute them
+        if precomputed_cnn is not None:
+            encoded = precomputed_cnn
+        else:
+            context_for_conv = context.permute(1, 2, 0)
+            encoded = self.conv(context_for_conv)
+            encoded = encoded.permute(2, 0, 1)
+        # Build weighted embedding
+        weighted_encoded = encoded * attention_weights.unsqueeze(2)
+        embedding = torch.sum(weighted_encoded, dim=0)
+        embedding = embedding.transpose(0, 1)
+        # Predict expert weights
+        combined = torch.cat([embedding, z], dim=0)
+        combined_t = combined.transpose(0, 1)
+        mlp_output = self.mlp_layer2(F.relu(self.mlp_layer1(combined_t)))
+        w_exp = F.softmax(-mlp_output.transpose(0, 1) / torch.abs(self.softmax_temp2[0]), dim=0)
+        return w_exp
+    def gaussian_init(self, M, N, dtype=torch.float32):
+        return torch.randn(M, N, dtype=dtype) * 0.01
+class ExpertNetwork(nn.Module):
+    """Base class for different expert architectures."""
+    def __init__(self, M, P=0, probabilistic=False, dtype=torch.float32):
+        super().__init__()
+        self.M = M
+        self.P = P
+        self.probabilistic = probabilistic
+        self.dtype = dtype
+        # Parameter for probabilistic experts
+        if probabilistic:
+            self.sigma = nn.Parameter(torch.ones(1, dtype=dtype) * 0.05, requires_grad=True)
+    def forward(self, z):
+        raise NotImplementedError("Subclasses must implement forward method")
+    def add_noise(self, z):
+        """Add stochasticity to the latent state if in probabilistic mode.
+        Args:
+            z: Input tensor
+        """
+        if self.probabilistic:
+            batch_size = z.shape[1]
+            noise = torch.randn(self.M, batch_size, dtype=z.dtype, device=z.device)
+            return z + self.sigma * noise
+        return z
+    def gaussian_init(self, M, N):
+        return torch.randn(M, N, dtype=self.dtype) * 0.01
+    def normalized_positive_definite(self, M):
+        R = np.random.randn(M, M).astype(np.float32)
+        K = R.T @ R / M + np.eye(M)
+        lambd = np.max(np.abs(np.linalg.eigvals(K)))
+        return K / lambd
+class AlmostLinearRNN(ExpertNetwork):
+    """Almost linear RNN expert architecture."""
+    def __init__(self, M, P, probabilistic=False, dtype=torch.float32):
+        super().__init__(M, P, probabilistic, dtype=dtype)
+        self.A, self.W, self.h = self.initialize_A_W_h(M)
+    def forward(self, z):
+        # z: (M, batch_size)
+        # Split z into regular and ReLU parts
+        z1 = z[:-self.P, :]
+        z2 = F.relu(z[-self.P:, :])
+        zcat = torch.cat([z1, z2], dim=0)
+        output = self.A.unsqueeze(-1) * z + self.W @ zcat + self.h.unsqueeze(-1)
+        # Add stochasticity if probabilistic
+        if self.probabilistic:
+            output = self.add_noise(output)
+        return output
+    def initialize_A_W_h(self, M):
+        A = torch.nn.Parameter(torch.diag(torch.tensor(self.normalized_positive_definite(M), dtype=self.dtype)))
+        W = torch.nn.Parameter(self.gaussian_init(M, M))
+        h = torch.nn.Parameter(torch.zeros(M, dtype=self.dtype))
+        return A, W, h
+class ClippedShallowPLRNN(ExpertNetwork):
+    """Clipped shallow PLRNN expert architecture."""
+    def __init__(self, M, hidden_dim=50, probabilistic=False, dtype=torch.float32):
+        super().__init__(M, hidden_dim, probabilistic, dtype=dtype)
+        self.A = torch.nn.Parameter(torch.diag(torch.tensor(self.normalized_positive_definite(M), dtype=self.dtype)))
+        self.W1 = torch.nn.Parameter(self.gaussian_init(M, hidden_dim))
+        self.W2 = torch.nn.Parameter(self.gaussian_init(hidden_dim, M))
+        self.h1 = torch.nn.Parameter(torch.zeros(M, dtype=self.dtype))
+        self.h2 = torch.nn.Parameter(torch.zeros(hidden_dim, dtype=self.dtype))
+    def forward(self, z):
+        # z: (M, batch_size)
+        W2z = self.W2 @ z
+        output = (self.A.unsqueeze(-1) * z +
+                self.W1 @ (F.relu(W2z + self.h2.unsqueeze(-1)) - F.relu(W2z)) +
+                self.h1.unsqueeze(-1))
+        # Add stochasticity if probabilistic
+        if self.probabilistic:
+            output = self.add_noise(output)
+        return output
+class DynaMix(nn.Module):
+    def __init__(self, M, N, Experts, P=2, hidden_dim=50, expert_type="almost_linear_rnn",
+                 probabilistic_expert=False, dtype=torch.float32):
+        """
+        Initialize a DynaMix model.
+        Args:
+            M: Dimension of latent state
+            N: Dimension of observation space
+            Experts: Number of experts
+            P: Number of ReLU dimensions
+            hidden_dim: Hidden dimension for clipped shallow PLRNN
+            expert_type: Type of expert to use ("almost_linear_rnn" or "clipped_shallow_plrnn")
+            probabilistic_expert: Whether to use probabilistic experts
+            dtype: Data type for model parameters (default: torch.float32)
+        """
+        super().__init__()
+        self.expert_type = expert_type
+        self.probabilistic_expert = probabilistic_expert
+        self.experts = nn.ModuleList()
+        self.dtype = dtype
+        for _ in range(Experts):
+            if expert_type == "almost_linear_rnn":
+                self.experts.append(AlmostLinearRNN(M, P, probabilistic=probabilistic_expert, dtype=dtype))
+            elif expert_type == "clipped_shallow_plrnn":
+                self.experts.append(ClippedShallowPLRNN(M, hidden_dim, probabilistic=probabilistic_expert, dtype=dtype))
+            else:
+                raise ValueError(f"Unknown expert type: {expert_type}")
+        self.gating_network = GatingNetwork(N, M, Experts, dtype=dtype)
+        self.B = nn.Parameter(self.uniform_init((N, M), dtype=dtype))
+        self.N = N
+        self.Experts = Experts
+        self.P = P
+        self.hidden_dim = hidden_dim
+        self.M = M
+    def step(self, z, context, precomputed_cnn=None):
+        # z: (M, batch_size)
+        # context: (seq_length, batch_size, N)
+        # precomputed_cnn: Optional precomputed CNN features
+        # Compute expert weights
+        w_exp = self.gating_network(context, z, precomputed_cnn=precomputed_cnn)  # (Experts, batch_size)
+        results = []
+        # Compute expert outputs
+        for i in range(self.Experts):
+            expert_output = self.experts[i](z)
+            results.append(expert_output * w_exp[i, :].unsqueeze(0))
+        # Combine expert outputs
+        return torch.sum(torch.stack(results, dim=0), dim=0)
+    def forward(self, z, context, precomputed_cnn=None):
+        """
+        Forward pass through the DynaMix model.
+        Args:
+            z: Latent state of shape (M, batch_size)
+            context: Context data of shape (seq_length, batch_size, N)
+            precomputed_cnn: Optional precomputed CNN features to avoid redundant computation for inference
+                            Shape should be (seq_length-1, batch_size, N)
+        Returns:
+            Updated latent state
+        """
+        return self.step(z, context, precomputed_cnn=precomputed_cnn)
+    def precompute_cnn(self, context):
+        """
+        Precompute CNN features for more efficient inference.
+        Args:
+            context: Context data of shape (seq_length, batch_size, N)
+        Returns:
+            Precomputed CNN features of shape (seq_length-1, batch_size, N)
+        """
+        # Process context with convolution
+        context_for_conv = context.permute(1, 2, 0)
+        encoded = self.gating_network.conv(context_for_conv)
+        return encoded.permute(2, 0, 1)
+    def uniform_init(self, shape, dtype=torch.float32):
+        din = shape[-1]
+        r = 1 / np.sqrt(din)
+        return (torch.rand(shape, dtype=dtype) * 2 - 1) * r
+    def gaussian_init(self, M, N):
+        return torch.randn(M, N, dtype=self.dtype) * 0.01
+def print_model_parameters(model):
+    """Print simplified breakdown of model parameters by component."""
+    total_params = sum(p.numel() for p in model.parameters())
+    print("\n" + "-"*60)
+    print("Model Parameter Summary:")
+    print(f"  Architecture: DynaMix with {model.expert_type} experts")
+    if model.expert_type == "almost_linear_rnn":
+        print(f"  Dimensions: M={model.M}, N={model.N}, Experts={model.Experts}, P={model.P}")
+    else:
+        print(f"  Dimensions: M={model.M}, N={model.N}, Experts={model.Experts}, Hidden dim={model.hidden_dim}")
+    print(f"  Probabilistic experts: {model.probabilistic_expert}")
+    # Count parameters
+    gating_params = sum(p.numel() for p in model.gating_network.parameters())
+    expert_params = sum(p.numel() for expert in model.experts for p in expert.parameters())
+    b_params = model.B.numel()
+    # Print parameter counts
+    print(f"\nParameter counts:")
+    print(f"  Gating Network: {gating_params:,} ({gating_params/total_params:.1%})")
+    print(f"  Experts: {expert_params:,} ({expert_params/total_params:.1%})")
+    print(f"  Observation matrix: {b_params:,} ({b_params/total_params:.1%})")
+    print(f"  Total: {total_params:,} parameters")
+    print("-"*60)

dynamix/forecaster.py ADDED Viewed

	@@ -0,0 +1,199 @@

+import torch
+import torch.nn.functional as F
+from .preprocessing import DataPreprocessor
+class DynaMixForecaster:
+    """
+    Forecasting pipeline for DynaMix models with batch processing support.
+    """
+    def __init__(self, model):
+        """
+        Initialize the forecaster with a DynaMix model.
+        Args:
+            model: DynaMix model instance
+        """
+        self.model = model
+    def _init_latent_state(self, initial_condition):
+        """
+        Initialize the latent state from the initial condition.
+        Args:
+            initial_condition: Initial state of shape (batch_size, N)
+        Returns:
+            Initial latent state z
+        """
+        N = self.model.N
+        # Initialize latent state
+        z = torch.matmul(initial_condition, self.model.B).t()  # (M, batch_size)
+        z[:N, :] = initial_condition.t()
+        return z
+    def _reshape_for_model(self, context, initial_x=None, device=None):
+        """
+        Prepare and reshape input data for the model.
+        Handles tensor conversion, dimension adjustments, and reshaping when feature_dim > model_dim.
+        Args:
+            context: Context data tensor of shape (seq_length, batch_size, feature_dim) or (seq_length, feature_dim)
+            initial_x: Optional initial condition of shape (batch_size, feature_dim) or (feature_dim,)
+            device: Device to place tensors on
+        Returns:
+            Processed context, initial_x, dimensions, and reshaping metadata
+        """
+        # Get the dtype from model parameters
+        model_dtype = next(self.model.parameters()).dtype
+        # Convert to torch tensor if needed
+        if not isinstance(context, torch.Tensor):
+            context = torch.tensor(context, dtype=model_dtype, device=device)
+        elif context.device != device or context.dtype != model_dtype:
+            context = context.to(device=device, dtype=model_dtype)
+        if initial_x is not None and not isinstance(initial_x, torch.Tensor):
+            initial_x = torch.tensor(initial_x, dtype=model_dtype, device=device)
+        elif initial_x is not None and (initial_x.device != device or initial_x.dtype != model_dtype):
+            initial_x = initial_x.to(device=device, dtype=model_dtype)
+        # Check data dimensions and reshape if needed
+        original_dim = context.dim()
+        if original_dim == 2:
+            context = context.unsqueeze(1)  # (seq_length, feature_dim) -> (seq_length, 1, feature_dim)
+        elif original_dim != 3:
+            raise ValueError(f"Expected 2D or 3D tensor for context, got shape {context.shape} with {context.dim()} dimensions")
+        if initial_x is not None and initial_x.dim() == 1:
+            initial_x = initial_x.unsqueeze(0)  # (feature_dim,) -> (1, feature_dim)
+            if initial_x.shape[1] != context.shape[2]:
+                raise ValueError(f"Initial condition has {initial_x.shape[1]} features, but context has {context.shape[2]} features")
+        # Data shape
+        seq_length, batch_size, feature_dim = context.shape
+        # Check if reshaping is needed for model dimension
+        if feature_dim <= self.model.N:
+            return context, initial_x, (batch_size, feature_dim, False, None, None, original_dim)
+        print(f"Warning: Input feature dimension {feature_dim} exceeds model dimension {self.model.N}. "
+              f"This may lead to performance degradation."
+              f"Reshaping data to treat each feature as separate time series.")
+        # Store original dimensions for reshaping back later
+        original_batch_size = batch_size
+        original_feature_dim = feature_dim
+        # Reshape context to (seq_length, batch_size * feature_dim, 1)
+        transposed = context.permute(0, 2, 1)
+        new_batch_size = batch_size * feature_dim
+        reshaped_context = transposed.reshape(seq_length, new_batch_size, 1)
+        # Similarly reshape initial_x if provided
+        reshaped_initial_x = initial_x
+        if initial_x is not None:
+            # Reshape from (batch_size, feature_dim) to (batch_size * feature_dim, 1)
+            reshaped_initial_x = initial_x.transpose(0, 1).reshape(new_batch_size, 1)
+        return reshaped_context, reshaped_initial_x, (new_batch_size, 1, True, original_batch_size, original_feature_dim, original_dim)
+    def _reshape_to_original(self, output, reshape_metadata):
+        """
+        Reshape output back to original dimensions.
+        Handles both high-dimensional reshaping and 2D input restoration.
+        Args:
+            output: Model output of shape (T, batch_size, N)
+            reshape_metadata: Tuple containing (was_reshaped, original_batch_size, original_feature_dim, original_dim)
+        Returns:
+            Output with original shape restored
+        """
+        _, _, was_reshaped, original_batch_size, original_feature_dim, original_dim = reshape_metadata
+        # Step 1: Reshape back to original dimensions if needed
+        if was_reshaped:
+            # Current shape: (T, batch_size=original_batch_size*original_feature_dim, 1)
+            T = output.shape[0]
+            # First reshape to (T, original_feature_dim, original_batch_size)
+            # by treating the batch dimension as (original_feature_dim, original_batch_size)
+            reshaped = output.reshape(T, original_feature_dim, original_batch_size, -1)
+            # Then permute to (T, original_batch_size, original_feature_dim)
+            output = reshaped.permute(0, 2, 1, 3).squeeze(-1)
+        # Step 2: If input was 2D, remove batch dimension from output
+        if original_dim == 2 and output.shape[1] == 1:
+            output = output.squeeze(1)
+        return output
+    @torch.no_grad()
+    def forecast(self, context, horizon, preprocessing_method="pos_embedding",
+                standardize=True, fit_nonstationary=False, initial_x=None):
+        """
+        Efficient batched forecasting with the DynaMix model.
+        This method implements a complete forecasting pipeline including:
+        - Data preprocessing (Box-Cox, detrending, standardization)
+        - Embedding techniques for dimensionality matching
+        - DynaMix model prediction
+        - Data postprocessing (inverse transformations)
+        Args:
+            context: Context data tensor of shape (seq_length, batch_size, feature_dim) or (seq_length, feature_dim)
+            horizon: Forecast horizon (number of steps to predict)
+            preprocessing_method: Data preprocessing method ('pos_embedding', 'zero_embedding',
+                                  'delay_embedding', or 'delay_embedding_random') (default: 'pos_embedding')
+            standardize: Whether to standardize the data (default: True)
+            fit_nonstationary: Whether to fit a non-stationary time series (default: False)
+            initial_x: Optional initial condition of shape (batch_size, feature_dim) or (feature_dim,)
+        Returns:
+            Predicted sequence of shape (horizon, batch_size, feature_dim)
+        """
+        # Get model dimensions
+        M = self.model.M
+        N = self.model.N
+        device = context.device if isinstance(context, torch.Tensor) else self.model.B.device
+        model_dtype = next(self.model.parameters()).dtype
+        # Apply context reshaping if needed
+        context, initial_x, shape_metadata = self._reshape_for_model(context, initial_x, device)
+        # Create data preprocessor
+        preprocessor = DataPreprocessor(
+            standardize=standardize,
+            box_cox=fit_nonstationary,
+            detrending=fit_nonstationary,
+            preprocessing_method=preprocessing_method
+        )
+        # Step 1: Apply preprocessing pipeline
+        context_embedded, initial_condition = preprocessor.preprocess(context, self.model.N, initial_x)
+        # Step 2: Initialize latent state
+        z = self._init_latent_state(initial_condition)
+        # Step 3: Perform forecasting loop
+        Z_gen = torch.empty(horizon, M, shape_metadata[0], device=device, dtype=model_dtype)
+        with torch.amp.autocast(device_type='cuda' if device.type == 'cuda' else 'cpu', enabled=device.type == 'cuda'):
+            precomputed_cnn = self.model.precompute_cnn(context_embedded)
+            for t in range(horizon):
+                z = self.model(z, context_embedded, precomputed_cnn=precomputed_cnn)
+                Z_gen[t] = z
+        # Step 4: Apply observation generation
+        output = Z_gen[:, :shape_metadata[1], :].permute(0, 2, 1)  # (horizon, batch_size, feature_dim)
+        # Step 5: Apply inverse data transformations (e.g. standardization, ...)
+        output = preprocessor.postprocess(output)
+        # Step 6: Reshape back to original dimensions if needed
+        output = self._reshape_to_original(output, shape_metadata)
+        return output

dynamix/preprocessing.py ADDED Viewed

	@@ -0,0 +1,262 @@

+import torch
+import numpy as np
+from .preprocessing_utilities import (TimeSeriesProcessor, Embedding,
+                                    BoxCoxTransformer, Detrending, estimate_initial_condition)
+class DataPreprocessor:
+    """
+    Main class for data preprocessing that orchestrates all transformations.
+    """
+    def __init__(self, standardize=True, box_cox=False, detrending=False, preprocessing_method="pos_embedding"):
+        """
+        Initialize the data preprocessor.
+        Args:
+            standardize: Whether to standardize the data
+            box_cox: Whether to apply Box-Cox transformation
+            detrending: Whether to apply exponential detrending
+            preprocessing_method: Method for embedding ('pos_embedding', 'zero_embedding',
+                                  'delay_embedding', 'delay_embedding_random')
+        """
+        self.standardize = standardize
+        self.box_cox = box_cox
+        self.detrending = detrending
+        self.preprocessing_method = preprocessing_method
+        # Parameters for inverse transformations
+        self.box_cox_params_list = None
+        self.detrending_params_list = None
+        self.context_mean = None
+        self.context_std = None
+        self.original_context = None
+        self.batch_size = None
+        self.feature_dim = None
+    def _apply_transformations(self, context):
+        """
+        Apply Box-Cox transformation and/or detrending to each batch in the context data.
+        Args:
+            context: Context data tensor of shape (seq_length, batch_size, N_data)
+        Returns:
+            Transformed context data
+        """
+        # Store original context for inverse transformations
+        self.original_context = context.clone()
+        # Apply Box-Cox transformation for each batch
+        if self.box_cox:
+            transformed_context = torch.zeros_like(context)
+            self.box_cox_params_list = []
+            for b in range(self.batch_size):
+                batch_context = context[:, b, :]
+                transformed, params = BoxCoxTransformer.transform(batch_context)
+                transformed_context[:, b, :] = transformed
+                self.box_cox_params_list.append(params)
+            context = transformed_context
+        # Apply detrending for each batch
+        if self.detrending:
+            detrended_context = torch.zeros_like(context)
+            self.detrending_params_list = []
+            for b in range(self.batch_size):
+                batch_context = context[:, b, :]
+                detrended, params = Detrending.apply_detrending(batch_context)
+                detrended_context[:, b, :] = detrended
+                self.detrending_params_list.append(params)
+            context = detrended_context
+        return context
+    def _apply_transformations_inverse(self, output):
+        """
+        Apply inverse Box-Cox and detrending transformations.
+        Args:
+            output: Model output of shape (T, batch_size, N)
+        Returns:
+            Output with transformations reversed
+        """
+        # Apply inverse detrending for each batch
+        if self.detrending and self.detrending_params_list is not None:
+            for b in range(self.batch_size):
+                batch_output = output[:, b, :]
+                batch_context = self.original_context[:, b, :]
+                batch_output = Detrending.apply_detrending_inverse(batch_context, batch_output, self.detrending_params_list[b])
+                output[:, b, :] = batch_output
+        # Apply inverse Box-Cox transformation for each batch
+        if self.box_cox and self.box_cox_params_list is not None:
+            for b in range(self.batch_size):
+                batch_output = output[:, b, :]
+                batch_output = BoxCoxTransformer.inverse_transform(batch_output, self.box_cox_params_list[b])
+                output[:, b, :] = batch_output
+        return output
+    def _standardize_data(self, context):
+        """
+        Standardize each batch in the context data.
+        Args:
+            context: Context data tensor of shape (seq_length, batch_size, N_data)
+            initial_x: Optional initial condition of shape (batch_size, N_data)
+        Returns:
+            Standardized context and initial_x (if provided)
+        """
+        if not self.standardize:
+            return context
+        # Calculate mean and std across time dimension for each batch separately
+        self.context_mean = torch.mean(context, dim=0)  # (batch_size, N_data)
+        self.context_std = torch.std(context, dim=0)    # (batch_size, N_data)
+        self.context_std = torch.clamp(self.context_std, min=1e-6)  # Avoid division by zero
+        # Standardize using broadcasting
+        context = (context - self.context_mean.unsqueeze(0)) / self.context_std.unsqueeze(0)
+        return context
+    def _unstandardize_data(self, output):
+        """
+        Undo standardization by applying the inverse transformation.
+        Args:
+            output: Model output of shape (T, batch_size, N)
+        Returns:
+            Output with standardization reversed
+        """
+        if self.standardize and self.context_mean is not None and self.context_std is not None:
+            return output * self.context_std.unsqueeze(0) + self.context_mean.unsqueeze(0)
+        return output
+    def _apply_embedding(self, context, model_dim):
+        """
+        Apply data preprocessing to each batch to reach model dimension.
+        Args:
+            context: Context data tensor of shape (seq_length, batch_size, N_data)
+            model_dim: Target model dimension
+        Returns:
+            Preprocessed context data tensor
+        """
+        context_embedded_batch = []
+        for b in range(self.batch_size):
+            batch_context = context[:, b, :]
+            batch_embedded = Embedding.apply_embedding(batch_context, model_dim, self.preprocessing_method)
+            context_embedded_batch.append(batch_embedded)
+        # Align sequence lengths across batches
+        seq_lengths = [emb.shape[0] for emb in context_embedded_batch]
+        min_seq_len = min(seq_lengths)
+        context_embedded_batch = [emb[-min_seq_len:] for emb in context_embedded_batch]
+        # Stack along batch dimension
+        return torch.stack(context_embedded_batch, dim=1)
+    def _prepare_initial_condition(self, context_embedded, initial_x, model_dim):
+        """
+        Prepare initial condition for forecasting.
+        Args:
+            context_embedded: Preprocessed context data
+            initial_x: Optional initial condition
+            model_dim: Model dimension
+        Returns:
+            Initial condition for forecasting
+        Raises:
+            ValueError: If initial condition is provided with Box-Cox or detrending enabled
+        """
+        if initial_x is None:
+            # Use last context value for each batch
+            return context_embedded[-1]
+        # Raise error if initial condition is provided with Box-Cox or detrending enabled
+        if (self.box_cox or self.detrending):
+            raise ValueError(
+                "Using initial conditions with Box-Cox or detrending is not supported. "
+                "Either disable Box-Cox and detrending or do not provide an initial condition."
+            )
+        # Process initial conditions for each batch
+        initial_x_processed = torch.zeros(self.batch_size, model_dim, device=context_embedded.device)
+        for b in range(self.batch_size):
+            batch_initial = initial_x[b]
+            # Apply standardization if enabled
+            if self.standardize and self.context_mean is not None and self.context_std is not None:
+                batch_initial = (batch_initial - self.context_mean[b]) / (self.context_std[b] + 1e-8)
+            # If dimensions are smaller than model_dim, estimate full initial condition
+            if initial_x.shape[1] < model_dim:
+                # Find matching state in context_embedded
+                batch_initial = estimate_initial_condition(
+                    batch_initial,
+                    context_embedded[:, b, :],
+                )
+            initial_x_processed[b] = batch_initial
+        return initial_x_processed
+    def preprocess(self, context, model_dim, initial_x=None):
+        """
+        Apply the complete preprocessing pipeline to the input data.
+        Args:
+            context: Context data tensor of shape (seq_length, batch_size, N_data) or (seq_length, N_data)
+            model_dim: Target model dimension
+            initial_x: Optional initial condition of shape (batch_size, N_data) or (N_data,)
+        Returns:
+            Preprocessed context data and initial condition
+        """
+        # Store dimensions
+        self.batch_size = context.shape[1]
+        self.feature_dim = context.shape[2]
+        # Apply transformations (Box-Cox, detrending)
+        context = self._apply_transformations(context)
+        # Standardize data if requested
+        context = self._standardize_data(context)
+        # Apply embedding to reach model dimension
+        context_embedded = self._apply_embedding(context, model_dim)
+        # Prepare initial batch
+        initial_condition = self._prepare_initial_condition(context_embedded, initial_x, model_dim)
+        return context_embedded, initial_condition
+    def postprocess(self, output):
+        """
+        Apply inverse transformations to restore original data scaling.
+        Args:
+            output: Model output of shape (T, batch_size, N)
+        Returns:
+            Output with inverse transformations applied
+        """
+        # Undo standardization
+        output = self._unstandardize_data(output)
+        # Apply inverse transformations (Box-Cox, detrending)
+        output = self._apply_transformations_inverse(output)
+        return output

dynamix/preprocessing_utilities.py ADDED Viewed

	@@ -0,0 +1,536 @@

+import torch
+import numpy as np
+from scipy import stats
+from scipy.signal import find_peaks
+import random
+from statsmodels.tsa.stattools import acf
+from scipy.ndimage import gaussian_filter1d
+from scipy import optimize
+class TimeSeriesProcessor:
+    """
+    Utility class for converting between numpy and torch.
+    """
+    @staticmethod
+    def to_numpy(data):
+        """Convert torch tensor to numpy array while preserving device and dtype info"""
+        is_torch = isinstance(data, torch.Tensor)
+        if is_torch:
+            device = data.device
+            dtype = data.dtype
+            return data.detach().cpu().numpy(), is_torch, device, dtype
+        return data, False, None, None
+    @staticmethod
+    def to_torch(data_np, is_torch, device=None, dtype=None):
+        """Convert numpy array back to torch tensor if original was a tensor"""
+        if is_torch:
+            return torch.tensor(data_np, device=device, dtype=dtype)
+        return data_np
+class Embedding:
+    """
+    Class for embedding methods to transform time series to target dimension.
+    """
+    @staticmethod
+    def estimate_TDM_tau(data, acorr_threshold=1/np.e):
+        """
+        Estimate tau using autocorrelation function with threshold method
+        Args:
+            data: Input data tensor of shape (seq_length, N)
+            acorr_threshold: Autocorrelation threshold
+        Returns:
+            Maximum estimated tau across all dimensions
+        """
+        # Convert to numpy
+        data_np, _, _, _ = TimeSeriesProcessor.to_numpy(data)
+        seq_length, n_dims = data_np.shape
+        tau_vals = np.zeros(n_dims, dtype=int)
+        for dim in range(n_dims):
+            # Calculate autocorrelation
+            autocorr_vals = acf(data_np[:, dim] - np.mean(data_np[:, dim]), nlags=seq_length//2)
+            # Find first value below threshold (after lag 0)
+            below_threshold = np.where(autocorr_vals[1:] < acorr_threshold)[0]
+            if len(below_threshold) > 0:
+                tau_vals[dim] = below_threshold[0] + 1  # +1 because skipping lag 0
+            else:
+                tau_vals[dim] = 1  # Default if no value below threshold
+        return int(np.max(tau_vals))
+    @staticmethod
+    def estimate_pos_tau(data, max_lag=None, min_lag=None):
+        """
+        Estimate autocorrelation time for positional embedding
+        Args:
+            data: Input data tensor of shape (seq_length, N)
+            max_lag: Maximum lag to consider
+            min_lag: Minimum lag to consider
+        Returns:
+            Maximum autocorrelation time across dimensions
+        """
+        data_np, _, _, _ = TimeSeriesProcessor.to_numpy(data)
+        seq_length, n = data_np.shape
+        if max_lag is None:
+            max_lag = seq_length - 1
+        if min_lag is None:
+            min_lag = seq_length // 10
+        tau_vals = np.zeros(n, dtype=int)
+        for dim in range(n):
+            ts = data_np[:, dim] if not isinstance(data, torch.Tensor) else data[:, dim].cpu().numpy()
+            autocorr_vals = acf(ts - np.mean(ts), nlags=max_lag)
+            # Determine max autocorrelation with tau>tau_min
+            peaks, _ = find_peaks(autocorr_vals)
+            valid_peaks = [i for i in peaks if i > min_lag and i < len(autocorr_vals)]
+            if valid_peaks:
+                peak_values = autocorr_vals[valid_peaks]
+                max_peak_idx = np.argmax(peak_values)
+                tau_vals[dim] = valid_peaks[max_peak_idx]
+            else:
+                start_idx = min_lag + 1
+                segment = autocorr_vals[start_idx:]
+                tau_vals[dim] = start_idx + int(np.argmax(segment))
+        return np.max(tau_vals)
+    @staticmethod
+    def delay_embedding(data, model_dim, tau=None):
+        """
+        Standard delay embedding with optimal tau
+        Args:
+            data: Input data tensor of shape (seq_length, N)
+            model_dim: Target dimension
+            tau: Time delay (if None, estimated from autocorrelation)
+        Returns:
+            Delay embedded data of shape (shortened_length, model_dim)
+        """
+        seq_length, N_data = data.shape
+        needed_dims = model_dim - N_data
+        if needed_dims <= 0:
+            return data
+        processed_data = data.clone()
+        # Estimate tau if not provided
+        if tau is None:
+            tau = Embedding.estimate_TDM_tau(processed_data)
+        # Select the last column for embedding
+        ts = processed_data[:, -1].clone()
+        # Calculate starting index
+        start_idx = needed_dims * tau
+        # Handle case where start_idx is too large
+        if start_idx >= seq_length:
+            tau = max(1, seq_length // (needed_dims + 1))
+            start_idx = needed_dims * tau
+        # Create shortened data
+        shortened_data = processed_data[start_idx:].clone()
+        result = shortened_data
+        # Add delayed versions
+        for i in range(1, needed_dims + 1):
+            delayed = ts[start_idx - i * tau:seq_length - i * tau].unsqueeze(1)
+            result = torch.cat([result, delayed], dim=1)
+        return result
+    @staticmethod
+    def delay_embedding_random(data, model_dim, upper_tau=10, lower_tau=3):
+        """
+        Random delay embedding with random tau values
+        Args:
+            data: Input data tensor of shape (seq_length, N)
+            model_dim: Target dimension
+            upper_tau: Upper bound for random tau values
+            lower_tau: Lower bound for random tau values
+        Returns:
+            Random delay embedded data
+        """
+        seq_length, N_data = data.shape
+        needed_dims = model_dim - N_data
+        if needed_dims <= 0:
+            return data
+        processed_data = data.clone()
+        # Generate random tau values
+        taus = [random.randint(lower_tau, upper_tau) for _ in range(needed_dims)]
+        max_tau = max(taus)
+        # Select the first column for embedding
+        ts = processed_data[:, 0].clone()
+        # Create shortened data
+        result = processed_data[max_tau:].clone()
+        # Add delayed versions
+        for i in range(needed_dims):
+            delayed = ts[max_tau - taus[i]:seq_length - taus[i]].unsqueeze(1)
+            result = torch.cat([result, delayed], dim=1)
+        return result
+    @staticmethod
+    def zero_embedding(data, model_dim):
+        """
+        Zero embedding: appends zeros to reach model dimensions
+        Args:
+            data: Input data tensor of shape (seq_length, N)
+            model_dim: Target dimension
+        Returns:
+            Tensor with zeros appended to reach model_dim
+        """
+        seq_length, N_data = data.shape
+        needed_dims = model_dim - N_data
+        if needed_dims > 0:
+            zeros = torch.zeros(seq_length, needed_dims, device=data.device, dtype=data.dtype)
+            data = torch.cat([data, zeros], dim=1)
+        return data
+    @staticmethod
+    def positional_embedding(data, model_dim, tau=None):
+        """
+        Positional embedding: adds sinusoidal signals based on autocorrelation time
+        Args:
+            data: Input data tensor of shape (seq_length, N)
+            model_dim: Target dimension
+            tau: Optional fixed value for tau. If None, estimated from data.
+        Returns:
+            Data with positional embeddings added
+        """
+        seq_length, N_data = data.shape
+        needed_dims = model_dim - N_data
+        if needed_dims <= 0:
+            return data
+        if needed_dims != 1:
+            shifts = torch.linspace(0, np.pi/2, needed_dims, device=data.device)
+        else:
+            shifts = torch.tensor([0.0], device=data.device)
+        tau_val = tau if tau is not None else Embedding.estimate_pos_tau(data)
+        t = torch.arange(1, seq_length + 1, dtype=data.dtype, device=data.device)
+        result = data.clone()
+        for shift in shifts:
+            pos_feature = torch.sin(2 * np.pi / tau_val * t + shift).unsqueeze(1)
+            result = torch.cat([result, pos_feature], dim=1)
+        return result
+    @staticmethod
+    def apply_embedding(data, model_dim, method="pos_embedding", **kwargs):
+        """
+        Apply selected embedding method to the data
+        Args:
+            data: Input data tensor of shape (seq_length, N)
+            model_dim: Target dimension
+            method: Embedding method ('pos_embedding', 'zero_embedding',
+                    'delay_embedding', or 'delay_embedding_random')
+            **kwargs: Additional parameters to pass to the specific embedding method
+        Returns:
+            Embedded data
+        """
+        if method == "pos_embedding":
+            return Embedding.positional_embedding(data, model_dim, **kwargs)
+        elif method == "zero_embedding":
+            return Embedding.zero_embedding(data, model_dim)
+        elif method == "delay_embedding":
+            return Embedding.delay_embedding(data, model_dim, **kwargs)
+        elif method == "delay_embedding_random":
+            return Embedding.delay_embedding_random(data, model_dim, **kwargs)
+        else:
+            raise ValueError(f"Unsupported embedding method: {method}")
+class BoxCoxTransformer:
+    """
+    Applies Box-Cox transformation to data for variance stabilization.
+    """
+    def __init__(self, lambda_range=(-2, 2)):
+        """
+        Initialize BoxCoxTransformer.
+        Args:
+            lambda_range: Range for lambda parameter search
+        """
+        self.lambda_range = lambda_range
+        self.params = None
+    @staticmethod
+    def transform(data, lambda_range=(-2, 2)):
+        """
+        Apply Box-Cox transformation to data for stabilization
+        Args:
+            data: Input data tensor of shape (seq_length, N)
+            lambda_range: Range for lambda parameter search
+        Returns:
+            Transformed data and parameters for inverse transformation
+        """
+        # Convert to numpy
+        data_np, is_torch, device, dtype = TimeSeriesProcessor.to_numpy(data)
+        seq_length, n_dims = data_np.shape
+        transformed_data = np.zeros_like(data_np)
+        box_cox_params = []
+        for dim in range(n_dims):
+            # Add constant to ensure positivity
+            if np.min(data_np[:, dim]) <= 0:
+                offset = abs(np.min(data_np[:, dim])) + 1.2
+                data_shifted = data_np[:, dim] + offset
+            else:
+                offset = 1.2
+                data_shifted = data_np[:, dim] + offset
+            try:
+                # Find optimal lambda for Box-Cox transformation
+                transformed, lambda_param = stats.boxcox(data_shifted)
+                # Limit lambda to a reasonable range to prevent numerical issues
+                lambda_param = max(min(lambda_param, 2.0), -2.0)
+                # Recalculate transformation with bounded lambda for consistency
+                if abs(lambda_param) < 1e-8:
+                    # For lambda near zero, use logarithmic transformation
+                    transformed = np.log(data_shifted)
+                else:
+                    transformed = (data_shifted ** lambda_param - 1) / lambda_param
+                # Store transformed data and parameters
+                transformed_data[:, dim] = transformed
+            except:
+                # If transformation fails, just use the original data
+                transformed_data[:, dim] = data_np[:, dim]
+                lambda_param = 1.0  # Identity transform
+            box_cox_params.append((lambda_param, offset))
+        # Convert back to torch if needed
+        return TimeSeriesProcessor.to_torch(transformed_data, is_torch, device, dtype), box_cox_params
+    @staticmethod
+    def inverse_transform(data, box_cox_params):
+        """
+        Apply inverse Box-Cox transformation
+        Args:
+            data: Transformed data tensor
+            box_cox_params: Parameters from Box-Cox transformation
+        Returns:
+            Original scale data
+        """
+        # Convert to numpy for computation
+        data_np, is_torch, device, dtype = TimeSeriesProcessor.to_numpy(data)
+        seq_length, n_dims = data_np.shape
+        inverse_data = np.zeros_like(data_np)
+        for dim in range(min(n_dims, len(box_cox_params))):
+            lambda_param, offset = box_cox_params[dim]
+            # Apply inverse transformation
+            if abs(lambda_param) < 1e-8:
+                # For lambda near zero, the transformation is logarithmic
+                inverse_data[:, dim] = np.exp(data_np[:, dim]) - offset
+            elif abs(lambda_param - 1.0) < 1e-8:
+                # For lambda=1 (identity transform), just subtract offset
+                inverse_data[:, dim] = data_np[:, dim] - offset
+            else:
+                # For other lambda values
+                base = lambda_param * data_np[:, dim] + 1
+                # Simple clipping approach to ensure base is positive
+                # This avoids complex numbers while preserving most data characteristics
+                base = np.maximum(base, 1e-10)
+                # Apply power transformation
+                result = base ** (1/lambda_param)
+                inverse_data[:, dim] = result - offset
+        # Convert back to torch if needed
+        return TimeSeriesProcessor.to_torch(inverse_data, is_torch, device, dtype)
+class Detrending:
+    """
+    Applies exponential detrending to time series data.
+    """
+    @staticmethod
+    def exp_model(t, params):
+        """
+        Exponential model for detrending
+        Args:
+            t: Time points
+            params: Model parameters [a, b, c]
+        Returns:
+            Model values
+        """
+        a, b, c = params
+        return a * (t ** b) + c
+    @staticmethod
+    def fit_objective(params, data):
+        """
+        Objective function for exponential model fitting
+        Args:
+            params: Model parameters
+            data: Data to fit
+        Returns:
+            Sum of squared errors
+        """
+        t = np.arange(1, len(data) + 1)
+        predicted = Detrending.exp_model(t, params)
+        return np.sum((data - predicted) ** 2)
+    @staticmethod
+    def apply_detrending(data):
+        """
+        Apply exponential detrending to data
+        Args:
+            data: Input data tensor of shape (seq_length, N)
+        Returns:
+            Detrended data and parameters for inverse transformation
+        """
+        # Convert to numpy
+        data_np, is_torch, device, dtype = TimeSeriesProcessor.to_numpy(data)
+        seq_length, n_dims = data_np.shape
+        detrended_data = np.zeros_like(data_np)
+        detrending_params = []
+        for dim in range(n_dims):
+            # Define the objective function for this dimension
+            objective = lambda params: Detrending.fit_objective(params, data_np[:, dim])
+            # Initial parameter guess
+            initial_params = [0.0, 1.0, data_np[0,dim]]
+            # Bounds for parameters
+            bounds = [(None, None), (0.1, 3.0), (None, None)]
+            # Optimize
+            result = optimize.minimize(
+                objective,
+                initial_params,
+                method='L-BFGS-B',
+                bounds=bounds,
+                options={
+                    'maxiter': 1000,
+                    'gtol': 1e-6,
+                    'maxfun': 1500,
+                    'maxcor': 10
+                }
+            )
+            optimal_params = np.round(result.x, 3)
+            # Calculate trend and detrend the data
+            t = np.arange(1, seq_length + 1)
+            trend = Detrending.exp_model(t, optimal_params)
+            detrended_data[:, dim] = data_np[:, dim] - trend
+            # Store parameters for inverse transformation
+            detrending_params.append(optimal_params)
+        # Convert back to torch if needed
+        return TimeSeriesProcessor.to_torch(detrended_data, is_torch, device, dtype), detrending_params
+    @staticmethod
+    def apply_detrending_inverse(context, data, detrending_params):
+        """
+        Apply inverse detrending to forecasted data
+        Args:
+            context: Original context data
+            data: Forecasted data
+            detrending_params: Parameters from detrending
+        Returns:
+            Forecasted data with trend restored
+        """
+        # Convert to numpy for computation
+        data_np, is_torch, device, dtype = TimeSeriesProcessor.to_numpy(data)
+        context_np, _, _, _ = TimeSeriesProcessor.to_numpy(context)
+        # Get dimensions
+        forecast_length, n_dims = data_np.shape
+        context_length = len(context_np)
+        # Create time points for the forecast horizon
+        t = np.arange(context_length + 1, context_length + forecast_length + 1)
+        # Add trend back to each dimension
+        for dim in range(min(n_dims, len(detrending_params))):
+            params = detrending_params[dim]
+            trend = Detrending.exp_model(t, params)
+            data_np[:, dim] = data_np[:, dim] + trend
+        # Convert back to torch if needed
+        return TimeSeriesProcessor.to_torch(data_np, is_torch, device, dtype)
+def estimate_initial_condition(initial_x, context_embedded):
+    """
+    Estimate full initial condition from partial observation
+    Args:
+        initial_x: Partial initial condition of shape (N_partial,)
+        context_embedded: Context data of shape (seq_length, N)
+    Returns:
+        Complete initial condition of shape (N,)
+    """
+    T, N = context_embedded.shape
+    N_partial = initial_x.shape[0]
+    assert N_partial <= N, "Initial condition dimension must be <= embedding dimension"
+    # Find timestep with closest match to initial condition in first N_partial dimensions
+    distances = torch.zeros(T, device=initial_x.device)
+    for t in range(T):
+        distances[t] = torch.sum((context_embedded[t, :N_partial] - initial_x) ** 2)
+    closest_t = torch.argmin(distances)
+    # Combine initial condition with closest matching state
+    return torch.cat([initial_x, context_embedded[closest_t, N_partial:]])

dynamix/utilities.py ADDED Viewed

	@@ -0,0 +1,174 @@

+import json
+from huggingface_hub import hf_hub_download
+from safetensors.torch import load_file
+from dynamix.dynamix import DynaMix
+import plotly.graph_objects as go
+import plotly.subplots as sp
+import numpy as np
+"""
+Loading models from HuggingFace Hub
+"""
+def load_hf_model_config(model_name):
+    """Load model configuration from HuggingFace Hub"""
+    config_path = hf_hub_download(
+        repo_id="DurstewitzLab/dynamix",
+        filename="config_" + model_name.replace("dynamix-", "") + ".json"
+    )
+    with open(config_path, 'r') as f:
+        model_config = json.load(f)
+    return model_config
+def load_hf_model(model_name):
+    """Load a specific DynaMix model with its configuration"""
+    try:
+        # Load model configuration
+        model_config = load_hf_model_config(model_name)
+        architecture = model_config["architecture"]
+        # Extract hyperparameters from config
+        M = architecture["M"]  # Latent state dimension
+        N = architecture["N"]  # Observation space dimension
+        EXPERTS = architecture["Experts"]  # Number of experts
+        P = architecture["P"]  # Number of ReLU dimensions
+        HIDDEN_DIM = architecture["hidden_dim"]
+        expert_type = architecture["expert_type"]
+        probabilistic_expert = architecture["probabilistic_expert"]
+        # Create model with config parameters
+        model = DynaMix(
+            M=M,
+            N=N,
+            Experts=EXPERTS,
+            expert_type=expert_type,
+            P=P,
+            hidden_dim=HIDDEN_DIM,
+            probabilistic_expert=probabilistic_expert,
+        )
+        # Load model weights
+        model_path = hf_hub_download(
+            repo_id="DurstewitzLab/dynamix",
+            filename=model_name + ".safetensors",
+        )
+        model_state_dict = load_file(model_path)
+        model.load_state_dict(model_state_dict)
+        model.eval()
+    except Exception as e:
+        print(f"Error loading model {model_name}: {e}")
+        raise ValueError(f"Model {model_name} not found")
+    return model
+# Model selection function
+def auto_model_selection(context):
+    """
+    Select the model to use for forecasting
+    """
+    if context.shape[1] == 1:
+        return "dynamix-6d-alrnn-v1.0"
+    elif context.shape[1] >= 2 and context.shape[1] <= 3:
+        return "dynamix-3d-alrnn-v1.0"
+    elif context.shape[1] >= 6:
+        return "dynamix-6d-alrnn-v1.0"
+"""
+Plotting functions
+"""
+def create_forecast_plot(values, reconstruction_ts_np, horizon):
+    """
+    Create a Plotly figure with dark theme styling matching the reference image
+    """
+    dims = reconstruction_ts_np.shape[-1]
+    plot_dims = min(dims, 15)  # plot up to 15 dimensions
+    context_time = np.arange(-len(values), 0)
+    forecast_time = np.arange(0, int(horizon))
+    # Create subplots
+    # Adjust spacing based on number of dimensions
+    if plot_dims <= 3:
+        vertical_spacing = 0.1
+    elif plot_dims <= 6:
+        vertical_spacing = 0.05
+    elif plot_dims <= 15:
+        vertical_spacing = 0.02
+    fig = sp.make_subplots(
+        rows=plot_dims,
+        cols=1,
+        vertical_spacing=vertical_spacing
+    )
+    # Add traces for each dimension
+    for d in range(plot_dims):
+        # Historical data
+        historical_trace = go.Scatter(
+            x=context_time,
+            y=values[:, d],
+            mode='lines',
+            line=dict(color='#4169E1', width=2.5),
+            name=f"context_{d+1}",
+            showlegend=False,
+            hovertemplate=f"context_{d+1}<br>x: %{{x}}<br>y: %{{y}}<extra></extra>"
+        )
+        # Forecast
+        forecast_trace = go.Scatter(
+            x=forecast_time,
+            y=reconstruction_ts_np[:, d],
+            mode='lines',
+            line=dict(color='#FF4242', width=2.5),
+            name=f"forecast_{d+1}",
+            showlegend=False,
+            hovertemplate=f"forecast_{d+1}<br>x: %{{x}}<br>y: %{{y}}<extra></extra>"
+        )
+        fig.add_trace(historical_trace, row=d+1, col=1)
+        fig.add_trace(forecast_trace, row=d+1, col=1)
+    fig.update_layout(
+        plot_bgcolor='#1f2937',
+        paper_bgcolor='#1f2937',
+        font=dict(color='white'),
+        showlegend=False,
+        title=None,
+        margin=dict(l=50, r=50, t=30, b=50),
+        xaxis=dict(
+            gridcolor='rgba(255, 255, 255, 0.2)',
+            zerolinecolor='rgba(255, 255, 255, 0.2)',
+            showgrid=True
+        ),
+        yaxis=dict(
+            gridcolor='rgba(255, 255, 255, 0.2)',
+            zerolinecolor='rgba(255, 255, 255, 0.2)',
+            showgrid=True,
+        ),
+        height=300 if plot_dims == 1 else 250 * plot_dims,
+        width=None
+    )
+    for i in range(plot_dims):
+        fig.update_xaxes(
+            gridcolor='rgba(255, 255, 255, 0.2)',
+            zerolinecolor='rgba(255, 255, 255, 0.2)',
+            showgrid=True,
+            row=i+1, col=1
+        )
+        fig.update_yaxes(
+            gridcolor='rgba(255, 255, 255, 0.2)',
+            zerolinecolor='rgba(255, 255, 255, 0.2)',
+            showgrid=True,
+            row=i+1, col=1
+        )
+    return fig

requirements.txt ADDED Viewed

	@@ -0,0 +1,10 @@

+torch>=1.10.0
+numpy>=1.20.0
+matplotlib>=3.4.0
+plotly>=6.3.0
+scipy>=1.7.0
+pandas>=1.3.0
+safetensors>=0.4.0
+huggingface_hub>=0.19.0
+statsmodels>=0.14.4
+gradio>=5.43.1