What does `max_window_layers` do?

#29
by jonaskuebler - opened

In the config there is the parameter max_window_layers https://huggingface.co/Qwen/Qwen3-Coder-30B-A3B-Instruct/blob/main/config.json#L14.

For this model it is 28 whereas the number of layers is 48. This seems like the last 20 layers would not use full attention. But it does not seem that this is the case for the model. So I am wondering does this parameter do anything else or is it an artifact that could be removed?

Sign up or log in to comment