Shiyunee nielsr HF Staff commited on
Commit
6910c17
·
verified ·
1 Parent(s): 957dc9c

Add pipeline tag, library name, and prominent GitHub link (#1)

Browse files

- Add pipeline tag, library name, and prominent GitHub link (1a58f8cb96f41fa78fd2ced477dc9f7f069be983)


Co-authored-by: Niels Rogge <[email protected]>

Files changed (1) hide show
  1. README.md +15 -8
README.md CHANGED
@@ -1,19 +1,26 @@
1
  ---
2
- license: apache-2.0
3
- language:
4
- - en
5
  base_model:
6
  - Qwen/Qwen2.5-7B-Instruct
 
 
 
 
 
7
  ---
8
- # Introduction
9
 
10
- This is the official repo of the paper [Annotation-Efficient Universal Honesty Alignment](https://arxiv.org/abs/2510.17509)
 
 
 
 
 
 
11
 
12
  This repository provides modules that extend **Qwen2.5-7B-Instruct** with the ability to generate accurate confidence scores *before* response generation, indicating how likely the model is to answer a given question correctly across tasks. We offer two types of modules—**LoRA + Linear Head** and **Linear Head**—along with model parameters under three training settings:
13
 
14
- 1. **Elicitation (greedy):** Trained on all questions (over 560k) using self-consistency-based confidence annotations.
15
- 2. **Calibration-Only (right):** Trained on questions with explicit correctness annotations.
16
- 3. **EliCal (hybrid):** Initialized from the Elicitation model and further trained on correctness-labeled data.
17
 
18
  For both **Calibration-Only** and **EliCal** settings, we provide models trained with different amounts of annotated data (1k, 2k, 3k, 5k, 8k, 10k, 20k, 30k, 50k, 80k, 200k, 560k+). Since **LoRA + Linear Head** is the main configuration used in our paper, the following description is based on this setup.
19
 
 
1
  ---
 
 
 
2
  base_model:
3
  - Qwen/Qwen2.5-7B-Instruct
4
+ language:
5
+ - en
6
+ license: apache-2.0
7
+ pipeline_tag: text-generation
8
+ library_name: transformers
9
  ---
 
10
 
11
+ # Annotation-Efficient Universal Honesty Alignment
12
+
13
+ This is the official repository for the paper [Annotation-Efficient Universal Honesty Alignment](https://arxiv.org/abs/2510.17509).
14
+
15
+ Code: [https://github.com/Trustworthy-Information-Access/Annotation-Efficient-Universal-Honesty-Alignment](https://github.com/Trustworthy-Information-Access/Annotation-Efficient-Universal-Honesty-Alignment)
16
+
17
+ ## Introduction
18
 
19
  This repository provides modules that extend **Qwen2.5-7B-Instruct** with the ability to generate accurate confidence scores *before* response generation, indicating how likely the model is to answer a given question correctly across tasks. We offer two types of modules—**LoRA + Linear Head** and **Linear Head**—along with model parameters under three training settings:
20
 
21
+ 1. **Elicitation (greedy):** Trained on all questions (over 560k) using self-consistency-based confidence annotations.
22
+ 2. **Calibration-Only (right):** Trained on questions with explicit correctness annotations.
23
+ 3. **EliCal (hybrid):** Initialized from the Elicitation model and further trained on correctness-labeled data.
24
 
25
  For both **Calibration-Only** and **EliCal** settings, we provide models trained with different amounts of annotated data (1k, 2k, 3k, 5k, 8k, 10k, 20k, 30k, 50k, 80k, 200k, 560k+). Since **LoRA + Linear Head** is the main configuration used in our paper, the following description is based on this setup.
26