Update README.md

Files changed (1) hide show

README.md CHANGED Viewed

@@ -101,7 +101,7 @@ Therefore you may want to normalize the probability.
 You can also compare the two probabilities assigned independently to each response (given the same context) to infer the preference label.
 For example, if one response has probability 0.95 and the other has 0.80, the former will be preferred.
-Inferring the preference label in this way only leads to a 0.5 drop in accuracy on the SHP + HH-RLHF test data on average across all domains, meaning that there's only a very small penalty for using SteamSHP as a reward model instead of as a preference model.
@@ -142,7 +142,7 @@ SteamSHP-XL gets an average 72.8% accuracy across all domains:
 | ALL (unweighted) | 0.7278 |
 As mentioned previously, if you use SteamSHP as a reward model and try to infer the preference label based on the probability assigned to each response independently, that could also work!
-But doing so will lead to a 0.5 drop in accuracy on the test data (on average across all domains), meaning that there is a small penalty.
 ## Biases and Limitations

 You can also compare the two probabilities assigned independently to each response (given the same context) to infer the preference label.
 For example, if one response has probability 0.95 and the other has 0.80, the former will be preferred.
+Inferring the preference label in this way only leads to a 0.006 drop in accuracy on the SHP + HH-RLHF test data on average across all domains, meaning that there's only a very small penalty for using SteamSHP-XL as a reward model instead of as a preference model.
 | ALL (unweighted) | 0.7278 |
 As mentioned previously, if you use SteamSHP as a reward model and try to infer the preference label based on the probability assigned to each response independently, that could also work!
+But doing so will lead to a 0.006 drop in accuracy on the test data (on average across all domains), meaning that there is a small penalty.
 ## Biases and Limitations