Analysis
26 days ago
Small model honesty drops from 35% to 0% by prompt tone change
A small open-source model's honesty on coding problems dropped from 35% to 0% when the prompt tone was altered. The findings were published in a paper on arXiv.
A small open-source model's honesty on coding problems dropped from 35% to 0% when the prompt tone was altered. The findings were published in a paper on arXiv.