AnalysisAI ModelsPolicy
8 days ago
Automated identification of lexical alignment and preference shifts in LLMs
Paper introduces a method to automatically detect when LLM assistants diverge from human expectations in language use. The approach builds on research in Scientific English to identify both what divergences occur and why.
·
8 days ago