Ethics by Design
Website: | Hamburg Open Online University |
Kurs: | Ethics by Design |
Buch: | Ethics by Design |
Gedruckt von: | Gast |
Datum: | Sonntag, 22. Dezember 2024, 12:43 |
Beschreibung
Truth Filters for LLMs?
Emma's curiosity deepened as she wondered how these philosophical perspectives on truth could be practically applied to help both users and developers interact more responsibly with LLMs. If truth could be seen through different lenses, then each lens might offer a way to improve the way people work with LLMs. She began to think of these theories as 'truth filters' - different ways of evaluating, interpreting and even designing model responses with specific goals in mind. These filters, she thought, could form the basis of an ethics by design framework, a system that could make interactions with LLMs more transparent, grounded, and ultimately safer for all.
She sketched out ideas for how users and developers could apply each truth filter to guide their approach:
- Correspondence Filter
- Coherence Filter
- Pragmatic Filter
- Epistemic Filter
1. Correspondence Filter
- For users: Emma imagined a world where users approached AI responses with a filter that prioritised real-world facts. With this filter, users could be encouraged to question whether an AI response really reflects objective reality. Is the information something that can be verified, such as a date, name or location? By encouraging users to think about real-world alignment, this filter could encourage a habit of cross-checking, rather than accepting everything at face value.
- For developers: Developers, Emma thought, could take advantage of this filter by building in fact-checking mechanisms. Integrating real-time fact-checking tools - such as APIs that pull from updated databases - could flag answers that deviate from established facts. This step would help reduce the likelihood of hallucinations by keeping responses grounded in reality, especially on fact-sensitive topics.
2. Coherence Filter
- For users: Emma knew that sometimes the truth lies in consistency rather than in raw facts. She imagined a coherence filter that would help users assess whether an answer fits logically with a previous conversation or with their own knowledge. If a model began to contradict itself in a conversation, users would be prompted to take a closer look, encouraging critical engagement rather than passive acceptance.
- For developers: Developers could implement this filter by designing models with an internal consistency check, allowing the LLM to 'remember' its previous responses during an ongoing session. By training models to avoid inconsistencies, especially in longer dialogues, they could improve consistency. This would make LLM output more reliable and engaging, and reduce the model's tendency to contradict itself over time.
3. Pragmatic Filter
- For users: Emma recognised that not all tasks require strict factual accuracy. So the pragmatic filter could guide users to assess the practical utility of an output. If the model answer is useful for a particular task - such as generating ideas in a brainstorming session - then it's useful, even if it's not perfectly factual. This filter could remind users to think about the context: Is this a creative or brainstorming scenario where a useful, flexible answer is better than an accurate one?
- For developers: Developers could use this filter to prioritise output based on the intended purpose of user prompts. Contextual fine-tuning could encourage models to produce relevant, pragmatic answers that meet user needs, especially in exploratory or creative applications. In this way, the model could focus on providing answers that are "good enough" for the context, rather than absolute truths, creating a more flexible interaction.
4. Epistemic Filter
- For users: Emma could see the epistemic filter acting as a guide for users to critically evaluate responses, especially where evidence and justification are essential. This filter would encourage users to look for answers that are supported by clear reasoning or sources, helping them to develop a natural scepticism towards answers that lack justification.
- For developers: Developers could improve the output of the LLM by implementing confidence scoring and, where possible, citation generation. By allowing the model to signal when it's more or less confident in its answers, developers could give users insight into how "sure" the model is. In addition, training the model to draw from reputable data sources or cite references would allow users to assess the basis for the answer, leading to more informed judgements.
Emma realised that embedding these truth filters into both user interfaces and model development processes could shape a responsible and user-conscious approach to interacting with AI. By guiding both sides - users to evaluate and developers to implement mindful strategies - these filters could improve transparency, usability and accountability in AI applications.
She imagined a future where every interaction with an LLM included subtle reminders of these filters, encouraging users to engage critically and ensuring that developers took proactive steps to minimise the risk of error and unintended consequences. In this future, truth wasn't just about producing correct answers; it was a dynamic process that allowed people to interact with LLMs in a meaningful and responsible way. With truth filters as her blueprint, Emma saw the possibility of an AI landscape where technology and human values were thoughtfully aligned.