2025

Chain of Thought (CoT) prompting

Chain-of-Thought prompting is a technique for getting better results from large language models by asking them to show their reasoning step by step. This post explains how we use it in our Diversity Tokenism research.

Read more →

2025

Can you trust GenAI with numbers?

Why LLMs ace easy sums, fail at 4-digit multiplication, and what that means for finance, audit, and tax teams using GenAI.

Read more →

2025

Is my LLM getting dumber, or is it just me?

43 carefully designed tests show GPT variants drifting in accuracy over time. Teams need continuous evaluation, not blind trust in a model label.

Read more →

2025

Don't tell AI what time it is

Adding timestamps for audit trails drops accuracy by 10%. The compliance mechanism undermines the output being audited.

Read more →