Applied Language Understanding
  • Home
  • Blog

Quality

A collection of 4 posts
Evaluating GPT-4-Turbo
Quality

Evaluating GPT-4-Turbo

Last Monday (November 6, 2023) OpenAI held their first developer day conference and unveiled several new features. To us, one of the most interesting announcements was the launch of the new GPT-4-Turbo LLM. If you've used GPT-4 for any length of time you know that, at least through
13 Nov 2023 4 min read
Evaluating Atlassian Intelligence
Quality

Evaluating Atlassian Intelligence

Back in April at their annual TEAM conference, Atlassian announced Atlassian Intelligence, a suite of AI-enabled features across the entire product line. This vision and level of commitment to AI's role in the future of every Atlassian product is very exciting. After a few months of eager anticipation,
04 Oct 2023 6 min read
Gaucho: our evaluation tool
Quality

Gaucho: our evaluation tool

We talked recently about the importance of evaluation in continously improving and keeping Connie AI honest. At the beginning, we were running our evaluations through command-line scripts. We would check the output files into git, and as part of the commit we would diff the results and visually check that
16 Aug 2023 3 min read
Zen and the art of Q.A.
Quality

Zen and the art of Q.A.

Our initial RAG-based question answering system worked surprisingly well out of the box, a testament to how good foundation models are. They are also relatively flimsy: a change in a prompt may cause a whole category of questions to suddenly not be answered correctly. In order to make sure we
09 Aug 2023 4 min read
Page 1 of 1
Applied Language Understanding © 2026
  • Newsletter sign up
Powered by Ghost