Are there any better ways to weigh the GPT-4 respo...
# 06-technical-discussion
d
Are there any better ways to weigh the GPT-4 responses (like get Confidence scores) based on domain knowledge? I am trying something on the lines of generating multiple messages and judge which one is the better one. Would love if someone can share some resources or share some examples/experiences etc.
o
Do you know what the right answers should be?
gpt4 lacks logprobs but i guess you could try to hack around that haha
d
Yes @Olabode Adedoyin we do have Human Edits in those messages, that can serve as a ground truth, curious to learn what you have in mind
o
On quick fix might be to use vector embeddings to compute a cosine Similarity score between your ground truth and what the model generates.
d
I see, good idea, will try these