I think HumanLoop just launched tracing.
Baserun.ai, a friend, is also getting in this space. Weights and Biases has a solution to do waterfall tracking as well. Seems like someone beat me to mentioning Arize.
One of my biggest challenges I've seen with tools in PromptOps space is that many of them assume you talk directly with OpenAI - and not have any Retrieval Augmented Generation. Or, for example there might be a distributed flow between multiple models or systems that you want to track. To solve for some of these problems across the entire lifecycle of a prompt, Ive been working on
PromptStash (see the "Deploying the Variant in Production" section for the traces/stashing). We are very agnostic to the chaining libraries and model vendors. If you're open to it, would appreciate feedback.