What are the workarounds for OpenAI completion API of 4k tok Cerebral Valley #06-technical-discussion

What are the workarounds for OpenAI completion API...

Art

05/06/2023, 7:20 PM

What are the workarounds for OpenAI completion API of 4k tokens? I am asking LLM to summarize the content in a particular format (aka JSON schema), which takes 2k tokens in the prompt without the actual text. I found that JSON is not very token-efficient and already switched that to the YAML as input and output, but I still need to keep the structured format. Sample of the schema:

Copy code

attr_1:
  type: number
  minimum: 0
  maximum: 30
attr_2:
  type: string
  enum: [value_1, value_2, etc]

and 50 more attributes like that. How are you optimizing long-structured schema prompts? What are the potential workarounds? Would love to hear from the community 🤖

Daniel Hsu

05/06/2023, 7:27 PM

If you need to read the entirety of the schema in a single prompt then you could explore prompt compression. Example technique: If you something that looks like an "attr_2" repeated many times, tell the LLM that the entirety of "attr_2: {... }" will be reffered to as "attribute_2_placeholder" from now on. You can also consider stripping irrelevant text on a trial/error basis. Another approach would be segmenting the input to summarize recursively (transformer XL does something similar: https://arxiv.org/abs/1901.02860) That said, are you sure you need to have the entirety of the schema in a single prompt?

👀 1

Art

05/06/2023, 7:56 PM

thank you @Daniel Hsu, the promt compression makes sense. I was able to compress is by 50% with a few little tweaks. As for Transformer-XL, are you aware of any implementations?

That said, are you sure you need to have the entirety of the schema in a single prompt?

My goal is to evaluate products described in the documents against specific attributes and then rank products based those attributes in memory (for now). The same product might be mentioned in multiple documents and I am trying to get an aggregate of the attributes across multiple documents. if not passing the schema into the prompt, how would you get a structured data in the output, what are the options here? so the simplified pipeline is: documents (multiple products per document) -> products with attributes -> User input on attributes (structured) -> ranking of products based on the attributes that matter. I am not sure if that’s the best pipeline, but curios if you have any thoughts on that.

Daniel Hsu

05/06/2023, 7:58 PM

Is the set of attributes static? i.e. a finite number of possible attributes that is well-defined?

Art

05/06/2023, 8:06 PM

yes, attributes are the same across all products and they are well defined: ranges for numbers, possible values for enums, etc

Daniel Hsu

05/07/2023, 7:52 PM

Perhaps you could preprocess segments of the documents into just the attributes themselves. e.g. chunk the document first, process each segment to extract all attributes exhibited by that segment. That'd compress your input drastically

✅ 1

Daniel Hsu

05/07/2023, 7:53 PM

I believe this is the code for transformer-xl: https://github.com/kimiyoung/transformer-xl

❤️ 2

Art

05/07/2023, 11:30 PM

thank you Daniel!

viksit

05/09/2023, 6:41 PM

this is exactly what i was looking for as well!

❤️ 1

Harsha Sabbineni

05/11/2023, 11:12 PM

Anthropic Claude now supports 100k tokens https://www.anthropic.com/index/100k-context-windows

Daniel Hsu

05/11/2023, 11:19 PM

whoa! I'm so curious how they did this...

Daniel Hsu

05/11/2023, 11:20 PM

Maybe they incorporated a chunking/summarizing step

Harsha Sabbineni

05/12/2023, 5:06 PM

I suspect something similar. There is also new research coming out that supports unlimited context windows: https://arxiv.org/pdf/2305.01625.pdf by offloading attention computation across all layers to a k-nearest neighbor index.

😮 1

Harsha Sabbineni

06/22/2023, 8:06 AM

@Daniel Hsu This article describes some techniques: https://blog.gopenai.com/how-to-speed-up-llms-and-use-100k-context-window-all-tricks-in-one-place-ffd40577b4c?gi=0098d13df3ea

✅ 1

3 Views

Open in Slack

Previous Next