Reducing GPT-4 API Cost by using Prompt Decompression (2024)

Reducing GPT-4 API Cost by using Prompt Decompression (1)

Report this article

Martin Khristi

"AI- og maskinlæringsfortaler | BI- og datavisualiseringsspecialist hos CA Karrierepartner og a-kasse | Microsoft Fabric-entusiast | Ingeniør & kurator af AI-indsigter | Python-programmering til AI og data science.

Published Mar 10, 2024

+ Follow

To reduce the size of a prompt, you can use compression techniques. One way to do this is by using GPT’s ability to compress and decompress tokens.

A recent tweet from @VictorTaelin suggests that GPT can be prompted to generate more tokens by compressing the original prompt. @VictorTaelin initially discovered GPT’s ability to compress and decompress tokens, as seen on his GitHub page.

Reducing GPT-4 API Cost by using Prompt Decompression (3)

Compression and Decompression Prompt

Here are the steps to create a compressed prompt:

Paste the text after the colon in the following prompt: “compress the following text in a way that fits in a tweet (ideally) and such that you (GPT-4) can reconstruct the intention of the human who wrote text as close as possible to the original intention. This is for yourself. It does not need to be human readable or understandable. Abuse of language mixing, abbreviations, symbols (unicode and emoji), or any other encodings or internal representations is all permissible, as long as it, if pasted in a new inference cycle, will yield near-identical results as the original text:”

Use trial and error to create a compressed version of the text. This may involve using abbreviations, symbols, or other encoding techniques to reduce the number of tokens.

The Results

I tried to summarize a newsletter prompt and reduce the token size, Here are the calculations:

Original Newsletter Prompt:

Original Token Size: 207

Conclusion

By using compression techniques, you can reduce the size of a prompt while still maintaining its meaning and intention. This can be particularly useful when working with large amounts of text or when memory constraints are an issue.

More Advanced Techniques:

Please take a look at Sparse Priming Representation (SPR)

In case of any queries, feel free to reach out to me

[email protected]

Sign in

Stay updated on your professional world

By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.

New to LinkedIn? Join now

Insights from the community

Machine Learning What do you do if logical reasoning in Machine Learning is evolving rapidly?
Computer Vision What are the current trends and future directions for VQA systems?
Contract Management What are the best practices for improving contract searchability with text analysis?
Artificial Neural Networks What are the benefits and challenges of fine-tuning BERT for text classification?
Generative AI How do you evaluate the quality and accuracy of the texts generated by transformers and GPT-3 models?
Search Engines How do you optimize the speed and scalability of self-attention models for search engines?
Artificial Intelligence How can self-attention improve question answering in AI?
Artificial Neural Networks What are the advantages and challenges of using GANs for text generation?
Statistics How can you handle class imbalance in text classification model selection and validation?
Data Mining How can you integrate expert systems with other technologies?

Others also viewed

DBRX: A New State-of-the-Art Open LLM Sharad Gupta 5mo
Transformer Architectures for Dummies - Part 2 (Decoder Only Architectures) Multicloud4U® Technologies 8mo
My Journey of Building the First Custom GPT for GPT Store Raj Kumar 4mo
How to Reduce Latency in Response Time When Making Requests to (OpenAI) Large Language Models? Mangalprada Malay 5mo
Research paper -Tiny Titans: Can Smaller Large Language Models Punch Above Their Weight in the Real World for Meeting Summarization? Srinivas Pradeep s 6mo
The Ultimate Hack for Perfecting Open-Source Language Models Dror Hilman 2mo
Explicit Reasoning: Why LLMs Must Write Down Their Thought Processes to Tackle Complex Challenges Pascal Soucy 4mo
Transformer Architectures for Dummies - Part 2 (Decoder Only Architectures) Bhaskar T, PhD 8mo
🥇Top ML Papers of the Week DAIR.AI 3mo

Explore topics

Sales
Marketing
IT Services
Business Administration
HR Management
Engineering
Soft Skills
See All