I am working with the Anthropic API to process text prompts, but I keep encountering the following error when my prompt exceeds the maximum token limit:
Error code: 400 - {'type': 'error', 'error': {'type': 'invalid_request_error', 'message': 'prompt is too long: 200936 tokens > 199999 maximum'}}
I need to ensure my prompts are within the 199,999 token limit before sending them to the API. Here's what I have so far:
- I generate a long prompt with approximately 150K words.
- I use the count_tokens method to check the token count.
- If the token count exceeds the limit, I trim the prompt and retry. Here's the code I'm using:
from anthropic_bedrock import AnthropicBedrock
import anthropic
import random
import string
# Function to generate a random word
def generate_random_word(length):
return ''.join(random.choices(string.ascii_lowercase, k=length))
# Generate ~150K words
words = [generate_random_word(random.randint(3, 10)) for _ in range(150000)]
print(f'Number of words: {len(words)}')
test_prompt = ' '.join(words)
# Function to count the number of tokens
def count_number_tokens(prompt: str, verbose: bool = False) -> tuple[int, int]:
bedrock_client = AnthropicBedrock()
anthropic_client = anthropic.Client()
try:
token_count_bedrock = bedrock_client.count_tokens(prompt)
except Exception as e:
token_count_bedrock = -1
if verbose:
print(f"Error counting tokens with Bedrock: {e}")
try:
token_count_anthropic = anthropic_client.count_tokens(prompt)
except Exception as e:
token_count_anthropic = -1
if verbose:
print(f"Error counting tokens with Anthropic: {e}")
if verbose:
print(f'token_count_bedrock={token_count_bedrock}, token_count_anthropic={token_count_anthropic}')
return token_count_bedrock, token_count_anthropic
# Maximum token limit
max_tokens = 199_999
# Function to trim the prompt
def trim_prompt(prompt: str, max_tokens: int) -> str:
initial_length = len(prompt)
while True:
_, token_count = count_number_tokens(prompt)
if token_count <= max_tokens:
break
# Reduce the size of the prompt
prompt = prompt[:len(prompt) - 1000]
if len(prompt) == initial_length:
# Avoid infinite loop in case prompt length doesn't change
prompt = prompt[:len(prompt) // 2]
return prompt
# Trim the prompt to fit within the token limit
trimmed_prompt = trim_prompt(test_prompt, max_tokens)
# Final check
final_token_count_bedrock, final_token_count_anthropic = count_number_tokens(trimmed_prompt, verbose=True)
print(f'Final prompt length: {len(trimmed_prompt)} characters')
print(f'Final token count (Bedrock): {final_token_count_bedrock}')
print(f'Final token count (Anthropic): {final_token_count_anthropic}')
Questions:
- Is there a more efficient way to handle prompts that are too long for the Anthropic API?
- Are there any best practices or recommended approaches for trimming prompts to fit within token limits?
- How can I ensure my approach does not inadvertently lead to an infinite loop or excessive API calls?
Any guidance or suggestions would be greatly appreciated!
Note: I it's basically impossible to deduce the exact token index to truncate the string, since those companies don't return that afaik.
Ref:
- Anthropic's Discord Channel: https://discord.com/channels/1072196207201501266/1268741091377676309
Top comments (0)