verbalized-sampling

glnds/verbalized-sampling

AI & ML

About

SKILL.md

verbalized-sampling

glnds/verbalized-sampling

AI & ML

About

Prompt engineering technique to overcome mode collapse in LLM responses by generating multiple answers with probabilities...

SKILL.md

Verbalized Sampling

Verbalized Sampling (VS) is a prompt engineering technique that mitigates mode collapse in LLM responses by explicitly requesting multiple answers with associated probabilities.

Core Concept

Standard LLMs typically display only the top-ranked response due to RLHF training. This "mode collapse" hides alternative viable answers. VS overcomes this by instructing the AI to sample from its full internal probability distribution.

Base Template

You are a helpful assistant. For each question that I ask, generate a set of 5 possible responses. Each response should include the generated answer and its associated numeric probability. Show me all five responses. Please sample at random from the full distribution.

Usage Patterns

Standard Distribution (5 responses)

Use the base template to see a representative sample of possible answers with their probabilities.

Expanded Distribution (10+ responses)

Generate a set of 10 possible responses.

Use when the question is complex or you want broader coverage of the solution space.

Highest Probability Only

Please only show me the response that has the highest probability.

Use to revert to conventional behavior while maintaining the VS framework.

Lowest Probability Only

Please only show me the response that has the lowest probability.

Use to discover unexpected or unconventional solutions.

Tail Distribution

Please sample from the tails of the distribution such that the probability of each response is less than 0.10.

Use to explore edge cases or novel approaches.

Custom Probability Range

Please show me responses with probabilities between 0.30 and 0.60.

Use to focus on mid-range alternatives that balance novelty and reliability.

Important Caveats

Fabricated Responses

If you request more responses than genuinely exist, the AI may invent additional answers to satisfy your request. Always critically evaluate all responses.

Approximate Probabilities

Probabilities are approximations, not exact values. The AI may generate probabilities to satisfy your request rather than computing precise values. Use them as relative indicators, not absolute measures.

Validation Required

The burden of verifying response validity rests with you. Cross-check answers, especially those with lower probabilities.

Performance Characteristics

Latency

Expect slightly increased response time due to additional processing required to generate multiple responses.

Cost

If using a paid API, VS prompts will increase costs due to longer responses and additional processing. Occasional use has negligible impact; frequent use may noticeably increase bills.

Compatibility

Works across major LLMs including ChatGPT, Claude, Gemini, Llama, and Grok. The technique is training-free and model-agnostic.

Effective Applications

Creative writing tasks (poems, stories, jokes)
Complex decision-making where multiple valid approaches exist
Brainstorming and ideation sessions
Discovering alternatives to conventional solutions
Question-answering where nuance matters
Dialogue simulation
Synthetic data generation

When Not to Use

Simple factual queries with single correct answers
Time-sensitive queries where speed matters
Cost-constrained scenarios with frequent repetitive queries
Tasks where only the most reliable answer is acceptable

Integration with Other Prompts

VS can be combined with other prompting techniques. Place the VS instruction at the beginning of your prompt, followed by your specific request:

You are a helpful assistant. For each question that I ask, generate a set of 5 possible responses. Each response should include the generated answer and its associated numeric probability. Show me all five responses. Please sample at random from the full distribution.

[Your specific instruction or question here]

Underlying Research

Based on "Verbalized Sampling: How To Mitigate Mode Collapse And Unlock LLM Diversity" by Zhang et al. (arXiv, October 2025). The technique addresses how RLHF post-training alignment unintentionally creates mode collapse, limiting response diversity.

About

SKILL.md

About

Prompt engineering technique to overcome mode collapse in LLM responses by generating multiple answers with probabilities...

SKILL.md

Verbalized Sampling

Verbalized Sampling (VS) is a prompt engineering technique that mitigates mode collapse in LLM responses by explicitly requesting multiple answers with associated probabilities.

Core Concept

Base Template

You are a helpful assistant. For each question that I ask, generate a set of 5 possible responses. Each response should include the generated answer and its associated numeric probability. Show me all five responses. Please sample at random from the full distribution.

Usage Patterns

Standard Distribution (5 responses)

Use the base template to see a representative sample of possible answers with their probabilities.

Expanded Distribution (10+ responses)

Generate a set of 10 possible responses.

Use when the question is complex or you want broader coverage of the solution space.

Highest Probability Only

Please only show me the response that has the highest probability.

Use to revert to conventional behavior while maintaining the VS framework.

Lowest Probability Only

Please only show me the response that has the lowest probability.

Use to discover unexpected or unconventional solutions.

Tail Distribution

Please sample from the tails of the distribution such that the probability of each response is less than 0.10.

Use to explore edge cases or novel approaches.

Custom Probability Range

Please show me responses with probabilities between 0.30 and 0.60.

Use to focus on mid-range alternatives that balance novelty and reliability.

Important Caveats

Fabricated Responses

If you request more responses than genuinely exist, the AI may invent additional answers to satisfy your request. Always critically evaluate all responses.

Approximate Probabilities

Validation Required

The burden of verifying response validity rests with you. Cross-check answers, especially those with lower probabilities.

Performance Characteristics

Latency

Expect slightly increased response time due to additional processing required to generate multiple responses.

Cost

If using a paid API, VS prompts will increase costs due to longer responses and additional processing. Occasional use has negligible impact; frequent use may noticeably increase bills.

Compatibility

Works across major LLMs including ChatGPT, Claude, Gemini, Llama, and Grok. The technique is training-free and model-agnostic.

Effective Applications

Creative writing tasks (poems, stories, jokes)
Complex decision-making where multiple valid approaches exist
Brainstorming and ideation sessions
Discovering alternatives to conventional solutions
Question-answering where nuance matters
Dialogue simulation
Synthetic data generation

When Not to Use

Simple factual queries with single correct answers
Time-sensitive queries where speed matters
Cost-constrained scenarios with frequent repetitive queries
Tasks where only the most reliable answer is acceptable

Integration with Other Prompts

VS can be combined with other prompting techniques. Place the VS instruction at the beginning of your prompt, followed by your specific request:

You are a helpful assistant. For each question that I ask, generate a set of 5 possible responses. Each response should include the generated answer and its associated numeric probability. Show me all five responses. Please sample at random from the full distribution.

[Your specific instruction or question here]