Smithery Logo
MCPsSkillsDocsPricing
Login
NewFlame, an assistant that learns and improves. Available onTelegramSlack
    itsmostafa

    bedrock

    itsmostafa/bedrock
    AI & ML
    990

    About

    SKILL.md

    Install

    • Telegram
      Telegram
    • Slack
      Slack
    • Claude Code
      Claude Code
    • Codex
      Codex
    • OpenClaw
      OpenClaw
    • Cursor
      Cursor
    • Amp
      Amp
    • GitHub Copilot
      GitHub Copilot
    • Gemini CLI
      Gemini CLI
    • Kilo Code
      Kilo Code
    • Junie
      Junie
    • Replit
      Replit
    • Windsurf
      Windsurf
    • Cline
      Cline
    • Continue
      Continue
    • OpenCode
      OpenCode
    • OpenHands
      OpenHands
    • Roo Code
      Roo Code
    • Augment
      Augment
    • Goose
      Goose
    • Trae
      Trae
    • Zencoder
      Zencoder
    • Antigravity
      Antigravity
    • Download skill
    ├─
    ├─
    └─
    Smithery Logo

    Give agents more agency

    Resources

    DocumentationPrivacy PolicySystem Status

    Company

    PricingAboutBlog

    Connect

    © 2026 Smithery. All rights reserved.

    About

    AWS Bedrock foundation models for generative AI. Use when invoking foundation models, building AI applications, creating embeddings, configuring model access, or implementing RAG patterns.

    SKILL.md

    AWS Bedrock

    Amazon Bedrock provides access to foundation models (FMs) from AI companies through a unified API. Build generative AI applications with text generation, embeddings, and image generation capabilities.

    Table of Contents

    • Core Concepts
    • Common Patterns
    • CLI Reference
    • Best Practices
    • Troubleshooting
    • References

    Core Concepts

    Foundation Models

    Pre-trained models available through Bedrock:

    • Claude (Anthropic): Text generation, analysis, coding
    • Titan (Amazon): Text, embeddings, image generation
    • Llama (Meta): Open-weight text generation
    • Mistral: Efficient text generation
    • Stable Diffusion (Stability AI): Image generation

    Model Access

    Models must be enabled in your account before use:

    • Request access in Bedrock console
    • Some models require acceptance of EULAs
    • Access is region-specific

    Inference Types

    Type Use Case Pricing
    On-Demand Variable workloads Per token
    Provisioned Throughput Consistent high-volume Hourly commitment
    Batch Inference Async large-scale Discounted per token

    Common Patterns

    Invoke Model (Text Generation)

    AWS CLI:

    # Invoke Claude
    aws bedrock-runtime invoke-model \
      --model-id anthropic.claude-3-sonnet-20240229-v1:0 \
      --content-type application/json \
      --accept application/json \
      --body '{
        "anthropic_version": "bedrock-2023-05-31",
        "max_tokens": 1024,
        "messages": [
          {"role": "user", "content": "Explain AWS Lambda in 3 sentences."}
        ]
      }' \
      response.json
    
    cat response.json | jq -r '.content[0].text'
    

    boto3:

    import boto3
    import json
    
    bedrock = boto3.client('bedrock-runtime')
    
    def invoke_claude(prompt, max_tokens=1024):
        response = bedrock.invoke_model(
            modelId='anthropic.claude-3-sonnet-20240229-v1:0',
            contentType='application/json',
            accept='application/json',
            body=json.dumps({
                'anthropic_version': 'bedrock-2023-05-31',
                'max_tokens': max_tokens,
                'messages': [
                    {'role': 'user', 'content': prompt}
                ]
            })
        )
    
        result = json.loads(response['body'].read())
        return result['content'][0]['text']
    
    # Usage
    response = invoke_claude('What is Amazon S3?')
    print(response)
    

    Streaming Response

    import boto3
    import json
    
    bedrock = boto3.client('bedrock-runtime')
    
    def stream_claude(prompt):
        response = bedrock.invoke_model_with_response_stream(
            modelId='anthropic.claude-3-sonnet-20240229-v1:0',
            contentType='application/json',
            accept='application/json',
            body=json.dumps({
                'anthropic_version': 'bedrock-2023-05-31',
                'max_tokens': 1024,
                'messages': [
                    {'role': 'user', 'content': prompt}
                ]
            })
        )
    
        for event in response['body']:
            chunk = json.loads(event['chunk']['bytes'])
            if chunk['type'] == 'content_block_delta':
                yield chunk['delta'].get('text', '')
    
    # Usage
    for text in stream_claude('Write a haiku about cloud computing.'):
        print(text, end='', flush=True)
    

    Generate Embeddings

    import boto3
    import json
    
    bedrock = boto3.client('bedrock-runtime')
    
    def get_embedding(text):
        response = bedrock.invoke_model(
            modelId='amazon.titan-embed-text-v2:0',
            contentType='application/json',
            accept='application/json',
            body=json.dumps({
                'inputText': text,
                'dimensions': 1024,
                'normalize': True
            })
        )
    
        result = json.loads(response['body'].read())
        return result['embedding']
    
    # Usage
    embedding = get_embedding('AWS Lambda is a serverless compute service.')
    print(f'Embedding dimension: {len(embedding)}')
    

    Conversation with History

    import boto3
    import json
    
    bedrock = boto3.client('bedrock-runtime')
    
    class Conversation:
        def __init__(self, system_prompt=None):
            self.messages = []
            self.system = system_prompt
    
        def chat(self, user_message):
            self.messages.append({
                'role': 'user',
                'content': user_message
            })
    
            body = {
                'anthropic_version': 'bedrock-2023-05-31',
                'max_tokens': 1024,
                'messages': self.messages
            }
    
            if self.system:
                body['system'] = self.system
    
            response = bedrock.invoke_model(
                modelId='anthropic.claude-3-sonnet-20240229-v1:0',
                contentType='application/json',
                accept='application/json',
                body=json.dumps(body)
            )
    
            result = json.loads(response['body'].read())
            assistant_message = result['content'][0]['text']
    
            self.messages.append({
                'role': 'assistant',
                'content': assistant_message
            })
    
            return assistant_message
    
    # Usage
    conv = Conversation(system_prompt='You are an AWS solutions architect.')
    print(conv.chat('What database should I use for a chat application?'))
    print(conv.chat('What about for time-series data?'))
    

    List Available Models

    # List all foundation models
    aws bedrock list-foundation-models \
      --query 'modelSummaries[*].[modelId,modelName,providerName]' \
      --output table
    
    # Filter by provider
    aws bedrock list-foundation-models \
      --by-provider anthropic \
      --query 'modelSummaries[*].modelId'
    
    # Get model details
    aws bedrock get-foundation-model \
      --model-identifier anthropic.claude-3-sonnet-20240229-v1:0
    

    Request Model Access

    # List model access status
    aws bedrock list-foundation-model-agreement-offers \
      --model-id anthropic.claude-3-sonnet-20240229-v1:0
    

    CLI Reference

    Bedrock (Control Plane)

    Command Description
    aws bedrock list-foundation-models List available models
    aws bedrock get-foundation-model Get model details
    aws bedrock list-custom-models List fine-tuned models
    aws bedrock create-model-customization-job Start fine-tuning
    aws bedrock list-provisioned-model-throughputs List provisioned capacity

    Bedrock Runtime (Data Plane)

    Command Description
    aws bedrock-runtime invoke-model Invoke model synchronously
    aws bedrock-runtime invoke-model-with-response-stream Invoke with streaming
    aws bedrock-runtime converse Multi-turn conversation API
    aws bedrock-runtime converse-stream Streaming conversation

    Bedrock Agent Runtime

    Command Description
    aws bedrock-agent-runtime invoke-agent Invoke a Bedrock agent
    aws bedrock-agent-runtime retrieve Query knowledge base
    aws bedrock-agent-runtime retrieve-and-generate RAG query

    Best Practices

    Cost Optimization

    • Use appropriate models: Smaller models for simple tasks
    • Set max_tokens: Limit output length when possible
    • Cache responses: For repeated identical queries
    • Batch when possible: Use batch inference for bulk processing
    • Monitor usage: Set up CloudWatch alarms for cost

    Performance

    • Use streaming: For better user experience with long outputs
    • Connection pooling: Reuse boto3 clients
    • Regional deployment: Use closest region to reduce latency
    • Provisioned throughput: For consistent high-volume workloads

    Security

    • Least privilege IAM: Only grant needed model access
    • VPC endpoints: Keep traffic private
    • Guardrails: Implement content filtering
    • Audit with CloudTrail: Track model invocations

    IAM Permissions

    {
      "Version": "2012-10-17",
      "Statement": [
        {
          "Effect": "Allow",
          "Action": [
            "bedrock:InvokeModel",
            "bedrock:InvokeModelWithResponseStream"
          ],
          "Resource": [
            "arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-3-sonnet-20240229-v1:0",
            "arn:aws:bedrock:us-east-1::foundation-model/amazon.titan-embed-text-v2:0"
          ]
        }
      ]
    }
    

    Troubleshooting

    AccessDeniedException

    Causes:

    • Model access not enabled in console
    • IAM policy missing bedrock:InvokeModel
    • Wrong model ID or region

    Debug:

    # Check model access status
    aws bedrock list-foundation-models \
      --query 'modelSummaries[?modelId==`anthropic.claude-3-sonnet-20240229-v1:0`]'
    
    # Test IAM permissions
    aws iam simulate-principal-policy \
      --policy-source-arn arn:aws:iam::123456789012:role/my-role \
      --action-names bedrock:InvokeModel \
      --resource-arns "arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-3-sonnet-20240229-v1:0"
    

    ModelNotReadyException

    Cause: Model is still being provisioned or temporarily unavailable.

    Solution: Implement retry with exponential backoff:

    import time
    from botocore.exceptions import ClientError
    
    def invoke_with_retry(bedrock, body, max_retries=3):
        for attempt in range(max_retries):
            try:
                return bedrock.invoke_model(
                    modelId='anthropic.claude-3-sonnet-20240229-v1:0',
                    body=json.dumps(body)
                )
            except ClientError as e:
                if e.response['Error']['Code'] == 'ModelNotReadyException':
                    time.sleep(2 ** attempt)
                else:
                    raise
        raise Exception('Max retries exceeded')
    

    ThrottlingException

    Causes:

    • Exceeded on-demand quota
    • Too many concurrent requests

    Solutions:

    • Request quota increase
    • Implement exponential backoff
    • Consider provisioned throughput

    ValidationException

    Common issues:

    • Invalid model ID
    • Malformed request body
    • max_tokens exceeds model limit

    Debug:

    # Check model-specific requirements
    aws bedrock get-foundation-model \
      --model-identifier anthropic.claude-3-sonnet-20240229-v1:0 \
      --query 'modelDetails.inferenceTypesSupported'
    

    References

    • Bedrock User Guide
    • Bedrock API Reference
    • Bedrock Runtime API
    • Model Parameters
    • Bedrock Pricing
    Recommended Servers
    Local Model Suitability MCP
    Local Model Suitability MCP
    Hugging Face
    Hugging Face
    DataForB2B
    DataForB2B
    Repository
    itsmostafa/aws-agent-skills
    Files