Skip to main content

[Preview] v1.81.12 - Guardrail Policy Templates & Action Builder

Krrish Dholakia
CEO, LiteLLM
Ishaan Jaff
CTO, LiteLLM

Deploy this version​

docker run litellm
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.81.12.rc.1

Key Highlights​


New Providers and Endpoints​

New Providers (2 new providers)​

ProviderSupported LiteLLM EndpointsDescription
Scaleway/chat/completionsScaleway Generative APIs for chat completions
Sarvam AI/chat/completions, /audio/transcriptions, /audio/speechSarvam AI STT and TTS support for Indian languages

New Models / Updated Models​

New Model Support (19 highlighted models)​

ProviderModelContext WindowInput ($/1M tokens)Output ($/1M tokens)
AWS Bedrockdeepseek.v3.2164K$0.62$1.85
AWS Bedrockminimax.minimax-m2.1196K$0.30$1.20
AWS Bedrockmoonshotai.kimi-k2.5262K$0.60$3.00
AWS Bedrockmoonshotai.kimi-k2-thinking262K$0.73$3.03
AWS Bedrockqwen.qwen3-coder-next262K$0.50$1.20
AWS Bedrocknvidia.nemotron-nano-3-30b262K$0.06$0.24
Azure AIazure_ai/kimi-k2.5262K$0.60$3.00
Vertex AIvertex_ai/zai-org/glm-5-maas200K$1.00$3.20
MiniMaxminimax/MiniMax-M2.51M$0.30$1.20
MiniMaxminimax/MiniMax-M2.5-lightning1M$0.30$2.40
Dashscopedashscope/qwen3-max258KTiered pricingTiered pricing
Perplexityperplexity/preset/pro-search-Per-requestPer-request
Perplexityperplexity/openai/gpt-4o-Per-requestPer-request
Perplexityperplexity/openai/gpt-5.2-Per-requestPer-request
Vercel AI Gatewayvercel_ai_gateway/anthropic/claude-opus-4.6200K$5.00$25.00
Vercel AI Gatewayvercel_ai_gateway/anthropic/claude-sonnet-4200K$3.00$15.00
Vercel AI Gatewayvercel_ai_gateway/anthropic/claude-haiku-4.5200K$1.00$5.00
Sarvam AIsarvam/sarvam-m8KFree tierFree tier
Anthropicfast/claude-opus-4-61M$30.00$150.00

Note: AWS Bedrock models are available across multiple regions (us-east-1, us-east-2, us-west-2, eu-central-1, eu-north-1, ap-northeast-1, ap-south-1, ap-southeast-3, sa-east-1). 54 regional model entries were added in total.

Features​

  • Anthropic

    • Enable non-tool structured outputs on Claude Opus 4.5 and 4.6 using output_format param - PR #20548
    • Add support for anthropic_messages call type in prompt caching - PR #19233
    • Managing Anthropic Beta Headers with remote URL fetching - PR #20935, PR #21110
    • Remove x-anthropic-billing block - PR #20951
    • Use Authorization Bearer for OAuth tokens instead of x-api-key - PR #21039
    • Filter unsupported JSON schema constraints for structured outputs - PR #20813
    • New Claude Opus 4.6 features for /v1/messages - PR #20733
    • Fix reasoning_effort=None and "none" should return None for Opus 4.6 - PR #20800
  • AWS Bedrock

    • Extend model support with 4 new beta models - PR #21035
    • Add Claude Opus 4.6 to _supports_tool_search_on_bedrock - PR #21017
    • Correct Bedrock Claude Opus 4.6 model IDs (remove :0 suffix) - PR #20564, PR #20671
    • Add output_config as supported param - PR #20748
  • Vertex AI

    • Add Vertex GLM-5 model support - PR #21053
    • Propagate extra_headers anthropic-beta to request body - PR #20666
    • Preserve usageMetadata in _hidden_params - PR #20559
    • Map IMAGE_PROHIBITED_CONTENT to content_filter - PR #20524
    • Add RAG ingest for Vertex AI - PR #21120
  • OCI / Cohere

    • OCI Cohere responseFormat/Pydantic support - PR #20663
    • Fix OCI Cohere system messages by populating preambleOverride - PR #20958
  • Perplexity

    • Perplexity Research API support with preset search - PR #20860
  • MiniMax

    • Add MiniMax-M2.5 and MiniMax-M2.5-lightning models - PR #21054
  • Kimi / Moonshot

  • Dashscope

    • Add dashscope/qwen3-max model with tiered pricing - PR #20919
  • Vercel AI Gateway

    • Add new Vercel AI Anthropic models - PR #20745
  • Azure AI

    • Add azure_ai/kimi-k2.5 to Azure model DB - PR #20896
    • Support Azure AD token auth for non-Claude azure_ai models - PR #20981
    • Fix Azure batches issues - PR #21092
  • DeepSeek

    • Sync DeepSeek model metadata and add bare-name fallback - PR #20938
  • Gemini

    • Handle image in assistant message for Gemini - PR #20845
    • Add missing tpm/rpm for Gemini models - PR #21175
  • General

    • Add 30 missing models to pricing JSON - PR #20797
    • Cleanup 39 deprecated OpenRouter models - PR #20786
    • Standardize endpoint display_name naming convention - PR #20791
    • Fix and stabilize model cost map formatting - PR #20895
    • Export PermissionDeniedError from litellm.__init__ - PR #20960

Bug Fixes​


LLM API Endpoints​

Features​

  • Responses API

    • Add server-side context management (compaction) support - PR #21058
    • Add Shell tool support for OpenAI Responses API - PR #21063
    • Preserve tool call argument deltas when streaming id is omitted - PR #20712
    • Preserve interleaved thinking/redacted_thinking blocks during streaming - PR #20702
  • Chat Completions

    • Add Web Search support using LiteLLM /search (web search interception hook) - PR #20483
    • Preserved nullable object fields by carrying schema properties - PR #19132
    • Support prompt_cache_key for OpenAI and Azure chat completions - PR #20989
  • Pass-Through Endpoints

    • Add support for langchain_aws via LiteLLM passthrough - PR #20843
    • Add custom_body parameter to endpoint_func in create_pass_through_route - PR #20849
  • Vector Stores

    • Add target_model_names for vector store endpoints - PR #21089
  • General

    • Add output_config as supported param - PR #20748
    • Add managed error file support - PR #20838

Bugs​

  • General
    • Stop leaking Python tracebacks in streaming SSE error responses - PR #20850
    • Fix video list pagination cursors not encoded with provider metadata - PR #20710
    • Handle metadata=None in SDK path retry/error logic - PR #20873
    • Fix Spend logs pickle error with Pydantic models and redaction - PR #20685
    • Remove duplicate PerplexityResponsesConfig from LLM_CONFIG_NAMES - PR #21105

Management Endpoints / UI​

Features​

  • Access Groups

    • New Access Groups feature for managing model, MCP server, and agent access - PR #21022
    • Access Groups table and details page UI - PR #21165
    • Refactor model_ids to model_names for backwards compatibility - PR #21166
  • Policies

    • Allow connecting Policies to Tags, simulating Policies, viewing key/team counts - PR #20904
    • Guardrail pipeline support for conditional sequential execution - PR #21177
    • Pipeline flow builder UI for guardrail policies - PR #21188
  • SSO / Auth

    • New Login With SSO Button - PR #20908
    • M2M OAuth2 UI Flow - PR #20794
    • Allow Organization and Team Admins to call /invitation/new - PR #20987
    • Invite User: Email Integration Alert - PR #20790
    • Populate identity fields in proxy admin JWT early-return path - PR #21169
  • Spend Logs

    • Show predefined error codes in filter with user definable fallback - PR #20773
    • Paginated searchable model select - PR #20892
    • Sorting columns support - PR #21143
    • Allow sorting on /spend/logs/ui - PR #20991
  • UI Improvements

    • Navbar: Option to hide Usage Popup - PR #20910
    • Model Page: Improve Credentials Messaging - PR #21076
    • Fallbacks: Default configurable to 10 models - PR #21144
    • Fallback display with arrows and card structure - PR #20922
    • Team Info: Migrate to AntD Tabs + Table - PR #20785
    • AntD refactoring and 0 cost models fix - PR #20687
    • Zscaler AI Guard UI - PR #21077
    • Include Config Defined Pass Through Endpoints - PR #20898
    • Rename "HTTP" to "Streamable HTTP (Recommended)" in MCP server page - PR #21000
    • MCP server discovery UI - PR #21079
  • Virtual Keys

    • Allow Management keys to access user/daily/activity and team - PR #20124
    • Skip premium check for empty metadata fields on team/key update - PR #20598

Bugs​

  • Logs: Fix Input and Output Copying - PR #20657
  • Teams: Fix Available Teams - PR #20682
  • Spend Logs: Reset Filters Resets Custom Date Range - PR #21149
  • Usage: Request Chart stack variant fix - PR #20894
  • Add Auto Router: Description Text Input Focus - PR #21004
  • Guardrail Edit: LiteLLM Content Filter Categories - PR #21002
  • Add null guard for models in API keys table - PR #20655
  • Show error details instead of 'Data Not Available' for failed requests - PR #20656
  • Fix Spend Management Tests - PR #21088
  • Fix JWT email domain validation error message - PR #21212

AI Integrations​

Logging​

  • PostHog

    • Fix JSON serialization error for non-serializable objects - PR #20668
  • Prometheus

    • Sanitize label values to prevent metric scrape failures - PR #20600
  • Langfuse

    • Prevent empty proxy request spans from being sent to Langfuse - PR #19935
  • OpenTelemetry

    • Auto-infer otlp_http exporter when endpoint is configured - PR #20438
  • CloudZero

    • Update CBF field mappings per LIT-1907 - PR #20906
  • General

    • Allow MAX_CALLBACKS override via env var - PR #20781
    • Add standard_logging_payload_excluded_fields config option - PR #20831
    • Enable verbose_logger when LITELLM_LOG=DEBUG - PR #20496
    • Guard against None litellm_metadata in batch logging path - PR #20832
    • Propagate model-level tags from config to SpendLogs - PR #20769

Guardrails​

  • Policy Templates

    • New Policy Templates: pre-configured guardrail combinations for specific use-cases - PR #21025
    • Add NSFW policy template, toxic keywords in multiple languages, child safety content filter, JSON content viewer - PR #21205
    • Add toxic/abusive content filter guardrails - PR #20934
  • Pipeline Execution

    • Add guardrail pipeline support for conditional sequential execution - PR #21177
    • Agent Guardrails on streaming output - PR #21206
    • Pipeline flow builder UI - PR #21188
  • Zscaler AI Guard

    • Zscaler AI Guard bug fixes and support during post-call - PR #20801
    • Zscaler AI Guard UI - PR #21077
  • ZGuard

    • Add team policy mapping for ZGuard - PR #20608
  • General

    • Add logging to all unified guardrails + link to custom code guardrail templates - PR #20900
    • Forward request headers + litellm_version to generic guardrails - PR #20729
    • Empty guardrails/policies arrays should not trigger enterprise license check - PR #20567
    • Fix OpenAI moderation guardrails - PR #20718
    • Fix /v2/guardrails/list returning sensitive values - PR #20796
    • Fix guardrail status error - PR #20972
    • Reuse get_instance_fn in initialize_custom_guardrail - PR #20917

Spend Tracking, Budgets and Rate Limiting​

  • Prevent shared backend model key from being polluted by per-deployment custom pricing - PR #20679
  • Avoid in-place mutation in SpendUpdateQueue aggregation - PR #20876

MCP Gateway (12 updates)​

  • MCP M2M OAuth2 Support - Add support for machine-to-machine OAuth2 for MCP servers - PR #20788
  • MCP Server Discovery UI - Browse and discover available MCP servers from the UI - PR #21079
  • MCP Tracing - Add OpenTelemetry tracing for MCP calls running through AI Gateway - PR #21018
  • MCP OAuth2 Debug Headers - Client-side debug headers for OAuth2 troubleshooting - PR #21151
  • Fix MCP "Session not found" errors - Resolve session persistence issues - PR #21040
  • Fix MCP OAuth2 root endpoints returning "MCP server not found" - PR #20784
  • Fix MCP OAuth2 query param merging when authorization_url already contains params - PR #20968
  • Fix MCP SCOPES on Atlassian issue - PR #21150
  • Fix MCP StreamableHTTP backend - Use anyio.fail_after instead of asyncio.wait_for - PR #20891
  • Inject NPM_CONFIG_CACHE into STDIO MCP subprocess env - PR #21069
  • Block spaces and hyphens in MCP server names and aliases - PR #21074

Performance / Loadbalancing / Reliability improvements (8 improvements)​

  • Remove orphan entries from queue - Fix memory leak in scheduler queue - PR #20866
  • Remove repeated provider parsing in budget limiter hot path - PR #21043
  • Use current retry exception for retry backoff instead of stale exception - PR #20725
  • Add Semgrep & fix OOMs - Static analysis rules and out-of-memory fixes - PR #20912
  • Add Pyroscope for continuous profiling and observability - PR #21167
  • Respect ssl_verify with shared aiohttp sessions - PR #20349
  • Fix shared health check serialization - PR #21119
  • Change model mismatch logs from WARNING to DEBUG - PR #20994

Database Changes​

Schema Updates​

TableChange TypeDescriptionPRMigration
LiteLLM_VerificationTokenNew IndexesAdded indexes on user_id+team_id, team_id, and budget_reset_at+expiresPR #20736Migration
LiteLLM_PolicyAttachmentTableNew ColumnAdded tags text array for policy-to-tag connectionsPR #21061Migration
LiteLLM_AccessGroupTableNew TableAccess groups for managing model, MCP server, and agent accessPR #21022Migration
LiteLLM_AccessGroupTableColumn ChangeRenamed access_model_ids to access_model_namesPR #21166Migration
LiteLLM_ManagedVectorStoreTableNew TableManaged vector store tracking with model mappings-Migration
LiteLLM_TeamTable, LiteLLM_VerificationTokenNew ColumnAdded access_group_ids text arrayPR #21022Migration
LiteLLM_GuardrailsTableNew ColumnAdded team_id text column-Migration

Documentation Updates (14 updates)​

  • LiteLLM Observatory section added to v1.81.9 release notes - PR #20675
  • Callback registration optimization added to release notes - PR #20681
  • Middleware performance blog post - PR #20677
  • UI Team Soft Budget documentation - PR #20669
  • UI Contributing and Troubleshooting guide - PR #20674
  • Reorganize Admin UI subsection - PR #20676
  • SDK proxy authentication (OAuth2/JWT auto-refresh) - PR #20680
  • Forward client headers to LLM API documentation fix - PR #20768
  • Add docs guide for using policies - PR #20914
  • Add native thinking param examples for Claude Opus 4.6 - PR #20799
  • Fix Claude Code MCP tutorial - PR #21145
  • Add API base URLs for Dashscope (International and China/Beijing) - PR #21083
  • Fix DEFAULT_NUM_WORKERS_LITELLM_PROXY default (1, not 4) - PR #21127
  • Correct ElevenLabs support status in README - PR #20643

New Contributors​

  • @iver56 made their first contribution in PR #20643
  • @eliasaronson made their first contribution in PR #20666
  • @NirantK made their first contribution in PR #19656
  • @looksgood made their first contribution in PR #20919
  • @kelvin-tran made their first contribution in PR #20548
  • @bluet made their first contribution in PR #20873
  • @itayov made their first contribution in PR #20729
  • @CSteigstra made their first contribution in PR #20960
  • @rahulrd25 made their first contribution in PR #20569
  • @muraliavarma made their first contribution in PR #20598
  • @joaokopernico made their first contribution in PR #21039
  • @datzscaler made their first contribution in PR #21077
  • @atapia27 made their first contribution in PR #20922
  • @fpagny made their first contribution in PR #21121
  • @aidankovacic-8451 made their first contribution in PR #21119
  • @luisgallego-aily made their first contribution in PR #19935

Full Changelog​

v1.81.9.rc.1...v1.81.12.rc.1