Skip to content

fix(stdio_client): tolerate invalid UTF-8 from child stdout#2456

Open
shaun0927 wants to merge 2 commits intomodelcontextprotocol:mainfrom
shaun0927:fix/stdio-client-invalid-utf8
Open

fix(stdio_client): tolerate invalid UTF-8 from child stdout#2456
shaun0927 wants to merge 2 commits intomodelcontextprotocol:mainfrom
shaun0927:fix/stdio-client-invalid-utf8

Conversation

@shaun0927
Copy link
Copy Markdown

Fixes #2454

Summary

stdio_client() currently decodes child stdout with encoding_error_handler="strict", so a malformed UTF-8 line can raise during TextReceiveStream(...) iteration and crash the transport task group.

This PR makes the client follow the same strategy that stdio_server() already uses after #2302:

  • decode with errors="replace"
  • let malformed lines fail JSON validation
  • surface those failures as in-stream exceptions instead of killing the transport

It also treats abrupt child shutdowns (BrokenResourceError / ConnectionResetError) as normal shutdown-time conditions in the background stdio tasks.

Motivation and Context

PR #2302 fixed the server side so invalid UTF-8 on stdin becomes a parse error instead of a transport crash. The client side still behaved asymmetrically: a buggy or non-compliant child process could kill the Python client with a single malformed stdout line even if the next line was valid JSON-RPC.

That makes the client less robust than the server for the same class of malformed stdio input.

How Has This Been Tested?

Added a focused regression test in tests/client/test_stdio.py that spawns a child process which writes:

  1. one malformed UTF-8 line (b"\xff\xfe\n")
  2. one valid JSON-RPC line

Without this patch, the transport fails before the valid line is delivered.

With this patch:

  • the first item from the read stream is an Exception
  • the second item is the valid SessionMessage

Local verification:

  • uv run --frozen pytest tests/client/test_stdio.py
  • uv run --frozen ruff check src/mcp/client/stdio.py tests/client/test_stdio.py
  • uv run --frozen pyright src/mcp/client/stdio.py

Breaking Changes

None.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation update

Checklist

  • I have read the MCP Documentation
  • My code follows the repository's style guidelines
  • New and existing tests pass locally
  • I have added appropriate error handling
  • I have added or updated documentation as needed

The stdio server path already degrades invalid UTF-8 into parse errors instead of killing the transport, but the client side still defaulted to strict decoding and could crash the whole task group on a single bad line from a child process. This changes the default decode strategy to replacement, broadens shutdown-time exception handling for abrupt child exits, and adds a regression test that proves malformed output becomes an in-stream error while the next valid JSON-RPC message still arrives.

Constraint: Must preserve stdio_client cleanup behavior when subprocesses exit early or reset pipes
Rejected: Add a new configuration flag only for the regression test | leaves the default transport behavior asymmetric and crash-prone
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: Keep client and server stdio malformed-UTF-8 handling symmetric unless the protocol deliberately diverges
Tested: uv run --frozen pytest tests/client/test_stdio.py; uv run --frozen ruff check src/mcp/client/stdio.py tests/client/test_stdio.py; uv run --frozen pyright src/mcp/client/stdio.py
Not-tested: Full multi-platform matrix outside local macOS / Python 3.10 run
The malformed-UTF8 regression test now executes the JSON-parse failure branch in stdio_client, so the old pragma no longer reflects reality and breaks CI's strict-no-cover gate. This follow-up removes the stale pragma without changing behavior.

Constraint: Must not widen the PR scope beyond the already-proposed stdio robustness fix
Rejected: Suppress strict-no-cover or weaken the regression test | hides real coverage drift instead of correcting it
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: When adding malformed-input regression tests, re-audit nearby no-cover pragmas immediately
Tested: coverage run of tests/client/test_stdio.py plus strict-no-cover locally up to the updated branch execution path
Not-tested: Full upstream CI rerun after push
@shaun0927
Copy link
Copy Markdown
Author

for additional signal: mcp-claude[bot] independently reproduced this on both main and v1.x and arrived at the same one-line default change (issue #2454 comment). all 25 CI checks are green; happy to address review feedback.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

stdio_client crashes on malformed UTF-8 from child stdout instead of surfacing parse error

1 participant