knut

Posted on Sep 24, 2023

A Cautionary Rust Tale About IO Redirection

#rust #debugging

Yesterday I hit an absolutely baffling bug with some Rust code I'm working on. Illustrating it is a bit contrived, but it's a cautionary tales, and there's not much I like more than sharing cautionary tales. I'm going to illustrate the bug I hit in a toy executable instead of in the context of netcrab, which is unnecessarily complicated.

So there I was, trying to add a feature to netcrab to allow piping input and output to an executed command on the local machine. I'm getting input from a network socket and want to transfer that over to the child process's stdin; and the child process is producing stdout output that I want to send back over the network.

So I start by creating the child process. In my case I was trying to use an interactive program like cmd.exe.

let mut child = std::process::Command::new("cmd.exe")
    .stdin(std::process::Stdio::piped())
    .stdout(std::process::Stdio::piped())
    .stderr(std::process::Stdio::null())
    .spawn()
    .expect("Failed to execute command");

I set the input and output to be piped so I can get access to them. Next I spawn a thread to read from the child process's stdin and write it out somewhere--in the example it'll just go to the main program's stdout.

// Extract the ChildStdout object and send it to be drained.
spawn_child_stdout_reader(child.stdout.take().unwrap());

// ...

fn spawn_child_stdout_reader(mut child_stdout: ChildStdout) -> JoinHandle<()> {
    // Create a new thread.
    std::thread::Builder::new()
        // In the thread, loop forever.
        .spawn(move || loop {
            let mut buf = [0u8];
            // Keep trying to read from the child's stdout.
            match child_stdout.read(&mut buf) {
                // If we get a 0-byte read, the child's stdout ended.
                Ok(0) => {
                    eprintln!("Child stdout exited!");
                    return;
                }
                Ok(_num_bytes_read) => {
                    // If we got data, write it to the main stdout.
                    std::io::stdout().write(&buf);
                }
                Err(e) => eprintln!("err reading from child stdout: {}", e),
            }
        }).unwrap()
}

I ran this, and the child stdout exited right after printing out the initial prompt. It didn't make sense why it would close stdout so early.

At some point I decided to try adding in the stdin handling to see if it would help. Yes, I know this could be written much better with templating. I'll include that at the end.

spawn_child_stdin_writer(child.stdin.take().unwrap());

// ...

fn spawn_child_stdin_writer(mut child_stdin: ChildStdin) -> JoinHandle<()> {
    std::thread::Builder::new()
        .spawn(move || loop {
            let mut buf = [0u8];
            match std::io::stdin().read(&mut buf) {
                Ok(0) => { eprintln!("stdin exited!"); return; }
                Ok(_num_bytes_read) => {
                    child_stdin.write(&buf);
                }
                Err(e) => eprintln!("err writing to child stdin: {}", e),
            }
        }).unwrap()
}

And this actually helped. Why?? In my quest to figure it out, I started commenting out bits and pieces to narrow it down. But I was sleepy, so I didn't do a good job of it. I ended up at some point with this slight change.

fn spawn_child_stdin_writer(mut child_stdin: ChildStdin) -> JoinHandle<()> {
    std::thread::Builder::new()
        .spawn(move || loop {
            let mut buf = [0u8];
            match std::io::stdin().read(&mut buf) {
                Ok(0) => { eprintln!("stdin exited!"); return; }
                Ok(_num_bytes_read) => {
                    // TODO: re-enable
                    //child_stdin.write(&buf);
                }
                Err(e) => eprintln!("err writing to child stdin: {}", e),
            }
        }).unwrap()
}

And this version of the code I sort of forgot about and left in, becoming more and more confused about why my child stdout was exiting so fast.

The reveal

So what was happening here? Here's what's up.

Fundamentally, if you close the child's stdin stream and the child process tries to read from it, then the child process will close the stdout stream too. This should only happen with programs that actually try to read from stdin. Or maybe programs that gracefully accept stdin being closed without responding by also closing stdout. I admit I don't have 100% understanding of this. With interactive programs, you typically get both streams or neither, bucko.

So how is the child stdin getting dropped? This is is where it gets contrived and painful.

First, I started off only implementing the stdout reader portion, so when the child process object went out of scope, the stdin handle, not having been extracted using take(), got dropped along with it. Oops.

Next, when I added the stdin writer thread, it started working again, because something was now using the child stdin.

Finally, in my blundering attempts to understand what was happening, when I commented out the child_stdin.write(&buf); line in spawn_child_stdin_writer, even though child_stdin was moved into the function as an argument, because I didn't reference it in the thread's closure, the move || syntax didn't move it, so it got dropped outside of the thread. OH COME ON.

Caution

You know, when I write C++ code, I want to be real explicit about everything. For example, I insist on avoiding auto at all times because I've seen how it makes code ambiguous. But in Rust, I trust the language a lot and happily let it type deduce and do all kinds of other things automatically because it seems to have its act together.

But dang now Rust, I'm going to have to start being more caref--what? What's that? Huh? No, I didn't see any warning messag--yeah, of course I look at the warning messages. Fine, I'll go look.

warning: unused variable: "child_stdin"
  --> src\main.rs:24:33
   |
24 | fn spawn_child_stdin_writer(mut child_stdin: ChildStdin) -> JoinHandle<()> {
   |                                 ^^^^^^^^^^^ help: if this is intentional, prefix it with an underscore: "_child_stdin"

Shut up.

Appendix: Better reader/writer thread

As promised, here's a cleaner way to express the concept of spinning up a thread that drains a source and sends it all to a sink. Not efficient, but illustrative. We could all use some illustrations in our lives.

fn spawn_read_write_thread<TSource, TSink>(mut source: TSource, mut sink: TSink) -> JoinHandle<()>
where
    TSource: Read + Send + 'static,
    TSink: Write + Send + 'static,
{
    std::thread::Builder::new()
        .spawn(move || loop {
            let mut buf = [0u8];
            match source.read(&mut buf) {
                Ok(0) => { eprintln!("Source closed!"); return; }
                Ok(_) => { sink.write(&buf); }
                Err(e) => eprintln!("Err reading from source: {}", e),
            }
        })
        .unwrap()
}

DEV Community

A Cautionary Rust Tale About IO Redirection

The reveal

Caution

Appendix: Better reader/writer thread

Top comments (0)

Read next

ALKANES: Smart Contracts on Bitcoin UTXOs

Starting to Rust

Interview with Prabhu Kiran Konda, Creator of Snail AI!

📦 Comparing Rust’s Smart Pointers: Box, Rc, and Arc