Version: Next

Understanding I/O In the zkVM

In the Hello World Tutorial, we had a brief introduction to how to perform I/O operations in the zkVM. Now we'll dive deeper into the subject. Keep reading to learn more about:

What are the different types of data in the zkVM
How to handle inputs and outputs
Best practices for handling I/O in the zkVM

Setting the Stage

We can think of programming in the zkVM as transitioning between two worlds; The host, where computation works the same as in any other regular program, and the guest, where computation is done in a zero-knowledge environment.

Since the guest works in a zk environment, it has access to a limited set of ways to get data if compared to the host. The only way to send data between the host and guest is through the Executor Environment. Such data transfer is done through file descriptors. The zkVM specify four default file descriptors, stdin, stdout, stderr, and journal; They're defined in the fileno module.

The zkVM Data Model

The zkVM has a data model that distinguishes between public and private data. By "public" we mean data that is included in the journal and becomes part of the proof, while "private" data is only accessible by the host and guest.

If your application handles sensitive data, it's important to be aware of specifically which data is commited to the journal, avoiding any sensitive data to be included in the proof.

Sending Data from the Host to the Guest

The stdin file descriptor is used to send input data from the host to the guest. In the host, it's possible to set the input data in the Executor Environment through the methods write and write_slice. The guest has corresponding functions read and read_slice to read the input data.

Writing to the guest's stdin can be done as simply as the code below. For a real example, check the Voting Machine's example.

src/main.rs
use risc0_zkvm::ExecutorEnv;

let input = "Hello, guest!";
let env = ExecutorEnv::builder().write(&input)?.build()?;

Since we mentioned the read/write methods and their _slice variants, let's take a moment to understand the difference between them.

A Note on Performance

During the process of sending data from host to guest and vice-versa, we can either do so while (de)serializing the data or by sending raw bytes. It's a trade-off between convenience and performance. By using the standard functions read, write and commit (that we'll cover in the next section), the zkVM performs automatic (de)serialization of the data. This enables easy handling of complex data structures, but it comes with a performance cost. Using the _slice variants, on the other hand, allows for sending raw bytes, which is faster but usually requires less ergonomic code.

It is good practice to use the standard functions first, switching to the _slice variants when performance becomes an issue or when optimizing the code to save on cycles. Since both approaches can be used concomitantly, moving from one to another shouldn't be a problem. We have a more detailed explanation on guest code optimization if you want to learn more about this topic.

Sending Private Data from the Guest

Back where we were, after getting data from the host and performing some transformations on it, we might want to send private data back. Both stdout and stderr file descriptor are used to send data from the guest to the host in a private manner, and a convenient way to send data to the host's stdout is by using the write method.

Writing to the host's stdout can be done as simply as the code below. For a real example, check the Voting Machine's example.

methods/guest/src/main.rs
let data = "Hello, host!";
env::write(&data);

On the host side, it's possible to read data coming from the guest by reading the buffer that was originally passed to the Executor Environment through its methods stdout and stderr.

info

The private data alluded to here is not included in the proof, but it is accessible to the host. This means that the party generating the proof can access the data, so you should take this into consideration. If you don't want to let private data leak to any other party, it's possible to achieve full secrecy by proving locally.

tip

A good practice to handle sensitive data is to use proof composition; Essentially splitting the proving process into smaller parts, proving the sensitive data locally and combining the larger program later through composition in a capable proving service like Bonsai to speed up the proof generation.

Sending Public Data from the Guest

We saw how to send private data directly to the host, but we might also want to commit public data, attesting to some fact that we want to share with the world. We can do so by sending this data to the journal file descriptor. This data will be included in the proof and can be accessed by any party through the Receipt after the proving process. Writing to the journal is done through the methods commit and commit_slice.

Writing to the journal can be done as simply as the code below. For a real example, check the Voting Machine's example.

src/main.rs
let data = "Hello, journal!";
env::commit(&data);

On the host side, (or any other regular program that has access to the Receipt), reading from the journal can be achieved by simply calling the Journal's method decode.

Reading Private Data in the Host

Once we sent data from the guest, we can read it back in the host by leveraging the from_slice method. This method is used to deserialize the data from a buffer into the desired type.

Reading from the host's stdout can be done as simply as the code below. For a real example, check the Voting Machine's example.

src/main.rs
let result: Type = from_slice(&output)?;

If data was sent in its raw form by using a _slice variant, you'll need to handle the bit fiddling manually.

Reading Public data in the host

Reading public data is done by accessing the Journal that is contained in the resulting Receipt after the proving process. This can be done by calling the decode method on the journal instance.

src/main.rs
// Produce a receipt by proving the specified ELF binary.
let receipt = prover.prove(env, ELF).unwrap().receipt;
// Decode the journal to access the public data.
let public_data = receipt.journal.decode()?;

A good pattern to follow when handling shared data structures between the host and guest is to have a common core module that contains the shared data structures. This way, both host and guest can import common data structures and consume them as needed.

A good example of this pattern being used is the JWT Validator. In its core module, it defines common structures that will be later used in the host and guest modules. Similarly, the Chess example does the same with its core being used by the host and guest.

Other examples leveraging this pattern can be found in the examples page.

Putting It All Together

Now that we've covered some details about I/O in the zkVM, let's see how a real program implements it in practice.

We'll cover the Voting Machine example. This example is a simple voting machine that allows users to vote for a candidate. We'll link to relevant parts of the code as we go along, and it's expected that you open the linked files in a separate tab to follow along.

The program is a state machine that supports three operations:

Init: Configures initial state
Submit: Which allows a user to submit a vote
Freeze: Which reveals the result of the election and closes the voting

First, we can see that all common data structures are defined in the core module.

The host has functions for each of the operations, and on each of them some input is sent to the guest. In the submit and freeze functions the host also passes a buffer to the guest to be filled with the result of the operation, but we'll get there in time.

Analyzing the init function first, we can see that the host simply sends the initial state to the guest using the write method. Such data is then read by the init guest program and immediately commited to the journal. Note how easy it is to operate on data structures when using the standard read and commit functions, no bit manipulation needed. It'd be a different story if we were using the _slice variants. Since we don't have to worry about performance critical code here, we can safely use the standard functions.

Moving on to the submit function, we can see that in the host an output buffer is passed to the stdout file descriptor of the guest. It'll be filled with values produced by the guest and then read by calling the from_slice method on the buffer. This can be seen in this line. The result that was filled in the buffer came from the write method call in the guest. Remember, the write method is used to send data to the host's stdout file descriptor.

Still in the submit function, note how the private output from the guest is used, and how it's relevant to the distinction between public and private data in this case. In the example presented, the VotingMachineState struct is changed during the guest's execution. But we don't want to commit (make public) the state of the voting machine, so we use the stdout file descriptor to send the result back to the host. This way, we can update the voting machine state at each iteration while preserving its privacy.

Finally, in the freeze function, the same patterns of sending and receiving data are repeated.

Conclusions

In this guide, we've covered the basics of I/O in the zkVM. We've seen how to send data from the host to the guest and vice-versa, how private and public data are distinguished, and how to commit data to the journal. We also covered the trade-offs between using the standard functions and their _slice variants and showed through the Voting Machine example how to implement I/O in practice. There are more examples available in the examples page that you can use as reference if you wish.

Happy coding!

Setting the Stage​

The zkVM Data Model​

Sending Data from the Host to the Guest​

A Note on Performance​

Sending Private Data from the Guest​

Sending Public Data from the Guest​

Reading Private Data in the Host​

Reading Public data in the host​

Sharing Data Structures Between Host and Guest​

Putting It All Together​

Conclusions​