An Introduction to gRPC

gRPC is an inter-process communication protocol used in high-performance applications in cloud computing, Internet of Things (IoT), mobile computing, and microservices environments.

This article examines how gRPC works, how to use it, and how it compares to other popular API architectures. It also discusses a unique use case where gRPC excels.

gRPC: A brief history

Google developed an RPC framework for internal use in the early 2000s with the name of Stubby. Google built Stubby around a powerful RPC core that can manage billions of communication requests per second. In 2015, Google introduced an open source version of the project called gRPC — originally named Google Remote Procedure Call, though the meaning of the “g” now rotates with each release. It’s a lightweight and language-agnostic way to define and enforce service contracts in a heterogeneous production environment. 

Due to gRPC’s extensive applicability and growing popularity, the Cloud Native Computing Foundation (CNCF) accepted the framework in February 2017. It is currently at the incubating project level.

How gRPC works

Like other RPCs, gRPC lets us call a method that executes remotely as if we were executing it locally. Unlike other RPCs, however, gRPC uses Google’s protocol buffers (protobuf) as its default interface definition language instead of JSON or XML. This makes building, connecting, operating, and debugging applications that use multiple first- and third-party components and technologies easier. 

Data structure

The use of protocol buffers and HTTP/2 are the primary components that give gRPC an advantage over other RPCs. Protocol buffers allow you to define your data structure once and let the gRPC software handle the implementation details.

You use the .proto text file to define a service, which determines how you want the data to be structured. gRPC lets you define four types of services:

  1. A unary RPC resembles a function call and consists of a single client request and a single server response.
  2. A server streaming RPC consists of a single client request to which the server responds by providing a stream containing a sequence of messages.
  3. In a client streaming RPC, the client uses a stream to send a sequence of messages to the server. The server then reads the messages and returns a response to the waiting client.
  4. In a bidirectional streaming RPC, the client and server may independently read and write to the same stream in any order. The order of messages in each direction is preserved.

Then, you use a protoc compiler plugin to generate code for the client and server APIs. This code consists of access classes in the official or third-party supported language you specify and methods to serialize and parse the data into and from a binary format.

On the server side, the API consists of two main components: a service that implements the RPC methods a client calls and a gRPC server that decodes incoming requests and encodes outgoing responses.

On the client side, a stub object represents the API, which implements the RPC methods defined in the service. The client wraps the method parameters in a protocol buffer message type for handling by gRPC.

Stream lifecycle

Once the client calls a stub method, the basic gRPC lifecycle takes the following pattern:

  1. gRPC notifies the server, providing client metadata and the method name. The client can specify a deadline, describing how long they can wait for the RPC to execute.
  2. The server can either wait for the client’s request message or return its metadata in preparation for its response. The application determines which happens first.
  3. The server processes the request and creates a response.
  4. If it successfully processes the request, the server returns a response to the client with status details containing the status code and any optional messages or metadata. This completes the server response.
  5. The client receives the response if it’s okay or receives a detailed error message if it’s not. This completes the client’s response.

In a server streaming RPC, the server returns a stream of messages in response to the client request and sends its status details after sending all messages in the stream.

In a client streaming RPC, the client instead sends a stream of messages. The server typically sends its single response message after it has received all the client’s messages.

In a bidirectional streaming RPC, the client and server may independently send and receive messages at any time. The server must still return its metadata before responding to the first client request, but it can send data anytime. This permits the client and server to use a “conversational” approach to messaging, where the client can construct subsequent requests based on the responses it receives from the server.

The call’s success is determined locally and independently by the server and client, either of which can cancel the call at any time. For example, a server may send its responses outside the deadline specified by the client (e.g., the call fails on the client side), or it may be unable to parse the client’s request (e.g., the call fails on the server side first).

In general, gRPC HTTP/2 ensures quick data encoding and decoding. In most cases, the performance of gRPC is better than GraphQL and REST. The gRPC framework is flexible and generates code for clients in many programming languages.

gRPC versus REST and GraphQL


REST APIs can use the well-supported HTTP/2 protocol, but they often still use the older HTTP/1.1 protocol.

gRPC uses HTTP/2 as the transport protocol, which is more efficient than plain text-based HTTP/1.1. The HTTP/1.1 responses must come back in the order received, which can cause a processing bottleneck. You can use multiple TCP connections to get around this, but this method is resource-expensive.

HTTP/2 supports multiplexed communication, which reduces network use by dividing a single TCP connection into multiple concurrent streams. Frames divide streams and encode them in binary instead of plain text like they are in HTTP/1.1. The HTTP/2 frames are tagged, so they can be prioritized for efficient processing. 

The REST architecture can use various data formats — including HTML, JSON, XML, and other text-based formats — to encode data. This makes it flexible and widely adopted, but the underlying mechanisms are inefficient for service-to-service communication.

The REST specification decouples a system’s front and back end. Decoupling makes it a popular way to implement web applications, as you can scale, build, or deploy each service independently without affecting other services.

Compared to XML and JSON, protocol buffers make implementing new services and features easier. Although the current default protocol buffers version can sometimes make maintenance difficult because it uses the required field type, the newer proto3 syntax is more forward-compatible, and protocol buffers, in general, are designed to maintain backward compatibility.

REST-based microservices communicate through uniform interfaces. All applications interface using a standardized protocol, and clients can access available resources in an app without knowing them in advance. In contrast, gRPC is by definition not standardized, and you must rely on a message’s .proto file to decode it, but this file can be located outside your codebase and made publicly available.


GraphQL is a specification for describing server-side data by structuring it as a series of nodes connected by edges. Following a path from a node to another node never loops — a connected acyclic graph.

GraphQL specifies that you define a schema to describe the data available for a user to query. A schema is written using a human-readable interactive data language and serves as a clearly-defined contract between the server and the client. The contract lets the client predict the nature of responses despite not knowing much about the server.

This well-defined structure makes it possible for a client to send complex requests that return the specific data you need in a single response. Requests are typically served over HTTP, although you can also use WebSockets as the underlying protocol. Responses may be significant, whereas gRPC performs better as a low-latency communication method handling streams of small packages.

A gRPC framework use case

gRPC is particularly appropriate in its application in Kubernetes, which functions as the default protocol for the container runtime interface (CRI). The CRI handles communication between each node’s kubelet — the agent that manages the pods on a node — and the container runtime, which directly controls the containers running in those pods. The kubelet must communicate with the container runtime through a reliable channel to succeed.

It also must be extensible to support the varied ecosystem of Kubernetes runtimes. Because the gRCP software generates code, developers can focus on higher-level data structure concerns without worrying about the implementation details of the rapidly changing Kubernetes specification.

The kubelet uses gRPC to communicate directly with the container runtime or with a CRI shim that corresponds with the runtime. The runtime or shim hosts a gRCP server, and the kubelet hosts a client.

gRPC provides a lightweight, fast, and efficient framework. The streaming binary format of messages encoded using protocol buffers is less CPU-intensive to parse and more suitable for communication between internal components. In this case, the “remote” in “remote procedure call” may only be millimeters, so packet latency is negligible, and reduced processing overhead makes a more significant difference.

In addition to its serialized format, gRCP’s multiplexed streaming further reduces connection overhead and makes container management seamless even on devices with less powerful CPUs.

Final thoughts on gRPC

Initially used exclusively on Google services, gRPC has grown in popularity and applicability to become a widely accepted open source RPC framework. Like REST and GraphQL, the gRPC framework supports multiple programming languages. However, while REST focuses on resources and GraphQL focuses on entities and their relationships, gRPC excels in facilitating action-based communication.

The gRPC framework offers a low-overhead communication method that explicitly supports multiplexed, bidirectional streaming, verbose error handling, and strongly typed code generation for servers and clients. It uses protocol buffers to encode and decode data for transport over HTTP/2.

Its built-in security, reliability, and ease of use apply to many situations demanding high performance, low latency, and flexible data handling at endpoints. Its proven functionality as Google’s internal messaging technology is enhanced by its open source status and support from the Cloud Native Computing Foundation, making it a popular choice for cloud and microservice applications. This includes a vital role as a protocol used for Kubernetes container management.

Want to learn more about managing your Kubernetes instance? Check out these articles about Kubernetes management on the Mattermost blog. 
This blog post was created as part of the Mattermost Community Writing Program and is published under the CC BY-NC-SA 4.0 license. To learn more about the Mattermost Community Writing Program, check this out.

Read more about:


Bridget Mwikali is an experienced technical writer interested in sharing knowledge in an easy-to-understand language. She has written several articles and tutorials for different audiences.