Apache thrift is a surprisingly popular data serialization and RPC library. I say surprising because at the time of this writing, there is hardly any decent documentation out there that explains the elements of thrift serialization. It is however easy to find tutorials that help you hit the ground running very fast. This article assumes you are conversant with simple examples.

Tbase objects

Data structures are defined using the thrift IDL. Code is then generated using this IDL in some programming language. The generated class usually inherits from a type known as TBase defined in the thrift library. Objects of type TBase are the ones that can be serialized or deserialized.

Protocols (Serializers)

Strictly speaking, modern day thrift (0.5 at the time of this writing) is a pluggable serialization library rather than being a specific way of serialization. The original format is simply called the  binary protocol (class name TBinaryProtocol).  Other available protocols are the compact protocol and JSON protocol. The job of a protocol is to convert a TBase object to and from a byte stream.

Transports

In thrift speak, a transport is a place where the serialized representation of an object is written to (or read from in the case of deserialization). Since the motivation for serialization tends to be either persistence or a network transfer, it is not surprising to find a transport for a generic stream (I/O stream transport) and more specialized versions of it for various kinds of network endpoints. A memory transport is also available for applications that wish to work off a memory representation of serialization. While most transports will pass on the output for a serializer as is, a transport may choose to alter the byte stream as it pleases. The most common usage is byte addition for message framing. Serializers in general, do not produce message boundary markers. If multiple objects need to be used in conjuction with streams, it becomes convenient to have message boundary markers. The framed transport is a good example of a transport that does exactly this kind of byte addition.

Piecing it all together

An object of a transport is created and it is associated with a protocol object.  This protocol object can now be used in conjunction with Tbase objects. TBase objects have  read() & write() functions that take a protocol object as an argument. Thrift also comes with seemingly tempting utility classes called TDeserializer & TSerializer.  They are however not the best choice since they tend to be restrictive in terms of the choices of transports and protocols. Here is a pseudocode sample to illustrate the usage pattern:

class Point inherits TBase
{
  int x
  int y
}

Point obj

FileStream ifs = open("path to file", "r")
TBinaryProtocol tpl(TIOStream(ifs))
obj.read(tp1)

obj.y = 100

FileStream ofs = open("path to file", "w")
TBinaryProtocol tp2(TIOStream(ofs))
obj.write(tp2)