A Complete Guide to Buffers in Node.js
In Node.js, buffers are a special type of object that can store raw binary data. A buffer represents a chunk of memory - typically RAM - allocated in your computer. Once set, the size of a buffer cannot be changed.
A buffer stores bytes. A byte is a sequence of eight bits. Bits are the most basic unit of storage on your computer, they can hold the value of either 0 or 1.
Node.js exposes the Buffer
class in the global scope (you don’t need to import or require it like other modules). With this API, you get a series of functions and abstractions to manipulate raw binaries.
A buffer in Node.js looks like this:
In this example, you can see 10 pairs of letters and numbers. Each pair represents a byte stored in the buffer. The total size of this particular buffer is 10.
You might be asking yourself: “if these are bits and bytes, where are the 0s and 1s?”
That’s because Node.js displays bytes using the hexadecimal system. This way, every byte can be represented using just two digits - a pair of numbers and letters from 0-9 and “a” to “f”.
Why buffers? Before buffers were introduced, there was no easy way of handling binary data in JavaScript. You would have to resort to primitives such as strings, which are slower and have no specialized tools to handle binaries. Buffers were created to provide a proper set of APIs to manipulate bits and bytes in an easy and performant way.
Working with buffers
Let’s see some of the things we can do with buffers.
You will notice that handling buffers is a bit similar to the way we handle arrays in JavaScript. For example, you can .slice()
, .concat()
and get the .length
of a buffer. Buffers are also iterable and can be used within constructs such as for-of
.
If you’re following the examples on your computer, keep in mind that the Buffer
class is exposed globally. You don’t need to import or require it as a separate module.
Creating buffers
Buffers are created using these three methods:
- Buffer.from()
- Buffer.alloc()
- Buffer.allocUnsafe()
In the past, buffers were created using the Buffer class constructor (e.g.,
new Buffer()
). This syntax is now deprecated.
Buffer.from()
This method is the most straightforward way to create a buffer. It accepts a string, an array, an ArrayBuffer
, or another buffer instance. Depending on which params you pass, Buffer.from()
will create a buffer in a slightly different way.
When passing a string, a new buffer object will be created containing that string. By default, it will parse your input using utf-8 as the enconding (see here all enconding types supported):
You can also pass an array of bytes to Buffer.from()
. Here I am passing the same string as before (“heya!”), but represented as an array of hexadecimal characters:
If you’re not familar with the
0xNN
syntax, it means that the characters after0x
should be interpreted as hexadecimal values.
When passing a buffer to Buffer.from()
, Node.js will copy that buffer into the current one. The new buffer is allocated in a different area of memory, so you can modify it independently:
These should cover most of the cases where you use Buffer.from()
. Refer to the docs for other ways to use it.
Buffer.alloc()
The .alloc()
method is useful when you want to create empty buffers, without necessarily filling them with data. By default, it accepts a number and returns a buffer of that given size filled with 0s:
You can later on fill the buffer with any data you want:
You can also fill the buffer with other content than 0 and a given enconding:
Buffer.allocUnsafe()
With .allocUnsafe()
, the process of sanitizing and filling the buffer with 0s is skipped. The buffer will be allocated in a area of memory that may contain old data (that’s where the “unsafe” part comes from). For example, the following code will most likely always print some random pieces of data every time you run it:
A good use case for .allocUnsafe()
is when you are copying a buffer that was safely allocated. Since you will completely overwrite the copied buffer, all the old bytes will be replaced by predictable data:
In general, .allocUnsafe()
should only be used if you have a good reason (e.g., performance optimizations). Whenever using it, make sure you never return the allocated buffer without completely filling it with new data, otherwise you could be potentially leaking sensitive information.
Writing to buffers
The way to write data into buffers is using Buffer.write()
. By default, it will write a string encoded in utf-8
with no offset (starts writing from the first position of the buffer). It returns a number, which is the number of bytes that were written in the buffer:
Keep in mind that not all characters ocuppy a single byte in the buffer (!):
Also notice that 2 is not the highest number of bytes a character can have. For example, the utf-8
enconding type supports characters with up to 4 bytes. Since you cannot modify the size of the buffer, you always need to be mindful of what you are writing and how much space it will take (size of the buffer vs. size of your content).
Another way to write into buffers is throguh an array-like syntax, where you add bytes to a specific position of the buffer. It’s important to notice that any data with more than 1 byte needs to be broken down and set on each position of the buffer:
While it’s appreciated that you can write to buffers using an array-like syntax, I suggest sticking to Buffer.from()
when you can. Managing the length of inputs can be a hard task and will bring complexity to your code. With .from()
, you can write things in a buffer worry-free and handle the cases where the input is too large by checking if nothing was written (when it returns 0).
Iterating over buffers
You can use modern JavaScript constructs to iterate over a buffer the same way you would with an array. For example, with for-of
:
Other iterator helpers such as .entries()
, .values()
and .keys()
are also available for buffers. For example, using .entries()
:
Going further: Buffers and TypedArrays
In JavaScript (I mean JavaScript in general, not restricted to Node.js), memory can be allocated using the special ArrayBuffer
class. We rarely manipulate ArrayBuffer
objects directly. Instead, we use a set of “view” objects which reference the underlying array buffer. Some of the view objects are:
Int8Array
, Uint8Array
, Uint8ClampedArray
, Int16Array
, Uint16Array
, Int32Array
, etc. See the full list here.
And then there is TypedArray
, which is an umbrella term to refer to all of these view objects listed above. All view objects inherit methods from TypedArray
via prototypes. The TypedArray
constructor is not exposed globally, you always have to use one of the view methods. If you see some tutorial or documentation using new TypedArray()
, it means it’s using any of the view objects (Uint8Array, Float64Array, etc).
In Node.js, objects created from the Buffer
class are also instance of Uint8Array
. There are a few small differences between them, which you can read here.
Conclusion
As a beginner, buffers were a topic in Node.js that got me very confused (another one was streams, but that deserves its own post). Hopefully I was able to demystify some of the concepts around buffers and give an overview of the Buffer API.
For questions, you can DM me on Twitter! Thanks for reading đź‘‹