12 Jul 2021 | 9 min read | 11.3k views

A Complete Guide to Buffers in Node.js

In Node.js, buffers are a special type of object that can store raw binary data. A buffer represents a chunk of memory - typically RAM - allocated in your computer. Once set, the size of a buffer cannot be changed.

A buffer stores bytes. A byte is a sequence of eight bits. Bits are the most basic unit of storage on your computer, they can hold the value of either 0 or 1.

Node.js exposes the Buffer class in the global scope (you don’t need to import or require it like other modules). With this API, you get a series of functions and abstractions to manipulate raw binaries.

A buffer in Node.js looks like this:

<Buffer 61 2e 71 3b 65 2e 31 2f 61 2e>

In this example, you can see 10 pairs of letters and numbers. Each pair represents a byte stored in the buffer. The total size of this particular buffer is 10.

You might be asking yourself: “if these are bits and bytes, where are the 0s and 1s?”

That’s because Node.js displays bytes using the hexadecimal system. This way, every byte can be represented using just two digits - a pair of numbers and letters from 0-9 and “a” to “f”.

Why buffers? Before buffers were introduced, there was no easy way of handling binary data in JavaScript. You would have to resort to primitives such as strings, which are slower and have no specialized tools to handle binaries. Buffers were created to provide a proper set of APIs to manipulate bits and bytes in an easy and performant way.

Working with buffers

Let’s see some of the things we can do with buffers.

You will notice that handling buffers is a bit similar to the way we handle arrays in JavaScript. For example, you can .slice(), .concat() and get the .length of a buffer. Buffers are also iterable and can be used within constructs such as for-of.

If you’re following the examples on your computer, keep in mind that the Buffer class is exposed globally. You don’t need to import or require it as a separate module.

Creating buffers

Buffers are created using these three methods:

  • Buffer.from()
  • Buffer.alloc()
  • Buffer.allocUnsafe()

In the past, buffers were created using the Buffer class constructor (e.g., new Buffer()). This syntax is now deprecated.

Buffer.from()

This method is the most straightforward way to create a buffer. It accepts a string, an array, an ArrayBuffer, or another buffer instance. Depending on which params you pass, Buffer.from() will create a buffer in a slightly different way.

When passing a string, a new buffer object will be created containing that string. By default, it will parse your input using utf-8 as the enconding (see here all enconding types supported):

// Creates a new buffer with the string 'heya!'
// If no enconding is passed in the second parameter, defaults to 'utf-8'.
Buffer.from('heya!')
 
// Creates the same buffer as the above, but passes 'heya!' as a hex encoded string
Buffer.from('6865796121', 'hex')

You can also pass an array of bytes to Buffer.from(). Here I am passing the same string as before (“heya!”), but represented as an array of hexadecimal characters:

// Also writes 'heya!' to the buffer, but passes a array of bytes
Buffer.from([0x68, 0x65, 0x79, 0x61, 0x21])

If you’re not familar with the 0xNN syntax, it means that the characters after 0x should be interpreted as hexadecimal values.

When passing a buffer to Buffer.from(), Node.js will copy that buffer into the current one. The new buffer is allocated in a different area of memory, so you can modify it independently:

const buffer1 = Buffer.from('cars')
 
// Creates a buffer from `buffer1`
const buffer2 = Buffer.from(buffer2)
 
// Modify `buffer2`
buffer2[0] = 0x6d // 0x6d is the letter "m"
 
console.log(buffer1.toString()) // --> "cars"
console.log(buffer2.toString()) // --> "mars"

These should cover most of the cases where you use Buffer.from(). Refer to the docs for other ways to use it.

Buffer.alloc()

The .alloc() method is useful when you want to create empty buffers, without necessarily filling them with data. By default, it accepts a number and returns a buffer of that given size filled with 0s:

Buffer.alloc(6)
// --> <Buffer 00 00 00 00 00 00>

You can later on fill the buffer with any data you want:

// Creates a buffer of size 1 filled with 0s (<Buffer 00>)
const buff = Buffer.alloc(1);
 
// Fill the first (and only) position with content
buff[0] = 0x78 // 0x78 is the letter "x"
 
console.log(buff.toString('utf-8');
// --> 'x'

You can also fill the buffer with other content than 0 and a given enconding:

Buffer.alloc(6, 'x', 'utf-8')
// --> <Buffer 78 78 78 78 78 78>

Buffer.allocUnsafe()

With .allocUnsafe() , the process of sanitizing and filling the buffer with 0s is skipped. The buffer will be allocated in a area of memory that may contain old data (that’s where the “unsafe” part comes from). For example, the following code will most likely always print some random pieces of data every time you run it:

// Allocates a random area of memory with size 10000
// Does not sanitizes it (fill with 0) so it may contain old data
const buff = Buffer.allocUnsafe(10000)
 
// Prints loads of random data
console.log(buff.toString('utf-8'))

A good use case for .allocUnsafe() is when you are copying a buffer that was safely allocated. Since you will completely overwrite the copied buffer, all the old bytes will be replaced by predictable data:

// Creates a buffer from a string
const buff = Buffer.from('hi, I am a safely allocated buffer')
 
// Creates a new empty buffer with `allocUnsafe` of the same
// length as the previous buffer. It will be initally filled with old data.
const buffCopy = Buffer.allocUnsafe(buff.length)
 
// Copies the original buffer into the new, unsafe buffer.
// Old data will be overwritten with the bytes
// from 'hi, I am a safely allocated buffer' string.
buff.copy(buffCopy)
 
console.log(buffCopy.toString())
// --> 'hi, I am a safely allocated buffer'

In general, .allocUnsafe() should only be used if you have a good reason (e.g., performance optimizations). Whenever using it, make sure you never return the allocated buffer without completely filling it with new data, otherwise you could be potentially leaking sensitive information.

Writing to buffers

The way to write data into buffers is using Buffer.write(). By default, it will write a string encoded in utf-8 with no offset (starts writing from the first position of the buffer). It returns a number, which is the number of bytes that were written in the buffer:

const buff = Buffer.alloc(9)
 
buff.write('hey there') // returns 9 (number of bytes written)
 
// If you write more bytes than the buffer supports,
// your data will truncated to fit the buffer.
buff.write('hey christopher') // retuns 9 (number of bytes written)
 
console.log(buff.toString())
// -> 'hey chris'

Keep in mind that not all characters ocuppy a single byte in the buffer (!):

const buff = Buffer.alloc(2)
 
// The copyright symbol ('©') occupies two bytes,
// so the following operation will completely fill the buffer.
buff.write('©') // returns 2
 
// If the buffer is too small to store the character, it will not write it.
const tinyBuff = Buffer.alloc(1)
 
tinyBuff.write('©') // returns 0 (nothing was written)
 
console.log(tinyBuff)
// --> <Buffer 00> (empty buffer)

Also notice that 2 is not the highest number of bytes a character can have. For example, the utf-8 enconding type supports characters with up to 4 bytes. Since you cannot modify the size of the buffer, you always need to be mindful of what you are writing and how much space it will take (size of the buffer vs. size of your content).

Another way to write into buffers is throguh an array-like syntax, where you add bytes to a specific position of the buffer. It’s important to notice that any data with more than 1 byte needs to be broken down and set on each position of the buffer:

const buff = Buffer.alloc(5)
 
buff[0] = 0x68 // 0x68 is the letter "h"
buff[1] = 0x65 // 0x65 is the letter "e"
buff[2] = 0x6c // 0x6c is the letter "l"
buff[3] = 0x6c // 0x6c is the letter "l"
buff[4] = 0x6f // 0x6f is the letter "o"
 
console.log(buff.toString())
// --> 'hello'
 
// ⚠️ Warning: if you try setting a character with more than 2 bytes
// to a single position, it will fail:
buff[0] = 0xc2a9 // 0xc2a9 is the symbol '©'
 
console.log(buff.toString())
// --> '�ello'
 
// But if you write each byte separately...
buff[0] = 0xc2
buff[1] = 0xa9
 
console.log(buff.toString())
// --> '©llo'

While it’s appreciated that you can write to buffers using an array-like syntax, I suggest sticking to Buffer.from() when you can. Managing the length of inputs can be a hard task and will bring complexity to your code. With .from(), you can write things in a buffer worry-free and handle the cases where the input is too large by checking if nothing was written (when it returns 0).

Iterating over buffers

You can use modern JavaScript constructs to iterate over a buffer the same way you would with an array. For example, with for-of:

const buff = Buffer.from('hello!')
 
for (const b of buff) {
  // `.toString(16)` returns the content in hexadecimal format.
  console.log(b.toString(16))
}
 
// Prints:
// --> 68
// --> 65
// --> 6c
// --> 6c
// --> 6f
// --> 21

Other iterator helpers such as .entries(), .values() and .keys() are also available for buffers. For example, using .entries():

const buff = Buffer.from('hello!')
const copyBuff = Buffer.alloc(buff.length)
 
for (const [index, b] of buff.entries()) {
  copyBuff[index] = b
}
 
console.log(copyBuff.toString())
// -> 'hello!'

Going further: Buffers and TypedArrays

In JavaScript (I mean JavaScript in general, not restricted to Node.js), memory can be allocated using the special ArrayBuffer class. We rarely manipulate ArrayBuffer objects directly. Instead, we use a set of “view” objects which reference the underlying array buffer. Some of the view objects are:

Int8Array, Uint8Array, Uint8ClampedArray, Int16Array, Uint16Array, Int32Array, etc. See the full list here.

And then there is TypedArray, which is an umbrella term to refer to all of these view objects listed above. All view objects inherit methods from TypedArray via prototypes. The TypedArray constructor is not exposed globally, you always have to use one of the view methods. If you see some tutorial or documentation using new TypedArray(), it means it’s using any of the view objects (Uint8Array, Float64Array, etc).

In Node.js, objects created from the Buffer class are also instance of Uint8Array. There are a few small differences between them, which you can read here.

Conclusion

As a beginner, buffers were a topic in Node.js that got me very confused (another one was streams, but that deserves its own post). Hopefully I was able to demystify some of the concepts around buffers and give an overview of the Buffer API.

For questions, you can DM me on Twitter! Thanks for reading đź‘‹