The vector::reserve fallacy

While reading through some code I wrote for a raytracing assignment, I noticed a peculiar function that had never caused any issues, but really looked like it should. After asking a bunch of people, I present this blog post to you!

Ah, C++ standard containers. So delightfully intuitive to work with. The most versatile has to be std::vector, whose job is to wrap a dynamic “C-style” array and manage its capacity for us as we grow and shrink the vector's size. We can simply call push_back on the vector to add as many elements as we want, and the vector will grow its capacity when needed to fit our new elements.

If you understand how a std::vector works, feel free to skip to the code.

But is it that simple?

Resizing the vector's internal array is not cheap! It incurs allocating a whole new (bigger) block of memory, copying all the elements to it, and finally freeing the old block (note that this copy may be a move, see here). Because we add elements one by one, this would trigger a lot of resizes, as the vector keeps having to guess how many elements we plan to add and reallocating a bigger and bigger internal array every time we push_back past its capacity! So, a conforming std::vector implementation will usually try to get ahead of us and secretly allocate a bigger block when it sees we start pushing to it, and then it can just keep track of the size of the vector (how many elements we've pushed to it) separately from its capacity (how many elements it can grow to before it needs to resize the internal array again).

std::vector kindly exposes this internal functionality to us through some functions. For example, the capacity() function returns the current capacity of the vector's internal array. If we know the size it will grow up to ahead of time, we can use the reserve(size_type capacity) function to have it pre-allocate this capacity for us. This avoids reallocating a lot when doing a bunch of push_backs, which can let us gain a precious bit of performance (see the example here for some actual numbers).

The code

Now that we understand std::vector::reserve, let's take a look at some C++:

std::vector<int> myVec{}; // create a vector of size 0
myVec.reserve(1); // reserve a capacity of 1
myVec[0] = 42; // write 42 to the first element of our empty(!!) vector
std::cout << myVec[0];

When run, the above prints 42. I hope I'm not the only one who's surprised this works! I'm overwriting the value of the first element in a vector... which has no elements. This is an out of bounds write, and should definitely not work. Not only that, but on my machine I can replace index 0 with up to index 15187 and it still works fine! Index 15188 segfaults, though, so at least that's sane behavior (so long as I get far enough away from the start of the vector...). So what the peck is going on??

The peck (it's going on)

Okay, okay, I'll say the thing. We've found what in C++ is called “undefined behavior” (UB). This is a magical realm where anything could happen. Your computer might replace every window title with your username, or your program might send an order to all pizza restaurants in a 5km radius. If you're lucky, your program will just crash. More likely though, your code will do exactly what you intended it to do, and either subtly break something later on, or never signal anything on your machine... and break on someone else's.

Why is this undefined behavior, you ask? We told our vector to reserve a size of 1, so 0 is a perfectly valid index in the its internal array. However, the C++ standard never states that vector should have an internal array! It only asks for vector implementations to be able to grow and shrink, and for reserve() to “ensure a capacity” up to which no reallocations need to happen.

NOTE: after lots of research (and asking the smart folks of the #include C++ community), I've been unable to find an implementation where this does break. That doesn't mean it's okay to rely on this behavior! It's still UB!

Why it works for us

Despite this being undefined behavior, it works consistently in my program. Why is this? When we run the line myVec[0] = 42, the std::vector::operator[] function is called with an argument of 0, to return a reference to the location in memory at index 0 for this vector. Let's look at the source code for this function in GCC's libstdc++ (which I used for my testing, though the same issue applies on clang and MSVC):

/**
 *  @brief  Subscript access to the data contained in the %vector.
 *  @param __n The index of the element for which data should be
 *  accessed.
 *  @return  Read/write reference to data.
 *
 *  This operator allows for easy, array-style, data access.
 *  Note that data access with this operator is unchecked and
 *  out_of_range lookups are not defined. (For checked lookups
 *  see at().)
 */
_GLIBCXX_NODISCARD _GLIBCXX20_CONSTEXPR
reference
operator[](size_type __n) _GLIBCXX_NOEXCEPT
{
    __glibcxx_requires_subscript(__n);
    return *(this->_M_impl._M_start + __n);
}

Looking past all the macros (the subscript thing expands to an empty line by default, we'll look into it later), this simply takes the pointer to the start of the internal array (_M_impl._M_start), adds our argument __n, and returns it as a reference. As long as _M_start points to some valid allocated address, we should be fine accessing it within bounds of the array (note, of course, that this is only true for this implementation of libstdc++! Other implementations may do different things; we're in UB-land here). This explains why our index outside of the vector's size worked: we're indexing the internal array, not the vector! As long as we call reserve on the vector first, and our index is within that reserved array's size the data should be perfectly okay being written to and read from an out-of-bounds-but-within-capacity index of a vector (on this specific version of GCC's libstdc++). If we remove the myVec.reserve(1) line, the program does crash as expected, since _M_impl._M_start is not initialized and thus points to invalid memory.

Array out of bounds

The reason why accessing an index higher than the array's size works is covered here, but a tl;dr is that you are indeed overwriting memory you shouldn't be, and by chance nothing bad is happening. If we run it through the valgrind memory error detector, it indeed detects our error for any index outside the array. Here's the log for a write at index 1, after a call to reserve(1):

Invalid write of size 4
   at 0x1091FC: main (ub.cpp:8)
 Address 0x4e21084 is 0 bytes after a block of size 4 alloc'd
   at 0x4841F11: operator new(unsigned long) (vg_replace_malloc.c:434)
   by 0x109825: std::__new_allocator<int>::allocate(unsigned long, void const*) (new_allocator.h:147)
   by 0x109604: allocate (alloc_traits.h:482)
   by 0x109604: std::_Vector_base<int, std::allocator<int> >::_M_allocate(unsigned long) (stl_vector.h:378)
   by 0x1093FF: std::vector<int, std::allocator<int> >::reserve(unsigned long) (vector.tcc:79)
   by 0x1091EA: main (ub.cpp:6)

Let's dissect this output: 1. The first line indicates that we wrote 4 bytes somewhere that's “invalid.” That's the size of a 64-bit int, which is the type we're writing into index 1. 2. The big call stack tells us where the array that we're accessing out of bounds was allocated. The penultimate line points us to that std::vector::reserve call we made, which creates a “block of size 4” (the vector's internal array, with the capacity for a single 4-byte int).

This indicates that we are indeed accessing the internal array out of bounds, and that it is a memory error that will cause UB even on this implementation of std::vector. So that answers that!

Speed at the cost of safety

Although on my GCC install, using this as actual storage “works” “fine,” it has... issues. When we try to do a range-based loop, it will never get the elements we wrote out of bounds. If the vector gets copied, it will only bring over the data within its size, and leave behind everything else. These kinds of issues would be super hard to diagnose had I not spotted the UB here!

Shouldn't std::vector::operator[] warn us that we're accessing an element outside of the vector's size? Let's check the C++ standard on vector functions.

Only at() performs range checking. If the index is out of range, at() throws an out_of_range exception. All other functions do not check.

- The C++ Standard Library: A Tutorial and Reference by Nicolai M. Josuttis (2012), pages 274-275

Well, darn. I can understand why, though. When writing code in C++, we expect to have the lowest possible performance overhead, yet still get to use all these nice abstractions. Performing bounds checks, even if cheap, can really add up if we have to do it for every vector access. Changing it to at(0) does indeed print a (relatively) helpful crash message:

terminate called after throwing an instance of 'std::out_of_range'
  what():  vector::_M_range_check: __n (which is 1) >= this->size() (which is 0)

As I was writing this, an excellent relevant post by @saagar@saagarjha.com graced my Mastodon timeline:

Original source.

That's not all, though! Remember that curious __glibcxx_requires_subscript(__n); macro in the GCC implementation of operator[], which I said we'd look at later? Now is before's later, so let's take a look at the definition:

#ifndef _GLIBCXX_ASSERTIONS
  # define __glibcxx_requires_subscript(_N)
#else
  # define __glibcxx_requires_subscript(_N)	\
  __glibcxx_assert(_N < this->size())
#endif

So it does do something! You just have to have _GLIBCXX_ASSERTIONS defined. Indeed, if we define that macro with the -D_GLIBCXX_ASSERTIONS compiler flag, we get this wonderful totally-readable error when the code tries to index out of bounds:

/usr/include/c++/13.2.1/bits/stl_vector.h:1125: std::vector<_Tp, _Alloc>::reference std::vector<_Tp, _Alloc>::operator[](size_type) [with _Tp = int; _Alloc = std::allocator<int>; reference = int&; size_type = long unsigned int]: Assertion '__n < this->size()' failed.

Okay, it's no “you're accessing this vector out of bounds, please stop,” but it certainly is better than dealing with the potential mess of undefined behavior that awaits otherwise. I guess I'll be adding this flag to all my debug builds from now on!

If you're curious, this is my original code where I found the issue.


Thanks for reading! Feel free to contact me if you have any suggestions or comments. Find me on Mastodon and Matrix.

You can follow the blog through: – ActivityPub by inputting @mat@blog.allpurposem.at – RSS/Atom: Copy this link into your reader: https://blog.allpurposem.at

My website: https://allpurposem.at