Iterator Invalidation of STL Containers

STL容器的迭代器失效

The size of a container refers to the number of elements in the container; the capacity of a container refers to the number of elements that the container can hold before reallocating more memory. When resizing or changing capacity, the elements may move to new storage locations. This means that iterators (as well as pointers or references) pointing to the elements may become invalid (i.e., point to the old element locations).
Iterators pointing to elements of associative containers only become invalid when the pointed element is removed from the container (erased). In contrast, iterators pointing to elements of sequential containers may become invalid when memory is reallocated (resize()/reverse() or push_back()) or when the pointed element moves within the container (such as by performing an erase() or insert() at its previous position).

Let’s take a look at how erase causes iterator invalidation in sequential containers.
When we use erase to remove an element from a container, it invalidates the iterator, and the effect is not as intuitive as one might think.

1
2
3
4
5
6
string coll{"aaaabbbccaadcvc"};
for(auto pos=coll.begin();pos<coll.end();++pos){
if(*pos=='a')
coll.erase(pos);
}
// result: aabbbccadcvc

The above code seems to remove all the character ‘a’ from str, but the actual result is not so; some ‘a’ characters are “missed.”
First, let’s look at the implementation of erase in SGI STL stl/string:

1
2
3
4
5
6
7
iterator erase(iterator __position) {
// The move includes the terminating null.
_Traits::move(__position, __position + 1, _M_finish - __position);
destroy(_M_finish);
--_M_finish;
return __position;
}

It will move all elements after the iterator forward by one position. Although we did not change the iterator __position itself, after executing erase, it points to the next element after the one it pointed to before (Note that the exact behavior also depends on the STL implementation version; what’s listed here is just one implementation of SGI STL, older implementations before C++11 may not have a return value).

Unfortunately, before C++11, it was a design decision not to return the following position, because if not needed, it costs unnecessary time.

Since after executing erase, __position essentially behaves as if it points to the next iterator (according to this SGI implementation), incrementing it in the loop will cause it to skip checking one element. (In fact, operating on an invalidated iterator is undefined behavior.)

Calling erase() for the element to which you are referring with pos invalidates pos as an iterator of coll. Thus, if you use pos after removing its element without any reinitialization, all bets are off. In fact, calling ++pos results in undefined behavior.

Thus, the correct writing for (before C++11) is:

1
2
3
4
5
6
7
8
string coll{"aaaabbbccaadcvc"};
for(auto pos=coll.begin();pos<coll.end();){
if(*pos=='a')
coll.erase(pos); // After execution, it is equivalent to pos having incremented
else{
++pos;
}
}

You can find several common container erase prototypes in the following links:

Since C++11, a solution is easy because erase() always returns the value of the following element.

So in C++11, it can be written this way:

1
2
3
4
5
6
7
8
string coll{"aaaabbbccaadcvc"};
for(auto pos=coll.begin();pos<coll.end();){
if(*pos=='a')
pos=coll.erase(pos); // Returns the current iterator after erase
else{
++pos;
}
}

[TC++PL4] An iterator to an element of an associative container (e.g., a map) is only invalidated if the element to which it points is removed from the container (erased; §31.3.7). To contrast, an iterator to an element of a sequence container (e.g., a vector) is invalidated if the elements are relocated (e.g., by a resize(), reserve(), or push_back()) or if the element to which it points is moved within the container (e.g., by an erase() or insert() of an element with a lower index).
It is tempting to assume that reserve() improves performance, but the standard growth strategies for vector (§31.4.1.1) are so effective that performance is rarely a good reason to use reserve(). Instead, see reserve() as a way of increasing the predictability of performance and for avoiding invalidation of iterators.

Vector:

Inserting or removing elements invalidates references, pointers, and iterators that refer to the following elements. An insertion that causes reallocation invalidates all references, iterators, and pointers.

Deque:

An insertion or deletion of elements might cause a reallocation. Thus, any insertion or deletion invalidates all pointers, references, and iterators that refer to other elements of the deque. The exception is when elements are inserted at the front or the back. In this case, references and pointers to elements stay valid, but iterators don’t.

The article is finished. If you have any questions, please comment and communicate.

Scan the QR code on WeChat and follow me.

Title:Iterator Invalidation of STL Containers
Author:LIPENGZHA
Publish Date:2017/03/17 11:43
Word Count:3.4k Words
Link:https://en.imzlp.com/posts/3276/
License: CC BY-NC-SA 4.0
Reprinting of the full article is prohibited.
Your donation will encourage me to keep creating!