Buğra Hasbek

Posted on Jan 5, 2021

Assingment vs. Initialization in C++

#cpp #todayilearned

This is going to be my first post so I chose a rather simple concept: Assignment vs Initialization in C++. I will try to keep the post as practical as possible and share keywords in case the reader wants to do in-depth research. So buckle up and enjoy the ride folks!

int x;     // Define x
x = 3;     // Assign 3 to x

int y{3}; // Define and initialize y with 3

Above statements cause both x and y variables to have a value of 3, which leads to the common pitfall that they are identical.

Let's wear ISO C++ Standards Committee hat
According to C++20 standards, which is recently published, initialization is explained in "9.4 Initializers" section; whereas assignment is explained in "11.4.5 Assignment operator" section. How dare you call them identical, you peasant!

That was a little bit harsh. Perhaps we should wear C++ compiler implementer hat
Gcc 10.2 produces identical output for below codes with or without optimizations.

int getX(){
    int x {3};
    return x;
}

int getY(){
    int y;
    y = 3;
    return y;
}

get():
        push    rbp
        mov     rbp, rsp
        mov     DWORD PTR [rbp-4], 3
        mov     eax, DWORD PTR [rbp-4]
        pop     rbp
        ret

That was a bit anticlimactic, I guess. My guess is if data type is scalar, compiler can directly assign value as cppreference.com suggests but I couldn't find the relevant section (direct assignment) on c++ standard. Perhaps we should try a non-scalar data type. For example std::string.

#include <string>
std::string getX(){
    std::string x {"bugra"};
    return x;
}

std::string getY(){
    std::string x;
    x = "bugra";
    return x;
}

Let's see what GCC 10.2 produce for getX and getY with optimizations enabled.

getX[abi:cxx11]():
        lea     rdx, [rdi+16]
        mov     BYTE PTR [rdi+20], 97
        mov     rax, rdi
        mov     QWORD PTR [rdi], rdx
        mov     DWORD PTR [rdi+16], 1919382882
        mov     QWORD PTR [rdi+8], 5
        mov     BYTE PTR [rdi+21], 0
        ret

.LC0:
        .string "bugra"
getY[abi:cxx11]():
        push    r12
        mov     r8d, 5
        mov     ecx, OFFSET FLAT:.LC0
        xor     edx, edx
        push    rbp
        xor     esi, esi
        mov     r12, rdi
        push    rbx
        lea     rbx, [rdi+16]
        mov     QWORD PTR [rdi], rbx
        mov     QWORD PTR [rdi+8], 0
        mov     BYTE PTR [rdi+16], 0
        call    std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_replace(unsigned long, unsigned long, char const*, unsigned long)
        mov     rax, r12
        pop     rbx
        pop     rbp
        pop     r12
        ret
        mov     rbp, rax
        jmp     .L2
getY[abi:cxx11]() [clone .cold]:

I am no expert but I think getX (initializing method) is a lot better than getY (assignment method). Since we established that assignment and initializing can cause different outputs to be produced, let's try to understand the difference between them. We need to wear the formal hat again.

C++20 standard "6.7.7 Temporary objects" section states that the expression a = f() requires a temporary object for the result of f(), which is materialized so that the reference parameter of X::operator=(const X&) can bind to it.

Cpp Core Guidelines also advices to prefer initalization to assignment.

Let's finish with a regular developer hat

TLDR: Prefer initialization to assignment!

class A {   // Good
    string s1;
public:
    A(string p) : s1{p} { }    // GOOD: directly construct 
};

class B {   // BAD
    string s1;
public:
    B(const char* p) { s1 = p; }   // BAD: default constructor followed by assignment
};

Keywords: copy assignment, Builtin direct assignment for scalar types, copy initialization, direct initialization, list-initialization, temporary objects

ISO C++20 Standard

Top comments (3)

Sandor Dargo • Jan 6 '21 • Edited

Just to complement on this part:

C++20 standard "6.7.7 Temporary objects" section states that the expression a = f() requires a temporary object for the result of f(), which is materialized so that the reference parameter of X::operator=(const X&) can bind to it.

That's not the only problem we face in getY for the string example.

    std::string x;
    x = "bugra";

In fact, we have to understand how local variables are initialized. When you create a local integer (int i;), memory is reserved for the new variable, but primitive data types are not default initialized. They will hold whatever value that they find in the allocated memory (on the stack).

On the other hand, std::string is not a primitive data type and objects are default initialized given that they have a default constructor. If they don't have, well, such code wouldn't compile.

class MyClass {
public:
  explicit MyClass(int num) : m_num(num) {}
private:
  int m_num;
};

int main() {
  MyClass mc;  // ERROR: no matching function for call to 'MyClass::MyClass()'
}

So getting back to getY for the string example.
The line std::string x; creates a variable x which is initialized to an empty string. Then on the next line x = "bugra"; you assign a new value to x. x is assigned twice! (The integer i was assigned only once!)

It's yet another problem that "bugra" is not a string. It's a const char* that first have to be - implicitly - converted to a std::string and you pay for it. Hence the immense difference in the ASM code. If we want to avoid that cost, and we have access to C++14, we should use a string literal.

Then the generated ASM code becomes similar:

But even with whatever optimization turned on, there is no reason in similar circumstances to split declaration from initialization. For example, it doesn't let you declare your variables const.

Here is a great talk on this

Buğra Hasbek • Jan 6 '21

Thanks for the great feedback Sandor. I really appreciate it :)

Maresia • Jan 6 '21

Cool, before I got involved with move semantics I had no idea of the difference between assignment and initialization. I believe that many people don't even try to understand :3
constructor -> initialization
operator= -> assignment

[google translator]

DEV Community

Assingment vs. Initialization in C++

Top comments (3)

Read next

Embracing margin-inline-start for Better RTL Support in Web Design

From SDLC to CI/CD: A Beginner’s Guide

🎨🛠️ 𝗩𝗮𝗻𝗶𝗹𝗹𝗮 𝗙𝗿𝗮𝗺𝗲𝘄𝗼𝗿𝗸 𝘁𝗼 𝗘𝗺𝗽𝗼𝘄𝗲𝗿 𝗗𝗲𝘃𝗲𝗹𝗼𝗽𝗲𝗿𝘀 🚀🌐

Reverse a string in JavaScript without using reverse()