DEV Community

Cover image for Assingment vs. Initialization in C++
Buğra Hasbek
Buğra Hasbek

Posted on

Assingment vs. Initialization in C++

This is going to be my first post so I chose a rather simple concept: Assignment vs Initialization in C++. I will try to keep the post as practical as possible and share keywords in case the reader wants to do in-depth research. So buckle up and enjoy the ride folks!

int x;     // Define x
x = 3;     // Assign 3 to x

int y{3}; // Define and initialize y with 3
Enter fullscreen mode Exit fullscreen mode

Above statements cause both x and y variables to have a value of 3, which leads to the common pitfall that they are identical.

Let's wear ISO C++ Standards Committee hat Formal Hat
According to C++20 standards, which is recently published, initialization is explained in "9.4 Initializers" section; whereas assignment is explained in "11.4.5 Assignment operator" section. How dare you call them identical, you peasant!

That was a little bit harsh. Perhaps we should wear C++ compiler implementer hat Fedora hat
Gcc 10.2 produces identical output for below codes with or without optimizations.

int getX(){
    int x {3};
    return x;
}

int getY(){
    int y;
    y = 3;
    return y;
}
Enter fullscreen mode Exit fullscreen mode
get():
        push    rbp
        mov     rbp, rsp
        mov     DWORD PTR [rbp-4], 3
        mov     eax, DWORD PTR [rbp-4]
        pop     rbp
        ret
Enter fullscreen mode Exit fullscreen mode

That was a bit anticlimactic, I guess. My guess is if data type is scalar, compiler can directly assign value as cppreference.com suggests but I couldn't find the relevant section (direct assignment) on c++ standard. Perhaps we should try a non-scalar data type. For example std::string.

#include <string>
std::string getX(){
    std::string x {"bugra"};
    return x;
}

std::string getY(){
    std::string x;
    x = "bugra";
    return x;
}
Enter fullscreen mode Exit fullscreen mode

Let's see what GCC 10.2 produce for getX and getY with optimizations enabled.

getX[abi:cxx11]():
        lea     rdx, [rdi+16]
        mov     BYTE PTR [rdi+20], 97
        mov     rax, rdi
        mov     QWORD PTR [rdi], rdx
        mov     DWORD PTR [rdi+16], 1919382882
        mov     QWORD PTR [rdi+8], 5
        mov     BYTE PTR [rdi+21], 0
        ret
Enter fullscreen mode Exit fullscreen mode
.LC0:
        .string "bugra"
getY[abi:cxx11]():
        push    r12
        mov     r8d, 5
        mov     ecx, OFFSET FLAT:.LC0
        xor     edx, edx
        push    rbp
        xor     esi, esi
        mov     r12, rdi
        push    rbx
        lea     rbx, [rdi+16]
        mov     QWORD PTR [rdi], rbx
        mov     QWORD PTR [rdi+8], 0
        mov     BYTE PTR [rdi+16], 0
        call    std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_replace(unsigned long, unsigned long, char const*, unsigned long)
        mov     rax, r12
        pop     rbx
        pop     rbp
        pop     r12
        ret
        mov     rbp, rax
        jmp     .L2
getY[abi:cxx11]() [clone .cold]:
Enter fullscreen mode Exit fullscreen mode

I am no expert but I think getX (initializing method) is a lot better than getY (assignment method). Since we established that assignment and initializing can cause different outputs to be produced, let's try to understand the difference between them. We need to wear the formal hat again.

C++20 standard "6.7.7 Temporary objects" section states that the expression a = f() requires a temporary object for the result of f(), which is materialized so that the reference parameter of X::operator=(const X&) can bind to it.

Cpp Core Guidelines also advices to prefer initalization to assignment.

Let's finish with a regular developer hat Dev hat

TLDR: Prefer initialization to assignment!

class A {   // Good
    string s1;
public:
    A(string p) : s1{p} { }    // GOOD: directly construct 
};

class B {   // BAD
    string s1;
public:
    B(const char* p) { s1 = p; }   // BAD: default constructor followed by assignment
};
Enter fullscreen mode Exit fullscreen mode

Keywords: copy assignment, Builtin direct assignment for scalar types, copy initialization, direct initialization, list-initialization, temporary objects

ISO C++20 Standard

Top comments (3)

Collapse
 
sandordargo profile image
Sandor Dargo • Edited

Just to complement on this part:

C++20 standard "6.7.7 Temporary objects" section states that the expression a = f() requires a temporary object for the result of f(), which is materialized so that the reference parameter of X::operator=(const X&) can bind to it.

That's not the only problem we face in getY for the string example.

    std::string x;
    x = "bugra";
Enter fullscreen mode Exit fullscreen mode

In fact, we have to understand how local variables are initialized. When you create a local integer (int i;), memory is reserved for the new variable, but primitive data types are not default initialized. They will hold whatever value that they find in the allocated memory (on the stack).

On the other hand, std::string is not a primitive data type and objects are default initialized given that they have a default constructor. If they don't have, well, such code wouldn't compile.

class MyClass {
public:
  explicit MyClass(int num) : m_num(num) {}
private:
  int m_num;
};

int main() {
  MyClass mc;  // ERROR: no matching function for call to 'MyClass::MyClass()'
}
Enter fullscreen mode Exit fullscreen mode

So getting back to getY for the string example.
The line std::string x; creates a variable x which is initialized to an empty string. Then on the next line x = "bugra"; you assign a new value to x. x is assigned twice! (The integer i was assigned only once!)

It's yet another problem that "bugra" is not a string. It's a const char* that first have to be - implicitly - converted to a std::string and you pay for it. Hence the immense difference in the ASM code. If we want to avoid that cost, and we have access to C++14, we should use a string literal.

Then the generated ASM code becomes similar:
ASM code

But even with whatever optimization turned on, there is no reason in similar circumstances to split declaration from initialization. For example, it doesn't let you declare your variables const.

Here is a great talk on this

Collapse
 
bugrahasbek profile image
Buğra Hasbek

Thanks for the great feedback Sandor. I really appreciate it :)

Collapse
 
maresia profile image
Maresia

Cool, before I got involved with move semantics I had no idea of the difference between assignment and initialization. I believe that many people don't even try to understand :3
constructor -> initialization
operator= -> assignment

[google translator]