Hi there!, in my previous post I told you that I’m looking for my first job as a software developer working abroad. After a few failed technical interviews, I realized that in my self-taught learning I missed some concepts and topics which now I need to learn to improve my programming skills.
Searching for study material was when I found CS50: Introduction to Computer Science, which seemed a good starting point to review concepts from scratch. The course is intended for people with or without prior programming knowledge and covers beginners to more advanced topics.
The intent is to teach students not how to use a programming language, but how to think algorithmically and solve problems efficiently, no matter the tool. The course is divided into weeks, each having a lecture, related materials, labs, and a set of problems to be solved.
In this lecture the instructor explains the basics of programming, like variables, conditionals, and loops, using Scratch, a software developed to teach programming, but what I want to highlight is what David Malan says about our career: computer science is fundamentally problem solving. Sounds silly, but many times we programmers start coding without having fully understood the problem, and without having a clear idea of its solution, the most important thing of all.
The instructor gives a very clear explanation of how computers and the binary system work, which was what I liked the most about the lecture.
Computers use the binary system as their unique “language” to store and process data. This system uses just 0 and 1 to represent numbers. But, why do computers use it? Since computers run on electricity, which can be turned on or off, it is possible to represent a bit by turning some switch on or off. And what is a bit may you be thinking? It's just a binary digit, each 0 and 1.
But computers don't use bits separately, they work with groups of 8 bits at a time, known as bytes, like 00000011 which represents the number 3. With 8 bits we can count from 0 to 255, that is the reason why this range is widely used in different contexts, like in RGB colors or IP addresses.
So, computers use the binary system to represent information, and the binary system represents numbers, what happens with letters and other characters?. The computer uses an ASCII table to map numbers to characters. The ASCII standard includes uppercase and lowercase letters, numbers, punctuation, and other special characters. For example, bytes 01001000, 01001001, and 00100001 represent the decimal numbers 72, 73, and 33, which map to the word HI!.
But computers also represent others characters not included in the ASCII standard, like emojis, for example. For those characters, they use another standard called Unicode, which uses not just a byte for each character like ASCII does, but 4 bytes for a character, for example, these bytes 11110000 10011111 10011000 10110111 represent the emoji 😷.
To represent colors, one of the most used systems is RGB, which uses 3 bytes (from 0 to 255), each one for red, green, and blue respectively. The combination of these three values gives us a variety of shades of colors.
Each color is a pixel, we can represent an image on the screen using millions of pixels, and with thousands or millions of images, we get a video. Yes, videos are just a sequence of images. Music can be represented with bits, too. MIDI is one such format that represents music with numbers for each note.
The important thing here is that what a byte means is given by the context. In Microsoft Word, the number 72 could represent the letter H, but in Photoshop maybe is a value for one of the RGB channels of a pixel.
Here comes in handy what programmers do in typed programming languages, defining the data type of the value stored in a variable, like a number, or a string.
David starts with a few words about code quality, which I found pretty good to understand the goals that every programmer must follow each time to write code. The quality can be evaluated based on 3 aspects:
- correctness, or whether our code solves our problem correctly
- design, or how well-written our code is, based on how efficient and readable it is
- style, or how well-formatted our code is visually
Already deep into the world of C, it is important to know which are its data types. This is useful, not just because the concept of a data type is common to every programming language, but to understand later the memory of the computer works.
In C, these are the data types and its size in memory:
- char → 1 byte (8 bits) → Range (from -128 to 127)
- unsigned char → 1 byte (8 bits) → Range (from 0 to 255)
- int → 4 bytes (32 bits) → Range (from -2^31 to 2^31 which means about -2 billion to 2 billion)
- unsigned int → 4 bytes (32 bits) → Range (from 0 to about 4 billion)
- long → 8 bytes (64 bits) → Big integers
- float → 4 bytes → 32 bits of precision
- double → 8 bytes → 64 bits of precision
It is important to know the range of each data type, because going beyond the allowed range, it will end in an overflow issue. For example, if we try to store a bigger number than 2 billion in an
int variable, an Integer Overflow error will be triggered.
Limits in the size of data types also come with other kinds of problems like the floating-point imprecision: the inability for computers to represent all possible real numbers with a finite number of bits, like 32 bits for a
float. So, our computer has to store the closest value it can, leading to imprecision.
The Y2038 is an integer overflow problem that is going to happen in the year 2038, when 32-bit computers are going to run out of bits to track time. Many years ago, some humans decided to use 32 bits to measure time with Unix time (or Unix epoch), which is the number of seconds since January 1st, 1970. But since a 32-bit integer can only count up to about two billion, in 2038 we’ll reach that limit.
The 32 bits of an integer representing 2147483647 (the maximum value for an
int) look like this:
When we increase that by 1, the bits will actually look like:
But the first bit in an integer represents whether or not it’s a negative value, so the decimal value will be -2147483648, the lowest possible negative value of an
int. So computers might actually think it’s sometime in 1901.
Fortunately, these days we have powerful computers, so we can start allocating more and more bits to store higher and higher values.
This is the code for the Problem set of Week 1.
At the moment the course is great, with lots of information and very clear explanations. It is always good to review the bases because you find data that you did not remember or did not even know about.
I hope you find interesting what I’m sharing about my learning path. If you’re taking CS50 also, left me your opinion about it.
See you soon! 😉
Thanks for reading. If you are interested in knowing more about me, you can give me a follow or contact me on Instagram, Twitter, or LinkedIn ❤.