Grigory Petrov from Evrone has a couple of fascinating talks on YouTube about programming languages. He had learned a lot about programming languages design and how compilers work under the microscope and he has found answers to questions about programming languages performance.
Here is my overview of his lectures (1, 2) in English. If you try to find an answer to the question of why Python or Ruby is slow, the article is for you.
The main question
In order to describe how programming languages work we must begin with the CPU on a machine. Modern CPU has plenty amount of cores for its performance. Modern CPU architecture is challenging to learn and describe. In simple terms, we can say that the code execution speed is the amount of machine code instructions that one CPU core can exec by one moment.
CPU reads each instruction from memory. Read operation is always slow operation. Therefore each modern CPU hash multi-level cache (L1, L2, Ln cache) and processor registers. They help not to read data from memory.
In simple terms, code speed performance is equal to how effective our code (machine code) works with memory. Can we store data in CPU cache or do we have to read data from memory?
Take a look at languages
What programming languages as C, C++, Rust, Objective-C and Golang have in common? When a programmer writes code he always thinks about memory: we must specify the data type of each variable; we have to allocate memory in heap; always think about pointers, blocks; etc.
As an advantage, the source code will be compiled to machine code and it’s executed so fast. Nevertheless, everyone knows that write code in C, C++, Rust is challenging. It happens because the syntax is tricky and you have to always care about memory.
If a programmer/developer doesn’t want to worry about the memory he can delegate the routine to a compiler. A compiler is a tool that transforms your source code into machine code. it tries to do it effectively. Source code converted to machine code will be easily handled by the CPU and data will be stored in cache and registers
The third way is to delegate routine to runtime, virtual machine (VM). It means that the language will not compile source code to machine code, it will compile the source code to bytecode and execute the bytecode in VM. Programming languages as Python, Ruby, PHP follow the way. These languages have awesome high-level and sweet syntax and the ability to write an extension easy and fast. But the price of using a virtual machine is speed - code performance is slow.
We can imagine each computer language as a character in a video role-playing game (RPG). In a typical RPG when we create a character we have a limited amount of points and few skills to put points into (strength, defence, intelligence, agility etc). Each language has three skills to put points into:
- Speed - speed of execution
- Syntax - enjoyable and elegant high-level syntax
- Extensibility - memory compatibility in order to easy write and use third-party code or an extension.
When a person, group of people start to design a new programming language they can choose only two skills as the base. A third one will always be hard to achieve. For example, C, C++, Rust, Objective-C, Go are fast and have extensibility, but they have low-level inelegant syntax.
Ruby, Python and PHP have chosen the third way. They have awesome sweet syntax, the code is easy to write. When you are writing the code you think only about business logic and don’t worry about memory. Also, it’s easy to use third-party code or write an extension as NumPy, SciPy in Python; or Nokogiri, Ruby-openCV, Redcarpet in Ruby. The price is performance these languages are slower than previous languages.
Whenever we say or hear that N language is slow remember that it was a meaningful decision in language design. You can ask a question: “How to became N language fast? Is it possible?”. Yes, it is possible but not easy to achieve.
Otherwise slow languages are flexible. It’s easy to write an extension in C++ and make some type of operation faster. For example NumPy and SciPy in Python. When we do ML in Python or use Ruby in order to write business login we use advantages of Python/Ruby in a high-level language and write the code fast. Also, we can delegate slow operation to swift extensions in third-party libraries in order to work with data, do math operations, parse text, etc.
Top comments (17)
First of all, you keep saying that syntax is bad in c, c++ and Rust but that is very debateble, some may say that is better because it is more verbose/expresive, so don't just say that they have bad syntax, what you might have wanted to say is that they have a step learning curve because of the syntax. Second of all java and c# compile to byte code, not python and what other language you typed there. Python code runs on a program called interpreter which reads "raw" python code and executes it at the same time, that's why python is slow. Java and c# on the other hand, they use a compiler that produces byte code and you as the programmer distribute that byte code instead of the java or c# code you wrote. The difference between java like languages and c like languages is the code that is produced, the byte code is cross platform but needs a VW to run while the binary produced by c like languages is not cross platform but doesn't need a VW to run. That's why C like languages are the fastest, Java like languages are second fastest and Python like languages are the slowest.
And I just want to clarify one point:
Each modern interpreted language doesn't read and execute each line of the source code at the same time. Honesty they aren't "pure" compilation languages.
Starting with Ruby 1.9 the official Ruby interpreter implementation switched to YARV ("Yet Another Ruby VM"). It pre-compiles Ruby into bytecodes. Once Ruby source code is converted to bytecode, a VM executes the bytecode. Converting source code to bytecode gave significant speed advantages to Ruby.
Python source code is also compiled to bytecode. In Python, you can directly observe it in the
.pycfiles. In Ruby 2.6.0 we have Just-In-Time compiler (JIT). It compiles instructions that are used often into optimized binary which runs faster.
So these languages are a hybrid of compiled and interpreted code.
Thank you for your comment. Yes, I agree that the syntax isn't "bad" (even I didn't use the word "bad"). In a university I enjoyed writing code on C, before I saw PHP and Python. But some people agree that it's challenging to write code and you always care about memory.
Grigory Petrov clarified it in the comment
You are saying both here and in your post that you allways have to think about memory in c, c++, rust and objective-c, but that is not true for c++ no more. With so many abstractions it has gotten hard to actually think about memory in that language. In rust its a similar story, but you have to worry about refrences instead of "memory".
I always wondered about the syntax factor. If somehow we could be able to make C syntax simpler and easier to understand, that would be nothing like it! I think the Julia programming language is attempting the same. What are your thoughts upon his?
Also one more factor while choosing the language I feel important is the legacy of the language. Sometimes some programmers choose one language over the other simply because they are more acquainted with one, or compatibility issues.
In the original lectures I elaborate on the "syntax" topic, unwinding it as "syntax that does not require developer to care about memory". In the end, "high level" and "low level" is how much you, as a software developer, need to "care" about memory. C and C++, abviously, requires such care. Go and Rust, on the other hand, require a DIFFERENT type of care. Still, they require it. While Python, Ruby, PHP, JS etc is "infinite clay" - you can operate on arrays, dicts and other data without concerning how they will layout in memory. Same for C# and Java.
Hi. Thank you for your question.
The article is only about language design and I accept the fact that there are a lot of reasons to choose a language. One more reason is a job factor. I had to learn PHP because many years ago I knew one company that hires junior developers.
Unfortunately I'm not familia with Julia, but thanks for your reference I will read more about it.
Only someone with little to no experience in writing Rust could ever claim it's syntax is inelegant.
Rust's style is VERY different from C oder C++, not only due to it's unique rules but because it forces the programmer a lot to use what is commonly considered to be best practices (in a good way). A language that won't let you (easily) write crappy code isn't very approachable for juniors. But there is currently no more elegant language out there.
I'm sorry for this. Honesty, I had little experience with Rust. Grigory Petrov explained the moment about syntax in the comment
I think that you missed the point, of the question posed in the title, entirely while building your article.
Why, you might wonder?
First, you're trying to judge programming languages from the perspective of how good looking their syntax is. This is completely subjective and has no relevance. When you work long enough with a tool, you'll get used to its looks and whatever you thought about the looks in the first place will get dulled, even to the point of becoming beautiful.
Third, you missed the mark with JNI (or however its equivalent in other languages is called). Using those is actually very fast and is also the reason why Java was able to achieve performances close to C/C++. Sure, writing such stuff is not as easy as pouring Java code, but, after all, it's not meant for every user.
Fourth, there are books out there teaching how to write a compiler. They are a very good read, and, daresay, a must for every developer, and they can help understand in better detail and in more depth what is going on in the insides.
My answer to your question is that in 9 out of 10 cases the guilty party is the programmer. Today we have crash courses claiming to teach you to write code in a matter of hours (some are more honest and make it weeks or months), but what they actually deliver is just another code gluer. Not anyone can be a musician for example, and as such not anyone can be a programmer, although is, by far, easier to become one.
Speed is not always the most important aspect, but choosing the right tool for your work definitely is. Besides, in programming, when speed is paramount, we always have machine code :)
"First, you're trying to judge programming languages from the perspective of how good looking their syntax is. This is completely subjective and has no relevance." Actually there are some languages that have horrible looking syntax that I would honestly avoid. Perl, Brainfck, Intercal, and Python come to mind.
Your usage of word syntax is just wrong. You could argue that a semantics of a language has an impact on speed because semantics puts constraints on compiler generated code. But to say that syntax has anything with speed is just wrong. Just look at for loop:
Completely different syntax. But which one is faster? The one with better compiler because semantics is the same and compilers don't care about syntax. Syntax is used to ease reading for humans. It is literally stripped out of code in compile pass. It means nothing in performance game.
Anyway all other points in article are wrong or badly phrased:
So this article is just wrong. Like the complete body of article is build on wording and assumptions which are just wrong. With facts and conclusion which are wrong.
Simplified and very explanatory. Thanks for sharing
That is a real nice post!
Some comments may only be visible to logged-in visitors. Sign in to view all comments.