DEV Community

Cover image for Compiled! An Unobfuscated Glossary
Jason C. McDonald
Jason C. McDonald

Posted on • Edited on

Compiled! An Unobfuscated Glossary

Interpreted. Compiled. Assembled. Linker. Bytecode. Machine code. Assembly language.

GAAAH!

The terminology surrounding building and shipping has gotten so tangled, it hurts! Terms are misunderstood and misapplied. New processes are invented, and old terms recycled to describe them. One group feels the need to denigrate another language by labelling it "interpreted." Another group changes what terms mean in an attempt to "defend" their favorite language, if only to themselves.

It's a jungle out there, folks.

I've been involved in a ton of conversations about this lately. Sourcing those and more research than is probably healthy for one day, here is an unobfuscated glossary of terms relating to building and deploying software.

(P.S. I'm not touching web development here.)

A Disclaimer

None of this should ever be taken to imply that Language A is better or worse than Language B. There's no One Right Way™ to ship a project.

Real Programmers Write Code. I don't want to hear one word about "butterflies". ;-)

Updates

This is meant to be a living post. As I find more terms to define, I'll add them. I'll also try to keep the existing definitions up to date.

If you have any suggestions or technical corrections, please leave a comment. I would appreciate sources for corrections when possible.

The Glossary

ABI [Application Binary Interface]

An interface allowing one binary to interact with another binary. If a binary program being executed by the user is interacting with the operating system, it is using an ABI.

API [Application Programming Interface]

An interface allowing the code of one project to interact with another program or library (or other unit of code). APIs are usually used when writing source code.

(Contrast with ABI.)

Architecture

Officially called the instruction set architecture or ISA, this is the set of machine code instructions that a particular CPU uses.

The architecture refers to the abstract instruction set, whereas the machine code is the actual concrete instruction set itself.

AMD64 is an example of an ISA, while Intel 64-bit microprocessors and AMD 64-bit microprocessors may implement these differently.

Archive

A file format wherein multiple files are packaged into one file, generally using some form of a compression algorithm to reduce the size.

Examples: *.tar.gz, *.exe, *.deb, *.dmg.

Assemble (Assembling, Assembled)

The official term for "compiling" an assembly language to machine code. (See also, assembler.)

AOT [Ahead-Of-Time] Compiler

A compiler which compiles (and typically assembles) to machine code just before the code is executed.

(Contrast with assembler, compiler, interpreter, JIT compiler.)

Assembled Language

A programming language which is compiled and assembled to a binary executable file (consisting of machine code), which is then shipped to the end-user.

Examples: C, C++, Ada, COBOL, FORTRAN

Non-Examples: Java, Python

Traditionally (and academically), this is actually called a compiled language, but due to the confusion regarding the modern, commonplace meaning of compiled, I coined this term to unobfuscate the meaning.

The term "Assembled Language" comes from assembler and assembling.

(See also, compiled language; contrast with assembly language.)

Assembler

A tool responsible for assembling an assembly language into machine code.

Assembler Language

(See assembly code)

Assembly Code

(See assembly language)

Assembly Language

A programming language which abstracts machine code into human-readable text-based commands. There is usually an assembly language for each machine language.

Examples: X86, X64, ARM

Binary

{1} A numeric system (base/radix) consisting solely of the digits of 0 and 1.

{2} An executable program or library which has been compiled to machine code, and is contained within a binary file.

Binary File

A file containing machine code encoded in binary.

Bundled

Packaging multiple files together, often into an archive, so they can be more easily moved (distributed) or executed together.

For example, sometimes bytecode is bundled with its interpreter into a single executable, which is then shipped to the end-user.

Bytecode

Code which often superficially resembles assembly language, but which is optimized for processing by an interpreter.

Many interpreted languages will compile the source code to bytecode, which is then shipped to the end-user either by itself, in an archive, or bundled with the corresponding interpreter.

Compile (Compiling, Compiled)

To translate from one form of code to another. It is possible to compile to bytecode, object code, assembly code, or machine code.

Academically, this should refer to compiling to machine code, but the distinction has been almost entirely lost to common knowledge; the term assemble should be preferred when referring to producing machine code.

One never "compiles" from one language of source code to another! This is called transpiling.

(Contrast with assemble, lexical analysis, parse, transpile; see also, compiled language, *compiler.)

Compile-Time

An action which is performed during compiling (or assembling), ahead of executing the program.

If an action is performed as part of AOT compiling or JIT compiling, it is said to be "ahead-of-time [AOT]" or "just-in-time [JIT]", and never "compile-time".

Compiled Language

Any language which is compiled before being shipped. By the modern usage, this can mean many things.

  • Java is a compiled language, because it is compiled to bytecode, which is later assembled to machine code on the target machine.

  • Python is a compiled language, because it is compiled to bytecode ahead of execution.

  • C++ is a compiled language, because it is compiled to machine code.

Traditionally, this had the same definition as assembled language, but due to the shifting definition of compile, this has taken on a new meaning.

This term cannot be applied to any language which is only transpiled to another language.

Examples: C, C++, Java, Python

Non-Examples: Bash

(Contrast with assembled language, transpile; see also, compile, interpreted language, interactive language.)

Compiler

A tool responsible for compiling. Should not be confused with assembler.

This term usually refers to a compiler which runs on the developer's machine, in contrast with an AOT compiler or JIT compiler, which run on the target machine.

This term is sometimes confused with toolchain.

(Contrast with assembler, linker, interpreter, toolchain; see also, AOT compiler, JIT compiler.)

CPU [Central Processing Unit]

The computer hardware responsible for executing instructions provided by machine code. See also, microprocessor.

Dependency

A separate unit of shipped code, usually a library, which a project depends on.

A language's handling of dependencies has no bearing on whether it is an assembled language, compiled language, interpreted language, interactive language, or scripting language.

(See dynamically-linked, standard library, statically-linked.)

Disassembler

A tool which converts a binary file into assembly code.

Deployment Environment

The hardware (or virtualized hardware), system, and runtime environment that the project is running on. There are multiple deployment environments:

  • Development environment, where the project is developed (such as a programmer's individual computer.

  • Testing environment, which is designed for robust, controlled testing of the project.

  • Staging environment, which is meant to resemble the production environment, but is used for final testing before shipping.

  • Production environment, where the finished project runs. This might be a public-facing web server (for web deployment), or an end-user computer (in which case, you'd have multiple different production environments.)

Dynamic Program Analysis

Analysis of a program by executing it, ofen in its final compiled or assembled form, in a monitored, modified, controlled, or emulated environment.

Memory error detectors, concurrency error detectors, security analysers, and code coverage tools often employ dynamic program analysis.

(Contrast with static program analysis.)

Dynamic Analysers

A tool which employs dynamic program analysis.

(Contrast with static analyser.)

Dynamically-Linked

Resolution of a dependency at run-time. This requires the dependency to be available (usually in its shipped form) the target machine.

(Contrast with statically-linked; see also, standard library.)

Environment

(See deployment environment and runtime environment.)

Execute (Executing, Executed)

To run an executable on a target system.

In the case of binary files/machine code, this is called loading.

This can involve a multitude of various steps in the case of an interpreted language, including invoking an AOT compiler, interpreter, or JIT compiler.

In the case of an archive, this step may also involve unpacking the archive in some fashion, such as into a temporary directory or into memory.

(See also, executable, loader, program.)

Executable

A file which is configured to be directly run (executed) as a program, especially by an end-user on a target system. This may be an archive, a binary file, or a modified source code file that provides the runtime environment with instructions on how to invoke the necessary interpreter.

Grammar

The set of rules for how the source code text of a programming language should be structured. The grammar defines the syntax of the language.

Implementation

In a generic sense, "how something is done." You can have one abstract idea or set of rules, but many implementations.

For example, you can have one programming language (such as Python), but you can build a working version of that language in many different ways (CPython, PyPy, IronPython).

Interface

In a generic sense, an interface is a clearly defined means by which two things interact with one another. The implementation of one thing is abstracted by the interface, such that changing the implementation does not (should not) change the interface in any way.

ABI and API are two practical examples of this.

Interpreted Language

A language whose shipped files are either source code or bytecode, and which are either (a) executed directly* on the target machine by an interpreter (traditional) or virtual machine, or (b) ONLY assembled to machine code on the target machine.

Some interpreted languages use an AOT Compiler or JIT Compiler on the target machine. Other languages use an interpreter (traditional) to directly execute the code without storing the machine code. Often, this comes down to implementation details; to the end-user, there is little difference.

Interpreted languages ship as either source code or bytecode, and require an interpreter or virtual machine to either be shipped to the end-user (either standalone or bundled), or to be pre-installed on the machine.

Examples: Java, Python, Javascript, Ruby, Perl.

(Contrast with assembled language; see also compiled language, *interactive language.)

Interpreter

As a generic term in relation to an interpreted language, can implement or include an AOT-compiler, interpreter (traditional), JIT-compiler, REPL shell, or virtual machine.

In the broadest sense, this is the software that must be installed on the target machine, which is responsible for the ultimate execution of the bytecode or source code shipped to the end-user. You should prefer the more specific term whenever possible.

(Contrast with interpreter (traditional), REPL shell; see also, interpreted language).

Interpreter (Traditional)

A computer program which processes source code or bytecode and directly executes it on the target machine, without first needing to compile/assemble it to machine code.

(Contrast with AOT compiler, interpreter, JIT compiler, virtual machine.)

Interactive Language

A language which is primarily used, executed, and interpreted within the context of an interactive shell.

This is another term I coined to unobfuscate a component of "interpreted language," wherein it is helpful to distinguish between a language which is principally used through an interpreter (traditional with an interactive shell, and one which passes through an AOT Compiler or JIT compiler.

Examples: Bash, Python

(Contrast with interpreted language.)

Interactive Shell

A shell wrapped around an interpreter (traditional), which is used to either execute source code directly, or to execute a file containing source code on demand.

JIT [Just-In-Time] Compiler

A compiler which compiles (and typically assembles) to machine code as the code is executed, often line by line.

(Contrast with AOT compiler, assembler, compiler, interpreter.)

Language (Programming Language)

A single concrete set of rules (grammar, syntax, semantics) that can be used to write computer code.

One language can have multiple implementations.

Lexer

A tool that performs lexical analysis. This is often a part of the compiler or the interpreter.

Lexical Analysis (Lex, Lexed, Lexing)

The process by which a string of text, such as source code, is broken up into tokens to be parsed.

This is usually followed by syntactic analysis.

(See also, semantic analysis, syntactic analysis.)

(Contrast with parse; see also, lexer, syntax.)

Library

An independent, often compiled, collection of code which is intended only to be used as a dependency.

In assembled languages, a library can be statically-linked or dynamically-linked. In interpreted languages, a library can be resolved as a dependency in any number of ways.

A library usually has a defined API.

Linter

A static analyser which checks for bugs and errors in source code.

Linker

A tool which is responsible for linking. Part of the toolchain.

Linking

Resolution of dependencies, especially as part of the compiling process.

Loader

A tool which performs loading of the machine code into memory for execution.

Loading

The process by which machine code is loaded into memory on the target system for execution by the CPU.

Machine Code

The instruction set that is directly executed by the CPU. This is entirely made up of numeric codes, and is usually encoded in binary. (See also, architecture; contrast with assembly code.)

Machine Language

(See machine code; see also architecture.)

Microprocessor

A central processing unit that exists as a single integrated circuit (what we'd usually call a "chip"), or a few connected integrated circuits. Most modern computers have microprocessors. (See also, CPU.)

Object Code

Machine code which has not yet been linked to its dependencies by the linker.

(See also, machine code; contrast bytecode.)

Operating System

Essentially a collection of programs, most of which are in machine code, which manages and runs the computer hardware and software on a particular machine. An operating system also provides multiple ABIs to allow interacting with it, and thus, with the hardware and software it manages.

(See also, runtime; contrast with virtual machine.)

Optimization

The process of making code run more efficiently. This can be done manually by the programmer, especially to source code, or automatically by any number of tools, to any form of code.

Optimizer

A tool which performs optimization, especially on bytecode, object code, or (occasionally) machine code.

Package {noun}

A collection of files comprising a single project, and sometimes (although not always) bundled with the tools to execute the project; a package seldom contains its dependencies, however.

A package often takes the form of an archive.

Package (Packaging, Packaged) {verb}

The process by which a project is prepared to be distributed to an end-user as a package.

Parse

(See syntactic analysis.)

Parser

A tool which performs syntactic analysis (also known as parsing).

Processor

See CPU.

Preprocessor

A tool which transforms the source code in various ways in preparation for compilation. Different languages may or may not have this tool, and it may perform a variety of functions.

This is usually the tool that handles macros and "include" directives in languages which support those features.

Program

The complete executable end-result of a project, especially as shipped to the end-user. May have external dependencies. In the case of interpreted langauges, may also be bundled.

Programs are generally intended to be interacted with by users, as opposed to libraries.

Project

The complete body of source code which will be shipped as a single program or library.

This term is mainly included for clarity. This may also be called a "solution".

REPL (Read-Evaluate-Print Loop) Shell

See interactive shell.

Run-time

Occuring at the time of execution on the target system.

(Contrast with AOT, compile time, JIT; see also runtime environment.)

Runtime Environment (Runtime System)

The environment provided to a program which is being executed. In a non-academic, simplified sense, this is everything that the program can "see," including APIs, dependencies, interfaces (including those provided by the operating system, libraries, and other programs.

(See also, deployment environment, operating system, virtual environment.)

Scripting Language

Traditionally refers to a language which is intended to be used only within a special runtime environment.

Typically, a scripting language is not intended to write an independent project, but rather to provide runtime instructions to another program. However, this distinction is vague, and difficult to determine.

Examples: ActionScript, BASH, FraggleScript, VBScript.

Non-Examples: Python, Java, Ruby, Javascript (these are interpreted languages, but not scripting languages; again, the distinction is difficult.)

NOTE: Unfortunately, this term has all but entirely lost its definition in the common knowledge, and is usually inappropriately substituted for interpreted language. Culturally, this term is usually used to mean "not a real programming language," and thus has become a derogatory term for disliked languages. Thus, more specific terms should be preferred when possible.

(See also, interactive language; contrast with interpreted language.)

Semantics

What the syntax represents computationally in a programming language, and how it behaves.

In other words, syntax is the structure, but semantics are the meaning.

Run-time errors and warningsd usually result from semantics errors.

(Contrast syntax.)

Semantic Analyser

A tool which performs semantic analysis.

Semantic Analysis

The process of checking the actual meaning (semantics) of the "parse tree" generated by the syntatic analyser.

This is often a part of the compiler or the interpreter.

(See also, syntactic analysis, lexical analysis.)

Shell

A program which interacts with a user through text-based input and output in real time.

See interactive shell.

Source Code

Text files written following the grammar, syntax, and semantics of a particular programming language.

(Contrast with bytecode, machine code; see also, assembly code.)

Static Analyser

A tool which performs static program analysis.

Static Program Analysis

The process of automatically analysing source code for various purposes, including detecting errors, bugs, inefficiencies, or inconsistencies with particular programming standards or style guides.

(See also, linters; contrast with dynamic program analysis.)

Standalone

Executable by itself, especially without the need of external dependencies or additional tools or programs to be available in the runtime environment.

(Contrast bundled.)

Standard Library

The library which is used by most or all projects written in a particular programming language. A standard library is often distrbuted as a dynamically-linked library, although in some cases it may be included in the interpreter, or in rarer cases, statically-linked to binaries using it.

(See also, dependencies, library.)

Statically-Linked

Resolution of a dependency at compile-time. This means the dependency does NOT need to be available in some other form on the target machine; the dependency's machine code is contained within the resulting machine code of the project using it.

(Contrast with statically-linked; see also, standard library.)

Syntax

The tokens that make up the source code of a programming language, as well as the order in which they should appear, as defined by the grammar.

Compile-time errors and warnings usually result from syntax errors.

(Contrast semantics; see also, grammar.)

Syntactic Analysis

The process of validating the tokens (or "token stream") output from lexical analysis against the syntax (or grammar) rules of the language.

Often, this step involves building a "parse tree", which is easier to compile, especially to bytecode or machine code.

This is often a part of the compiler or the interpreter.

After this step (usually) comes semantic analysis.

(See also, lexical analysis, semantic analysis.)

Target Machine

The physical computer that the finished code is being executed on; usually, this is the computer the end-user is running the shipped software on.

Token

A single unit of meaning in a programming language, as defined by the grammar and syntax of the language. Includes symbols and keywords.

(See also, lexical analysis, grammar, syntax.)

Tokenization

(See lexical analysis.)

Transpile

To compile from one programming language to another. This is technically the correct term for this situation; "compile" in this context is wrong.

For example, Typescript is transpiled to Javascript.

Transpiling NEVER relates to compiling to bytecode, assembly code, object code, or machine code.

(See also, assemble, compile.)

Toolchain

A collection of tools which transforms source code into machine code, especially through a single, unbroken process. Also called the "compiler toolchain."

In assembled languages, the toolchain is typically comprised of the *compiler, linker, and assembler.

Virtual Machine

A program which virtualizes a CPU, and sometimes additional hardware. May execute machine code, or may execute bytecode as if it were machine code.

(See also, interpreter; contrast interpreter (traditional), compiler.)

Virtualize (Virtualization, Virtualized)

The process of simulating a CPU through software, often such that machine code can be executed, or that bytecode can be executed as if it were machine code.

Top comments (4)

Collapse
 
timkor profile image
Timkor

Nice write up! Maybe there is also some space for Intermediate Representation?

Collapse
 
mujeebishaque profile image
Mujeeb Ishaque

Thank you so much for sharing with us.

Collapse
 
wrldwzrd89 profile image
Eric Ahnell

This really does cut through the chaos of definitions! Great work!

Collapse
 
codemouse92 profile image
Jason C. McDonald • Edited

Phew. Really glad you liked it. That represents a mind-numbing five hours of my life.

I'm going to go find coffee.