Good programmers copy; Greater programmers steal?
Good artists copy; Great artists steal
- Pablo Picasso
Pablo Picasso is often credited for saying the above quote. I'm not going to provide my own interpretation of this quote, as there are loads of interpretations available online for those interested. However, I've often wondered how this quote could apply to programming and to Open Source development.
Is our programming knowledge really our own?
The Internet is full of resources to teach you almost anything programming related. Myself and many of my classmates have learned practically all of their programming knowledge from textbooks, online resources/documentation, and YouTube tutorials. These resources have taught us the programming knowledge we have and apply today, but is all of that knowledge stolen? Are you stealing code if you're learned to write your first Hello World
program from a YouTube tutorial thousands of others have watched? What are the chances the video uploader also learned to write a Hello World
program using external resources? Is the uploader's knowledge stolen?
I can't say whether this knowledge is stolen, but I believe this kind of knowledge is necessary for us to build on and expand our skills. No one person can know every perfect implementation or program every line of code for every project.
Open Source software is made to be used!
Open Source software does not exist to be ignored, Open Source software is meant to be used! If an Open Source project provides the functionalities you need, why implement the functionalities yourself when others have painstakingly done it for you? As long as the library is actively maintained and does what you need it to, you should use and study the Open Source library. Should you find any bugs, issues, or missing features, try and create your own solution and contribute to the project to make it better for everyone! Similarly, if an Open Source library provides functionality you need but you cannot integrate the library, implement the feature in your own project using the Open Source project as an inspiration.
Taking inspiration from other projects
In my previous blog post on Code Reading, I read the codebase of Docusaurus to research how the project implements Syntax Highlighting for fenced code blocks. My research taught me that Docusaurus actually uses Prism-React-Renderer, a third-party library, to provide Syntax Highlighting. This knowledge was useful because I wanted to add syntax highlighting to ctil, my Markdown-to-HTML converter, but didn't want to implement the feature from scratch. Although I can't use Prism React Renderer in my own project, researching Docusaurus gave me the idea to find a Open Source library I could use.
Syntax Highlighting with highlight.js
My search for a third-party syntax highlighter brought me to highlight.js. ctil converts text (.txt
) and Markdown (.md
) to generated HTML (.html
) files, so I want the generated HTML files to support syntax highlighting. highlight.js can be used as HTML Tags by using a Content Delivery Network, CDN, so I was able to add highlight.js by adding the following lines to the generated HTML files:
<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/11.9.0/styles/default.min.css">
<script src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/11.9.0/highlight.min.js"></script>
<script>hljs.highlightAll();</script>
and with these lines, I was able to add syntax highlighting:
Comparison to Docusaurus approach
In my previous blog post, I described how Docusaurus used Prism React Renderer to add syntax highlighting. Like Docusaurus, I used a third-party library, highlight.js, to add syntax highlighting. However, in Docusaurus configuration files were added and modified to setup Prism React Renderer, and syntax highlighting was added by using the <Highlight />
component. Prism React Renderer also provides highlighting themes that users can configure in their projects. For my project just added highlight.js to my generated HTML files as HTML tags to be delivered through CDN. For now I'm content with basic syntax highlighting, so I'm not concerned about having a specific highlighting theme. One disadvantage of using highlight.js via CDN is that the syntax highlighting likely won't work if the user is online. In the future I may add highlight.js to the project itself so syntax highlighting will work offline.
Next steps.
This feature is still in development. In the current iteration, html is used as the default language class for the <code>...</code>
block. This is acceptable for now, but this solution will ignore any language class settings in the original Markdown file(s). I want ctil to parse language tags from the Markdown files to determine which language to use for highlighting. That will be an issue to fix in the future. The issue is available here for those interested.
So is copying code from Open Source projects stealing?
As long as copying is allowed under the project's license and you follow the requirements of the project's license you aren't stealing. Similarly, I would argue looking to open source projects for inspiration is not stealing.
Top comments (22)
Hasn't all the open source code already been stolen by the AI though? :)
The Gen Z in me: based 🗿.
Those codes are protected by licenses and are meant to be used within this terms.
Copying their code is misconduct and misuse of licence.
Hi Felix,
Thanks for your comment!
I agree with you 100%
I agree, but at the same time GitHub did use the codes for training LLMs. But it's in their Privacy Policy or somewhere I remember such an argument sometime early 2023.
Hence Roy's last paragraph:
"As long as copying is allowed under the project's license and you follow the requirements of the project's license you aren't stealing."
Clearly Microsoft / OpenAI etc consider the individual parts of an Open Source project to be fair game. I consider it that way too. If I read some code, that is open source, and as a result I understand a principle, then I can't unlearn that - so just typing it out again is another form of copying.
Software patents exist for protecting whole algorithms, that's fine, I don't like it much but it's fine. Copyright is about exact replication. The terms of a permissive Open Source license are by nature permissive, I'll take what I like. If I take most of it then I'll credit the author and include their license. If it's one function, I probably won't bother and I hope no one else does too.
Feel free to take any part or whole of anything I've made open source and use it as you will. Commercial use - fine, private use - fine. Take bits of it and use it - fine. Publish it as your own - fine. Just don't hold me liable for it working. This is how it should be, not some horrendous race to protect and own knowlege, but rather a freedom to use it.
If license allows then it's not stealing
Legally you can reuse open source code, but ethically you should minimize copying, understand the license terms, and always attribute the original source.
You're workflow is absolutely right. Imho this is one of main reason to why we sharing our code to public. Have to chanche other one to read, using copy, get a new idea, fork or conduct. I does not count that action as stealing.
Even Oracle give us permission to use the source code in the examples of Java Tutorial they provide, as long as we follow the conditions. In all of the examples, the conditions are always mentioned at the top of the source code as follows :
Although I use the examples mainly for the purpose of learning, chances are I will use parts of the source code, with some modification, in my application.
The four freedoms mean you can copy, modify, look, and run source code that is under an OSI certified license. So it’s not stealing, it’s a freedoms that developers enjoy thanks to open source.
Mere duplication does not require an artist or software engineer. Creative forms that can take existing work without destroying or otherwise changing the original. But the quote begs the question about what copying and stealing are in the context of creative work.
When an artist copies, they are making derivative works, which might violate the copyright of another. When great artists steal, they take the fundamentals from others but use that to make their own art, which is not copyright infringements.
Similarly, if you merely research open source code to inform your own, you are copying the approach. It is not until your programming changes to incorporate the fundamentals that you have "stolen" something. That's my interpretation of the quote and how it can inform us.
This is why I pair all of my OSS projects with an MIT license. It makes these discussions trivial. Is it always a good solution? Probably not, but it works for most of the time.
Whatever I release as an OSS, I already accepted that it will be reused, relabelled, "stolen", copied, etc. It's usually fine, that's why I release it as an OSS project instead of as a product.