The digital age is an age of opportunity. But along with opportunities come risks and problems. Developing modern software is a complex task that requires attention to detail and a wide knowledge of many aspects of programming. But even experienced programmers can make mistakes that can lead to a host of problems, such as bugs, performance issues, or security vulnerabilities. Let’s take a look at some of the most common mistakes developers make when working with GitHub. Whether you’re a novice programmer taking your first steps or a seasoned expert looking to improve your skills, this article should help you avoid common pitfalls when working with the popular version control system. With this knowledge, we should create better solutions that are reliable, efficient, and secure.
In recent years, GitHub has become an essential tool for developers to collaborate and manage source code. However, using this tool can also lead to some common developer mistakes that can have a significant impact on the development process. One of the most frequent problems is simply the improper use of version control. And this can lead to loss of code and, most importantly, to the waste of our most precious resource: time.
An example of a mistake is not regularly making changes to the repository, which leads to confusion and a lack of insight into the development process. In addition, failure to update repositories can cause merge conflicts. This is not a terrible thing, but on the other hand, poor management of code merge policies can lead to wasted time when resolving conflicts. We also need to be alert to the fact that improper handling of sensitive information in repositories can lead to security vulnerabilities. Being aware of such common GitHub errors is the first step in avoiding them, and with the right knowledge, developers can ensure that their code is organized, secure, and manageable.
None of us is infallible. However, the most common reason for developers’ mistakes is a lack of knowledge. Increasing awareness and proper education allow us to minimize risks and create better solutions. So let’s check out some common slip-ups that happen even to experienced programmers.
- branch deletion
The risk may be associated with deleting branches, especially if they contain important code changes. Removal of a branch permanently erases all associated code, commits, and history, which in most cases cannot be restored. It’s important to carefully review the code changes and consider the potential risks before deleting any branches on GitHub. In addition, it’s a good practice to regularly back up code changes and ensure that important changes are properly merged into the main codebase.
- removal of the old repository
Deleting the repository – on purpose or by mistake – could result in the permanent loss of this information if the repository was not properly backed up or archived. I don’t need to say that this can be quite dangerous to a project or organization. Additionally, if the repository contained any sensitive information, such as passwords or access keys, deleting it without properly securing or disposing of that information can put us or our organizations at risk of security breaches and data leaks. It is super important to carefully consider the potential consequences before deleting any old GitHub repositories.
- losing a local copy
On the one hand, this point seems like a minor problem. After all, we have an external repository; we can create another local copy at any time and continue working. Apparently, yes, but even in this situation, we will lose some time (depending on the size and complexity of the project) to start the environment again. However, this is only a minor inconvenience.
A much greater risk, on the other hand, is when, for example, we create some PoC from scratch, checking certain things before we share or show our work to others. It may then happen that our local copy is the only existing version.
It doesn’t matter if this happens due to hardware failure, accidental deletion, or any other reason. The result can be the loss of important code changes, documentation, or any other project-related data. Subsequently, this can result in lost time and effort to restore lost work, missed deadlines, and potential damage to the organization’s reputation. If the repository has not been backed up to another device or cloud storage service, it may not be possible to recover lost data. And it will only be our fault.
- hardware loss
This topic has already been partially covered above. Hardware failures can result in the loss of our local data. But after all, even if we are doing everything right on our local side, a failure can affect the external server that hosts our repository. A total failure of GitHub is unlikely to threaten us, but what about the situation when we host our repository on our own? Usually, we have a separate department or people in charge of administration for this, and it is not the responsibility of a programmer. However, in a small startup, who knows? We may be responsible for many things, and our own hosting can always fail, so we need to have a backup prepared for such situations.
- problems with credentials or authentication
When working with GitHub, it is very important to ensure that access to repositories is properly managed and controlled. If credentials are compromised or authentication protocols are not properly implemented, unauthorized individuals can access sensitive information or modify code without proper permissions. This may have very serious consequences for companies, organizations, or individuals. To mitigate this risk, it is important to follow access control best practices, implement two-factor authentication, and regularly monitor access logs to detect any suspicious activity.
- committed secrets
This is my “favorite” part. Unfortunately, it is still a very common problem. It usually concerns access to databases in various environments, but not only that. This may come as a surprise to many, but such a phenomenon is not only widespread but has actually been increasing in recent years! This is confirmed by reports such as ‘The State of Secrets Sprawl 2023′ by GitGuardian. For example, according to Cybernews, around 18K out of 30K investigated Android apps are leaking secrets! Not enough? One in ten GitHub users who made a push in 2022 accidentally exposed a secret.
This is doubly important in today’s IT world, full of cloud services like GCP or AWS. Such platforms charge based on the number of servers/requests etc. If we carelessly or ignorantly expose our cloud platform credentials we may end up consuming unplanned resources, and end up with a hefty bill for services we didn’t plan to perform. This is one of the most painful developer mistakes.
GitHub addresses this and provides a solution called “secret scanning.” It doesn’t solve the problem completely, but it makes it easier for us to control and track the accidental placement of secrets in our repositories. More about that tool you can find in GitHub Docs.
- access control and protected areas
One of the key advantages of using GitHub is the ability to control access to repositories. Thanks to that, we can easily manage and restrict access to the code. At the repository level, we can set permissions to define who has read or written access to the code. This can be configured on a per-user or per-team basis, providing fine-grained control over who can do what in the codebase. In addition, we can also control access to specific branches within a repository, or limit who can merge changes or make modifications to certain areas of the code.
GitHub supports organization-level permissions, enabling centralized control over access to multiple repositories. This allows organizations to easily and effectively manage access at a higher level and ensure consistent permissions across all our repositories.
We can also control the access through the use of deploy keys, which allow for secure authentication and access to specific repositories or servers. This provides an additional layer of security, ensuring that only authorized individuals or systems can access sensitive code or data.
- use of private GitHub accounts
The main problem with private accounts is that their use can lead to a lack of transparency and visibility of the code base. As a result, this makes it difficult for managers or other team members to track progress or provide feedback. Additionally, with respect to the aforementioned mistake, using private accounts can make it difficult to enforce access controls and security protocols, potentially leading to data breaches or other security issues. Using private accounts and/or not configuring them properly also leads to a lack of accountability, as it is more difficult to track who made changes.
- leaving things out of repositories
This one is very common and also difficult to track. People sometimes treat the GitHub repository as a place only for storing their code. But what about graphics? What about the configuration? Source code is just one component of a project. Maybe the biggest, but not the only one. A complex configuration or setup required to build a project should be accompanied by documentation on the installation process and stored with the code. Otherwise, the code is useless if it cannot be built on another machine. Another point on our GitHub mistakes list.
The above list is just a sampling of some common developers’ mistakes when working with GitHub. Some of them seem so trivial that we are certain that we (or our employees) don’t make them, yet you can never be sure of anything. What is important is that our mistakes can have serious consequences, like accidentally losing some important piece of code or exposing sensitive information in a repository.
These errors can cause us a lot of damage. Let me point out “only” delays in project timelines, loss of productivity, or even compromise of the security of the project or exposure to ransomware attacks. Nowadays, it is very important for us and our customers to have some security certifications, so such situations are unacceptable. We don’t want our company or project to be found as an anti-pattern in reports such as GitGuardian. Therefore, every developer needs to take the necessary precautions, and do as much as possible to minimize the impact of mistakes on GitHub.
So what can we do in this situation? Of course, the first and most important step is to educate ourselves and become aware of the risks associated with the above errors and their consequences. A good knowledge of Git VCS technology is the foundation for avoiding popular Git errors. But it is not enough. Knowledge of the GitHub platform, and how to configure our projects and repositories, is another key issue. The aforementioned secret-scanning tool by GitHub or third-party tools like gitLeaks, for example, can be very helpful in avoiding such errors in our code.
Considering all the dangers and their potential consequences, it is crucial to have proper DevOps backup solutions in place. Being able to recover our data in case of any failure, regardless of the cause, is something essential in today’s IT world. Additionally, proper education on the use of GitHub and the risks associated with it can help us avoid common mistakes. By taking these steps, we and our teams can work more confidently and effectively with GitHub while minimizing the potential for mistakes and data loss.