Software engineer explains: Git

Meet Git

Git is software for tracking changes in any set of files, used for coordinating work among software engineers that collaboratively develop source code during software development. At Sofico we use Bitbucket for software development.

What is version control?

Version control is quite literally about how you can maintain and access different versions of the items you are working on. It allows you to revert files back to a previous state, revert an entire project back to a previous state, or compare changes over time.

In the area of software development, the items that need version control are usually code files, configuration files, and if you are lucky: documentation. Many people already have their very own version control system, maybe even without knowing it. If you submitted your thesis as FinalFinalVersion2 you have been practicing version control. Next to naming files, another popular strategy is copying files into different directories, for example, a directory that represents a specific date is a common way for people to keep track of different versions that have been created over time. Unfortunately, this method is error-prone. Furthermore, when working on multiple projects, especially when working together with other people on the same content it soon becomes unmanageable.

Version control is quite literally about how you can maintain and access different versions of the items you are working on. In software development: code files, configuration files and if documentation.

Nadia

Centralized Version Control Systems

To be able to collaborate on the same code base in a more efficient and reliable way developers started using Centralized Version Control Systems (CVCSs). In a CVCS (such as Subversion and CVS) there is a single server that contains all versioned items. Everybody that wants to work on the items needs to download them from the main server. Whenever a piece of work is finished it needs to be uploaded back to the main server. A CVCS provides reliable version control and the ability to see who last modified something and when.

The most important downside of CVCSs is the single point of failure in the main server. If the server goes down all that remains are the specific versions that were downloaded to the clients. The full version history will be lost. If the server is only temporarily unavailable (or you don’t have an internet connection) there is usually very little work you can do on the project.

A Centralized Version Control System provides reliable version control and the ability to see who last modified something and when. The most important downside is the single point of failure in the main server.

Nadia

Distributed Version Control Systems

For many years CVCSs have been the standard for version control. Due to the mentioned downsides the current standard today is to use Distributed Version Control Systems (DVCSs) such as Git or Mercurial. In DVCSs, the single point of failure is avoided as clients fully mirror the server, including the full history. When the server dies, any of the client repositories can serve as a backup to be copied to the server for a full restore. This also makes all operations a lot faster: you are acting on your local copy instead of communicating with the server for each action. When you can’t reach the server, you can continue to work because you have a local copy of the full project.

Why Git?

At Sofico, we have transferred from Subversion to Git as our version control system. Git started in 2005 and has evolved to be easy to use, fast, efficient with large projects, small, secure, open-source and it provides strong support for non-linear development via branching. This makes Git a perfect fit for Sofico where we have more than 150 contributors committing code on a daily basis for projects from all around the world.

Git is a perfect fit for Sofico, where we have more than 150 contributors committing code on a daily basis for projects from all around the world.

Nadia

Branching?

Branching is a term you will soon encounter when working with git. It is an abstraction for an independent line of development. The figure below depicts 2 branches:

  • The main line of development (often referred to as ‘master’) is starting with version 1, 2, and 3.
  • At version 3, a new branch is started for project A. Notice how it resembles the start of a branch of a tree (with some imagination).
  • Now, while developers are working hard on project A, somebody has created a version 4 on the master branch.
  • At this point in time, another team is starting a new branch for project B.

Branching is nothing more than a pointer. Project A has a pointer to its starting point: the master commit number 3. Branch B has no knowledge of project A’s changes. It simply has a pointer to its own starting point on master: commit number 4.

Pretty soon after it started, project B is finished. The tester verifies the quality of the changes on the branch. This means the changes for project B can now be merged back into the master branch. Git offers multiple merging strategies and excellent documentation of them that is worth reading before you start working with Git. In many cases Git can automatically merge commits. As nothing has changed on the master branch since project B was created, it can be auto-merged back to the master branch: commit number 5. 

  • After project B is merged back to the master branch, project A is also finished, tested, and ready to merge back to the master branch.
  • But, since when started at commit 3, the master branch project was colored black.
  • In the meantime, it has changed: from black to grey in commit 4, and from grey to light blue in the merge-commit 5 from project B.

What needs to be the color of project A’s merge-commit number 6? It can become dark blue but dark blue is coming directly from black, the grey and light blue changes will be lost. We probably don’t want to make it light blue either, as we want the dark blue change to become part of the master.

In most cases, the commit will become a blend of dark blue and light blue.

When merging, you have to make sure both the dark blue and light blue changes are kept. Most of the time, the merging of light blue and dark blue can still be automated. Only when the changes occur in the same places you will need to resolve the conflicts manually.

Is Git here to stay?

I think Git is here to stay for a long time to come – for humans. As a programmer, I create only the static part of the software: the code. I upload my work to the version control system – a snapshot in time. When the code is deployed it joins up with the dynamic part: the state of the application. In our case, different databases. People change the state by interacting with the application. The static code is only changed when developers deploy a new version of the code.

I expect the software of the future to operate quite differently. Powered by AI, modeled after the human brain, the whole concept is different. Humans do not seem to have brain code, that operates on brain data. The data is intertwined with the ‘code’: a brain can only be compared to thinking, adapting software. AI development does not tend to focus on timed releases but on continuous learning. Interactions of the software do not only change the state but can also trigger the code to adapt itself to the situations that it encountered.

Git is not built to keep track of changes that occur every second. The concepts will probably remain, ‘thinking software’ also needs branches and needs to be able to roll back to a ‘good’ state. For example when your nice chatbot becomes an offensive racist within 16 hours, like Microsoft’s Tay. Currently, there are programs such as DVC Git (Data Version Control Git) which are tailored more to the structure of data science and machine learning projects. But this is still tailored to humans working on machine learning projects. I’m curious to see what the future of AI version control will bring. With a bit of luck, we humans don’t even have to design it ourselves…

Tags


Share this story  —   Facebook     Twitter     LinkedIn     Xing

More life at Sofico

20200624-132301-sofico-IneDehandschutter

"I SWOT my new job”

After 21 years working for BMW Financial Services, Bert Vanden Bergh decided it was time for a new challenge. “I had to step outside my comfort zone in...

Learn more