Introduction

Git plays a crucial role in the toolset of many software developers today. Both experts and beginners can greatly increase their productivity by appropriately leveraging a versioning system as powerful as git. Unfortunately, online tutorials and references dealing with the topic are often either too shallow, leaving readers with a sense of only having touched the tip of the iceberg, or are too long-winded, diving too deeply into details which needn't interest everybody. Therefore, this article aims at providing a way for git beginners as well as those looking for a refresher to become confident with git as fast as possible without missing many relevant details.

This article should be seen as part tutorial and part reference. The first part is a short introduction where the purpose and inner workings of git are quickly discussed. The second section presents a glossary of sorts where the terms I consider most important are explained and relevant examples given. It starts by defining what a git repository and a commit are but delves quickly into more interesting topics such as the differences between an arbitrary head and the HEAD, dirty working trees among others. Finally, the last part is a collection of useful git commands. Although the snippets shown at the end range from simple to a bit more complex, they are quite helpful and getting a feeling for what each does is the key to a good start with git.

Beware, this article is a bit long but that is, in my opinion, a necessary evil. As mentioned before, I've tried to make it neither too shallow nor too deep. I hope you enjoy it. Let me know if you found it useful or have any suggestions.

Table of Contents

The Intuitive File Worflow

While programming there are four basic tasks which are constantly performed and usually lead to changes in the codebase. Those are of course creating, modifying, deleting and checking the status of files. This is known to everyone, even if intuitively, who writes computer programs. This basic workflow will be our reference point throught the article. It is in this context that one would like to have a system which helps keeping track of the codebase and the operations performed on the objects that comprise it. It is in this context that git comes into play.


Git Glossary

Git

Git is a distributed version control system that emphasizes speed, data integrity and support for non-linear workflows. Git was created by Linus Torvalds in 2005 for development of the Linux kernel, with other kernel developers contributing to its initial development1.

Git aims at providing answers for questions such as the following to anyone with access to a so-called git repository.

  • What files exist currently in the codebase?
  • Of those files, which ones comprise the latest version of the codebase?
  • Which files have been modified but not saved?
  • Which have been saved but not shared with the team?
  • What are the changes that led to a particular file looking the way it does?
  • What alternative versions of the files are there?
  • Who was the last person to edit a given file?

The Git Workflow

A key point when discussing how to work with git is that there's no single valid workflow. There are ways of working with git which ressemble version control systems such as Subversion2 and there are more git-like workflows which fully exploit the distributed and dynamic capabilities of git.

In any case, the purpose of this writing is not to thoroughly compare different git workflows, a good overview of some common ways teams leverage git can be found in the footnotes3, but to have a look at one which hopefully fits every beginner or intermediate git user.

Having said that, for local repositories there is pretty much only one way to use git. The difference between local and remote repositories is explained further below. For the time being we focus on the local git workflow, which consists of four steps. These steps can be mapped, to some extent, to those constituting the "intuitive workflow" described previously. The steps are:

  1. Initialize a repository.
  2. Configure the repository.
  3. Create the initial line of development in the repository.
  4. Create files or perform changes.
  5. Collect one or more of the changes performed in a unit called a commit.
  6. Save that set of changes to the history of a given development line.
  7. Make changes available to others.

Steps one to three are usually only performed once. So, they shouldn't be understood per se as part of the workflow.

That is also the usual lifecycle of any given file while working with git locally.

The following image illustrates the lifecycle of a file in the git world more clearly:

Taken from https://git-scm.com/book/en/v2/book/02-git-basics/images/lifecycle.png

Lifecycle of files in git

Taken from https://git-scm.com/book/en/v2/book/02-git-basics/images/lifecycle.png

Once a file has been created or moved into a git repository, a directory managed by git, it starts as untracked. That means the repository hasn't been told to monitor the changes done to that file.

One can tell git to start monitoring a file by using the command git add mynewfile. At this point git knows that the file mynewfile should be tracked. What actually happened, however, is a bit more complex. By issuing git add mynewfile one actually tells git to put the current version of mynewfile in a list of changes made to the codebase. If one was to add another file called myotherfile, the current version of that file will similarly be added to that list of changes.

After the user has grouped several changes in the aforementioned list of changes, called the stage or index, he can decide to save them to the current version of the codebase and its history.

In other words, by staging a file of which git has no knowledge we tell git to put the change addition of a new file to the working directory in the index. In this way we group that change with our other changes.

A change or group of changes which have been put in the index, that is grouped together and prepared for storage, can then be saved to the history of the codebase. The act of saving one or several changes is called committing and it results in a commit or entry in the project's history.

Files whose current state has been saved in the project's history are deemed unmodified and require no further action from the user. As shown by the diagram below, they can still be removed or edited if desired. If removed from the repository, they become untracked and we are back at square one. If edited, the files are now considered modified. A file being in modified state means that their current working copy differs from the snapshot git has of it in the history of the repository. If the user so desires, the changes can once again be staged and commited.

This is the basic idea of working locally with git. We tell git to track files, perform changes and tell it at which point in time to take snapshots of the codebase by saving the changes made.

Up until now there are many questions open but the next sections will clarify the matter further.

In order to keep an overview of what is being discussed, the following image might come in handy. It shows several things but the details are, at least for now, not so important. We observe that commit objects, formerly described as sets of changes, are connected with eachother. Furthermore, the working directory or working tree consists of those files which can be seen by the user in the directory. The stage or index keeps the changes which will be saved as a single commit in the commit chain of the project later on. Finally, lines of development which look like sub-chains made out of commits are also shown and these are called branches.

Taken from https://marklodato.github.io/visual-git-guide/conventions.svg

These are some git objects which are dealt with quite often while in the git world

Taken from https://marklodato.github.io/visual-git-guide/conventions.svg

That was a superficial, albeit needed, overview of git, we now move into more interesting territory.

User

In the context of this article, a user is an entity, human or otherwise, which performs actions on the codebase or other artifacts related to or being tracked by git.

Working Tree

The contents of the directory where a git repository has been initialized, i.e. ~/gitrepo or C:\Users\sentheon\gitrepo.

Repository

A git repository is any directory which has been initialized to be one. It can be said that every git repository starts as an empty one only composed of its configuration. As time progresses, it becomes a compilation of all the branches and commits, loosely speaking lines of development and changes respectively, that have lead to the codebase being what it is.

Object Identifier

As already mentioned, a git repository is an abstract term describing the grouping of all branches and commits comprising a codebase or project. In order to track such objects, git relies on calling them a name, these are the hashes or object identifiers. Every object tracked by git has an object identifier. Branches and commits, for example, are all uniquely identified by strings.

A branch name can be master or feature42 and commits are usually identified by hashes, i.e. 9d8e3d229cc6bb57b156848b1fb4df6bc5fda1c1.

Configuration

Git is quite powerful and can be greatly customized. To make this possible it relies on numerous configuration files. The local configuration for a given git repository lives in the .git subdirectory. This, however, is not the only place where information related to git's configuration and the repository is stored.

Git looks for its configuration in the following places:

  • /etc/gitconfig
  • ~/.gitconfig
  • .git/config

Additionally, files such as .gitignore and .gitmodules exist for particular purposes. The former defines which files git should not track and should not be seen by it as part of the codebase although they exist in the directory tree. The latter deals with git submodules, a way to reference git repositories inside git repositories.

Upon initialization, a git directory looks as follows:

[~/gitrepo]$ git init .
Initialized empty Git repository in ~/gitrepo/.git/
[~/gitrepo]$ ls -a
drwxrwxr-x  7 user user .git

The files used by git for storing its configuration can be modified manually, but it is more comfortable to use the tool itself to perform some common operations such as modifying the email and username used by git for new commits.

[~/gitrepo]$ git config --system user.name

[~/gitrepo]$ git config --global user.name
sentheon

[~/gitrepo]$ git config --local user.name

[~/gitrepo]$ git config --global user.email "contact@sentheon.com"

[~/gitrepo]$ git config --global user.email
contact@sentheon.com

[~/gitrepo]$ git config color.ui true

In the output shown above, the git config command has been issued for several scopes. The system scope is the one associated with /etc/gitconfig. --global refers to the one present in the user's home directory, ~/.gitconfig. Finally, --local means the configuration file in ~/gitrepo/.git/config will be used.

As shown previously, git config can be used to show and modify the current state of git's configuration.

Git is a really well documented piece of software. For a more detailed listing of configuration options one can use git config -h. The -h option works similarly for other commands and should be kept present.

Index

The index, also called the staging area or cache, is the place where changes the user wants git to know about are kept in order to be committed later on. As such, when the user creates a file in his working tree it isn't necessarily known to git as a file whose state it should track and recorded in the repository's history. To do this, the user must add the change to the index.

In the following example we can see, that although the working directory contains four files, the index, shown using git ls-files -s only contains one. That is, the user has told git to keep track of myfile but not of README.md.

Worded differently, the change creation of myfile was added to the stage but creation of README.md was not.

[~/gitrepo]$ ls -a
drwxrwxr-x  7 user user .git
-rw-rw-r--  1 user user myfile
-rw-rw-r--  1 user user README.md

[~/gitrepo]$ git ls-files -s
100644 d082433a347e0a1518124642561f537b122bda71 0200 myfile

Commits

A commit is a change or series of changes which have been 'saved' or commited to the local repository by telling git to do so. They are also refered as snapshots since each commit and its respective parents capture the state of the project at the time it has been added to the history. A commit 'contains' among others information about the identity of the user who committed the set of changes, an object identifier for the commit, the date on which the changes were commited and the changes which were made.

A commit looks like this:

[~/gitrepo]$ git show 19d8e3d229cc6bb57b156848b1fb4df6bc5fda1b

commit 19d8e3d229cc6bb57b156848b1fb4df6bc5fda1b
Author: user <contact@sentheon.com>
Date:   Fri May 20 21:15:48 2016 +0200

    initial commit

    diff --git a/.gitmodules b/.gitmodules
    new file mode 100644
    index 0000000..45dfaa2
    --- /dev/null
    +++ b/.gitmodules
    @@ -0,0 +1,9 @@
    +[submodule "plugins"]
    +       path = plugins
    +       url = https://github.com/getpelican/pelican-plugins
    +[submodule "themes"]
    +       path = themes
    +       url = https://github.com/getpelican/pelican-themes

Feel free to ignore the output of the command, we will deal with commits and their structure in more depth later on.

Branches

https://git-scm.com/book/en/v2/book/03-git-branching/images/basic-merging-2.png

Two git branches with one of them, iss53, merged onto the master branch

https://git-scm.com/book/en/v2/book/03-git-branching/images/basic-merging-2.png

A branch is an independent development track on a given repository. Every branch has its own working tree, staging area and project history. They exist inside the repository itself and are usually based on a commit from another branch in the repository. Every repository has a 'main' branch upon its creation, the master branch. New developers coming to a project usually base their work on the master branch, then new branches are created where modifications are introduced and then these branches are merged onto the master branch.

[~/gitrepo]$ git branch -a
* master
  new
  remotes/origin/master

The example above shows that the repository which exists inside the directory ~/gitrepo contains three branches; in other words three development lines of the project. The first one and the one being used or checked out, as pointed out by the asterisk, is the master branch. The second one is another local branch called new. The third is a so-called remote branch described by remotes/origin/master. Remote branches will be discussed later on.

Heads and the HEAD

In git a head is a pointer to the tip of a branch or a revision, more to come on this topic later. The take away at this point is, however, that there is a special head, the HEAD, which simply gives a name to the commit on which the current working tree is based. The HEAD can be understood as a pointer to the most recent commited version of the codebase in the branch currently in use.

In the following example it can be observed, that the git history of the current branch master contains only three commits. The latest commit being the one with hash 518defa8757902f3be2b17cca0dbfe48fae87465. One can see from the output of the show command, that the HEAD points at the last commit made in the master branch.

[~/gitrepo]$ git log --pretty=oneline --max-count=4

518defa8757902f3be2b17cca0dbfe48fae87465 fix README.md
f37e6f56a322cd22f9e367f4e1f76d7c9b8b4575 add README.md
19d8e3d229cc6bb57b156848b1fb4df6bc5fda1b initial commit

[~/gitrepo]$ git show HEAD

commit 518defa8757902f3be2b17cca0dbfe48fae87465
Author: sentheon <contact@sentheon.com>
Date:   Fri May 20 21:35:35 2016 +0200

                    ...

The working directory, that is the files, the user can see are based on this commit pointed at by the HEAD. Modifications committed by the user at this point will be put on top of the commit to which the HEAD points. The newly created commit would then have the old HEAD as parent in the projects history.

Checking Out

Checking out refers to the act of updating the HEAD to a point determined by the head of a branch or an arbitrary commit in an other branch. As exemplified by the terminal output below, checking out a branch leads to the modification of the HEAD. Naturally, this also leads to the current working tree, index and history being swapped by those contained in the branch which is being checked out.

[~/gitrepo]$ git branch -v
* master 5bc416d seventh
  new    8ffdf6e sixth

Here we see that two branches are present. The head or latest commit of the master branch is pointing to the commit with commit message seventh. The head of the new branch is pointing to a commit with commit message sixth.

We now observe that the HEAD is pointing to seventh.

[~/gitrepo]$ git show HEAD
commit 5bc416dcd1f353375f846fb655ba68c05044fd18
Author: sentheon <contact@sentheon.com>
Date:   Mon May 23 09:43:26 2016 +0200

    seventh

diff --git a/seventh b/seventh
new file mode 100644
index 0000000..e69de29

However, if we check out the branch new, the HEAD will now point to sixth.

[~/gitrepo]$ git checkout new
Switched to branch 'new'

[~/gitrepo]$ git show HEAD
commit 8ffdf6e97a1c9b752062bea51d5db993b74cb56a
Author: sentheon <contact@sentheon.com>
Date:   Mon May 23 09:41:58 2016 +0200

    sixth

diff --git a/sixth b/sixth
new file mode 100644
index 0000000..e69de29

As a general remark, there is a special case when the HEAD points to a commit which is not the head of a local branch. This is called a “detached HEAD” state, it's a somewhat intricate topic which isn't in the scope of this article.

Two things suffice at this point. First, that checking out anything other than the head of a local branch leads to a detached state. Second, that this can generally be easily solved by checking out a head of a local branch.

The detached HEAD state is meant to allow for experimenting with changes based on a commit at an arbitrary point in the branch history 4.

Merging

You have most likely asked yourself what happens when several branches exist on a git repository. What if a developer wishes to incorporate changes found in branch X into the master branch of the project?

This is where merging comes into play. Merging refers to the act of taking changes found on another branch and applying them to another. Since git is a branch-based version control system, codebases managed with it are usually composed of several, commonly short-lived, branches, all of which evolve in parallel. For example, if three developers were to work on a software project with a plugin-based architecture, they could have three branches called plugin A, plugin B and plugin C. They could then at a later point in time merge their respective finished features to the master branch. In this way plugin A, plugin B and plugin C would become part of master and each developer could start working on some other features, i.e. feature A, feature B and feature C.

Here is a quick example showing git merging in action. First a git repository is initialized and the contents of the working tree listed.

[~/gitrepo]$ git init .
Initialized empty Git repository in ~/gitrepo/.git/

[~/gitrepo]$ ls -a
total 12
drwxrwxr-x  3 user user 4096 Jun  6 21:21 .
drwxrwxrwt 19 user  user  4096 Jun  6 21:21 ..
drwxrwxr-x  7 user user 4096 Jun  6 21:21 .git

We then create two files, add them to the index and create a commit with message first, second.

[~/gitrepo]$ touch first

[~/gitrepo]$ touch second

[~/gitrepo]$ git add .

The index now contains the files first and second. Now we commit them with a commit message which explains what was done.

[~/gitrepo]$ git commit -m "first, second"
[master (user-commit) 8e1fba3] first, second
 2 files changed, 0 insertions(+), 0 deletions(-)
 create mode 100644 first
 create mode 100644 second

We now create a new branch called mybranch and verify that the contents of master have been included in this branch. That is, mybranch is based on the last commit done to master.

[~/gitrepo]$ git checkout -b mybranch
Switched to a new branch 'mybranch'

[~/gitrepo]$ ls -a
total 12
drwxrwxr-x  3 user user 4096 Jun  6 21:22 .
drwxrwxrwt 19 user  user  4096 Jun  6 21:21 ..
-rw-rw-r--  1 user user    0 Jun  6 21:22 first
drwxrwxr-x  8 user user 4096 Jun  6 21:24 .git
-rw-rw-r--  1 user user    0 Jun  6 21:22 second

Now we add a file called third to mybranch and commit the changes.

[~/gitrepo]$ touch third
[~/gitrepo]$ git add .

[~/gitrepo]$ git commit -m "third"

[mybranch 2e4db31] third
 1 file changed, 0 insertions(+), 0 deletions(-)
 create mode 100644 third

We observe that third is now part of the repository.

[~/gitrepo]$ ls -l
total 12
drwxrwxr-x  3 user user 4096 Jun  6 21:25 .
drwxrwxrwt 19 user  user  4096 Jun  6 21:21 ..
-rw-rw-r--  1 user user    0 Jun  6 21:22 first
drwxrwxr-x  8 user user 4096 Jun  6 21:25 .git
-rw-rw-r--  1 user user    0 Jun  6 21:22 second
-rw-rw-r--  1 user user    0 Jun  6 21:25 third

Similarly, we switch to the master branch and create a file called fourth, add it to the index and commit it.

[~/gitrepo]$ git checkout master
Switched to branch 'master'

[~/gitrepo]$ ls -a
total 12
drwxrwxr-x  3 user user 4096 Jun  6 21:25 .
drwxrwxrwt 19 user  user  4096 Jun  6 21:21 ..
-rw-rw-r--  1 user user    0 Jun  6 21:22 first
drwxrwxr-x  8 user user 4096 Jun  6 21:25 .git
-rw-rw-r--  1 user user    0 Jun  6 21:22 second

[~/gitrepo]$ touch fourth

[~/gitrepo]$ git add .

[~/gitrepo]$ git commit -m "fourth"
[master 8f5e7a5] fourth
 1 file changed, 0 insertions(+), 0 deletions(-)
 create mode 100644 fourth

[~/gitrepo]$ git branch
* master
  mybranch

As expected, first, second and fourth are part of the master branch.

[~/gitrepo]$ ls -a

total 12
drwxrwxr-x  3 user user 4096 Jun  6 21:25 .
drwxrwxrwt 19 user  user  4096 Jun  6 21:21 ..
-rw-rw-r--  1 user user    0 Jun  6 21:22 first
-rw-rw-r--  1 user user    0 Jun  6 21:25 fourth
drwxrwxr-x  8 user user 4096 Jun  6 21:25 .git
-rw-rw-r--  1 user user    0 Jun  6 21:22 second

[~/gitrepo]$ git status 

On branch master
nothing to commit, working directory clean

We list the contents of mybranch while having master checked out and note that the list of files looks different. In the case of mybranch, first, second and third are contained in the branch but fourth isn't.

[~/gitrepo]$ git ls-tree -r mybranch --name-only
first
second
third

Now we merge mybranch with master.

[~/gitrepo]$ git merge mybranch
Merge made by the 'recursive' strategy.
 third | 0
 1 file changed, 0 insertions(+), 0 deletions(-)
 create mode 100644 third

[~/gitrepo]$ ls -a
total 12
drwxrwxr-x  3 user user 4096 Jun  6 21:26 .
drwxrwxrwt 19 user  user  4096 Jun  6 21:21 ..
-rw-rw-r--  1 user user    0 Jun  6 21:22 first
-rw-rw-r--  1 user user    0 Jun  6 21:25 fourth
drwxrwxr-x  8 user user 4096 Jun  6 21:26 .git
-rw-rw-r--  1 user user    0 Jun  6 21:22 second
-rw-rw-r--  1 user user    0 Jun  6 21:26 third

[~/gitrepo]$ git status
On branch master
nothing to commit, working directory clean

As can be seen above, we have successfully merged mybranch and its contents with master. This leads to third being incorporated in the master branch.

A visualization of the branch topology at this point can be shown by using git log as follows:

[~/gitrepo]$ git log --pretty=format:'%h %ad | %s%d [%an]' --graph --date=short
*   5c058b5 2016-06-06 | Merge branch 'mybranch' (HEAD -> master) [sentheon]
|\  
| * 2e4db31 2016-06-06 | third (mybranch) [sentheon]
* | 8f5e7a5 2016-06-06 | fourth [sentheon]
|/  
* 8e1fba3 2016-06-06 | first, second [sentheon]
[~/gitrepo]$ ^C
[~/gitrepo]$ 

Using a visualization tool is also a possibility. At the moment I prefer gitg.

https://wiki.gnome.org/Apps/Gitg/

Visualizing branch structure with gitg

https://wiki.gnome.org/Apps/Gitg/

The Idea of Tracking

There are two senses in which the concept of tracking can come into play in git. One is the idea of git tracking changes made to a file. This is the case once the user tells git to stage the current status of a given file by using git add. Should a user have files in the local git repository which haven't been staged(put in the index for later commit), then these are deemed untracked. As seen below, the README.md file is untracked, while the creation of .gitignore is shown as a change to be commited.

[~/gitrepo]$ git status 
On branch master

Initial commit

Changes to be committed:
  (use "git rm --cached <file>..." to unstage)

    new file:   .gitignore

Untracked files:
  (use "git add <file>..." to include in what will be committed)

    README.md

Therefore, if we wanted the current status of README.md to be saved in our upcoming commit, we would have to add it to the index using git add README.md.

The other way in which the word tracking is used is in the context of branches. A local branch which also exists in a remote git repository can have a local branch which tracks the state of the remote one.

[~/gitrepo]$ git branch -a
* master
  new
  remotes/origin/master

In the terminal output shown above, remotes/origin/master is the tracking branch for the master branch in the associated remote repository. We will talk about remote repositories in a moment.

Dirty Working Tree

A dirty file is one which git has been told to track and is different with respect to the index or with respect to the HEAD of the branch. In other words, the dirty state of a repository's working tree is defined by changes to files which are being tracked but haven't been committed. Conversely, one can think of a clean working tree as a working tree where there are no changes to commit and there are no differences between the stage and the working directory.

Clobbering

Clobbering refers to the action of overwriting a file. Knowing when your working tree finds itself in a dirty state is important, since switching branches, swapping thereby the current index and working tree, would lead to changes made to untracked files which don't exist in the current branch but exist in the to-be-checked-out branch to be lost.

As an example, consider the following terminal output:

[~/gitrepo]$ git ls-tree -r new 
100644 blob e69de29bb2d1d6434b8b29ae775ad8c2e48c5391    file

[~/gitrepo]$ git ls-tree -r master

[~/gitrepo]$ git branch -a
  * master
    new

[~/gitrepo]$ ls -a
total 12
drwxrwxr-x  3 user user 4096 Mai 23 12:04 .
drwxrwxrwt 18 user user 4096 Mai 23 12:04 ..
drwxrwxr-x  8 user user 4096 Mai 23 12:04 .git

[~/gitrepo]$ touch file

[~/gitrepo]$ git checkout new
error: The following untracked working tree files would be overwritten by checkout:
    file
Please move or remove them before you can switch branches.
Aborting

We can see that there are two branches, new and master. master is currently tracking no files but the new branch contains file. Our HEAD is pointing to the head in master, as shown by the asterisk in the ouput of git branch -a. If we then create a file using touch file and don't tell git to track it and save its state we are unable to checkout the new branch.

What git sees is that there are files in the working tree which are untracked in the current branch, master, and exist in the branch to be checked out, new. Since git doesn't know what the state of the untracked file is, how is it supposed to know what the user wants to keep or to what extent? Checking out the new branch would lead to git clobbering file. Git will therefore not allow a checkout of new while the working tree is in a dirty state, since checking out new would lead to clobbering file in master.

The History

The history is the collection of commits which have been made to a repository. It contains the histories of all branches which have ever existed in the repository, composed of course of the respective commits made to those branches. It need not be linear, as several branches can exist at any given time. This situation leads to a forked history, where at some commit a new branch comes into existence and is tracked independently of the initial master branch.

The following is a simple linear example for the history of a branch:

$ git log --pretty=format:"%h - %an, %ar : %s"
518defa - user, 2 days ago : fix README.md
f37e6f5 - user, 2 days ago : add README.md
19d8e3d - user, 2 days ago : initial commit

This other one, however, shows two branches where the new branch is ahead of the master branch by one commit.

[~/gitrepo]$ git log --graph --full-history --all --pretty=format:"%h%x09%d%x20%s"
* 7527d23        (new) add third
* e212506        (HEAD -> master) add second
* 705b193        add first

In some sense, the previous example still presents a linear history, since the new branch is just one commit ahead of the master branch. Due to this, although they are two branches, their heads and commits which brought their heads to the current state are not diferent enough to think of them as two separate development lines.

The next example presents the history of a more non-linear and powerful workflow.

[~/gitrepo]$ git ls-tree -r master
100644 blob e69de29bb2d1d6434b8b29ae775ad8c2e48c5391    fifth
100644 blob e69de29bb2d1d6434b8b29ae775ad8c2e48c5391    first
100644 blob e69de29bb2d1d6434b8b29ae775ad8c2e48c5391    fourth
100644 blob e69de29bb2d1d6434b8b29ae775ad8c2e48c5391    second
100644 blob e69de29bb2d1d6434b8b29ae775ad8c2e48c5391    seventh
100644 blob e69de29bb2d1d6434b8b29ae775ad8c2e48c5391    third

As can be seen, first the files being tracked by the master branch are listed using git ls-tree -r master. This shows a series of files starting with the first and ending with the seventh, excluding the sixth.

Afterwards, we list the files being tracked by another branch, the new branch. This time we only find the first, second and sixth files.

[~/gitrepo]$ git ls-tree -r new
100644 blob e69de29bb2d1d6434b8b29ae775ad8c2e48c5391    first
100644 blob e69de29bb2d1d6434b8b29ae775ad8c2e48c5391    second
100644 blob e69de29bb2d1d6434b8b29ae775ad8c2e48c5391    sixth

Now, one would like to know how it has come to be this way. Some kind of graphical representation would also be. For this we use git log --graph

[~/gitrepo]$ git log --graph --full-history --all --pretty=format:"%h%x09%d%x20%s"

* 5bc416d        (HEAD -> master) add seventh
* 04b25c8        add fourth and fifth
| * 8ffdf6e      (new) add sixth
| * 5e4e01b      remove third
|/  
* 7527d23        add third
* e212506        add second
* 705b193        add first

From bottom to top, we see that in the beginning the first, second and third files where added to the master branch. Then a new branch was created, shown by the |/ bifurcation. In this new branch, the third file was removed and a sixth file was added. According to the history, the head of the new branch is still at the change where the sixth file was created and that change commited. Then, a commit was made to the master branch which created the fourth and fifth files. Finally, another file was created, the seventh, and this change committed. It is at this change, that the head of the master branch is pointing to.

Since, as the following terminal ouput shows, we have currently checked out the master branch, the HEAD of the repository is pointing to the head of the master branch, which was shown by git log previously as 5bc416d (HEAD -> master) add seventh.

[~/gitrepo]$ git branch -v
* master 5bc416d seventh
  new    8ffdf6e sixth

Diffing

Diffing refers to the action of comparing two objects, in this case two git objects. By using git diff one can compare a file in the HEAD with its version in a remote branch. One can also compare two commits with eachother and similarly two files with eachother. There are many possibilities for what can be achieved with git diff. It is sufficient to say that diffing is a central process in the git world and that the result of diffing two objects is an annotated output which shows what has been changed, i.e. added or deleted.

Taking an arbitrary repository as an example, a diff between the HEAD and the fourth latest commit may look similar to this:

[~/gitrepo]$ git diff HEAD~4 HEAD
diff --git a/myfile b/myfile
index b793f86..555b214 100644
--- a/myfile
+++ b/myfile
@@ -1,3 +1,2 @@
Hello world
-Goodbye world
Hello again
\ No newline at end of file

We see that the only difference is the absence of the line Goodbye world.

Although this output is enough for simple modifications, following changes by means of pluses and minuses gets tedious quite fast. Fortunately, there are several graphical tools which can be used to visualize the result of a diff. Diffuse is one such tool and is, as of today, my favorite one.

Diffuse can be set as default diff.tool by modifying the configuration variable diff.tool

[~/gitrepo]$ git config --global  diff.tool diffuse
[~/gitrepo]$ git difftool HEAD~4 HEAD

Viewing (1/1): 'myfile'
Launch 'diffuse' [Y/n]: 
[~/gitrepo]$ 

The result in diffuse is the following:

http://diffuse.sourceforge.net/

Performing diffs with diffuse as difftool

http://diffuse.sourceforge.net/

Here we see the result of git diff in a more appealing and understandable representation.

Dangling commits

Dangling commits are commits which aren't associated with any branch. They exist in the history but are not tied to any line of development, they exist as "islands" in the commit history without parents or children in the chain. They are what makes the git stash possible.

The Stash

Stashing changes is a quick and easy way to generate a commit containing all changes which haven't been added to the index and store them away. As explained by the git manpage, the command git stash "stash(es) the changes in a dirty working directory away". Furthermore, "use git stash when you want to record the current state of the working directory and the index, but want to go back to a clean working directory."

The stash works as a stack on top of which one can put dangling commits until one desires to use them or apply them to another commit. Sort of putting together everything you haven't quite had the time to properly organize and commit it to some place for latter use.

[~/gitrepo]$ ls
total 12
drwxrwxr-x  3 user user 4096 Jun  5 22:47 .
drwxrwxrwt 14 user  user  4096 Jun  5 22:47 ..
-rw-rw-r--  1 user user    0 Jun  5 22:47 file
drwxrwxr-x  8 user user 4096 Jun  5 22:47 .git
-rw-rw-r--  1 user user    0 Jun  5 22:46 README.md

[~/gitrepo]$ git status --short
?? file

As seen from the previous output, file is untracked. By using the stash, we can store away all changes to the working directory, including untracked files, which make our working directory dirty using git stash -u and retrieve them later on.

[~/gitrepo]$ git stash -u
Saved working directory and index state WIP on master: 4963f48 Initial commit
HEAD is now at 4963f48 Initial commit

Consequently, file has "disappeared" and our working tree is once again reflecting the head of the master branch.

[~/gitrepo]$ ls
total 12
drwxrwxr-x  3 user user 4096 Jun  5 22:47 .
drwxrwxrwt 14 user  user  4096 Jun  5 22:47 ..
drwxrwxr-x  8 user user 4096 Jun  5 22:47 .git
-rw-rw-r--  1 user user    0 Jun  5 22:46 README.md

[~/gitrepo]$ git status 
On branch master
nothing to commit, working directory clean

Nonetheless, there is one entry in the stash which contains our former working tree and index.

~/gitrepo]$ git stash list
stash@{0}: WIP on master: 4963f48 Initial commit

By popping the first entry on the stash we apply the saved changes and restore our former index.

[~/gitrepo]$ git stash pop
Already up-to-date!
On branch master
Untracked files:
  (use "git add <file>..." to include in what will be committed)

    file

nothing added to commit but untracked files present (use "git add" to track)
Dropped refs/stash@{0} (47f58adf2bf6e448964314619403649b79bfc8d1)

This leads to file appearing once again in our working tree.

[~/gitrepo]$ ls -a
total 12
drwxrwxr-x  3 user user 4096 Jun  5 22:51 .
drwxrwxrwt 14 user  user  4096 Jun  5 22:47 ..
-rw-rw-r--  1 user user    0 Jun  5 22:51 file
drwxrwxr-x  8 user user 4096 Jun  5 22:51 .git
-rw-rw-r--  1 user user    0 Jun  5 22:46 README.md

file is also once again listed, as it should be, as un tracked.

[~/gitrepo]$ git status
On branch master
Untracked files:
  (use "git add <file>..." to include in what will be committed)

    file

nothing added to commit but untracked files present (use "git add" to track)

Remotes

A remote can be understood as an identifier given to a repository which exists on some system, is available to you and can be reached via a path such as https://domain.com/repo.git or ssh://user@server.com:/opt/git/repo.git.

One such example would be as follows:

$ git remote -v
origin  https://github.com/getpelican/pelican-themes (fetch)
origin  https://github.com/getpelican/pelican-themes (push)

In this example, origin is the identifier given to the repository at https://github.com/getpelican/pelican-themes.

As with any repository, remote repositories have branches, commits and histories of their own. However, they would be of little use if one was not be able to track their contents and potentially alter them. Git solves this problem by keeping track of the contents in remote repositories using local tracking-branches. These branches work as local caches which reflect the status of a remote up until the point it was last fetched.

As can be seen below, git is tracking the contents of the origin remote at https://github.com/getpelican/pelican-themes by using local copies of its branches, namely master and previews.

git branch -a
* master
  remotes/origin/HEAD -> origin/master
  remotes/origin/master
  remotes/origin/previews

Origin and Origin Master

Two keywords anyone interested in git is bound to find sooner or later are master, origin and the combination of both origin master. They are, once again, nothing but default names given to certain git objects. master is the name given to default development branch of any repository. As mentioned before, once a repository is initialized using git init, a master branch is also created. origin is the default name given to the remote repository to which local changes will eventually be pushed. Then origin master, quite coherently, refers to the master branch at the remote repository origin. Note that neither master nor origin are a must. One can indeed give an arbitrary name to the default branch of a repository and the same applies for naming remotes.

Cloning

Cloning refers to the act of copying a remote git repository to the user's local filesystem. One can clone a given repository by issuing git clone followed by the ssh, http or https URL as follows:

[~/gitrepo/]$ git clone git@server.com:example.git .
Cloning into '.'...
remote: Counting objects: 24, done.
remote: Compressing objects: 100% (17/17), done.
remote: Total 24 (delta 2), reused 0 (delta 0)
Receiving objects: 100% (24/24), done.
Resolving deltas: 100% (2/2), done.
Checking connectivity... done.

[~/gitrepo/]$ ls -a
total 20
drwxrwxr-x  3 user user 4096 Jun  6 18:15 .
drwxrwxrwt 17 user  user  4096 Jun  6 18:15 ..
drwxrwxr-x  8 user user 4096 Jun  6 18:15 .git
-rw-rw-r--  1 user user   11 Jun  6 18:15 myfile

After the cloning process the user now has a working copy of the remote git repository in his file system and can proceed to make changes accordingly.

Pushing

Pushing refers to the act of transfering changes in a git repository which have happened locally to a remote one. When a remote repository is involved, a new step has to be added to the three step git workflow. It becomes:

  1. Make changes
  2. Prepare a commit by staging said changes
  3. Record the commit in the commit chain
  4. Push commit to remote

In the following example a remote git repository is cloned. After cloning it the second file is created, added and commited.

[~/gitrepo]$ git clone git@sentheon.com:example.git .

Cloning into '.'...
remote: Counting objects: 7, done.
remote: Compressing objects: 100% (4/4), done.
remote: Total 7 (delta 1), reused 0 (delta 0)
Receiving objects: 100% (7/7), done.
Resolving deltas: 100% (1/1), done.
Checking connectivity... done.

[~/gitrepo]$ echo 'Hello world' > second

[~/gitrepo]$ git add .

[~/gitrepo]$ git commit -m "second"
[master 27a6184] second
 1 file changed, 1 insertion(+)
 create mode 100644 second

We now look at the local history and see that the commit of second has been added to the history. Additionally, we observe that several commits were made before our own in the past. First README.md was added, then first, afterwards first was deleted and finally we added second.

[~/gitrepo]$ git log 
commit 27a6184451cd4f383c9b28336a223bc791813384
Author: sentheon <contact@sentheon.com>
Date:   Mon Jun 6 00:20:56 2016 +0200

    second

commit 9d357c22216ae40cc8babb84b72d11fce03879dc
Author: sentheon <contact@sentheon.com>
Date:   Mon Jun 6 00:18:46 2016 +0200

    remove first

commit e14f7b699680fe4e8093e2b9a6764a2c1ef53ca0
Author: sentheon <contact@sentheon.com>
Date:   Mon Jun 6 00:14:40 2016 +0200

    first

commit 3dabef2ecd20252a90cc819b4634ea16e17500eb
Author: sentheon <contact@sentheon.com>
Date:   Sun Jun 5 23:57:46 2016 +0200

    Added README.md

However, when looking at the history of the master branch on the origin remote, we see that the commit of second is missing. This is so since we haven't pushed our changes to the remote repository. In other words, we haven't told the remote repository to incorporate our changes, this is what pushing is all about.

[~/gitrepo]$ git log origin/master 
commit 9d357c22216ae40cc8babb84b72d11fce03879dc
Author: sentheon <contact@sentheon.com>
Date:   Mon Jun 6 00:18:46 2016 +0200

    remove first

commit e14f7b699680fe4e8093e2b9a6764a2c1ef53ca0
Author: sentheon <contact@sentheon.com>
Date:   Mon Jun 6 00:14:40 2016 +0200

    first

commit 3dabef2ecd20252a90cc819b4634ea16e17500eb
Author: sentheon <contact@sentheon.com>
Date:   Sun Jun 5 23:57:46 2016 +0200

    Added README.md

We now push the changes.

[~/gitrepo]$ git push -u origin/master
Counting objects: 2, done.
Delta compression using up to 4 threads.
Compressing objects: 100% (2/2), done.
Writing objects: 100% (2/2), 245 bytes | 0 bytes/s, done.
Total 2 (delta 0), reused 0 (delta 0)
To git@server.com:example.git
   9d357c2..27a6184  master -> master

It can now be observed, that the remote repository has incorporated our changes and these are now shown in the history.

[~/gitrepo]$ git log origin/master
commit 27a6184451cd4f383c9b28336a223bc791813384
Author: sentheon <contact@sentheon.com>
Date:   Mon Jun 6 00:20:56 2016 +0200

    second

commit 9d357c22216ae40cc8babb84b72d11fce03879dc
Author: sentheon <contact@sentheon.com>
Date:   Mon Jun 6 00:18:46 2016 +0200

    remove first

commit e14f7b699680fe4e8093e2b9a6764a2c1ef53ca0
Author: sentheon <contact@sentheon.com>
Date:   Mon Jun 6 00:14:40 2016 +0200

    first

commit 3dabef2ecd20252a90cc819b4634ea16e17500eb
Author: sentheon <contact@sentheon.com>
Date:   Sun Jun 5 23:57:46 2016 +0200

    Added README.md

At this point it is said, that the changes have sucessfully been pushed.

It is worth noting, that it is not necessary to push every commit. One can, indeed, work on a cloned repository and create several commits. Then, when the time is appropriate, all the changes can be pushed at once.

Pulling

Pulling refers to the act of obtaining changes present in a remote repository which have not been incorporated in the local copy of said repository.

In the following example a user clones a git repository, waits a while and performs some changes. Before pushing the changes, however, discrepancies between the local and remote histories of the master branch are found. Therefore, the user pulls the changes which were made on top of his version of the remote repository, before commiting his own.

[~/gitrepo/]$ git clone git@server.com:example.git .
Cloning into '.'...
remote: Counting objects: 24, done.
remote: Compressing objects: 100% (17/17), done.
remote: Total 24 (delta 2), reused 0 (delta 0)
Receiving objects: 100% (24/24), done.
Resolving deltas: 100% (2/2), done.
Checking connectivity... done.

[~/gitrepo/]$ ls -a
total 20
drwxrwxr-x  3 user user 4096 Jun  6 18:15 .
drwxrwxrwt 17 user  user  4096 Jun  6 18:15 ..
drwxrwxr-x  8 user user 4096 Jun  6 18:15 .git
-rw-rw-r--  1 user user   11 Jun  6 18:15 myfile

[~/gitrepo/]$ cat myfile 
Hello world

[~/gitrepo/]$ echo 'Goodbye world' >> myfile

Up until this point nothing too interesting has happened. The repository was cloned and myfile changed. This last change hasn't been commited or pushed yet, we will first check if there are new changes in the remote repository, to which we would like to push the change.

The last commit on the local history is the following:

[~/gitrepo/]$ git log -n1
commit 6d773d7c6b060424754186b173a4ff79815b88ce
Author: sentheon <contact@sentheon.com>
Date:   Mon Jun 6 16:15:01 2016 +0000

    Greet the world

The last commit in the remote branch, however, seems to be different.

[~/gitrepo/]$ git fetch
remote: Counting objects: 3, done.
remote: Compressing objects: 100% (2/2), done.
remote: Total 3 (delta 0), reused 0 (delta 0)
Unpacking objects: 100% (3/3), done.
From server.com:example
   6d773d7..d95ecd1  master     -> origin/master

[~/gitrepo/]$ git log -n2 origin/master
commit d95ecd19f7590fd874cfe2b116c20264e4a01477
Author: sentheon <contact@sentheon.com>
Date:   Mon Jun 6 16:17:38 2016 +0000

    Hello again

commit 6d773d7c6b060424754186b173a4ff79815b88ce
Author: sentheon <contact@sentheon.com>
Date:   Mon Jun 6 16:15:01 2016 +0000

    Greet the world

There is one commit after 6d773d7c6b060424754186b173a4ff79815b88ce in the remote history, the one with message hello again. We therefore try to pull the changes in that commit but get an error instead.

[~/gitrepo/]$ git pull
Updating 6d773d7..d95ecd1
error: Your local changes to the following files would be overwritten by merge:
    myfile
Please, commit your changes or stash them before you can merge.
Aborting

It turns out, that our working tree is in a dirty state(in this case it has differences with respect to the HEAD). This means that by pulling from the remote repository git would have to clobber(overwrite) changes which haven't been saved anywhere. This would potentially lead to data loss and hence git aborts the operation.

In order to continue, we put our changes in the stash for later use.

[~/gitrepo/]$ cat myfile 
Hello world
Goodbye world

[~/gitrepo/]$ git stash
Saved working directory and index state WIP on master: 6d773d7 Greet the world
HEAD is now at 6d773d7 Greet the world

[~/gitrepo/]$ cat myfile 
Hello world

we now pull as was intended initially.

[~/gitrepo/]$ git pull
Updating 6d773d7..d95ecd1
Fast-forward
myfile | 3 ++-
1 file changed, 1 insertions(+), 0 deletion(-)

[~/gitrepo/]$ cat myfile 
Hello world
Hello again
[~/gitrepo/]$ 

[~/gitrepo/]$git status
On branch master
Your branch is up-to-date with 'origin/master'.
nothing to commit, working directory clean
[~/gitrepo/]$

At this point the local repository is up-to-date.

Nonetheless, we are forgetting something. There was a change we wanted to apply before realizing that there were changes to pull from the remote master branch. The objective was to append Goodbye world to myfile but instead of doing this we saved this change for later in the stash.

To re-apply the change we have saved in the stash we can use git stash pop. Unfortunately, we get a conflict error message from git.

[~/gitrepo/]$git stash pop 
Auto-merging myfile
CONFLICT (content): Merge conflict in myfile

This means that git does not know how to merge the two versions of myfile.

The local one, which is also the one in the head of the master branch, looks like this:

Hello world
Hello again

The one saved in the stash, however, looks like this:

Hello world
Goodbye world

We are now in the terrain of conflict resolution.

Conflicts and Conflict Resolution

Let's go again over what brought us to having conflicts in our local repository. Below you'll find the terminal output shown in the past section.

In it the following is shown: 1. clone a repository 1. check the contents of myfile 1. modify myfile 1. check the local commit chain 1. check the remote commit chain 1. notice that there are new changes on origin/master and try to pull them 1. pull fails due to a dirty working tree 1. remove index and modifications by stashing them 1. succesfully pull changes 1. try to apply the contents of the stash to the current working tree 1. git stash pop fails, since it is based on the parent of the HEAD but the changes contained in it differ from those in HEAD

[~/gitrepo/]$git clone git@server.com:example.git .

Cloning into '.'...
remote: Counting objects: 3, done.
remote: Total 3 (delta 0), reused 1 (delta 0)
Receiving objects: 100% (3/3), done.
Checking connectivity... done.

[~/gitrepo/]$ls -a
total 16
drwxrwxr-x  3 user user 4096 Jun  7 22:49 .
drwxrwxrwt 30 user  user  4096 Jun  7 22:49 ..
drwxrwxr-x  8 user user 4096 Jun  7 22:49 .git
-rw-rw-r--  1 user user   12 Jun  7 22:49 myfile

[~/gitrepo/]$cat myfile 
Hello world

[~/gitrepo/]$echo -e "\nGoodbye world" >>myfile 

[~/gitrepo/]$git log -n1

commit cd2eb6563b566a2994c4b836e8974e9f71106825
Author: sentheon <contact@sentheon.com>
Date:   Tue Jun 7 22:49:05 2016 +0200
    Greet the world

[~/gitrepo/]$git fetch
remote: Counting objects: 3, done.
remote: Total 3 (delta 0), reused 1 (delta 0)
Unpacking objects: 100% (3/3), done.
From server.com:example
   cd2eb65..6906bf3  master     -> origin/master

[~/gitrepo/]$git log -n2 origin/master
commit 6906bf3594fcb82496e2e5b8da9b3279549d5d09
Author: sentheon <contact@sentheon.com>
Date:   Tue Jun 7 20:52:02 2016 +0000

    Hello again

commit cd2eb6563b566a2994c4b836e8974e9f71106825
Author: sentheon <contact@sentheon.com>
Date:   Tue Jun 7 22:49:05 2016 +0200

    Greet the world

[~/gitrepo/]$cat myfile 
Hello world
Goodbye world

[~/gitrepo/]$git pull
Updating cd2eb65..6906bf3
error: Your local changes to the following files would be overwritten by merge:
    myfile
Please, commit your changes or stash them before you can merge.
Aborting

[~/gitrepo/]$cat myfile 
Hello world
Goodbye world

[~/gitrepo/]$git stash
Saved working directory and index state WIP on master: cd2eb65 Greet the world
HEAD is now at cd2eb65 Greet the world

[~/gitrepo/]$git pull
Updating cd2eb65..6906bf3
Fast-forward
 myfile | 1 +
 1 file changed, 1 insertion(+)
[~/gitrepo/]$git status
On branch master
Your branch is up-to-date with 'origin/master'.
nothing to commit, working directory clean

[~/gitrepo/]$git stash pop
Auto-merging myfile
CONFLICT (content): Merge conflict in myfile

Conflict resolution is a topic which seems mysterious at first but once understood turns out to be quite straight forward. It refers to the act of manually merging changes when git has not been able to merge them automatically. This usually involves manually editing the files involved and adding them to the stage.

git stash pop tries to apply the commit on top of the stash list to the current working directory, so in a sense, it also works as git merge, only that it merges a commit and not a branch.

Calling git merge leads to an error similar to the one we've already seen. There are changes which have not been yet merged and they have to be solved in order to proceed.

[~/gitrepo/]$git merge
error: merge is not possible because you have unmerged files.
hint: Fix them up in the work tree, and then use 'git add/rm <file>'
hint: as appropriate to mark resolution and make a commit.
fatal: Exiting because of an unresolved conflict.

We observe that git has modified the file with conflicting changes. myfile now contains the change add "Hello again" on the second line which comes from origin/master and our local change which we popped from the stash `add "Goodbye world" at the second line.

[~/gitrepo/]$ls -a
total 20
drwxrwxr-x  3 user user 4096 Jun  7 22:57 .
drwxrwxrwt 30 user  user  4096 Jun  7 22:56 ..
drwxrwxr-x  8 user user 4096 Jun  7 22:57 .git
-rw-rw-r--  1 user user   96 Jun  7 22:57 myfile

[~/gitrepo/]$cat myfile
Hello world
<<<<<<< Updated upstream
Hello again
=======
Goodbye world
>

We open myfile with a text editor and modify it in the way which best suits us.

[~/gitrepo/]$cat myfile
Hello world
Hello again
Goodbye world

We have kept both changes. Now we add the file again to the index. This is our way of telling git that the conflicts for myfile have been solved.

[~/gitrepo/]$git add myfile

[~/gitrepo/]$git status
On branch master
Your branch is up-to-date with 'origin/master'.
Changes to be committed:
  (use "git reset HEAD <file>..." to unstage)

    modified:   myfile

myfile is now listed as modified but not commited. So we commit it and push it.

[~/gitrepo/]$git commit -m "Fixed conflict"

[master e1bc4c8] Fixed conflict
 1 file changed, 1 insertion(+)

[~/gitrepo/]$git push
Counting objects: 3, done.
Writing objects: 100% (3/3), 266 bytes | 0 bytes/s, done.
Total 3 (delta 0), reused 0 (delta 0)
To git@server.com:example.git
   6906bf3..e1bc4c8  master -> master

origin/master now contains our changes.

[~/gitrepo/]$git log -n3 origin/master
commit e1bc4c8c51387858f1533521510a27e51e65dd58
Author: sentheon <contact@sentheon.com>
Date:   Tue Jun 7 23:01:08 2016 +0200

    Fixed conflict

commit 6906bf3594fcb82496e2e5b8da9b3279549d5d09
Author: sentheon <contact@sentheon.com>
Date:   Tue Jun 7 20:52:02 2016 +0000

    Hello again

commit cd2eb6563b566a2994c4b836e8974e9f71106825
Author: sentheon <contact@sentheon.com>
Date:   Tue Jun 7 22:49:05 2016 +0200

    Greet the world
[~/gitrepo/]$

As you've might have noticed, manually solving merging conflicts is a troublesome task. It involves moving text around, copying, deleteing and so on. It is for this reason that GUIs for merging and solving conflicts might be a better alternative. My favorite at the moment is Meld.

We can set meld as our merge.tool by installing it and issuing git config --global merge.tool meld. Under Ubuntu it can be easily installed with sudo apt-get install meld.

[~/gitrepo/]$git mergetool 
Merging:
myfile

Normal merge conflict for 'myfile':
  {local}: modified file
  {remote}: modified file

At this point git calls our chosen tool.

As you can see, the text in origin/master is shown on the left, while our changes are shown on the right. The final contents of myfile are shown in the middle.

A good thing about Meld is that it generates a myfile.orig file, which contains the file before the conflict resolution, just in case something went wrong.

http://meldmerge.org/

Initial conflict view in Meld

http://meldmerge.org/

In this case we choose to only keep the remote changes and discard our own by clicking on the arrow to the left. Finally, we save the changes and close the GUI by using CTRL+S and then CTRL+Q.

http://meldmerge.org/

Incorporating changes with Meld

http://meldmerge.org/

If we try to run our mergetool again, git lets us know that there are no more changes to be solved.

[~/gitrepo/]$git mergetool 
No files need merging

[~/gitrepo/]$git status 
On branch master
Your branch is up-to-date with 'origin/master'.
Untracked files:
  (use "git add <file>..." to include in what will be committed)

    myfile.orig

nothing added to commit but untracked files present (use "git add" to track)

[~/gitrepo/]$cat myfile.orig 
Hello world
<<<<<<< Updated upstream
Hello again
=======
Goodbye world
>>>>>>> Stashed changes

myfile.orig is still there and can be removed. Another alternative would be to put files with a .orig extension in .gitignore if you wish to keep such files locally.

In any case, myfile looks as follows.

[~/gitrepo/]$cat myfile
Hello world
Hello again

In this way we have managed to solve merging conflicts caused by edit collisions.

Advanced topics

There are some topics I've purposely left out since I consider them to be out of scope. I do hope, however, that with what I've presented so far you will now be better equipped to delve into those topics on your own. For completeness' sake, the topics I left out but you might want to look into after you've gained more confidence with git:

  1. resetting
  2. cherry-picking
  3. rebasing
  4. git workflows

Finally, I've prepared a cheat sheet with my favorite commands which I hope you find useful. You can find it here

Final words

As far as basic git terminology and usage goes, that was pretty much it. Below you will find the most common commands used in the git world. Some you'll use on a daily basis, some others once every month. I've decided to include all my favorite ones in order to give beginners a good overview of how one can employ git.

One more thing, practice what you've read so far! Create a local repository or head over to Github or Gitlab and create a remote one. Practice cloning, adding files, committing changes, creating branches, merging and so on. With enough practice you'll become familiar enough with git to start fine-tuning your usage of the tool and your workflow.

Thank you for reading, feel free to comment or contact me via email. Do let me know if there's anything you'd like to know or there's something that should be corrected, added or clarified further.

Of course, feel free to subscribe by entering your email on the sidebar if you are interested in receiving updates whenever I write something new.


Resources

Footnotes


  1. https://en.wikipedia.org/wiki/Git_(software) 

  2. https://en.wikipedia.org/wiki/Subversion_(software) 

  3. https://www.atlassian.com/git/tutorials/comparing-workflows 

  4. https://git-scm.com/docs/git-checkout#_detached_head 



Comments

comments powered by Disqus