Thursday, November 10, 2011

Git Tutorial

Git Tutorial


1. Git

1.1. What is Git?

Git is a distributed version control system (dvcs) written in C. A version control system allows the creation of a history for a collection of files and includes functionality to revert the collection of files to another state. You may for example change the collection of files to a state from 2 days ago or you may switch between states for experimental features and production issues.
The collection of files is usually called "source code". In a distributed version control system everyone has a complete copy of the source code (including the complete history of the source code) and can perform version control operations against this local copy. The usage of a dvcs does not require a central code repository.
If you do changes to the source code you mark them as relevant for the version control (add them to the index / staging) and then you add them to the repository (commit). Git maintains all versions, therefore you can revert to any point in your source code history via Git.
Git performs commits to your local repository and you can synchronize your repository with other (remote) repositories. Git allows you to clone repositories, e.g. create an exact copy of a repository including the complete history of the source code. Owners of repositories can synchronize changes via push (transfer changes to a remote repository) or via pull (getting changes from a remote repository).
Git supports branching, e.g. you can have different versions of your source code. If you want to develop a new feature you may open a branch in your source code and make the changes in this branch without affecting the main line of your code.
Git can be used from the command line; this approach will be described in this tutorial. You also find graphical tools for example EGit for the Eclipse IDE, but these tools will not be described in this tutorial.

1.2. Important terminology

Table 1. Git Terminology
TermDefinition
RepositoryA repository contains the history, the different versions over time and all different branches and tags. In Git each copy of the repository is again a complete repository. The repository allows retrieving revisions into your working copy.
BranchesA branch is a separate code line with its own history. You can create a new branch from an existing one and change the code independently from other branches. One of the branches is the default (normally named master). The user selects a branch and works in this selected branch, this is called the "working copy". Selecting a branch is called "to checkout a branch".
TagsA tag points to a certain point in time in a specific branch. With a tag you can have a named point to which you always can revert, e.g. the coding of 25.01.2009 in the branch "testing".
CommitYou commit your changes into a repository. This creates a new revision which can be later retrieved, for example if you want to see the source code of an older version. Each commit contains the author and committer to allow to identify the source of the change. The author and committer might be different persons.
URLAn URL in Git determines the location of the repository.
RevisionRepresents a version of the source code. Git identifies revisions with SHA1 ids. SHA1 ids are 160-bit long and are represented in hexadecimal. The latest version can be addressed via "HEAD", the version before that via "HEAD~1" and so on.

1.3. Staging index

Git requires that changes are marked explicitly for the next commit. If you make for example a change in a file and want that this change is relevant for the next commit you have to add the file to the so called "staging index" via the command "git add file". The staging index will be a complete snapshot of the changes.
New files must always be added to the index explicitly. For files which were already committed once you can use the the "-a" flag during a commit.

2. Installation

On Ubuntu you can install the Git command line tool via the following:
sudo apt-get install git-core
  
For other Linux distributions please check your Vendor documentation.
A windows version of Git can be found on the msysgit Project site. The URL to this webpage is http://code.google.com/p/msysgit/.

3. Setup

Git allows storing global settings in a ".gitconfig" file. This file is located in the user home directory. As mentioned before Git stores the committer and author in each commit. This and additional information can be stored in the global settings.
The following will configure Git so that a certain user and email address is used, enable color coding and tell Git to ignore certain files.

3.1. User Configuration

Configure your user and your email for Git via the following command.
# Config the user which will be used by git
# Of course you should use your name
git config --global user.name "Example Surname"
# Same for the email addess
git config --global user.email "your.email@gmail.com"
# Set default so that all changes are always pushed to the repository
git config --global push.default "matching"

   
To query your Git settings execute the following command:
git config --list

   

3.2. Color Highlighting

The following will enable some highlighting for the console.
git config --global color.status auto
git config --global color.branch auto

   

3.3. Ignore certain files

Git can be configured to ignore certain files and directories. This is configured via the ".gitignore" file. This file can be in any directory and can contain pattern for files. For example you can tell git to ignore the directory "bin" via the following ".gitignore" in the main directory.
bin
   
Git has also the global setting "core.excludesfile" to specify global excludes.

4. Getting started with Git

The following will guide you through a typical Git workflow. You will create a few files, create a local Git repository and commit your file into this repository. Afterwards you clone the repository and push and pull some changes between the repositories. The comments (indicated with #) before the commands explain the actions.
Open a command line / shell for the operations.

4.1. Create content

The following creates some files with some content which will later be placed under version control.
#Switch to home
cd ~/
# create a directory
mkdir ~/repo01.git
# switch into it
cd repo01.git
# create a new directory
mkdir datafiles
# Create a few files
touch test01
touch test02
touch test03
touch datafiles/data.txt
# put a little text into the first file
ls >test01

   

4.2. Create repository, add and commit

Every Git repository is stored in the .git folder of the directory in which the Git repository was created. This directory contains the complete history of the repository. The .git/config file contains the local configuration for the repository.
The following will create a Git repository, add the files to the index of the repository and commit your changes.
# Initialize the local Git repository
git init
# Add all (files and directories) to the Git repository
git add .
# Make a commit of your file to the local repository
git commit -m "Initial commit"
# Show the log file
git log


   

4.3. See differences via diff and commit changes

The command "git diff" allows the user to see the changes made. To test this make some changes to a file and check what the command "git diff" shows to you. Afterwards commit the changes to the repository.
# Make some changes to the file
echo "This is a change" > test01
echo "and this is another change" > test02

# Check the changes via the diff command 
git diff

# commit the changes, -a will commit changes for modified files
# but will not add automatically new files
git commit -a -m "These are new changes"


   

4.4. Status, Diff and Commit Log

The following helps you to see the current status and the list of commits in your repository.
# Make some changes in the file
echo "This is a new change" > test01
echo "and this is another new change" > test02


# See the current status of your repository 
# (which files are changed / new / deleted)
git status
# Show the differences between the uncommited files 
# and the last commit in the current branch
git diff

# Add the changes to the index and commit
git add . && git commit -m "More chaanges - typo in the commit message"

# Show the history of commits in the current branch
git log
# This starts a nice graphical view of the changes
gitk --all

   

4.5. Correction of commit messages - git amend

The git amend command allows to change the last commit message.
In the above example the commit message was incorrect as it contained a typo. The following will correct this via the --amend parameter.
git commit --amend -m "More changes - now correct"
   

4.6. Delete files

If you delete a file which is under version control "git add ." will not pick this file up. You need to use the git commit command with the -a flag or to use the -A flag in the git add command.
# Create a file and put it under version control
touch nonsense.txt
git add . && git commit -m "a new file has been created"
# Remove the file
rm nonsense.txt
# Try standard way of committing -> will not work 
git add . && git commit -m "a new file has been created"
# Now commit with the -a flag
git commit -a -m"File nonsense.txt is now removed"
# Alternatively you could add deleted files to the staging index via
git add -A . 
git commit -m "File nonsense.txt is now removed"

   

5. Working with remote repositories

5.1. Setting up a remote (bare) Git repository

We will now create a remote Git repository. Git allows storing this remote repository either on the network or locally.
A standard Git repository is different from a remote Git repository. A standard Git repository contains the source code and the Git repository. You can work directly in this directory as the repository contains a working copy of all files.
Remote repositories do not contain working copies of the files, they only contain repository files. To create such a repository set the "--bare" flag.
In order to simplify the following examples the Git repository will be created locally in the filesystem.
# Switch to the first repository
cd ~/repo01.git
# 
git clone --bare . ../remote-repository.git

# Check the content, it is equal to the .git directory in repo01.git
ls ~//remote-repository.git

   

5.2. Push changes to another repository

Do some changes and push them from your first repository to the remote repository via the following commands.
# Make some changes in the first repository
cd ~/repo01

# Make some changes in the file
echo "Hello, hello. Turn your radio on" > test01
echo "Bye, bye. Turn your radio off" > test02

# commit the changes, -a will commit changes for modified files
# but will not add automatically new files
git commit -a -m "Some changes"

# push the changes
git push ../remote-repository.git

   

5.3. Add remote

You can always push to a git repository via the full URL to it. But you can also add a "shortname" to a repository via the "git remote add" command. "origin" is a special name which is usually automatically used if you clone a git repository. Origin indicates the original repository from which you started. As we started from scratch this name is still available.
# Add ../remote-repository.git with the name origin
git remote add origin ../remote-repository.git 

# Again some changes
echo "I added a remote repo" > test02
# commit
git commit -a -m "This is a test for the new remote origin"
# If you do not label a repository it will push to origin
git push origin


   

5.4. Show the existing remote repositories

To see the existing definitions of the remote repositories use the following command.
# show the existing defined remote repositories
git remote
   

5.5. Clone your repository

Create a new repository in a new directory via the following commands.
# Switch to home
cd ~
# make new directory
mkdir repo02.git 

# Switch to new directory
cd ~/repo02.git
# 
git clone ../remote-repository.git .

   

5.6. Pull changes

Pull allows you to get the latest changes from another repository. In your second repository, make some changes, push them to your remote repository and pull these changes to your first repository.
# Switch to home
cd ~

# Switch to second directory
cd ~/repo02.git
# Make changes
echo "A change" > test01
# Commit
git commit -a -m "A change"
# Push changes to remote repository
# origin is automatically maintained as we cloned from this repository
git push origin
# Switch to the first repository and pull in the changes
cd ~/repo01.git
git pull ../remote-repository.git/
# Check the changes
less test01

   

6. Revert Changes

If you create files in your working copy which you don't want to commit you can discard them.
# Create a new file with content
touch test04
echo "this is trash" > test04

# Make a dry-run to see what would happen
# -n is the same as --dry-run 
git clean -n

# Now delete
git clean -f


  
You can checkout older revisions of your source code via the commit id. The commit id is shown if you enter the command "git log". It is displayed behind the "commit" word.
# Switch to home
cd ~/repo01.git 
# get the log
git log

# Copy one of the older commits and checkout the older revision via 
git checkout commit_name


  
If you have not added the changes to the staging index you can also directly revert the changes.
#Some nonsense change
echo "nonsense change" > test01
# Not added to the staging index therefore we can 
# just checkout the old version
git checkout test01
#check the result
cat test01
# Another nonsense change
echo "another nonsense change" > test01
# We add the file to the staging index
git add test01
# restore the file in the staging index
git reset HEAD test01
# get the old version from the staging index
git checkout test01
  
You can also revert commits via the following:
#Revert a commit
git revert commit_name

  
If you deleted a file but you have not yet added it to the index or committed the change you can checkout the file again.
// Delete a file
rm test01
// Revert the deletion
git checkout test01

  
If you added a file to the index but don't want to commit the file you can remove it from the index via git reset file
// create a file
touch incorrect.txt
// accidently add it to the index
git add .
// remove it from the index
git reset incorrect.txt
// delete the file
rm incorrect.txt

  
If you deleted a directory and you have not yet committed the changes you can restore the directory via the following command.
git checkout HEAD -- your_dir_to_restore
  

7. Tagging in Git

Git has the option to tag certain versions in the history so that you find them later on more easily. Most commonly this is used to tag a certain version which has been released.
You can list the available tags via:
git tag
  
You can create a new tag via the following. -a indicates the tag name and via the -m parameter you specify the description of this tag.
git tag -a version1.6 -m 'version 1.6'

  
If you want to use the code associated with the tag, use:
git checkout 
  

8. Branches and Merging

8.1. Branches

Git allows creating branches, e.g. independent copies of the source code which can be changed independently from each other. The default branch is called "master". Git allows creating branches very fast and cheap in the sense of resource consumption. Developers are encouraged to use branches frequently.
The following command lists all locally available branches. The current active branch is marked with "*".
git branch 
   
If you want to see all branches (including remote branches) use the following command.
git branch -a
   
You can create a new branch via the following.
# Syntax: git branch  
# hash is optional if not specified the last commit will be used
git branch testing
# Switch to your new branch
git checkout testing
# Some changes
echo "Cool new feature in this branch" > test01
git commit -a -m "new feature"
#Switch to the master branch
git checkout master
# Check that the content of test01 is the old one
cat test01


   

8.2. Merging

Merge allows combining the changes of two branches. Merge does perform a so-called three-way-merge between the latest snapshot of two branches, based on the most recent common ancestor of both.
As a result you have a new snapshot. You can merge changes from one branch to the current active one via the following command.
# Syntax: git merge 
git merge testing

   
If a merge conflict occurs Git will mark the conflict in the file and the programmer has to manually fix the conflict. After resolving it he can add the file to the staging index and commit the change.

8.3. Delete a branch

To delete a branch which is not needed anymore you can use the following command.
#Delete branch testing
git branch -d testing
# Check if branch has been deleted
git branch

   

9. Solving merge conflicts

A merge conflicts happens if two people have modified the same content and Git cannot automatically determine how both changes should be applied.
Git requires that merge conflicts are solved manually. In this section we will first create a merge conflict and then resolve it and apply the change to the Git repository.
The following will create a merge conflict.
# Switch to the first directory
cd ~/repo01.git
# Make changes
touch mergeconflict.txt
echo "Change in the first repository" > mergeconflict.txt
# Stage and commit
git add . && git commit -a -m "Will create merge conflict 1"

# Switch to the second directory
cd ~/repo02.git
# Make changes
touch mergeconflict.txt
echo "Change in the second repository" > mergeconflict.txt
# Stage and commit
git add . && git commit -a -m "Will create merge conflict 2"
# Push to the master repository
git push

# Now try to push from the first directory
# Switch to the first directory
cd ~/repo01.git
# Try to push --> you will get an error message
git push
# Get the changes
git pull origin master





  
Git marks the conflict in the affected file. This file looks like the following.
<<<<<<< HEAD
Change in the first repository
=======
Change in the second repository
>>>>>>> b29196692f5ebfd10d8a9ca1911c8b08127c85f8

  
The above part is the part from your repository and the below one from the remote repository. You could now edit the file manually and then commit the changes. Alternatively you could use the command "git mergetool". "git mergetool" starts a configurable merge tool which displays the changes in a split screen.
# Either edit the file manually or use 
git mergetool
# You will be prompted to select which merge tool you want to use
# on Ubuntu I use meld
# After manually merging the changes, commit them
git commit -m "merged changes"
  

10. Rebase

10.1. Rebase commits in the same branch

The Git command "rebase" allows combining several commits into one commit. This is useful as it allows the user to rewrite some of commit history (cleaning it up) before pushing your changes to a remote repository.
The following will create several commits which should be later combined.
# Create a new file
touch rebase.txt

# Add it to git
git add . && git commit -m "rebase.txt added to index"

# Do some silly changes and commit
echo "content" >> rebase.txt
git add . && git commit -m "added content"
echo " more content" >> rebase.txt
git add . && git commit -m "added more content"
echo " more content" >> rebase.txt
git add . && git commit -m "added more content"
echo " more content" >> rebase.txt
git add . && git commit -m "added more content"
echo " more content" >> rebase.txt
git add . && git commit -m "added more content"
echo " more content" >> rebase.txt
git add . && git commit -m "added more content"

# check the git log message
git log

   
We will combine the last seven commits. You can do this interactively via the following command.
git rebase -i HEAD~7

   
This will open your editor of choice and let you edit the commit message or squash / fixup the commit with the last one.
Squash will combine the commit messages while fixup will disregard the commit message.

10.2. Rebase branches

You can also use Git to rebase two branches. As described the command "merge" does combine the changes of two branches. Rebase takes the changes of a branch, creates a patch and applies it to another branch.
The final result for the source code is the same as with merge but the commit history is cleaner; the history appears to be linear.
# Create new branch 
git branch testing
# Checkout the branch
git checkout testing
# Do some changes
echo "This will be rebased to master" > test01
# Commit into testing branch
git commit -a -m "New feature in branch"
# Rebase the master
git rebase master

   

10.3. Best practice for rebase

You should always check your local branch history before pushing changes to a “central” Git repository or review system.
Git allows to do local commits and this feature is frequently used to have points to which you can go back if something should go wrong later during a feature development. If you do this you should, before pushing, look at your local branch history and validate if these commits are relevant for others.
If they all belong to the implementation of the same feature then, most likely, you want to summarize them into one single commit before pushing.
The interactive rebase is basically rewriting the history. It is safe to do this as long as the commits have not been pushed to another repository. This means commits should only be rewritten as long as they have not been pushed.
If you rewrite and push a commit which is already present in other Git repositories then it will look like you implemented something that somebody already implemented in the past.

11. Create and apply patches

A patch is a text file with contains changes to the source code. This file can be send to someone else and this person can use this file to apply the changes to his local repository.
The following creates a branch, makes some changes in this branch, creates a patch and applies the patch to the master.
# Create a new branch
git branch mybranch
# Use this new branch
git checkout mybranch
# Do some changes
touch test05
# Change some content in an existing file
echo "New content for test01" >test01
# Commit this to the branch
git add .
git commit -a -m "First commit in the branch"

# Create a patch --> git format-patch master
git format-patch origin/master
# This created patch 0001-First-commit-in-the-branch.patch

# Switch to the master
git checkout master

# Apply the patch
git apply 0001-First-commit-in-the-branch.patch
# Do your normal commit in the master 
git add .
git commit -a -m "Applied patch"

# delete the patch 
rm 0001-First-commit-in-the-branch.patch
  

12. Define alias

An alias in Git allows you to setup your own Git command. For example you can define an alias which is a short form of your own favorite commands or you can combine several commands with an alias.
For example the following defines the "git add-commit" command which combines "git add . -A" and "git commit -m". After defining this command you can use it via "git add-commit -m "message".
git config --global alias.add-commit '!git add . -A && git commit'

  
Unfortunately defining alias at the time of this writing not work in Windows with msysGit.

13. Untrack a file / directory

Sometimes you want to have files or directories not being included in your Git repository. If you add it to your ".gitignore" file, git will stop tracking it from this moment. It will not remove it from the repository, so the last version will still be in git. To untrack a file or directory in Git you can use.
# remove directory .metadata from git repo
git rm -r --cached .metadata
# remove file test.txt from repo
git rm --cached test.txt
  
This will not remove the file from the commit history. If the file should also be removed from the history have a look at "git filter-branch" which allows rewriting the commit history.

14. Other useful commands

The following lists a few Git commands which are useful in the daily work with Git.
Table 2. Useful Git Commands
CommandDescription
git blame filenameWho created / modified the file
git checkout -b mybranch master~1creates new branch based on the master branch without the last commit

15. Installating a Git server

As described you don't need a server, you can just use a file system or a public Git provider, as Github. But sometimes it is convenient to have your own server and installing it under Ubuntu is relatively easy.
First make sure you have ssh installed.
apt-get install ssh
  
If you have not yet installed Git on your server then you need to do this too.
sudo apt-get install git-core
   
Create a new user for git.
sudo adduser git

  
Now logon with your Git user and create a bare repository.
# login to server
# to test use localhost
ssh git@IP_ADDRESS_OF_SERVER

# Create repository
mkdir example.git
cd example.git
git --bare init

  
Now you can commit to the remote repository.
mkdir gitexample
cd gitexample
git init
touch README
git add README
git commit -m 'first commit'
git remote add origin git@IP_ADDRESS_OF_SERVER:example.git
git push origin master

  

16. Online remote repositories

16.1. Cloning remote repositories

Git also allows remote operations. Git supports several transport types; the native protocol for Git is also called "git".
The following will clone an existing repository via the Git protocol.
git clone git@github.com:vogella/gitbook.git
   
Alternatively you could clone the same repository via the http protocol.
# The following will ask for your Github password, 
# if you don't have a password just ignore this example
git clone https://vogella@github.com/vogella/gitbook.git
   

16.2. Add more remote repositories

If you clone a remote repository the original repository will automatically be stored as "origin". You can push changes to this origin repository via "git push origin". Of course pushing to a remote repository requires write access to this repository.
You can add more remote repositories to your repository via "git remote add name gitrepo". For example if you cloned the repository from above via the Git protocol you could add the http protocol via:
// Add the https protocol 
git remote add githttp https://vogella@github.com/vogella/gitbook.git

   

16.3. Remote operations via http and a proxy

It is possible to use the HTTP protocol to clone Git repositories. This is especially helpful if your firewall blocks everything except http.
Git also provides support for http access via a proxy server. The following Git command could for example clone a repository via http and a proxy. You can either set the proxy variable in general for all applications or set it only for Git.
This example uses environment variables.
// Linux
export http_proxy=http://proxy:8080
// On Windows
// set http_proxy=http://proxy:8080 
git clone http://dev.eclipse.org/git/org.eclipse.jface/org.eclipse.jface.snippets.git
// push back to the origin using http
git push origin
   
This example uses the Git config settings.
// Set proxy for git globally
 git config --global http.proxy http://proxy:8080
// To check the proxy settings
git config --get http.proxy
// Just in case you need to you can also revoke the proxy settings
git config --global --unset http.proxy


   

17. Git Hosting Provider

Instead of setting up your own server you can also use a hosting service. The most popular Git hosting sites are Github and Bitbucket. Both offer free hosting with certain limitations.

17.1. GitHub

Github can be found under the URL https://github.com/. Github is free for all public repositories, e.g. if you want to have private repositories which are only visible to people you select you have to pay Github a monthly fee.
Github requires you to create a ssh key. A description for creating an ssh key in Ubuntu can be found in the ssh key creation in Ubuntu webpage. For Windows please see msysgit ssh key generation .
Create an account at Github and create a repository. After creating a repository at GitHub you will get a description of all the commands you need to upload your project to Github. Follow these instructions.
These instructions will be similar to the following:
Global setup:
 Set up git
  git config --global user.name "Your Name"
  git config --global user.email your.email@gmail.com
      
Next steps:
  mkdir gitbook
  cd gitbook
  git init
  touch README
  git add README
  git commit -m 'first commit'
  git remote add origin git@github.com:vogella/gitbook.git
  git push -u origin master
      
Existing Git Repo?
  cd existing_git_repo
  git remote add origin git@github.com:vogella/gitbook.git
  git push -u origin master
  
   

17.2. Bitbucket

Bitbucket can be found under the URL https://bitbucket.org/. Bitbucket allows unlimited public and private repositories, while the number of participants for one private repository is currently limited to 5 collaborators. E.g. if you have more then 5 developers which need access to a private repository you have to pay money to Bitbucket.

18. Graphical UI's for Git

This tutorial focused on the usage of the command line for Git. After finishing this tutorial you may want to look at graphical tools for working with Git.
Git provides two graphical tools. "gitk" shows the history and "git gui" shows an editor which allows to perform Git operations.
The Eclipse EGit project provides Git integration into Eclipse which is included in the latest Eclipse release.

19. Thank you

Please help me to support this article:

No comments:

Post a Comment