6. The tree
tree [content size]0
10644 blob a906cb README
10755 blob 6f4e32 run
04000 tree 1f7a4e src
7. The commit
$ git commit file
[master dbaf944] This is a commit message.
$ cat .git/objects/db/af944a4a9eb72af64042b1e3a128936000dfc2 |
zlib_inflate -d
commit 318
tree 47ec7a250164a21cb14eb64618c3a903db0b7420
parent 402b26df0644f09fc62842c0a4a44a0a3345c530
author Manu <m.cupcic@criteo.com> 1380977766 +0200
committer Manu <m.cupcic@criteo.com> 1380977766 +0200
This is a commit message.
8. The commit
• Is identified by
• a snapshot of the repo state (the tree).
• parent commit(s)
• a commit message
• Is immutable
• Has a deterministic hash (SHA1)
• Commits form a linked list: the history
11. Take home message
• Git stores a snapshot of the whole repo at each commit.
• The SHA1 of a commit depends only on its content, message,
committer and parent(s).
• A git branch/tag is a 40 digits hex number stored in a file.
12. Things we can play with
git reflog
git fsck
git pack
git config
git rebase -i
git reset
git refspecs
git stash
git add -p
git log (advanced stuff)
git pull –rebase
Notes de l'éditeur
Internal model of the commit object.DEMO: git cat-file –p <commit sha1>
The most important Git object is the COMMIT.The most important thing about the commit is that it is IMMUTABLE.So why is it important?A commit is primarily defined by 3 things: a snapshot of the working directory, the “disk” state ; a commit message ; and most importantly, a parent commit. Every commit has a pointer toward its parent. This is what defined a history of commits, a chained list of commit.So, if you change- a single file -> different commit- the commit message -> different commit- the commit parent or parents -> different commitA commit is uniquely identified by its SHA1. A SHA1 is deterministic : a snapshot with the exact same content will have the same SHA1. A commit refering to the same snapshot, the same parent commit and the same commit message will be identified by the same SHA1.So why is this important?It matters because of the initial Git design choices. Git is primarily a content-addressables file that stores any version of any object as a distinct object accessible for ever. Any file, any snapshot, any commit that has been archived in Git, can be retrieved for ever by its SHA1.These files are stored entirely, all the time. This is a major difference with other versioning systems such as Subversion and Perforce, which stores diff of files.Now, I’m making simplifications, but this is true in a 1st approximation.Why do we care?This means that all snapshots that have been committed once can always be retrieved. Keep this in mind as it will be important later.
Commit has a pointer to a tree, which describes the entire git repo content.