Skip to main content

Command Palette

Search for a command to run...

Inside Git: How It Works and the Role of the .git Folder

Updated
5 min read
Inside Git: How It Works and the Role of the .git Folder
  • How Git Works Internally

  • Understanding the .git Folder

  • Git Objects: Blob, Tree, Commit

  • How Git Tracks Changes


What is .git Folder?

When you use the git init command, a database-like environment is created inside your working folder to track and store all the information about your work. This process is called initializing Git. Without this, you can't track any changes in your files or folders.

It contains various directories and files such as HEAD, refs, and objects, which store all the information about what’s happening inside your files and folders. As users, we don’t need to interfere with these files and directories because they are configured for our use. Always avoid changing them unless you know exactly what you're doing.


How Git works internally?

Git gives users complete control to work as needed. A newly created file starts as an untracked file. To track it, use the git add <fileName> command, which moves the file to the staging area. If you want to save the file without making more changes, commit it using the command git commit -m "user message". Each commit generates a commit ID, which is essential for retrieving information about the history of changes made to different files. There is a HEAD, which acts like a marker, always pointing to your current working location. We also have branches, which are like timelines, and multiple timelines can coexist without affecting each other's work and more.

We can see all these by exploring the Anatomy of the .git folder.


Anatomy of .git Folder

First, you need to create a Git folder inside your working folder using the command git init.

git init

Second, enter the .git folder using the command cd .git in your terminal.

cd .git

Third, enter the command ls

ls

You will now see a group of files and folders like this.

Let’s read what this HEAD file is? enter the command cat HEAD.

cat HEAD

You will see the reference ref: refs/heads/main, which indicates you are on the main branch.

Let's check out the refs folder now.

cd refs

Then, use the ls command inside it to see what files are present.

ls

Now you see some files like heads and tags. Let's go inside the heads file again.

cd heads

and run the ls command again.

ls

You will now see it showing main.


So, what have we learned from the above scenario? We can find out which branch we are currently on by using the cat HEAD command and We can also see where this information is stored, which is in refs - heads - main.


Now let's look at another aspect. Read the main branch we found earlier by using the command cat main.

cat main

You will see a long hash ID, like 6ce14944xxxxxxxxxxxxxxxxx. You might wonder what this is. You'll be surprised to learn that this is the ID of the last commit you are working on. In other words, it's the HEAD pointing to this commit. You can verify this by exiting the .git folder and running the command git log. You will see the latest commit ID is the same as the one you saw inside main.


Anatomy of objects

Objects is a directory we saw earlier inside the .git folder, and it's also very interesting to explore. Run the command inside the.gitfolder cd objects to go inside the objects folder.

git init
cd .git
cd objects

Now run the command ls here.

ls

You will see a bunch of alphanumeric terms there.

08      43      6c      b7      c0      c7      e2      e7      info
42      51      77      bf      c2      ca      e4      f6      pack

What's this? Hmm! Git has its own way of storing data. The first two letters or numbers of the commit ID create the folder name, and the rest of the hash is used to store the data or content. For example, if the commit ID is 6ce14944xxxxxxxxx, the folder is named 6c, and inside it, we have e14944xxxxxxxxx.

Let’s checkout the folder 6c and run the ls command

cd 6c
ls

You will see the rest of the hash, e14944xxxxxxxxx. Can you read the content of this hash using the cat command? Let’s check.

cat e14944xxxxxxxxx

You will get something, but it won't be readable, like ��J��k�j��Ү=K%. To read this, we need to exit the .git folder and then run the command git cat-file -p <commitID>.

git cat-file -p 6ce14944xxxxxxxxx

Hmm! Something happens; you get a tree hash, a parent hash, an author, and a committer. Hey if we are on head can we go on its parent node? Let’s check. insert the parent hash that you got from reading the commit hash.

git cat-file -p <parentID>

Okay, again I got the tree hash, a parent hash, an author, and a committer. But this time, it's different from before. Let's run the git log command to double-check this. Great! I can see that this is the first previous commit ID of my current commit ID.

Great! I think this is similar to a linked list. Can we now access the data from each commit I've made so far?


git cat-file -p <tree Hash>

Amazing! We can now see our file names. Let's check if the content inside the files matches what I wrote. Run the command to read the files: cat <fileName>.

cat <fileName>

See the result below😲.

#include <cstdio>
using namespace std;

int main(){
    puts("Hello World!");
    puts("Click here to start the Game.");
    return 0;
}

This is the exact code that I wrote in my code files.


Flow Diagram


How git track changes?

Every time you commit or add changes, Git creates a new reference for future changes. It then identifies the differences between them. You can manually check these differences by using the git diff command and providing two commit IDs, like this: git diff <commit1> <commit2>. This will show the differences between the changes made in the two commits.


Conclusion

Everything seems difficult and frustrating until you understand what's happening behind the scenes.