DEV Community

Pratham
Pratham

Posted on

Inside Git: How It Works and the Role of the .git Folder

You've used git commit hundreds of times. But do you know what actually happens when you run it?

Most developers treat Git like a black box—type the magic spell, hope for the best. When something goes wrong (detached HEAD, lost commits, merge nightmares), they panic because they don't understand the mechanics.

This article changes that.

After reading this, you'll see Git not as a mysterious version control system, but as what it really is: a simple key-value database with some clever pointers on top.

What you'll learn:

  • What the .git folder actually contains (and why deleting it erases everything)
  • How Git stores your files (Blobs, Trees, Commits)
  • What really happens during git add and git commit
  • Why branches are "free" and commits are permanent
  • How to explore Git's internals yourself

Part 1: The .git Folder — Where Everything Lives

When you run git init or git clone, Git creates a hidden folder called .git in your project root. This folder IS your repository.

my-project/
├── src/
├── package.json
├── README.md
└── .git/           ← This IS the repository. Everything else is just "the current version"
Enter fullscreen mode Exit fullscreen mode

[!IMPORTANT]
Delete .git = Delete all history. The project files remain, but every commit, every branch, every bit of version control is gone. The .git folder is not a cache or backup—it IS Git.

What's Inside .git?

# Run this in any Git repository
ls -la .git/

# Output:
.git/
├── HEAD                 ← "You Are Here" marker
├── config               ← Repository settings (remotes, user info)
├── description          ← Used by GitWeb (you can ignore this)
├── hooks/               ← Scripts that run on events (commit, push, etc.)
├── index                ← The Staging Area (binary file)
├── logs/                ← History of where HEAD and refs have been (reflog)
├── objects/             ← THE DATABASE: All your files, folders, and commits
├── refs/                ← Branch and tag pointers
│   ├── heads/           ← Local branches (each is a text file!)
│   └── tags/            ← Tags
└── packed-refs          ← Optimization: compressed refs for large repos
Enter fullscreen mode Exit fullscreen mode

The Big Three you need to understand:

  1. objects/ — The database storing all content
  2. refs/ — The labels (branches, tags) pointing to commits
  3. HEAD — The pointer saying "you are currently here"

Everything else is configuration or optimization. Master these three, and you master Git.


Part 2: Git Objects — The Building Blocks

Git stores everything as objects in the .git/objects/ directory. There are only three types you need to know:

Object Type What It Stores Analogy
Blob File content (just the bytes) A page of text
Tree Directory structure (list of blobs and other trees) A table of contents
Commit Snapshot metadata (points to a tree + parent + message) A chapter in a book

Why Only Three?

This is Git's genius: by breaking everything into these three primitives, Git can:

  • Deduplicate identical content automatically
  • Verify integrity with cryptographic hashes
  • Build any structure from simple building blocks

2.1 Blobs: The Content Store

A blob (Binary Large Object) stores the raw content of a file—nothing else. No filename. No permissions. Just bytes.

Example: You create a file called hello.txt with content Hello, World!

Git doesn't store:

filename: hello.txt
content: Hello, World!
Enter fullscreen mode Exit fullscreen mode

Git stores ONLY:

blob: Hello, World!
Enter fullscreen mode Exit fullscreen mode

Why no filename? Because the filename is metadata, stored separately in the Tree. This allows Git to detect when you rename a file—the blob stays the same, only the Tree changes.

The SHA-1 Hash (Content Address)

Every object gets a unique 40-character ID based on its content:

echo "Hello, World!" | git hash-object --stdin
# Output: 8ab686eafeb1f44702738c8b0f24f2567c36da6d
Enter fullscreen mode Exit fullscreen mode

This hash IS the address. Git stores the blob at:

.git/objects/8a/b686eafeb1f44702738c8b0f24f2567c36da6d
              ↑↑
              First 2 characters = directory name
Enter fullscreen mode Exit fullscreen mode

[!NOTE]
Why hashing matters: If anyone changes even one byte of your file, the hash changes completely. Git uses this to guarantee data integrity—if the hash matches, the content is exactly what was saved.

Same Content = Same Blob

Create two files with identical content:

echo "Hello, World!" > file1.txt
echo "Hello, World!" > file2.txt
Enter fullscreen mode Exit fullscreen mode

Git creates only ONE blob. Both files point to the same 8ab686e... object. This is how Git saves space—duplicates are free.


2.2 Trees: The Directory Structure

A tree is like a directory listing. It contains:

  • Pointers to blobs (files)
  • Pointers to other trees (subdirectories)
  • Filenames and permissions

Example Tree:

100644 blob 8ab686ea... hello.txt
100644 blob 2b7e1f5c... style.css
040000 tree 5d3c8f2a... src/
Enter fullscreen mode Exit fullscreen mode
Entry Meaning
100644 File permissions (normal file)
blob Object type
8ab686ea... SHA-1 hash of the blob
hello.txt Filename

Why this structure?

  • Filenames live HERE, not in blobs
  • Renaming a file = new tree, same blob (efficient!)
  • Subdirectories are just nested trees

Diagram: Tree → Blob Relationship

                         Tree (root directory)
                         ┌──────────────────────────────────────┐
                         │ 100644 blob abc123  index.html       │
                         │ 100644 blob def456  style.css        │
                         │ 040000 tree 789abc  src/             │
                         └──────────────────────────────────────┘
                                        │
                              ┌─────────┴─────────┐
                              ▼                   ▼
                    Blob (abc123)           Tree (789abc)
                    ┌────────────┐          ┌─────────────────────┐
                    │ <html>     │          │ 100644 blob aaa111  │
                    │ <head>     │          │   app.js            │
                    │ ...        │          │ 100644 blob bbb222  │
                    └────────────┘          │   utils.js          │
                                            └─────────────────────┘
Enter fullscreen mode Exit fullscreen mode

2.3 Commits: Snapshots in Time

A commit is the glue that holds everything together. It contains:

Field What It Is
tree Pointer to the root tree (the project snapshot)
parent Pointer to the previous commit (or none for first commit)
author Who wrote the code + timestamp
committer Who made the commit + timestamp
message Your commit message

Example Commit Object:

tree 5d3c8f2a4b1e0f3d2c1a0b9e8d7c6f5a4b3c2d1e
parent 7a8b9c0d1e2f3a4b5c6d7e8f9a0b1c2d3e4f5a6b
author Piyush <piyush@example.com> 1706712000 +0530
committer Piyush <piyush@example.com> 1706712000 +0530

feat: add user login functionality
Enter fullscreen mode Exit fullscreen mode

Why parent matters:

  • Creates a linked list of history
  • Each commit knows what came before it
  • Commits only look BACKWARD, never forward

The Time Travel Analogy

Imagine Git history as a timeline:

  1. You are in 2024 (the latest commit)
  2. You time travel back to 1990 (an older commit)
  3. You decide to stay in 1990 and create a new commit

This new commit branches off from 1990. It doesn't connect to 2024 because, in this new timeline, 2024 hasn't happened yet. If the new commit automatically referenced the "future" commit, it would create a loop rather than a history.

This is why detached commits are orphaned: They're on an alternate timeline that doesn't connect to the main branch.

Diagram: The Complete Object Relationship

┌─────────────────────────────────────────────────────────────────────────────┐
│                           COMPLETE OBJECT RELATIONSHIP                      │
└─────────────────────────────────────────────────────────────────────────────┘

    Commit (abc123)                    Commit (def456)
    ┌──────────────────────┐           ┌──────────────────────┐
    │ tree: 111aaa         │           │ tree: 222bbb         │
    │ parent: def456   ────┼──────────►│ parent: (none)       │
    │ author: Piyush       │           │ author: Piyush       │
    │ message: "Add login" │           │ message: "Initial"   │
    └──────────────────────┘           └──────────────────────┘
              │                                  │
              ▼                                  ▼
        Tree (111aaa)                      Tree (222bbb)
        ┌─────────────┐                    ┌─────────────┐
        │ login.html  │                    │ index.html  │
        │ style.css   │                    │ README.md   │
        │ src/        │                    └─────────────┘
        └─────────────┘                           │
              │                                   ▼
              ▼                            Blobs (files)
        Blobs (files)

    Each commit is a COMPLETE SNAPSHOT, not a diff!
    Git calculates diffs on-the-fly by comparing trees.
Enter fullscreen mode Exit fullscreen mode

[!NOTE]
Commits are snapshots, not diffs. Git doesn't store "what changed." It stores the entire tree at that point. Diffs are calculated when you ask for them by comparing two commits.


Part 3: What Happens During git add

Now that you understand the building blocks, let's trace what happens when you run git add.

The Three Areas

┌─────────────────┐      ┌─────────────────┐      ┌─────────────────┐
│  WORKING DIR    │      │  STAGING AREA   │      │   REPOSITORY    │
│                 │ git  │    (Index)      │ git  │   (.git/objects)│
│  Your files     │ add  │ .git/index      │commit│                 │
│  on disk        │─────►│                 │─────►│  Permanent      │
│                 │      │  "Ready to      │      │  history        │
│                 │      │   commit"       │      │                 │
└─────────────────┘      └─────────────────┘      └─────────────────┘
Enter fullscreen mode Exit fullscreen mode

Step-by-Step: What git add src/login.js Does

1. Hash the file content

# Git calculates the SHA-1 hash of the file
sha1("content of login.js") = 8ab686eafeb1f...
Enter fullscreen mode Exit fullscreen mode

2. Create a blob object

# If this hash doesn't exist yet, Git creates the blob:
.git/objects/8a/b686eafeb1f44702738c8b0f24f2567c36da6d
Enter fullscreen mode Exit fullscreen mode

3. Update the index (staging area)

# Git updates .git/index with:
# "When you commit, include: src/login.js → blob 8ab686e..."
Enter fullscreen mode Exit fullscreen mode

Verify It Yourself

# Create a file
echo "console.log('Hello');" > test.js

# Stage it
git add test.js

# See what's in the staging area
git ls-files --stage
# Output: 100644 a1b2c3d4e5f6... 0    test.js
#         ↑       ↑              ↑    ↑
#       perms    blob hash    stage  filename

# The blob now exists in .git/objects/
find .git/objects -type f
# You'll see: .git/objects/a1/b2c3d4e5f6...
Enter fullscreen mode Exit fullscreen mode

Key insight: git add doesn't just "stage" a file—it creates the blob object immediately. The staging area is a list of "blobs I want to commit."


Part 4: What Happens During git commit

When you run git commit, Git does three things:

Step 1: Create Tree Object(s)

Git reads the staging area (.git/index) and creates tree objects representing the directory structure.

Index says:
  - src/login.js → blob 8ab686e
  - src/auth.js  → blob 4d5e6f0
  - style.css    → blob 2b7e1f5

Git creates:
  - Tree for src/ (pointing to login.js and auth.js blobs)
  - Tree for root (pointing to src/ tree and style.css blob)
Enter fullscreen mode Exit fullscreen mode

Step 2: Create Commit Object

Git creates a commit containing:

  • Pointer to the root tree
  • Pointer to the current HEAD commit (parent)
  • Your author info and message
# The new commit object:
tree 5d3c8f2a4b...
parent 7a8b9c0d1e...  ← Current HEAD becomes parent
author Piyush <...>
message: feat: add login
Enter fullscreen mode Exit fullscreen mode

Step 3: Update the Branch Pointer

# Before commit:
.git/refs/heads/main → 7a8b9c0d1e...

# After commit:
.git/refs/heads/main → abc123def4...  ← Updated to new commit!
Enter fullscreen mode Exit fullscreen mode

That's it. The branch file is just updated with the new commit's hash.

Diagram: The Complete Flow

┌─────────────────────────────────────────────────────────────────────────────┐
│                        git add → git commit FLOW                            │
└─────────────────────────────────────────────────────────────────────────────┘

  WORKING DIRECTORY              STAGING AREA                REPOSITORY
  ┌─────────────────┐           ┌─────────────┐           ┌─────────────────┐
  │                 │           │             │           │                 │
  │  login.js  ─────┼── add ───►│ login.js    │           │  objects/       │
  │  (modified)     │           │ (blob hash) │           │   ├── blobs     │
  │                 │           │             │── commit─►│   ├── trees     │
  │  style.css ─────┼── add ───►│ style.css   │           │   └── commits   │
  │  (modified)     │           │ (blob hash) │           │                 │
  │                 │           │             │           │  refs/heads/    │
  └─────────────────┘           └─────────────┘           │   └── main ─────┼─┐
                                                          │     (updated)   │ │
                                                          └─────────────────┘ │
                                                                    ▲         │
                                                                    │         │
                                                                    └─────────┘
                                                                  Points to new
                                                                  commit hash
Enter fullscreen mode Exit fullscreen mode

Part 5: Refs and HEAD — The Label System

Branches Are Just Text Files

This is the most liberating Git insight: a branch is literally a text file containing a 40-character hash.

# See what 'main' branch points to:
cat .git/refs/heads/main
# Output: abc123def456789...

# That's it. That's the entire branch.
Enter fullscreen mode Exit fullscreen mode

Why this matters:

  • Creating a branch is instant (just create a tiny file)
  • Deleting a branch doesn't delete commits
  • "Merging" is just moving pointers

HEAD: The "You Are Here" Marker

HEAD tells Git where you currently are. It usually points to a branch:

cat .git/HEAD
# Output: ref: refs/heads/main
Enter fullscreen mode Exit fullscreen mode

This means: "I'm on the main branch."

Normal vs Detached HEAD

State HEAD Contains What Happens on Commit
Normal ref: refs/heads/main Branch moves forward with you
Detached abc123def456... (raw hash) No branch moves; commit is orphaned

Normal State:

HEAD → refs/heads/main → Commit C
                              ↑
                         You commit D
                              ↓
HEAD → refs/heads/main → Commit D
Enter fullscreen mode Exit fullscreen mode

Detached State:

HEAD → Commit B (directly)
            ↑
       You commit D
            ↓
HEAD → Commit D

But 'main' still points to Commit C!
D has no branch. It's an orphan.
Enter fullscreen mode Exit fullscreen mode

[!CAUTION]
Detached HEAD warning: If you commit in detached HEAD state, your work is at risk. Always create a branch (git checkout -b new-branch) before committing if you're detached.

What Counts as a "Reference"?

Reference Type Example Stable?
Branch name main, feature-x ✅ Yes
Tag v1.0.0 ✅ Yes
HEAD (on a branch) HEADmain → commit ✅ Yes
Detached HEAD HEAD → commit directly ❌ Only while you're there!

The moment you leave a detached commit, it becomes eligible for garbage collection.

Reachability: Why Some Commits Survive

Git's garbage collector deletes objects with zero references. But it checks reachability:

main → C → B → A (all reachable from main)
           ↑
           └─ D (orphan - no branch points here)

Garbage collector:
✓ A is reachable from main (through C → B → A)
✓ B is reachable from main (through C → B)
✓ C is reachable from main (directly)
✗ D is NOT reachable - will be deleted
Enter fullscreen mode Exit fullscreen mode

The Chain of Custody: As long as a branch points to the tip, all ancestor commits are protected because Git follows parent pointers.

The Mountain Climbers Analogy

Imagine a team of mountain climbers roped together:

  • The Helicopter (Branch) is holding the top climber (C)
  • Climber C is holding the rope for Climber B
  • Climber B is holding the rope for Climber A

Even though the helicopter only holds C, climbers A and B don't fall because they're chained to C.

Your detached commit D is a climber who tied their rope to B, but B isn't holding onto D. If the helicopter (branch) doesn't come down to pick up D specifically, D falls.

Key insight: Parent pointers only go backward. B doesn't know D exists.


Part 6: Hands-On Exploration

Commands to Inspect Git Objects

# See what type an object is
git cat-file -t abc123
# Output: commit, tree, or blob

# See the content of an object
git cat-file -p abc123
# Output: The actual content
Enter fullscreen mode Exit fullscreen mode

Example: Trace a Commit to Its Files

# 1. Get the latest commit hash
git rev-parse HEAD
# Output: abc123def456...

# 2. See the commit object
git cat-file -p abc123
# Output:
# tree 111aaa222bbb...
# parent 333ccc444ddd...
# author Piyush <...>
# ... message ...

# 3. See the tree (directory snapshot)
git cat-file -p 111aaa
# Output:
# 100644 blob 555eee666fff    login.html
# 100644 blob 777ggg888hhh    style.css
# 040000 tree 999iii000jjj    src

# 4. See a blob (file content)
git cat-file -p 555eee
# Output: The actual HTML content!
Enter fullscreen mode Exit fullscreen mode

The Reflog: Your Safety Net

Even if you lose a commit, Git remembers where HEAD has been:

git reflog
# Output:
# abc123 HEAD@{0}: commit: feat: add login
# def456 HEAD@{1}: checkout: moving from main to feature
# 789abc HEAD@{2}: commit: fix: typo
Enter fullscreen mode Exit fullscreen mode

Recover a lost commit:

# Find it in reflog
git reflog

# Create a branch to save it
git branch rescue-branch abc123
Enter fullscreen mode Exit fullscreen mode

[!TIP]
Reflog entries expire after 30 days for unreachable commits and 90 days for reachable ones. Act quickly!


Part 7: The Mental Model Summary

After all this, here's the simple truth:

Git Is a Content-Addressable Filesystem

Concept Reality
Repository A folder called .git with a key-value database
Blob File content, addressed by SHA-1 hash
Tree Directory listing, addressed by SHA-1 hash
Commit Metadata pointing to a tree + parent
Branch A text file containing a commit hash
HEAD A text file saying which branch you're on
git add Create blob, update index
git commit Create tree + commit, update branch pointer

Visual Summary

┌─────────────────────────────────────────────────────────────────────────────┐
│                           GIT'S ARCHITECTURE                                │
└─────────────────────────────────────────────────────────────────────────────┘

                    You (HEAD)
                        │
                        ▼
                   ┌─────────┐
                   │  main   │  ← Branch (text file)
                   └────┬────┘
                        │ (contains hash)
                        ▼
                   ┌─────────┐
                   │ Commit  │  ← Commit object
                   │ abc123  │
                   └────┬────┘
                        │ (tree pointer)
                        ▼
                   ┌─────────┐
                   │  Tree   │  ← Root directory
                   │ 111aaa  │
                   └────┬────┘
                        │ (blob and tree pointers)
            ┌───────────┼───────────┐
            ▼           ▼           ▼
       ┌────────┐  ┌────────┐  ┌────────┐
       │  Blob  │  │  Blob  │  │  Tree  │
       │ file1  │  │ file2  │  │  src/  │
       └────────┘  └────────┘  └────────┘
Enter fullscreen mode Exit fullscreen mode

Conclusion: Why This Knowledge Matters

Understanding Git's internals transforms you from a command-memorizer to a confident user:

Before After
"I ran reset and lost my work!" "I know it's in reflog for 30 days"
"Detached HEAD is scary" "Just means HEAD points to hash, not branch"
"Branches are expensive to create" "They're just 41-byte text files"
"Git is mysterious" "Git is a key-value store with pointers"

Your Next Steps

  1. Explore your own .git folder — Run the commands from Part 6
  2. Create a throwaway repo and experiment — Break things on purpose
  3. Read the hash — When you see error messages with hashes, you now know what they mean

The key insight: Every complex Git operation (rebase, cherry-pick, reset) is just manipulating objects and pointers. Once you see the database, the commands become obvious.


You now understand Git better than 90% of developers. Use this power wisely. 🚀


Have questions? Found this helpful? Let me know in the comments below!

Top comments (0)