If "diffs" and "snapshots" are leaky abstractions, that often enough lead you ba...

If "diffs" and "snapshots" are leaky abstractions, that often enough lead you badly astray, then why insist on these abstractions in the first place?

Why not just teach people the mental model behind Git up-front? Objects form an immutable directed acyclic graph, human-readable names point at objects, there are some rules by which the graph is being extended and pruned, and by which names (references) are being updated to point at different objects.

This isn't a hard mental model, not for programmers (for whom the tool is intended in the first place). If you know how the most basic pointer-based data structures - a linked list, a tree, a directed graph - work, then learning the actual model isn't hard, immediately clarifies why Git does what it does. It should be taught to people up front.

A commit isn't a diff, and it isn't a snapshot. It's a bunch of objects Git creates for you, where the "commit" object points at previous commits and at a tree, built of "tree" and "blob" objects. When Git wants to know how to recreate your file structure, it starts at the "commit" object and walks the graph to discover what files and folders should exist. When you make a change and perform the "commit" action, Git creates a new "commit" object and a new "tree" object for it, and add more objects to the graph to encode what changed, while reusing previously existing objects for things that did not change. The end state is, if you start at your new "commit" object and walk the graph, the resulting description of your file structure should be equal to what's on your hard drive when you made your commit.

Trying to paper over that with "friendly abstractions" is what makes Git difficult to understand.