==============================================================================================
RAHUL'S ML BLOG -- notes on machine learning, worked out by hand est. 2026
==============================================================================================
home | about | archive | glossary | contact
----------------------------------------------------------------------------------------------
SPECIAL . FOR HACKER NEWS
Cheapest Walk: UCS and A*, Built From Nothing, Every Pop Shown
Posted: 2026-06-17 . Author: Rahul Rai . Tags: search, ucs, a-star, dijkstra, by-hand
============================================================================================
PATH . SPECIAL -- Cheapest Walk: UCS and A* (standalone; no other page on this blog needed)
Every pathfinding post draws a maze with coloured arrows, names f = g + h, and waves at
"admissibility." This page never waves. One town, five places. Every single pop and offer
shown with its arithmetic. The proof that the cheapest pop is already final, worked out with
numbers, not hand-waved. And A* as the few-line change to UCS that it really is -- run on
the same map, every pop shown, the exact moment it wins demonstrated.
THE RULE OF THIS PAGE. Assume you remember nothing. Every word is defined where it first
appears. Anything named more than five lines ago is named again. Every number is worked in
front of you, never "recall that...". One idea per line. A pencil and this page are the
whole kit. Decimals shown to two places; a final digit may sit one off from a longer division.
TOY MAP, REAL RECIPE. One line of honesty up front, so nothing misleads. The RECIPE --
every move, in order -- is exactly Uniform-Cost Search and A* as Dijkstra, Russell, and
Norvig wrote them. The MAP is a toy: five letter-nodes, a handful of roads with small costs,
chosen so every number fits on paper. Where a toy choice could mislead about the method,
the text says so. Trust the recipe; treat the map as a sketchpad.
The whole method, named once so you see the shape, then built piece by piece below:
map + costs -> pile (heap) -> pop cheapest -> stale? toss -> goal? stop
-> seal -> offer each unsealed neighbour -> add if new, replace if cheaper
-> write tab + breadcrumb -> repeat
Three data structures sit on your desk throughout. Named once here so you are not surprised:
the pile (a min-heap of slips), the tab (a dict of cheapest costs found so far), and the
breadcrumb (a dict of predecessors, used to reconstruct the path at the end). Each is fully
spelled out where it first appears below.
** KEY: THE BLIND MAZE TRAP -- why a cost-only search wastes time in the wrong direction
Uniform cost search spreads like a puddle: it explores every road with equal cost
before committing to any direction. It WILL find the cheapest path -- but it
explores the entire map outward in a ring, including all the roads leading away
from the goal, before it reaches the goal.
THE FIX: A*. Add a compass (h) that estimates the remaining cost to the goal.
Each slip on the pile gets scored as cost-so-far + compass-estimate.
The puddle now pushes harder in the direction the compass says is promising.
It still finds the optimal path (as long as the compass never overestimates).
## A Machine That Adds Costs -- and "S→B" Is Not a Number
A town has five places: S (start), A, B, C, G (goal). Every road has a cost painted on it
in horses -- how dear it is to walk that road. Roads run both ways at the same cost.
leaves S : A(1) B(4)
leaves A : S(1) B(1) C(5)
leaves B : S(4) A(1) C(1) G(8)
leaves C : A(5) B(1) G(2)
leaves G : C(2) B(8)
start = S goal = G
Drawn on paper (each number is a road cost):
S
/ \
1 4
/ \
A --1-- B --8-- G
\ / /
5 1 2
\ / /
C ------+
The cheapest walk from S to G is NOT the most direct one (S->B->G = 4+8 = 12).
It is S->A->B->C->G = 1+1+1+2 = 5. Four roads, less than three.
In code this is a dict of dicts: `graph['B'] = {'S': 4, 'A': 1, 'C': 1, 'G': 8}`. Nothing
else. The map never changes. Every confusion that comes later traces back to forgetting this.
The goal: find the walk from S to G whose road-costs add to the smallest total. Not the
walk with the fewest roads -- the walk with the cheapest SUM.
## Fewest Roads Is Not Cheapest -- Therefore You Cannot Commit to a Road
Two ways to reach B. Direct: S→B costs 4. Via A: S→A costs 1, then A→B costs 1, total 2.
Two roads beats one road, 2 < 4. Fewer roads is more expensive here.
That kills the obvious strategy: take the cheapest first road and follow it. A dear opening
road might reveal a cheaper continuation later. You cannot commit to any path early, because
you don't yet know what roads open up ahead.
Which means you must keep ALL partial routes alive until you can PROVE a place is final.
That proof only arrives at one specific moment -- spelled out below -- and it relies on the
costs being ≥ 0. That condition matters; it's the floor the whole argument stands on.
## Three Papers On Your Desk Force Themselves Into Being
If you can't commit, you need a running record. Three pieces of paper appear on your desk
not by choice but by necessity:
**The pile** (call it `q`). A heap -- a list kept so the cheapest item always sits on top.
Each item on the pile is a slip: (cost-so-far, place). The pile holds every frontier
candidate you haven't decided about yet. Heap means: heapq in Python; every push and pop
keeps the smallest on top automatically.
**The tab** (call it `states`). A dict `{place: cheapest cost-so-far found to reach it}`.
When you find a cheaper route to a place, you overwrite the old cost. One number per place,
always the smallest seen so far. Separate from the pile: the pile holds (cost, place) PAIRS
you pop; the tab holds ONE cost PER PLACE you look up. Never blur them.
**The breadcrumb** (call it `previous`). A dict `{place: predecessor}`. When place P gets
a new cheapest route, you write `previous[P] = wherever that cheaper route came from`.
At the end, walk it backward: G→C→B→A→S→(None) and you've recovered the path.
At birth, you're already standing on S for free. No road was walked to reach it. So:
q = [ (0, S) ] pile holds one slip: cost 0, at S
states = { S: 0 } tab says S reached for 0
previous = { S: None } S has no predecessor (the chain stops here)
sealed = { } nothing finalized yet
`states` is NOT empty at birth. Empty would mean "I've reached nothing." But you're
standing on S right now, so S is reached for 0.
## The Wheel: Pull, Discard or Stop or Open
Pull the cheapest slip (cost, place) off the pile. Three cases, in order:
Already sealed? → Discard. Pull again. (A stale copy -- explained right after the walk.)
Is it the goal? → Stop. Walk breadcrumbs back from G to S.
Else: → Seal this place. Then, for every road leaving it:
new = cost + road-cost
For each neighbour on that road:
skip if neighbour is already sealed
neighbour not in tab yet → ADD: push (new, nbr), tab[nbr]=new, prev[nbr]=place
nbr in tab and new < tab[nbr] → REPLACE: push (new, nbr), tab[nbr]=new, prev[nbr]=place
nbr in tab and new ≥ tab[nbr] → offer loses, do nothing
Both ADD and REPLACE push a NEW slip onto the pile. You never edit a slip already on the
pile -- heaps have no eraser. The old pricier slip stays, rides until it surfaces, sees its
place is already sealed, and gets tossed. Popped many times; sealed exactly once.
## The Full UCS Walk (Six Pops, Every Offer Shown)
```
start: q=[(0,S)] states={S:0} previous={S:None} sealed={}
POP (0, S) S not in sealed. S ≠ goal. seal S. offer roads from S:
A: new = 0+1 = 1. A not in states → ADD: states[A]=1, prev[A]=S, push (1,A)
B: new = 0+4 = 4. B not in states → ADD: states[B]=4, prev[B]=S, push (4,B)
pile after: (1,A) (4,B)
POP (1, A) A not sealed. A ≠ goal. seal A. offer roads from A:
S: new = 1+1 = 2. states[S]=0. 2 ≥ 0 → offer loses, skip
B: new = 1+1 = 2. states[B]=4. 2 < 4 → REPLACE: states[B]=2, prev[B]=A, push (2,B)
C: new = 1+5 = 6. C not in states → ADD: states[C]=6, prev[C]=A, push (6,C)
pile after: (2,B) (4,B)stale (6,C)
POP (2, B) B not sealed. B ≠ goal. seal B. offer roads from B:
S: new = 2+4 = 6. states[S]=0. skip. A: new = 2+1 = 3. states[A]=1. skip.
C: new = 2+1 = 3. states[C]=6. 3 < 6 → REPLACE: states[C]=3, prev[C]=B, push (3,C)
G: new = 2+8 = 10. G not in states → ADD: states[G]=10, prev[G]=B, push (10,G)
pile after: (3,C) (4,B)stale (6,C)stale (10,G)
POP (3, C) C not sealed. C ≠ goal. seal C. offer roads from C:
A: new = 3+5 = 8. states[A]=1. skip. B: new = 3+1 = 4. states[B]=2. skip.
G: new = 3+2 = 5. states[G]=10. 5 < 10 → REPLACE: states[G]=5, prev[G]=C, push (5,G)
pile after: (4,B)stale (5,G) (6,C)stale (10,G)stale
POP (4, B) B IS in sealed → STALE. Discard. Pull again.
POP (5, G) G not sealed. G IS the goal → STOP.
```
Walk breadcrumbs back from G:
G ← C (prev[G] = C)
C ← B (prev[C] = B)
B ← A (prev[B] = A)
A ← S (prev[A] = S)
S ← • (prev[S] = None → stop)
Path = [S, A, B, C, G]. Cost = states[G] = 5.
The cheapest walk is not S→B→G (cost 4+8=12, two roads). Not S→B→C→G (4+1+2=7). It is
S→A→B→C→G (1+1+1+2=5). The method found it by keeping all options alive and closing the
cheapest box in each round.
## Two Things the Walk Just Did, Worth Naming
**The tab killed the circle.** When A offered S back, 1+1=2 lost to states[S]=0. No
"visited" set needed. The tab already holds the cheapest cost to every sealed place; any
offer to a sealed place will always lose, because the sealed place was finalized when it
was the global cheapest and -- since every road is ≥ 0 -- nothing can be cheaper than that.
The tab plus the non-negative roads are the circle-breaker, built in, for free. (Drop the
non-negative condition and this stops working: a place's tab could keep improving forever
around a negative loop. The tab breaks circles only because no road ever lowers a total.)
**Stale slips ride harmlessly.** When A found B for cost 2, the old (4,B) slip was already
sitting on the pile. It couldn't be removed -- heaps have no per-item delete. So it rode
along, three more pops later surfaced as round 5, saw B in the sealed set, and died.
One extra discard was the price HERE; in general it's one stale pop per superseded slip, so
a place reached cheaper k times leaves k−1 ghosts on the pile. That is the price of not
having a decrease-key operation. In practice this is called "lazy deletion" and it's how
most real UCS/Dijkstra/A* implementations work (the alternative, an indexed heap with
decrease-key, trades the ghosts for bookkeeping).
## Why the Cheapest Pop Is Already Final (The Proof, With Numbers)
Pop X off the pile. X is the cheapest unsealed place; its current cost is c. Why is c final?
Why can no future route to X be cheaper?
Any route to X that doesn't go through X directly must pass through some other unsealed
place first -- call it Y. Y is unsealed, so states[Y] ≥ c (that's what "cheapest unsealed"
means -- everything else costs at least as much). To then walk from Y to X costs a road ≥ 0.
So the total via Y is ≥ c + 0 = c. Exactly c, at best. No improvement.
With numbers from the walk: after sealing S and A, G first appeared at cost 10 (via B).
Was it safe to leave G alone and seal C first? C was the cheapest unsealed place at cost 3.
Any other route to C passes through an unsealed place costing ≥ 3, plus a road ≥ 0: total
≥ 3. C is final at 3. And from C, G dropped to 5 -- cheaper than the original 10.
**The floor the whole argument stands on:** roads must cost ≥ 0. If a road has negative
cost, walking it subtracts from the total. Then a route through a more expensive place could
end up cheaper after the negative road, and the step "≥ c + 0 ≥ c" breaks. UCS is no longer
guaranteed correct -- it CAN seal a place at a cost a later negative road would have beaten.
(It doesn't fail on every graph with a negative road -- some happen to come out right -- but
the guarantee is gone, which is all you need to stop trusting it.) Worse, on THIS map every
road is two-way, so a single negative road is a negative two-cycle you can loop forever for
unboundedly negative cost -- there is no shortest path to find at all. For directed graphs
with negative edges but no negative cycle, Bellman-Ford handles it, at higher cost.
This correctness argument is due to Dijkstra (1959) and is also why "this IS Dijkstra's
algorithm" -- see the honest note at the end.
## A*: The Same Wheel, One Change
UCS orders the pile by g alone (g = cost-so-far). A* orders it by f = g + h, where h is a
straight-line guess of how far the goal still is. The guess tilts the pile toward goal-
directed places. The wheel, the tab, the breadcrumbs, the sealed set: all identical. Only
the number you push onto the pile changes.
Each slip on the pile carries THREE numbers: (f, g, place). f = g + h sorts the pile; g
is the true cost spent; place is where. The tab, the offer test, and whatever you return at
the end all use g alone. The guess ONLY affects which slip pops first. It never touches the
money.
The straight-line guess -- ignoring the roads, measuring crow-flies -- is:
h(place) = sqrt( (goal_col - place_col)^2 + (goal_row - place_row)^2 )
A place is written (row, col). Column gap first, row gap second. Both are squared so the
order doesn't matter for the value -- but get the mapping right deliberately: a place is
(row, col), not (x, y).
Here is the one thing you cannot fake: the straight-line guess is only a safe under-count
if the map is HONEST -- if every road is at least as long as the straight line between its
two ends. A crow-flies guess can only be a lower bound on road cost when roads don't take
shortcuts the crow can't. So assign positions where every road's length ≤ its painted cost:
S = (0, 0) A = (1, 0) B = (1, 1) C = (2, 1) G = (2, 3)
Check the honesty -- each road's straight-line length vs its cost:
S-A: length 1.00 ≤ cost 1 A-B: length 1.00 ≤ cost 1 B-C: length 1.00 ≤ cost 1
C-G: length 2.00 ≤ cost 2 S-B: length 1.41 ≤ cost 4 A-C: length 1.41 ≤ cost 5
B-G: length 2.24 ≤ cost 8
Every road is at least as long as its crow-flies line. (The first attempt at this section
used flashier coordinates -- and they LIED: a road of cost 1 spanned a straight-line gap of
2, so the guess overshot and the under-count proof below would have been false. The numbers
here are honest by construction.)
h values to G = (2, 3):
h(S) = sqrt( (3-0)^2 + (2-0)^2 ) = sqrt(13) ≈ 3.61
h(A) = sqrt( (3-0)^2 + (2-1)^2 ) = sqrt(10) ≈ 3.16
h(B) = sqrt( (3-1)^2 + (2-1)^2 ) = sqrt(5) ≈ 2.24
h(C) = sqrt( (3-1)^2 + (2-2)^2 ) = sqrt(4) = 2.00
h(G) = 0
Cross-check against the true cheapest road costs to G (S=5, A=4, B=3, C=2): every h is ≤
its true cost (3.61≤5, 3.16≤4, 2.24≤3, 2.00≤2). Each h is a genuine under-count -- which,
proved below, is exactly what makes A* safe.
## The Full A* Walk (Five Pops -- One Fewer Than UCS)
```
start: pile = [(3.61, 0, S)] states = {S:0} prev = {S:None} sealed = {}
POP (3.61, 0, S) S not sealed. S ≠ goal. seal S. offer from S:
A: g=0+1=1, f=1+3.16=4.16 A not in states → states[A]=1, prev[A]=S, push (4.16,1,A)
B: g=0+4=4, f=4+2.24=6.24 B not in states → states[B]=4, prev[B]=S, push (6.24,4,B)
pile: (4.16,1,A) (6.24,4,B)
POP (4.16, 1, A) A not sealed. A ≠ goal. seal A. offer from A:
S: g=1+1=2, f=2+3.61=5.61. states[S]=0. skip.
B: g=1+1=2, f=2+2.24=4.24. states[B]=4. 2<4 → REPLACE: states[B]=2, prev[B]=A,
push (4.24,2,B)
C: g=1+5=6, f=6+2.00=8.00. C not in states → states[C]=6, prev[C]=A, push (8.00,6,C)
pile: (4.24,2,B) (6.24,4,B)stale (8.00,6,C)
POP (4.24, 2, B) B not sealed. B ≠ goal. seal B. offer from B:
S: skip. A: skip.
C: g=2+1=3, f=3+2.00=5.00. states[C]=6. 3<6 → REPLACE: states[C]=3, prev[C]=B,
push (5.00,3,C)
G: g=2+8=10, f=10+0=10.00. G not in states → states[G]=10, prev[G]=B, push (10.00,10,G)
pile: (5.00,3,C) (6.24,4,B)stale (8.00,6,C)stale (10.00,10,G)
POP (5.00, 3, C) C not sealed. C ≠ goal. seal C. offer from C:
A: skip. B: skip.
G: g=3+2=5, f=5+0=5.00. states[G]=10. 5<10 → REPLACE: states[G]=5, prev[G]=C,
push (5.00,5,G)
pile: (5.00,5,G) (6.24,4,B)stale (8.00,6,C)stale (10.00,10,G)stale
POP (5.00, 5, G) G not sealed. G IS the goal → STOP.
```
Same path: [S, A, B, C, G]. Same cost: 5.
Five pops total. UCS needed six (the stale (4,B) was pop 5 in UCS, wasting a round).
Here the stale (6.24,4,B) is still on the pile when G pops -- it was never reached.
Why? After sealing C, the pile held (5.00,5,G) and (6.24,4,B)stale. UCS would sort these
by g: 4 < 5, so stale B would pop first. A* sorted by f: 5.00 < 6.24, so G popped first.
The heuristic pushed G ahead of the stale copy. That is the whole A* win on this map --
one saved discard -- but on a grid with thousands of nodes the savings compound across
hundreds of pops.
## The Guess Is an Under-Count -- Which Is Why A* Returns the Same Answer as UCS
The heuristic h(place) is admissible if it never overcounts the true remaining cost:
h(place) ≤ true cheapest road cost from place to G, always. Here is the careful version
of the claim, because it is easy to get wrong: Euclidean distance is admissible ONLY when
every road costs at least its straight-line length. That is the honesty condition checked
when the coordinates were assigned. Under it, the cheapest road from any place to G is a
sequence of road costs, each ≥ its own straight-line leg, so the whole road sum ≥ the
straight line from place to G -- which is exactly h. So h ≤ true remaining cost. If road
costs do NOT dominate straight-line lengths (a cheap road across a wide gap), Euclidean
distance can overshoot and is NOT admissible -- it is not magic, it is a consequence of the
map being honest.
Why admissibility matters: if h ever overcounted (h(X) > true cost X→G), it could make a
slip for X look more expensive than it really is, pushing X down the pile past slips on a
worse path. A* could then pop the goal by a dearer route before X's cheaper route ever
surfaces, seal it, and return that dearer answer -- the optimality guarantee is gone. An
admissible h can't do this -- it undersells, so it can only pull slips EARLIER, never bury
the cheapest one. A* seals the same correct answer as UCS, in fewer pops.
The stronger h property worth naming: **consistent** (also called monotone). h is
consistent if for every road P→Q of cost c: h(P) ≤ c + h(Q). On an honest map Euclidean
distance satisfies this by the triangle inequality (h(P) ≤ straight-line P→Q ≤ c, plus
h(Q)). Consistency implies admissibility, and it is the property that lets graph-search A*
trust the sealed set: with a consistent h, once a place is sealed its g is final and never
needs revisiting. (With a merely-admissible-but-inconsistent h, textbook A* may need to
reopen sealed nodes.) In practice: an honest-map Euclidean h gives you both. If you write a
custom h, check admissibility first, consistency second.
## Honest Extras (Three Short)
**UCS IS Dijkstra's algorithm.** Edsger Dijkstra published this exact method in 1959 for
finding shortest paths in graphs with non-negative edge weights. UCS is the AI textbook
name (Russell & Norvig). Dijkstra is the CS algorithms name. Same algorithm, same data
structures, same correctness proof. If you knew Dijkstra's but had not seen UCS: you already
knew this. If you knew UCS but not Dijkstra's: now you do. The algorithm has one name on
every graph theory exam and a different name on every AI midterm.
**Why Bellman-Ford instead of UCS for negative edges.** The proof above requires the
cheapest unsealed place to be final. Negative edges break this: a place penned at cost 5
might later be reached for cost 5 + (-3) = 2 via a negative road, which would have beaten
the sealed cost. Bellman-Ford handles this by relaxing every edge |V|−1 times; it's slower
(O(VE) vs O(E log V) for Dijkstra) but correct on directed graphs with negative edges --
provided there is no negative cycle, which makes "shortest path" meaningless for anyone.
(On the binary-heap with lazy deletion shown here the heap can hold up to E entries, so the
literal bound is O(E log E); but log E ≤ 2 log V for any graph, so it collapses to the
usual O(E log V).)
**Why the tab and the pile hold the same fact at birth but are different tools.** Both
start holding one cost-0 entry for S. After one pop they diverge: the pile grows by adding
slips for A and B; the tab grows by adding entries for A and B. But the pile holds PAIRS
(cost, place) you retrieve by popping cheapest. The tab holds ONE number per place you
look up by name. Confusing them -- e.g. using the pile to check if a place has been reached,
or using the tab to get the cheapest pair -- is the single most common first-implementation
bug. The pile answers "what should I explore next?"; the tab answers "what is the cheapest
cost to place P?"
## The Numbers, in Python
Round-by-round, the same numbers from the walk above, as named variables:
```python
# --- town road costs ---
s_to_a, s_to_b = 1, 4
a_to_s, a_to_b, a_to_c = 1, 1, 5
b_to_s, b_to_a, b_to_c, b_to_g = 4, 1, 1, 8
c_to_a, c_to_b, c_to_g = 5, 1, 2
g_to_c, g_to_b = 2, 8
# UCS round 1: pop S (cost 0)
cost_s = 0
new_a_r1 = cost_s + s_to_a # 1 (A not in tab → ADD)
new_b_r1 = cost_s + s_to_b # 4 (B not in tab → ADD)
tab_a, tab_b = new_a_r1, new_b_r1 # tab: A=1, B=4
# UCS round 2: pop A (cost 1)
cost_a = tab_a # 1
new_s_r2 = cost_a + a_to_s # 2 (tab S=0, 2 ≥ 0 → skip)
new_b_r2 = cost_a + a_to_b # 2 (2 < tab B=4 → REPLACE)
new_c_r2 = cost_a + a_to_c # 6 (C not in tab → ADD)
tab_b = new_b_r2 # 2 (was 4)
tab_c = new_c_r2 # 6
# UCS round 3: pop B (cost 2)
cost_b = tab_b # 2
new_c_r3 = cost_b + b_to_c # 3 (3 < tab C=6 → REPLACE)
new_g_r3 = cost_b + b_to_g # 10 (G not in tab → ADD)
tab_c = new_c_r3 # 3 (was 6)
tab_g = new_g_r3 # 10
# UCS round 4: pop C (cost 3)
cost_c = tab_c # 3
new_g_r4 = cost_c + c_to_g # 5 (5 < tab G=10 → REPLACE)
tab_g = new_g_r4 # 5 (was 10)
# UCS round 5: pop (4, B) → stale: B sealed at 2, slip says 4
stale_b_slip = 4
b_sealed_at = 2 # 4 ≠ 2 → toss
# UCS round 6: pop G (cost 5) → GOAL
final_cost_ucs = tab_g # 5
# A* straight-line h values to G=(2,3); coords picked so every road >= its straight line
import math
s_pos, a_pos, b_pos, c_pos, g_pos = (0,0), (1,0), (1,1), (2,1), (2,3)
def h(place_pos, goal_pos):
return math.sqrt((goal_pos[1]-place_pos[1])**2 + (goal_pos[0]-place_pos[0])**2)
h_s = h(s_pos, g_pos) # 3.61 (true cost to G = 5, so 3.61 <= 5: admissible)
h_a = h(a_pos, g_pos) # 3.16 (true 4)
h_b = h(b_pos, g_pos) # 2.24 (true 3)
h_c = h(c_pos, g_pos) # 2.00 (true 2)
h_g = h(g_pos, g_pos) # 0.00
# A* f-values (what goes into the pile to sort it)
# round 1: pop S (f=3.61, g=0) → offer A and B
f_a_r1 = 1 + h_a # 4.16 (pile entry for A)
f_b_r1 = 4 + h_b # 6.24 (pile entry for B)
# round 2: pop A (f=4.16, g=1) → B cheaper
f_b_r2 = 2 + h_b # 4.24 (replaces 6.24)
f_c_r2 = 6 + h_c # 8.00
# round 3: pop B (f=4.24, g=2) → C and G offered
f_c_r3 = 3 + h_c # 5.00 (replaces 8.00)
f_g_r3 = 10 + h_g # 10.00
# round 4: pop C (f=5.00, g=3) → G offered cheaper
f_g_r4 = 5 + h_g # 5.00 (replaces 10.00)
# round 5: cheapest f in pile = (5.00,5,G) < (6.24,4,B)stale → G pops first!
# UCS would pop stale B here (g=4 < g=5). A* skips it.
f_g_final = f_g_r4 # 5.00 ← wins the sort over stale B's 6.24
f_b_stale = f_b_r1 # 6.24 ← never popped; G won first
final_cost_astar = 5 # same answer as UCS, one fewer pop
```
Working heapq implementation -- the recipe above, directly:
```python
import heapq, math
graph = {
'S': {'A': 1, 'B': 4},
'A': {'S': 1, 'B': 1, 'C': 5},
'B': {'S': 4, 'A': 1, 'C': 1, 'G': 8},
'C': {'A': 5, 'B': 1, 'G': 2},
'G': {'C': 2, 'B': 8},
}
def ucs(graph, start, goal):
pile = [(0, start)] # the pile: (g, place) slips
states = {start: 0} # the tab: cheapest cost per place
previous = {start: None} # the breadcrumb: predecessor per place
sealed = set()
while pile:
cost, place = heapq.heappop(pile)
if place in sealed: continue
if place == goal: break
sealed.add(place)
for nbr, road in graph[place].items():
if nbr in sealed: continue
new = cost + road
if nbr not in states or new < states[nbr]:
states[nbr] = new
previous[nbr] = place
heapq.heappush(pile, (new, nbr))
if goal not in states: # goal never reached
return None, math.inf
path, node = [], goal
while node is not None: path.append(node); node = previous[node]
return list(reversed(path)), states[goal]
pos = {'S':(0,0), 'A':(1,0), 'B':(1,1), 'C':(2,1), 'G':(2,3)} # honest map: road >= straight line
def astar(graph, start, goal, pos):
def h(p): return math.sqrt((pos[goal][1]-pos[p][1])**2+(pos[goal][0]-pos[p][0])**2)
pile = [(h(start), 0, start)] # the pile: (f, g, place) slips, sorted by f
states = {start: 0} # the tab: cheapest g per place (NOT f)
previous = {start: None} # the breadcrumb: predecessor per place
sealed = set()
while pile:
_, cost, place = heapq.heappop(pile)
if place in sealed: continue
if place == goal: break
sealed.add(place)
for nbr, road in graph[place].items():
if nbr in sealed: continue
new = cost + road
if nbr not in states or new < states[nbr]:
states[nbr] = new
previous[nbr] = place
heapq.heappush(pile, (new + h(nbr), new, nbr))
if goal not in states:
return None, math.inf
path, node = [], goal
while node is not None: path.append(node); node = previous[node]
return list(reversed(path)), states[goal]
ucs_path, ucs_cost = ucs(graph, 'S', 'G') # ['S','A','B','C','G'], 5
astar_path, astar_cost = astar(graph, 'S', 'G', pos) # ['S','A','B','C','G'], 5
```
## One Breath (Carry This Away)
Keep a pile (min-heap of slips), a tab (cheapest cost per place), and breadcrumbs; pop the
cheapest, discard if already sealed, stop if it's the goal, else seal it and offer every
unsealed neighbour cost+road: add if new, replace if cheaper, ignore if not; safe because
the cheapest unsealed place's cost can't be undercut -- any alternative passes through an
unsealed place costing ≥ that, plus a road ≥ 0. The tab kills backward walks by itself,
no visited-set needed; stale slips die on pop via lazy deletion. A* is the same wheel with
the pile sorted by g + straight-line h instead of g alone, while the tab and returned cost
stay g; admissible h (h ≤ true remaining cost, always) guarantees the same answer, in fewer
pops. UCS and Dijkstra's algorithm are the same thing with different name tags.
On this map: UCS found [S,A,B,C,G] cost 5 in 6 pops (5 seals + 1 stale discard).
A* found the same path and cost in 5 pops (5 seals, stale B never reached the top).
----------------------------------------------------------------------------------------------
SPECIAL . CHEAPEST WALK: UCS AND A* (written for Hacker News; standalone)
Go deeper, same blog, same method:
LSTM From Pencil -- RNN and LSTM From Scratch, Nothing But a Pencil
Transformer From Pencil -- Attention From Scratch, One Number at a Time
Transformers With Pencil -- A Whole Block Worked by Hand
<- Back to all posts
----------------------------------------------------------------------------------------------
(c) 2026 Rahul Rai . pure HTML+CSS, no JavaScript, no trackers .
home . source on GitHub
==============================================================================================