About :: Rahul's ML Blog

==============================================================================================
  RAHUL'S ML BLOG -- notes on machine learning, worked out by hand                    est. 2026
==============================================================================================
  home | about | archive | glossary | contact
----------------------------------------------------------------------------------------------

  ABOUT THIS BLOG
  Last updated: 2026-06-06


  Who Am I
  --------
  Hi, I'm Rahul Rai. I write about machine learning from first principles -- not the
  hype version, the actual nuts-and-bolts. What *is* a rule, really? Why hide part of
  the pile? What is the machine actually doing when it guesses?

  One rule on this blog: no jargon until it's been earned. If I can't explain something
  in words a farmer in 1953 would understand, it hasn't been explained yet. So I ban the
  buzzwords, draw the idea, solve it by pencil, then write the steps -- and only at the
  very end do I stick the textbook labels on.


  What This Blog Is
  -----------------
    - ML fundamentals, derived by hand before any toolbox is opened
    - ASCII diagrams and pencil math, then the real Python
    - Code that actually ran, producing results that actually work
    - Honest about what I don't know


  The 1950 Contract
  -----------------
  Pretend it is 1950.  Print these pages.  On your desk: pencils, erasers, infinite
  graph paper, a blackboard, chalk -- and a room of infinite, tireless CLERKS who will
  add, subtract, multiply, divide, and never complain.  There is no computer and no
  calculator anywhere in the teaching.

  Four promises follow from that:

    1. Nothing in the teaching path ever needs a machine.  The Python at the end of
       each post is a museum exhibit -- for the day you meet a computer.
    2. Every number is recomputed where it is needed.  No page assumes you remember
       the previous one.
    3. Every worked example is followed by a YOUR TURN drill -- do it on your slate
       before reading on; the checked answer is printed right after.
    4. Cost is counted in clerk-steps, not seconds: how many subtractions, how many
       squarings, done by lunch or done by Christmas.

  Your attention is the scarce thing; arithmetic is free.  The blog is written
  accordingly: short sections, heavy pencil work.


  What's Here
  -----------
  A short book in six chapters (twenty-one posts), plus an appendix and a glossary. Read it
  top to bottom -- each chapter draws its ideas by hand, then gathers the Python at the end.

    - CHAPTER 1 -- Predicting House Prices: two estimators from scratch (ask-closest and
      straight-stick) on the California housing pile, by hand.
        part 1 .
        part 2 .
        part 3

    - CHAPTER 2 -- Grading a Guesser: two ways to score a fitted line (MSE and R^2) and how
      to read the dials it sets, on a sheet of cars.
        part 1 .
        part 2

    - CHAPTER 3 -- Sorting Into Bins: sorting breast lumps into sick/well: the S-curve
      guesser, the four-box table, ROC/AUC, the L2 leash, the two-cloud wall (LDA), grid
      search, skewed piles, and macro/micro averaging.
        part 1 .
        part 2 .
        part 3 .
        part 4

    - CHAPTER 4 -- Humble Dials and Wobble Bands: regularisation and uncertainty on the
      diabetes pile: Ridge and Lasso leashes, picking the knob by cross-validation,
      bootstrap wobble bands, the out-of-bag free exam, and the matrix solve by hand.
        part 1 .
        part 2 .
        part 3

    - CHAPTER 5 -- Question Charts and Committees: a machine that asks yes/no questions
      instead of turning dials -- built by hand from blank sheet, then humbled by pruning.
      Then 200 charts vote: bagging, random forest (hide the columns), and gradient
      boosting (a line of stumps each fixing the last one's leftovers). How to make the
      black box confess which columns it leaned on.
        part 1 .
        part 2 .
        part 3

    - CHAPTER 6 -- Finding Patterns Without Answers: unsupervised learning -- sheets with no
      answer column. Distance and the ruler problem, PCA (the strongest direction), K-means
      and hierarchical clustering, both tools on real gene data, and a recommender that fills
      a sheet full of holes.
        part 1 .
        part 2 .
        part 3 .
        part 4 .
        part 5 .
        part 6

    - APPENDIX A -- Classification Reference: all of Chapter 3's concepts in one flip-to page:
      cross-entropy vs MSE, C parameter, LDA with priors, grid search fold safety,
      min-max scaling, precision vs recall in business, PR curves, and averaging.
        classification reference

    - APPENDIX B -- Distance and Clustering Reference: Chapter 6's loose ends -- Hamming and
      Mahalanobis distance, missing-data traps, why crush a room, and the ethics of sorting
      people into piles (bias, privacy, transparency) plus customer segmentation.
        clustering reference

    - Glossary / Decoder Ring -- every plain term on this blog mapped to its textbook
      buzzword, and back again. The whole "ban the jargon" idea, in one table.
        glossary


  What This Blog Isn't
  --------------------
    - AI hype coverage
    - Tutorial-farm content with 30 headers and no depth
    - Sponsored anything


  The Site
  --------
  Plain-text source files wrapped in one <pre> tag each, styled by a single hand-written
  stylesheet. No JavaScript, no tracking, no cookies, no frameworks, no web fonts.
  Authored to 100 columns so it reads the same in a browser or piped through `less` in a
  terminal. Hosted on GitHub Pages.


  Contact
  -------
    Email:  learn.rahul.rai@gmail.com
    GitHub: learnrahulrai-ui

----------------------------------------------------------------------------------------------
  (c) 2026 Rahul Rai . pure HTML+CSS, no JavaScript, no trackers .
  home . source on GitHub
==============================================================================================