==============================================================================================
  RAHUL'S ML BLOG -- notes on machine learning, worked out by hand                    est. 2026
==============================================================================================
  home | about | archive | glossary | contact
----------------------------------------------------------------------------------------------

  CHAPTER 2 . GRADING A GUESSER . PART 2 OF 2
  Reading the Dials: What the Coefficients Say
  Posted: 2026-06-04 . Author: Rahul Rai . Tags: coefficients, interpretation, scaling
  ============================================================================================

  PATH . post 5 of 28
    <- prev:  Grading 1: Two Rulers, MSE and R^2
       next:  Sorting 1: S-Curve and Four-Box Table ->

  So far the dials have been the machine's private business -- numbers it set for itself
  to make good guesses. But those dials are not just machinery; they are a message. Each
  one is the rule quietly telling you how its column pushes the answer around. Read them
  and the black box starts talking.

  This post asks one innocent-looking question of a sheet of cars -- which column drags
  miles per gallon down the hardest? -- and uses it to spring two traps that catch almost
  everyone. One is a trick of arithmetic. The other is a trick of units, and it is the
  kind of mistake that ends up in published papers.


  ## One Dial Per Column

    guess = d1*disp + d2*horse + d3*weight + d4*accel + nudge
             +--+--+
           turn this dial up by 1 unit of its column,
           the guess moves by d1 -- everything else held still

  A POSITIVE dial pushes the answer up as its column grows; a NEGATIVE dial drags it down.
  The size says how hard. So "drags mpg down hardest" means the MOST NEGATIVE dial. Easy
  -- except the rule hands them to you with the names torn off.


  ## The Bare Row Problem

    model.coef_  ->  [ -0.01, -0.04, -0.007, +0.12 ]   numbers only, no names
    X.columns    ->  [ disp,  horse,  weight,  accel ]  names, same order

  Two rows, same order, one missing its labels. Zip the names onto the dials and you have
  a shelf you can read:

    zip ->  disp  : -0.01
            horse : -0.04
            weight: -0.007
            accel : +0.12


  ## Most Negative Means MIN, Not MAX

    number line -- furthest BELOW zero is the strongest drag:

      -0.04      -0.01      -0.007             +0.12
      horse       disp       weight             accel
    <-- most negative                      most positive -->

  !! WARN: -0.04 IS SMALLER THAN -0.007
     The trap: 4 LOOKS bigger than 7, so the eye wants -0.04 to be the larger drag and
     reaches for the biggest number. But further below zero is the SMALLER value. "Most
     negative" is the MINIMUM, not the maximum. Here the strongest drag is horse at -0.04.

  So zip the names on, then grab the name sitting on the smallest dial -- the same
  winner-hunt used to pick the best neighbour-count earlier in the blog. The two lines that
  do it are at the end of the post.


  IN HAND: a bare row of dials [-0.01, -0.04, -0.007, +0.12] with the names zipped
  back on (disp, horse, weight, accel), and the most-negative hunt that crowns horse
  (-0.04) the hardest drag.  This section shows that crown is counterfeit until the
  columns share a ruler.

  ## The Trap Underneath: Dials Wear Their Column's Units

  Here is the deeper trap, the one the easy version walks straight into. A dial is
  "answer-units per ONE unit of its column" -- and every column is measured on its own
  ruler, with its own idea of how big "one unit" is:

    weight dial  = mpg per   1 POUND        (pounds run 1500 ... 5000)
    horse  dial  = mpg per   1 HORSEPOWER   (horsepower runs 45 ... 230)

    a step of "1 pound" is tiny; a step of "1 horsepower" is large.
    so a small weight-dial and a large horse-dial may carry
    the SAME real punch -- the numbers just wear different rulers.

  !! WARN: RAW DIALS ARE NOT COMPARABLE ACROSS COLUMNS
     Comparing -0.007 per pound against -0.04 per horsepower is comparing prices in two
     different currencies without the exchange rate. The bare dial that LOOKS smallest can
     be the weakest or the strongest real drag depending only on how wide its column runs.
     "Which drags hardest" is not honestly answerable from the raw dials alone.

  The fix is the same-ruler trick: replace each value x with (x - column average) / column
  spread, so every column ends at zero average and a spread of one BEFORE setting the dials. Now every dial means
  "mpg per one-spread step of its column" -- one shared currency -- and the most-negative
  one is the honest strongest drag. (The code that does this is at the end of the post.)

  >> NOTE: THIS DOESN'T CONTRADICT "OLS NEEDS NO SCALING"
     The straight-stick rule's GUESSES and its MSE and R^2 do not care about scaling --
     rescale a column and its dial rescales inversely, leaving the answer untouched (shown
     in the straight-stick post). Scaling changes nothing about how well it fits. It only
     changes whether the dials are COMPARABLE TO EACH OTHER as a measure of pull.
     Same-ruler is for the reading, not the fitting.


  ## And Don't Over-Trust a Single Dial

  ** KEY: "HOLDING THE OTHERS FIXED" CAN BE A FICTION
     A dial means "move this column by one, hold the rest still." But displacement,
     horsepower and weight rise together -- heavy cars tend to have big engines. When
     columns move as a pack (collinearity), the rule cannot cleanly tell their pulls
     apart: it can shuffle weight between their dials, so a single dial can swing wildly,
     even flip sign, with a tiny change in the pile. Read individual dials as a story to
     check, not a verdict to trust. Your physics hunch -- heavier car, worse mileage -- is
     a hypothesis; the dials are evidence, and tangled columns make that evidence shaky.

  A concrete dial comparison, by pencil, before and after same-ruler.  Suppose 2
  columns: weight (pounds, runs 2000-5000) and horsepower (runs 45-230):

    raw dials (each in its own unit):
      weight dial = -0.007  mpg per POUND
      horse  dial = -0.04   mpg per HORSEPOWER

    At a glance, -0.04 looks 6x stronger than -0.007.  But is it?

    The dial says "per ONE unit of its column."  A step of "1 pound" is tiny;
    a step of "1 horsepower" is large.  To compare, rescale both columns to
    spread=1, then refit:

    after same-ruler (StandardScaler):
      weight dial = -3.2   mpg per ONE-SPREAD of weight
      horse  dial = -2.1   mpg per ONE-SPREAD of horsepower

    Now the comparison is fair: weight (-3.2) pulls harder than horse (-2.1)
    per equal-sized step of each column.  The raw dials reversed the ranking.

    weight column spread (sd) ~= 1000 pounds.  One-spread step of weight
    changes the guess by -3.2 mpg.  One-spread step of horsepower changes it
    by -2.1 mpg.  Same currency, honest comparison.

  >> YOUR TURN
     A weight dial reads -0.005 mpg per pound (made-up), and weight's spread is 800
     pounds.  Convert it to the shared currency: mpg per one-spread step.

     check your slate:  per-spread dial = raw dial x column spread = -0.005 x 800 =
     -4.0.  One typical-sized step in weight (800 pounds) drops the guess by 4 mpg --
     the honest pull, now comparable to any other column's per-spread dial.


  ## Pencil It

    shelf (already on a shared ruler):
      disp : -0.01    horse : -0.04    weight: -0.007    accel : +0.12

    most negative?  scan for the lowest:
      -0.04  <- horse   (further below zero than -0.01 or -0.007)

    strongest_drag = "horse"
    accel (+0.12) is the only one PUSHING mpg up.

  >> YOUR TURN
     A different shelf (made-up), already on a shared ruler:  cyl +0.05, disp -0.02,
     horse -0.06, accel +0.09.  Name the strongest drag.

     check your slate:  scan for the lowest -- +0.05, -0.02, -0.06, +0.09.  The
     smallest (furthest below zero) is -0.06 = horse.  Most-negative is a MIN: the
     digit 6 looks small, but -0.06 sits lower than -0.02, so horse drags hardest.

  Finding that strongest drag is the cheapest step in the whole blog: d dials need
  d - 1 comparisons -- just 3 for these four.  No room of clerks; one clerk crowns
  the winner before the kettle boils.

    1. Each column keeps one dial: + pushes the answer up, - drags it down; size is
       strength.
    2. The rule hands back a bare row of dials -- zip the column names on to read them.
    3. "Most negative / drags hardest" is the MINIMUM dial, not the maximum;
       min(d, key=d.get) returns its name.
    4. Raw dials wear each column's own units -- not comparable until you put the columns
       on one shared ruler first.
     5. When columns move together, a single dial is shaky; treat it as evidence, not proof.


  ## Common Tripwires I Caught

    TRIPWIRE 1:  The bare row problem -- names torn off
       WRONG: try to read model.coef_ directly without knowing which
              number belongs to which column.
       RIGHT: zip the names onto the numbers:
              dials = dict(zip(X.columns, model.coef_)).
              Now each dial has its column's name attached.

    TRIPWIRE 2:  Most negative = MIN, not MAX
       WRONG: -0.04 is bigger than -0.007 by the digits 4 vs 7.
       RIGHT: -0.04 is further below zero = SMALLER = most negative.
              A negative dial drags the answer DOWN.  The most negative
              is the STRONGEST drag downward.  Use min(...), not max(...).

    TRIPWIRE 3:  min(d, key=d.get) vs plain min(d)
       WRONG: plain min(d) on a shelf of names -> values returns the
              smallest NAME alphabetically ("accel"), not the smallest value.
       RIGHT: min(d, key=d.get) returns the KEY (name) whose VALUE is
              smallest.  It returns a column name, not the dial number.

    TRIPWIRE 4:  Raw dials are NOT comparable across columns
       WRONG: compare -0.007 per pound with -0.04 per horsepower.
       RIGHT: Each dial is "answer-units per ONE unit of its column."
              Different columns have different-sized "one units."
              Put every column on the same ruler (StandardScaler) FIRST,
              then read the dials for an honest comparison.

    TRIPWIRE 5:  Scaling does NOT change the fit -- only the dial
                 readability
       WRONG: "I scaled the columns, so the guesses change."
       RIGHT: Rescale a column and its dial rescales inversely.
              The guesses, MSE, and R^2 stay exactly the same.
              Scaling only changes whether dials are COMPARABLE.

    TRIPWIRE 6:  "Holding the others fixed" can be a fiction
       WRONG: trust each individual dial as "the real effect of this
              column."
       RIGHT: When columns move together (displacement, horsepower,
              weight all rise with car size), the rule cannot cleanly
              separate their pulls.  A single dial can swing wildly or
              even flip sign.  Read dials as evidence, not proof.


  ## The Code, If You Want It

  Nothing above needed a computer -- only pencils, clerks, and patience.  This last
  section is for the day you meet one: the same steps, spoken in Python.

  Two short snippets. The first zips the column names onto the bare dials and grabs the
  most-negative one. The second does the same AFTER putting every column on a shared ruler,
  so the comparison is honest rather than an accident of units.

  >> NEW TO PYTHON? Each named once:
       dict(zip(a, b))    -- pair two rows into a labelled shelf: name -> value
       min(d, key=d.get)  -- the name (key) whose value is smallest, not the smallest
                             name; plain min(d) would sort the names alphabetically
       m[-1]              -- the last step of a pipeline (here, the fitted stick)

    # bare dials, each in its own column's units -- NOT comparable across columns
    dials = dict(zip(X.columns, model.coef_))
    strongest_drag = min(dials, key=dials.get)   # KEY of the smallest VALUE -> a name

    # same-ruler first, THEN read the dials -- now an honest comparison
    from sklearn.pipeline import make_pipeline
    from sklearn.preprocessing import StandardScaler
    from sklearn.linear_model import LinearRegression

    m = make_pipeline(StandardScaler(), LinearRegression())
    m.fit(X_train, y_train)
    dials = dict(zip(X.columns, m[-1].coef_))     # now in one shared ruler
    strongest_drag = min(dials, key=dials.get)    # an honest comparison

  >> NOTE: IT HANDS BACK A NAME, NOT A NUMBER
     min(dials, key=dials.get) walks the shelf, scores each name by its dial, and returns
     the NAME with the lowest score -- not the dial itself. Plain min(dials) would instead
     compare the names alphabetically and hand back "accel" -- the wrong question answered
     confidently.


  ## The Labels, Last

    Plain term used above                 Standard label
    -----------------------------------   ------------------------------------------
    dial on a column                      coefficient / weight
    the fixed nudge                       intercept
    the bare row of dials                 model.coef_
    put columns on one shared ruler       standardisation
    dials after same-ruler                standardised (beta) coefficients
    columns moving as a pack              collinearity / multicollinearity
    hold the others fixed                 ceteris paribus / partial effect

----------------------------------------------------------------------------------------------
  IN THIS CHAPTER (Chapter 2 -- Grading a Guesser):
    Part 1 -- Two Rulers for One Guess .
    Part 2 (this post)

  Next chapter: Chapter 3 -- Sorting Into Bins
  <- Back to all posts
----------------------------------------------------------------------------------------------
  (c) 2026 Rahul Rai . pure HTML+CSS, no JavaScript, no trackers .
  home . source on GitHub
==============================================================================================