<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Econometrics | Patrick Nasser</title><link>https://www.patricknasser.com.br/tags/econometrics/</link><atom:link href="https://www.patricknasser.com.br/tags/econometrics/index.xml" rel="self" type="application/rss+xml"/><description>Econometrics</description><generator>HugoBlox Kit (https://hugoblox.com)</generator><language>en-us</language><lastBuildDate>Sun, 24 May 2026 00:00:00 +0000</lastBuildDate><image><url>https://www.patricknasser.com.br/media/icon_hu_da05098ef60dc2e7.png</url><title>Econometrics</title><link>https://www.patricknasser.com.br/tags/econometrics/</link></image><item><title>The bias your LLM forgets when you log-transform</title><link>https://www.patricknasser.com.br/blog/bias-your-llm-forgets-when-you-log-transform/</link><pubDate>Sun, 24 May 2026 00:00:00 +0000</pubDate><guid>https://www.patricknasser.com.br/blog/bias-your-llm-forgets-when-you-log-transform/</guid><description>
&lt;blockquote class="border-l-4 border-neutral-300 dark:border-neutral-600 pl-4 italic text-neutral-600 dark:text-neutral-400 my-6"&gt;
&lt;p&gt;&lt;strong&gt;TL;DR.&lt;/strong&gt; When you fit a regression on $\ln Y$ and exponentiate the prediction back, you get a biased estimate of $Y$ — Jensen&amp;rsquo;s inequality says so. It&amp;rsquo;s a small detail nobody enforces, every codebase I&amp;rsquo;ve seen ignores it, and LLMs cheerfully reproduce that habit by default. The problem isn&amp;rsquo;t math, it&amp;rsquo;s the knowledge–practice gap, and the way out is shared context (SKILL files, open-source patterns) that bridges domains the model doesn&amp;rsquo;t know to consult.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h2 id="the-knowledgepractice-gap-amplified"&gt;The knowledge–practice gap, amplified&lt;/h2&gt;
&lt;p&gt;This is a consistent mistake I keep running into while vibe coding common data science problems, and it made me realize how much current LLMs amplify the &lt;strong&gt;knowledge–practice gap&lt;/strong&gt; — the distance between what is &lt;em&gt;known&lt;/em&gt; in a field and what is actually &lt;em&gt;done&lt;/em&gt; day to day.&lt;/p&gt;
&lt;p&gt;When training, labs like Anthropic and OpenAI weight their data sources by credibility so the model produces better predictions. Weighting is fragile, though, and that fragility shows up the moment you ask a token-prediction machine to do something where a field&amp;rsquo;s consensus and a field&amp;rsquo;s practice disagree. I see it most clearly with &lt;strong&gt;big numbers&lt;/strong&gt;.&lt;/p&gt;
&lt;h2 id="why-we-log-transform"&gt;Why we log-transform&lt;/h2&gt;
&lt;p&gt;Big numbers are ugly to read (unless they&amp;rsquo;re in your bank account). They&amp;rsquo;re also a pain to interpret, and worse to compare — stock prices against inflation, house prices against rentals, that kind of thing. So data scientists, economists, and any data nerd reach for the &lt;strong&gt;log scale&lt;/strong&gt;:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Variance shrinks, so heteroscedasticity stops being a fire.&lt;/li&gt;
&lt;li&gt;Coefficients in log–log or log–level models read directly as % changes.&lt;/li&gt;
&lt;li&gt;Optimizers tend to behave better on heavy-tailed targets.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;If we do it, the LLM does it — which is what brought me here. Until here everything is fine, but then comes &lt;strong&gt;the catch&lt;/strong&gt;.&lt;/p&gt;
&lt;h2 id="the-catch-you-cant-just-exponentiate-back"&gt;The catch: you can&amp;rsquo;t just exponentiate back&lt;/h2&gt;
&lt;p&gt;Once you fit a linear model on the log scale, you have a sample estimate of the conditional mean of $\ln Y$:&lt;/p&gt;
$$\widehat{\ln Y} = \hat{\beta}_0 + \hat{\beta}_1 X$$&lt;p&gt;which estimates the population quantity&lt;/p&gt;
$$E[\ln Y \mid X] = \beta_0 + \beta_1 X$$&lt;p&gt;The naive move — what almost everyone does — is to exponentiate it and call that your prediction of $Y$:&lt;/p&gt;
$$\hat{Y}_{\text{naive}} = e^{\widehat{\ln Y}} = e^{\hat{\beta}_0 + \hat{\beta}_1 X}$$&lt;p&gt;In code, that&amp;rsquo;s the two lines you&amp;rsquo;ve probably written a hundred times:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-python" data-lang="python"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;fit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;X&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;y_pred&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;exp&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;predict&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;X_new&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="c1"&gt;# ← the bias lives on this line&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;But this introduces bias, courtesy of &lt;strong&gt;Jensen&amp;rsquo;s inequality&lt;/strong&gt;. The exponential is strictly convex, so for any random variable&lt;/p&gt;
$$E[e^{g(X)}] \geq e^{E[g(X)]}$$&lt;p&gt;with strict inequality whenever there&amp;rsquo;s any real variation in $g(X)$. Applied here:&lt;/p&gt;
$$E[Y \mid X] = E[e^{\ln Y \mid X}] \;\geq\; e^{E[\ln Y \mid X]}$$&lt;p&gt;
&lt;figure &gt;
&lt;div class="flex justify-center "&gt;
&lt;div class="w-full" &gt;
&lt;img alt="Jensen&amp;rsquo;s inequality: the gap between $E[e^X]$ (green) and $e^{E[X]}$ (red) is the bias you inherit by exponentiating naively."
srcset="https://www.patricknasser.com.br/blog/bias-your-llm-forgets-when-you-log-transform/jensen_inequality_hu_66a1ade5860d879a.webp 320w, https://www.patricknasser.com.br/blog/bias-your-llm-forgets-when-you-log-transform/jensen_inequality_hu_27a9784ea975fa2e.webp 480w, https://www.patricknasser.com.br/blog/bias-your-llm-forgets-when-you-log-transform/jensen_inequality_hu_111c21d70feb9f4.webp 760w"
sizes="(max-width: 480px) 100vw, (max-width: 768px) 90vw, (max-width: 1024px) 80vw, 760px"
src="https://www.patricknasser.com.br/blog/bias-your-llm-forgets-when-you-log-transform/jensen_inequality_hu_66a1ade5860d879a.webp"
width="760"
height="451"
loading="lazy" data-zoomable /&gt;&lt;/div&gt;
&lt;/div&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p&gt;In plain English: exponentiating the estimated mean of $\ln Y$ does &lt;strong&gt;not&lt;/strong&gt; give you the estimated mean of $Y$. It gives you a biased estimate that &lt;strong&gt;systematically underestimates&lt;/strong&gt; the true arithmetic mean and messes up predicted variance. The bias lives in the error term and it doesn&amp;rsquo;t go away with more data.&lt;/p&gt;
&lt;p&gt;I won&amp;rsquo;t go through the corrections here. There are several — &lt;strong&gt;Duan&amp;rsquo;s smearing&lt;/strong&gt; for the nonparametric route, the &lt;strong&gt;$e^{\hat\sigma^2/2}$ adjustment&lt;/strong&gt; if you&amp;rsquo;re willing to assume normal errors — none is perfect, and your favorite LLM can find them faster than I can summarize them.&lt;/p&gt;
&lt;h2 id="what-this-looks-like-with-an-llm"&gt;What this looks like with an LLM&lt;/h2&gt;
&lt;p&gt;The problem is: most people don&amp;rsquo;t apply any correction, so the LLM, dutifully following the corpus, doesn&amp;rsquo;t either — unless you tell it to.&lt;/p&gt;
&lt;p&gt;In every single place I&amp;rsquo;ve worked, I&amp;rsquo;ve found people doing the naive conversion without realizing there&amp;rsquo;s a bias. And every single time I asked an LLM to build a model where a log transform was obviously the right call, it did the transform — and then naively transformed the predictions back, with no mention of Jensen.&lt;/p&gt;
&lt;figure style="max-width: 480px; margin: 1.5rem auto; text-align: center;"&gt;
&lt;img src="printclaude1.png"
alt="The prompt: asking Claude to build a model for house prices from a few explanatory variables, with the transforms it would recommend."
style="width: 100%; height: auto;" /&gt;
&lt;figcaption style="font-size: 0.85em; color: var(--hb-color-foreground, #6b7280); margin-top: 0.5rem;"&gt;
&lt;strong&gt;The prompt&lt;/strong&gt; — asking Claude to build a model for house prices from a few explanatory variables, and to choose the appropriate transforms.
&lt;/figcaption&gt;
&lt;/figure&gt;
&lt;figure style="max-width: 480px; margin: 1.5rem auto; text-align: center;"&gt;
&lt;img src="printclaude2.png"
alt="The answer Claude produced in the same session — np.expm1 quietly bringing the prediction back to price space, with no correction for the Jensen bias."
style="width: 100%; height: auto;" /&gt;
&lt;figcaption style="font-size: 0.85em; color: var(--hb-color-foreground, #6b7280); margin-top: 0.5rem;"&gt;
&lt;strong&gt;The answer&lt;/strong&gt;, same session — &lt;code&gt;np.expm1&lt;/code&gt; quietly bringing the prediction back to price space, with no correction for the Jensen bias.
&lt;/figcaption&gt;
&lt;/figure&gt;
&lt;h2 id="so-what-do-we-do"&gt;So what do we do?&lt;/h2&gt;
&lt;p&gt;I don&amp;rsquo;t think there&amp;rsquo;s a clean fix. This is an &lt;strong&gt;epistemic&lt;/strong&gt; problem about how we transmit knowledge and how AI reproduces it. The best lever we have is to write &lt;strong&gt;SKILL files&lt;/strong&gt; — context the model can read at task time — that bridge the domains the base model doesn&amp;rsquo;t know to consult on its own.&lt;/p&gt;
&lt;p&gt;As an open-source advocate I think this gets solved through community, not by labs alone. SKILL files are popping up in different repos every day. Use them. Enhance them. Make sure the AI you&amp;rsquo;re working with has the best context you can give it.&lt;/p&gt;
&lt;p&gt;Speeding up the easy stuff is revolutionary. But there&amp;rsquo;s still an opportunity to &lt;em&gt;close&lt;/em&gt; the knowledge–practice gap, not widen it — and that&amp;rsquo;s where the real social win is right now.&lt;/p&gt;</description></item></channel></rss>