<?xml version="1.0" encoding="utf-8"?>
<?xml-stylesheet type="text/xsl" href="/assets/styled-feed/style.xslt"?>
<feed xmlns="http://www.w3.org/2005/Atom">
  <title>Will Smidlein's Blog</title>
  <subtitle>Personal blog and linkblog</subtitle>
  <link href="https://blog.willsmidlein.com/feeds/links.xml" rel="self"/>
  <link href="https://blog.willsmidlein.com"/>
  <updated>2025-12-21T17:52:49.289Z</updated>
  <id>https://blog.willsmidlein.com/</id>
  <author>
    <name>Will Smidlein</name>
  </author>
  
     <entry>
       <title>Mise</title>
       <link href="https://blog.willsmidlein.com/posts/2025/dec/21/mise/"/>
       <id>https://blog.willsmidlein.com/posts/2025/dec/21/mise/</id>
       <published>2025-12-21T16:34:09Z</published>
       <updated>2025-12-21T16:34:09Z</updated>
       <author>
         <name>Will Smidlein</name>
         <email>will@willsmidlein.com</email>
       </author>
       
       <summary>Link post to https://mise.jdx.dev/dev-tools/</summary>
       <content type="html"><![CDATA[<p>Continuing on my theme of “Will provisions a new laptop” I also (somewhat arbitrarily) decided to use <code>mise</code> to manage all my dev tool versions rather than the usual orchestra of pyenv, nvm (wait no fnm), rvm, etc. So far so good, will report back if I have some awful version conflict with system Ruby or something. I have been impressed by the breadth of the ecosystem.</p>]]></content>
     </entry>
  
     <entry>
       <title>Early Career Advice From John Siracusa</title>
       <link href="https://blog.willsmidlein.com/posts/2025/jul/10/career-advice-siracusa/"/>
       <id>https://blog.willsmidlein.com/posts/2025/jul/10/career-advice-siracusa/</id>
       <published>2025-07-10T19:46:17Z</published>
       <updated>2025-07-10T19:46:17Z</updated>
       <author>
         <name>Will Smidlein</name>
         <email>will@willsmidlein.com</email>
       </author>
       <category term="john-siracusa"/>
        <category term="career"/>
        <category term="advice"/>
        <category term="atp"/>
       <summary>Link post to https://atp.fm/647</summary>
       <content type="html"><![CDATA[<blockquote>
<p>If you just graduated with a computer science degree and you have any interest whatsoever in being involved in any kind of startup, or being one of a few people in a small company with a lot of responsibility, <strong>now</strong> is the time to do it - when you’re young.</p>
<p>If you <em>don’t</em> have any interest in that, don’t do it - don’t do it just because you think it’s what people do. But if you’re like, <em>“Oh, I always wanted to be one of five programmers working on a project”</em> or something, <em>“I always wanted to be involved in a startup”</em> or whatever, <strong>now is the time to do it.</strong></p>
<p>It will only get harder to do that later. Having a job like that early in your career, where you were one of a small number of people, will force you to learn how to do a whole bunch of stuff, and that will make you a much more valuable employee when you get tired of the startup world or when you want to go to a company that’s not going to go under. Later, you will have so much more real-world experience and knowledge than people who went to work for IBM right out of school - to throw IBM under the bus - or went to work for some big company.</p>
<p>When you go to a big company, you do learn things on the job, but it’s a much more stable environment in terms of what’s expected of you. You don’t go in there and suddenly have seventeen jobs and have to learn them all now. It’s going to be more sustainably paced, let’s say, but you will learn less; it will take you longer, and your skills will be more narrow because there are 75,000 other people who do their own specialized jobs.</p>
<p>That’s not to say, <em>“Do startups when you’re young because that’s the time to burn you out”</em>. You shouldn’t have burnout even when you’re young. What I am saying is that if you are in a company with a small number of people, it can still be a healthy work environment, and you will still be required to learn how to do way more jobs simply because there just aren’t enough people. Someone’s got to figure out how to administer this Linux machine - congratulations, you’re the sysadmin now. Someone’s got to learn this new API or this new language - congratulations, it’s you and one other person.</p>
<p>You will learn so much and you will be battle-tested. And when your company inevitably goes under - because that’s what happens to most startups - when you apply for that job at Apple or Google or whatever, you should look a lot better than the other candidates because you will literally know how to do more stuff.</p>
<p><cite>- <a href="%5B%5Bxxx%5D()%5D(https://atp.fm/647)">John Siracusa on ATP episode 647</a> ▸ 1:34:07</cite></p>
</blockquote>
<p>Great discussion from the ATP folks, highly recommend listening to the entire segment.</p>]]></content>
     </entry>
  
     <entry>
       <title>What Makes A Good Manager</title>
       <link href="https://blog.willsmidlein.com/posts/2025/jun/26/good-manager/"/>
       <id>https://blog.willsmidlein.com/posts/2025/jun/26/good-manager/</id>
       <published>2025-06-26T15:38:05Z</published>
       <updated>2025-06-26T15:38:05Z</updated>
       <author>
         <name>Will Smidlein</name>
         <email>will@willsmidlein.com</email>
       </author>
       <category term="george-mandis"/>
        <category term="engineering-management"/>
        <category term="leadership"/>
       <summary>Link post to https://george.mand.is/2025/06/what-it-takes-to-be-a-good-engineering-manager</summary>
       <content type="html"><![CDATA[<blockquote>
<p>A good manager balances high empathy with high expectations and knows when to pull which lever.</p>
<p><cite>- <a href="https://george.mand.is/2025/06/what-it-takes-to-be-a-good-engineering-manager">George Mandis</a></cite></p>
</blockquote>]]></content>
     </entry>
  
     <entry>
       <title>Unit Tests In Markdown</title>
       <link href="https://blog.willsmidlein.com/posts/2025/may/27/markdown-tests/"/>
       <id>https://blog.willsmidlein.com/posts/2025/may/27/markdown-tests/</id>
       <published>2025-05-27T23:00:48Z</published>
       <updated>2025-05-27T23:00:48Z</updated>
       <author>
         <name>Will Smidlein</name>
         <email>will@willsmidlein.com</email>
       </author>
       <category term="python"/>
        <category term="rust"/>
        <category term="ty"/>
        <category term="astral"/>
        <category term="ruff"/>
       <summary>Link post to https://github.com/astral-sh/ruff/tree/main/crates/ty_python_semantic/resources/mdtest</summary>
       <content type="html"><![CDATA[<p>I was reading this <a href="https://blog.edward-li.com/tech/comparing-pyrefly-vs-ty">great post comparing Pyrefly and Ty</a> and the appendix stood out-</p>
<blockquote>
<p>I just wanted to call out that ty’s tests are written in… MARKDOWN! How cool is that?</p>
<p><cite>- <a href="https://blog.edward-li.com/tech/comparing-pyrefly-vs-ty">Edward Li</a></cite></p>
</blockquote>
<p>Very cool indeed!</p>
<p>I did a bit of digging and found <a href="https://github.com/astral-sh/ruff/blob/main/crates/ty_test/README.md">the Readme</a> as well.</p>]]></content>
     </entry>
  
     <entry>
       <title>Engineering Advice From SuperfastMatt</title>
       <link href="https://blog.willsmidlein.com/posts/2025/apr/6/engineering-advice/"/>
       <id>https://blog.willsmidlein.com/posts/2025/apr/6/engineering-advice/</id>
       <published>2025-04-06T15:31:23Z</published>
       <updated>2025-04-06T15:31:23Z</updated>
       <author>
         <name>Will Smidlein</name>
         <email>will@willsmidlein.com</email>
       </author>
       <category term="superfastmatt"/>
        <category term="advice"/>
       <summary>Link post to https://www.youtube.com/watch?v=LaXMfTrYY20</summary>
       <content type="html"><![CDATA[<blockquote>
<p>As you work your way through an engineering career, you will meet people who are <em>so good</em> at their job, they will be able to tell you if a design will work or not just be looking at it.</p>
<p>And then, as you progress further, you’ll realize that these people are <em>terrible</em> engineers.</p>
<p><cite>- <a href="https://www.youtube.com/watch?v=LaXMfTrYY20">SuperfastMatt</a></cite></p>
</blockquote>]]></content>
     </entry>
  
     <entry>
       <title>A closer look at the details behind the Go port of the TypeScript compiler</title>
       <link href="https://blog.willsmidlein.com/posts/2025/mar/11/typescript-go/"/>
       <id>https://blog.willsmidlein.com/posts/2025/mar/11/typescript-go/</id>
       <published>2025-03-12T00:53:33Z</published>
       <updated>2025-03-12T00:53:33Z</updated>
       <author>
         <name>Will Smidlein</name>
         <email>will@willsmidlein.com</email>
       </author>
       <category term="2ality"/>
        <category term="typescript"/>
        <category term="go"/>
       <summary>Link post to https://2ality.com/2025/03/typescript-in-go.html</summary>
       
     </entry>
  
     <entry>
       <title>Set VS Code as Default Program for All Code Files (Mac)</title>
       <link href="https://blog.willsmidlein.com/posts/2025/mar/11/vscode-default-mac/"/>
       <id>https://blog.willsmidlein.com/posts/2025/mar/11/vscode-default-mac/</id>
       <published>2025-03-12T00:48:22Z</published>
       <updated>2025-03-12T00:48:22Z</updated>
       <author>
         <name>Will Smidlein</name>
         <email>will@willsmidlein.com</email>
       </author>
       <category term="scripts"/>
       <summary>Link post to https://gist.github.com/ws/d3529ff975d7b6994a8e5137e49819c6#file-open-code-in-vscode-sh</summary>
       <content type="html"><![CDATA[<p>Every goddamn time I set up a new Mac, I have to dig around to find that one Terminal CLI command that sets default programs (it’s called <a href="http://duti.org">duti</a>). Then I loop through every file type I can think of and manually execute the command to change it. And every time, I forget an extension and end up cursed—accidentally opening Xcode for that file type for the rest of time. And every time, I swear next time will be the time I automate this.</p>
<p>That time has finally come.</p>]]></content>
     </entry>
  
     <entry>
       <title>H3: Uber’s Hexagonal Hierarchical Spatial Index</title>
       <link href="https://blog.willsmidlein.com/posts/2025/mar/9/uber-h3/"/>
       <id>https://blog.willsmidlein.com/posts/2025/mar/9/uber-h3/</id>
       <published>2025-03-09T21:11:58Z</published>
       <updated>2025-03-09T21:11:58Z</updated>
       <author>
         <name>Will Smidlein</name>
         <email>will@willsmidlein.com</email>
       </author>
       <category term="uber"/>
        <category term="h3"/>
        <category term="geospatial"/>
       <summary>Link post to https://www.uber.com/blog/h3/</summary>
       <content type="html"><![CDATA[<blockquote>
<p>H3 indexes points and shapes into a hexagonal grid. Coordinates can be indexed to cell IDs that each represent a unique cell.</p>
<p>Indexed data can be quickly joined across disparate datasets and aggregated at different levels of precision.</p>
<p>H3 enables a range of algorithms and optimizations based on the grid, including nearest neighbors, shortest path, gradient smoothing, and more.</p>
<p><cite>- <a href="https://h3geo.org/">h3geo.org</a></cite></p>
</blockquote>
<p>Only 8 years late to the draw on this one.</p>
<p>Very neat, and looks like it has <a href="https://h3geo.org/docs/library/migrating-3.x">continued to evolve over time</a> as well.</p>
<p>Hat tip to <a href="https://simonwillison.net/2025/Mar/9/h3-viewer/">Simon Willison</a>.</p>]]></content>
     </entry>
  
     <entry>
       <title>Comparing LLMs For &lt;img&gt; alt text</title>
       <link href="https://blog.willsmidlein.com/posts/2025/mar/1/comparing-llms-for-alt-text/"/>
       <id>https://blog.willsmidlein.com/posts/2025/mar/1/comparing-llms-for-alt-text/</id>
       <published>2025-03-02T05:30:07Z</published>
       <updated>2025-03-02T05:30:07Z</updated>
       <author>
         <name>Will Smidlein</name>
         <email>will@willsmidlein.com</email>
       </author>
       <category term="dries-buytaert"/>
       <summary>Link post to https://dri.es/comparing-local-llms-for-alt-text-generation</summary>
       <content type="html"><![CDATA[<p>In an earlier post I mentioned that I might want to make LLM alt tag generation part of my build process. I went down that rabbit hole this evening- more to come on that in a future post- but I came across <a href="https://dri.es/comparing-local-llms-for-alt-text-generation">this delightful post</a> - as well as <a href="https://dri.es/trusting-ai-with-my-images-was-not-easy">his followup with findings after running his 9k images through</a> and figured I should share.</p>]]></content>
     </entry>
  
     <entry>
       <title>Scotty Peeler Label &amp; Sticker Remover</title>
       <link href="https://blog.willsmidlein.com/posts/2025/mar/1/scotty-peeler/"/>
       <id>https://blog.willsmidlein.com/posts/2025/mar/1/scotty-peeler/</id>
       <published>2025-03-01T19:01:40Z</published>
       <updated>2025-03-01T19:01:40Z</updated>
       <author>
         <name>Will Smidlein</name>
         <email>will@willsmidlein.com</email>
       </author>
       <category term="buy-this"/>
       <summary>Link post to https://www.amazon.com/Scotty-Peelers-Label-Sticker-Remover/dp/B0068QIQVA</summary>
       <content type="html"><![CDATA[<p>Bought this many years ago thinking I would use it once or twice. I use it once or twice <em>a week</em> if not more. You don’t realize how much you rely on this thing until you go to a friend’s house and they don’t have one. Extremely high ROI purchase. Not an ad, not an affiliate link, I have no relationship with the company beyond spending $8 on one of their products once.</p>]]></content>
     </entry>
  
     <entry>
       <title>uv: It&apos;s Really Good</title>
       <link href="https://blog.willsmidlein.com/posts/2025/feb/18/uv-its-realy-good/"/>
       <id>https://blog.willsmidlein.com/posts/2025/feb/18/uv-its-realy-good/</id>
       <published>2025-02-19T00:33:13Z</published>
       <updated>2025-02-19T00:33:13Z</updated>
       <author>
         <name>Will Smidlein</name>
         <email>will@willsmidlein.com</email>
       </author>
       <category term="uv"/>
        <category term="python"/>
       <summary>Link post to https://www.bitecode.dev/p/a-year-of-uv-pros-cons-and-should</summary>
       <content type="html"><![CDATA[<blockquote>
<p>Basically, they took what was working in pip, rye and poetry, and discarded all the stuff that didn’t work. Then they spent months killing tickets to bring it to an insane level of quality.</p>
<p>This cannot be understated, as such a level of quality and dedication is so extremely rare in software that I usually associate it with things like VLC or sqlite. This is the league I consider uv in.</p>
</blockquote>
<blockquote>
<p>Always try uv first. If it doesn’t work (which is very rare), go back to what you did before or find a workaround.</p>
</blockquote>
<p>If you haven’t played with <a href="https://docs.astral.sh/uv/">uv</a> yet, I can’t recommend it enough. Python ecosystem tools come and go but this one feels like it’s got some staying power. Lots of tremendously well thought-out decisions, many of which are laid out in the linked article. No magic, just good abstractions. Such a joy.</p>
<p>I will say I am somewhat confused about who is paying for all of this. Like obviously <a href="https://astral.sh/blog/announcing-astral-the-company-behind-ruff">VCs</a> but uh… why? Definitely not out of the goodness of their hearts. Similar uneasiness around <a href="https://bun.sh/">Bun</a>, which I also love.</p>]]></content>
     </entry>
  
     <entry>
       <title>Harper Reed&apos;s LLM codegen workflow</title>
       <link href="https://blog.willsmidlein.com/posts/2025/feb/18/harper-reed-llm/"/>
       <id>https://blog.willsmidlein.com/posts/2025/feb/18/harper-reed-llm/</id>
       <published>2025-02-18T21:52:46Z</published>
       <updated>2025-02-18T21:52:46Z</updated>
       <author>
         <name>Will Smidlein</name>
         <email>will@willsmidlein.com</email>
       </author>
       <category term="llm"/>
        <category term="harper-reed"/>
       <summary>Link post to https://harper.blog/2025/02/16/my-llm-codegen-workflow-atm/</summary>
       <content type="html"><![CDATA[<blockquote>
<p>For some reason I say “over my skies” a lot when talking about LLMs. I don’t know why. It resonates with me. Maybe it’s because it is beautiful smooth powder skiing, and then all of a sudden you are like “WHAT THE FUCK IS GOING ON!” and are completely lost and suddenly fall off a cliff.</p>
<p>I find that using a planning step […] can help keep things under control. At least you will have a doc you can double-check against. I also do believe that testing is helpful - especially if you are doing wild style aider coding. Helps keep things good, and tight.</p>
</blockquote>
<p>Broad strokes this is very similar to <a href="https://blog.willsmidlein.com/posts/2025/feb/16/llm-side-projects/">my workflow</a> but there lots of nuggets of wisdom in this post. He also goes into a lot of interesting detail about using LLMs in non-Greenfield projects. Just a great read all around.</p>
<p>First time hearing about <a href="https://aider.chat">Aider</a> and <a href="https://github.com/yamadashy/repomix">repomix</a>, excited to try them out.</p>]]></content>
     </entry>
  
     <entry>
       <title>Samuel Covin On AI Abstractions</title>
       <link href="https://blog.willsmidlein.com/posts/2025/feb/9/pydantic-podcast/"/>
       <id>https://blog.willsmidlein.com/posts/2025/feb/9/pydantic-podcast/</id>
       <published>2025-02-10T01:38:44Z</published>
       <updated>2025-02-10T01:38:44Z</updated>
       <author>
         <name>Will Smidlein</name>
         <email>will@willsmidlein.com</email>
       </author>
       <category term="llm"/>
       <summary>Link post to https://www.latent.space/p/pydantic</summary>
       <content type="html"><![CDATA[<blockquote>
<p>If you’re running a customer service business and you have loads of people sitting answering telephones, the less well trained they are, the less that you trust them, the more that you need to give them a script to go through. […] If you’re doing high net worth banking, you just employ people who you think are going to be charming to other rich people and set them off to go and have coffee with people. […] And the same is true of models. The more intelligent they are, the less we need to tell them, like structure what they go and do and constrain the routes in which they take.</p>
<p>If models are getting faster as quickly as you say they are, then we don’t need agents and we don’t really need any of these abstraction layers. We can just give our model […] access to the internet, cross our fingers and hope for the best. Agents, agent frameworks, graphs, all of this stuff is basically making up for the fact that right now the models are not that clever.</p>
<p><cite><a href="https://www.latent.space/p/pydantic">Samuel Covin</a> [~00:26:32]</cite></p>
</blockquote>
<p>One of many great tidbits from Samuel in this podcast.</p>
<p>I am generally not a fan of Python (in favor of the clearly far superior Javascript) but I am a superfan of <a href="https://pydantic.dev/">Pydantic</a>. I was thrilled when <a href="https://ai.pydantic.dev/">Pydantic AI</a> was announced and have continued to follow it’s developments and iterations closely. I have a strong feeling it will continue to define mental models in the AI SDK space for many years to come.</p>]]></content>
     </entry>
  
     <entry>
       <title>[Video] Sidewalk Chalk Robot</title>
       <link href="https://blog.willsmidlein.com/posts/2025/jan/31/chalkbot/"/>
       <id>https://blog.willsmidlein.com/posts/2025/jan/31/chalkbot/</id>
       <published>2025-01-31T23:48:03Z</published>
       <updated>2025-01-31T23:48:03Z</updated>
       <author>
         <name>Will Smidlein</name>
         <email>will@willsmidlein.com</email>
       </author>
       <category term="diy"/>
        <category term="robotics"/>
        <category term="youtube"/>
        <category term="video"/>
       <summary>Link post to https://www.youtube.com/watch?v=FDYqlQKaD1w</summary>
       
     </entry>
  
     <entry>
       <title>Taylorator: Flood the FM Broadcast Band with Taylor Swift</title>
       <link href="https://blog.willsmidlein.com/posts/2025/jan/27/taylorator/"/>
       <id>https://blog.willsmidlein.com/posts/2025/jan/27/taylorator/</id>
       <published>2025-01-27T18:11:38Z</published>
       <updated>2025-01-27T18:11:38Z</updated>
       <author>
         <name>Will Smidlein</name>
         <email>will@willsmidlein.com</email>
       </author>
       <category term="sdr"/>
       <summary>Link post to https://www.scd31.com/posts/taylorator</summary>
       
     </entry>
  
     <entry>
       <title>Anomalous Tokens in DeepSeek-V3 and r1</title>
       <link href="https://blog.willsmidlein.com/posts/2025/jan/25/glitch-tokens/"/>
       <id>https://blog.willsmidlein.com/posts/2025/jan/25/glitch-tokens/</id>
       <published>2025-01-26T00:34:24Z</published>
       <updated>2025-01-26T00:34:24Z</updated>
       <author>
         <name>Will Smidlein</name>
         <email>will@willsmidlein.com</email>
       </author>
       <category term="llm"/>
       <summary>Link post to https://outsidetext.substack.com/p/anomalous-tokens-in-deepseek-v3-and</summary>
       <content type="html"><![CDATA[<p>Fascinating. Only a matter of time before somebody writes an llm fuzzer.</p>]]></content>
     </entry>
  
     <entry>
       <title>Presidio</title>
       <link href="https://blog.willsmidlein.com/posts/2025/jan/23/presidio/"/>
       <id>https://blog.willsmidlein.com/posts/2025/jan/23/presidio/</id>
       <published>2025-01-24T02:30:42Z</published>
       <updated>2025-01-24T02:30:42Z</updated>
       <author>
         <name>Will Smidlein</name>
         <email>will@willsmidlein.com</email>
       </author>
       <category term="presidio"/>
       <summary>Link post to https://microsoft.github.io/presidio/</summary>
       <content type="html"><![CDATA[<blockquote>
<p>Presidio helps to ensure sensitive data is properly managed and governed. It provides fast identification and anonymization modules for private entities in text and images such as credit card numbers, names, locations, social security numbers, bitcoin wallets, US phone numbers, financial data and more.</p>
</blockquote>
<a href="https://microsoft.github.io/presidio/#how-it-works" target="_blank"><figure><figcaption><p>How it works</p></figcaption></figure></a>
<p>I was just taking a look at <a href="https://github.com/Chainlit/chainlit">Chainlit</a>, and more specifically <a href="https://docs.chainlit.io/examples/security">this example</a> and saw <a href="https://microsoft.github.io/presidio/">Presidio</a> mentioned.</p>
<p>I have seen basic attempts at doing this with hand-spun regexes in the past and I’ve seen <a href="https://docs.aws.amazon.com/comprehend/latest/dg/how-pii.html">commercial</a> <a href="https://www.assemblyai.com/docs/audio-intelligence/pii-redaction">products</a>, but this feels like it strikes a nice middle ground. Despite the very Microsoft-y website that made me immediately assume it was for C# or .NET, it’s a Python library, and it’s MIT licensed. From their FAQs:</p>
<blockquote>
<p>Microsoft Presidio is not an official Microsoft product. […] The authors and maintainers of Presidio come from [our] <a href="https://microsoft.github.io/code-with-engineering-playbook/">Industry Solutions Engineering</a> team.</p>
</blockquote>]]></content>
     </entry>
  
     <entry>
       <title>Understanding Home Assistant’s Database and Statistics Model</title>
       <link href="https://blog.willsmidlein.com/posts/2025/jan/23/homeassistant-statistics/"/>
       <id>https://blog.willsmidlein.com/posts/2025/jan/23/homeassistant-statistics/</id>
       <published>2025-01-23T22:11:12Z</published>
       <updated>2025-01-23T22:11:12Z</updated>
       <author>
         <name>Will Smidlein</name>
         <email>will@willsmidlein.com</email>
       </author>
       <category term="home-assistant"/>
       <summary>Link post to https://smarthomescene.com/blog/understanding-home-assistants-database-and-statistics-model/</summary>
       
     </entry>
  
     <entry>
       <title>Zero</title>
       <link href="https://blog.willsmidlein.com/posts/2025/jan/23/zero/"/>
       <id>https://blog.willsmidlein.com/posts/2025/jan/23/zero/</id>
       <published>2025-01-23T21:38:22Z</published>
       <updated>2025-01-23T21:38:22Z</updated>
       <author>
         <name>Will Smidlein</name>
         <email>will@willsmidlein.com</email>
       </author>
       <category term="js"/>
        <category term="frontend"/>
       <summary>Link post to https://zero.rocicorp.dev</summary>
       <content type="html"><![CDATA[<p>Terrible name, interesting idea.</p>
<aside>
Hat tip <a href="https://macwright.com/2025/01/11/predictions.html">Tom MacWright</a>
</aside>
]]></content>
     </entry>
  
     <entry>
       <title>Hacking Subaru</title>
       <link href="https://blog.willsmidlein.com/posts/2025/jan/23/hacking-subaru/"/>
       <id>https://blog.willsmidlein.com/posts/2025/jan/23/hacking-subaru/</id>
       <published>2025-01-23T20:22:01Z</published>
       <updated>2025-01-23T20:22:01Z</updated>
       <author>
         <name>Will Smidlein</name>
         <email>will@willsmidlein.com</email>
       </author>
       <category term="sam-curry"/>
        <category term="car-hacking"/>
        <category term="subaru"/>
       <summary>Link post to https://samcurry.net/hacking-subaru</summary>
       <content type="html"><![CDATA[<blockquote>
<p>I bought my mom a 2023 Subaru Impreza with the promise that she would let me borrow it to try and hack it</p>
</blockquote>
<p>Another Sam Curry Banger. Don’t miss the <em>Bypassing 2FA</em> portion, it’s a doozy.</p>
<p>First time hearing of <a href="https://github.com/ffuf/ffuf">ffuf</a>, looks neat.</p>]]></content>
     </entry>
  
     <entry>
       <title>Pagefind</title>
       <link href="https://blog.willsmidlein.com/posts/2025/jan/21/pagefind/"/>
       <id>https://blog.willsmidlein.com/posts/2025/jan/21/pagefind/</id>
       <published>2025-01-22T03:38:58Z</published>
       <updated>2025-01-22T03:38:58Z</updated>
       <author>
         <name>Will Smidlein</name>
         <email>will@willsmidlein.com</email>
       </author>
       <category term="rust"/>
        <category term="cbor"/>
        <category term="search"/>
        <category term="meta"/>
       <summary>Link post to https://pagefind.app</summary>
       <content type="html"><![CDATA[<blockquote>
<p>Pagefind is a fully static search library that aims to perform well on large sites, while using as little of your users’ bandwidth as possible, and without hosting any infrastructure.</p>
</blockquote>
<p>Delightful project I accidentally stumbled upon while building this very blog. It pre-computes all the search indexes at build time and then packages into a gloriously simple frontend. Who knows if I’ll ever post enough that it’s worth using. For the time being, it lives at <a href="/search">/search</a>.</p>
<p>It spits out a directory structure like this:</p>
<pre><code><span><span>dist/pagefind</span></span>
<span><span>├── fragment</span></span>
<span><span>│   ├── en_4733ec7.pf_fragment</span></span>
<span><span>│   ├── en_5c9a98e.pf_fragment</span></span>
<span><span>│   ├── en_7ca223a.pf_fragment</span></span>
<span><span>│   └── en_953c689.pf_fragment</span></span>
<span><span>├── index</span></span>
<span><span>│   └── en_4d96258.pf_index</span></span>
<span><span>├── pagefind-entry.json</span></span>
<span><span>├── pagefind-highlight.js</span></span>
<span><span>├── pagefind-modular-ui.css</span></span>
<span><span>├── pagefind-modular-ui.js</span></span>
<span><span>├── pagefind-ui.css</span></span>
<span><span>├── pagefind-ui.js</span></span>
<span><span>├── pagefind.en_f57a1155c8.pf_meta</span></span>
<span><span>├── pagefind.js</span></span>
<span><span>├── wasm.en.pagefind</span></span>
<span><span>└── wasm.unknown.pagefind</span></span></code></pre>
<p>The .js, .css, and even the wasm stuff all made sense, but I was curious about the binary blobs in the .pf_fragment, .pf_index, and .pf_meta files.</p>
<p>Weirdly (and somewhat ironically), I could not find any documentation on the actual binary format the indexes were being stored as. I poked around a bit before deciding to <a href="https://github.com/CloudCannon/pagefind/tree/main/pagefind">dig into the source code</a>.</p>
<p>With the help of Claude, I’ve figured out that they’re using <a href="https://cbor.io">Concise Binary Object Representation</a> via the <a href="https://crates.io/crates/minicbor">minicbor Rust lib</a> and sort of pieced together the root data structures. I have linked to them below.</p>
<h3>.pf_fragment</h3>
<pre><code><span><span>#[derive(</span><span>Serialize</span><span>, </span><span>Debug</span><span>, </span><span>Clone</span><span>)]</span></span>
<span><span>pub</span><span> struct</span><span> PageFragmentData</span><span> {</span></span>
<span><span>    pub</span><span> url</span><span>:</span><span> String</span><span>,</span></span>
<span><span>    pub</span><span> content</span><span>:</span><span> String</span><span>,</span></span>
<span><span>    pub</span><span> word_count</span><span>:</span><span> usize</span><span>,</span></span>
<span><span>    pub</span><span> filters</span><span>:</span><span> BTreeMap</span><span>&lt;</span><span>String</span><span>, </span><span>Vec</span><span>&lt;</span><span>String</span><span>&gt;&gt;,</span></span>
<span><span>    pub</span><span> meta</span><span>:</span><span> BTreeMap</span><span>&lt;</span><span>String</span><span>, </span><span>String</span><span>&gt;,</span></span>
<span><span>    pub</span><span> anchors</span><span>:</span><span> Vec</span><span>&lt;</span><span>PageAnchorData</span><span>&gt;,</span></span>
<span><span>}</span></span>
<span></span>
<span><span>#[derive(</span><span>Serialize</span><span>, </span><span>Debug</span><span>, </span><span>Clone</span><span>)]</span></span>
<span><span>pub</span><span> struct</span><span> PageAnchorData</span><span> {</span></span>
<span><span>    pub</span><span> element</span><span>:</span><span> String</span><span>,</span></span>
<span><span>    pub</span><span> id</span><span>:</span><span> String</span><span>,</span></span>
<span><span>    pub</span><span> text</span><span>:</span><span> String</span><span>,</span></span>
<span><span>    pub</span><span> location</span><span>:</span><span> u32</span><span>,</span></span>
<span><span>}</span></span></code></pre>
<p><a href="https://github.com/CloudCannon/pagefind/blob/13f3bda84206ef629e5d9bec5f7359d76a526676/pagefind/src/fragments/mod.rs#L5-L21">Code</a></p>
<h3>.pf_index</h3>
<pre><code><span><span>/// A single word index chunk: `pagefind/index/*.pf_index`</span></span>
<span><span>#[derive(</span><span>Encode</span><span>)]</span></span>
<span><span>pub</span><span> struct</span><span> WordIndex</span><span> {</span></span>
<span><span>    #[n(0)]</span></span>
<span><span>    pub</span><span> words</span><span>:</span><span> Vec</span><span>&lt;</span><span>PackedWord</span><span>&gt;,</span></span>
<span><span>}</span></span>
<span></span>
<span><span>/// A single word as an inverse index of all locations on the site</span></span>
<span><span>#[derive(</span><span>Encode</span><span>, </span><span>Clone</span><span>, </span><span>Debug</span><span>)]</span></span>
<span><span>pub</span><span> struct</span><span> PackedWord</span><span> {</span></span>
<span><span>    #[n(0)]</span></span>
<span><span>    pub</span><span> word</span><span>:</span><span> String</span><span>,</span></span>
<span><span>    #[n(1)]</span></span>
<span><span>    pub</span><span> pages</span><span>:</span><span> Vec</span><span>&lt;</span><span>PackedPage</span><span>&gt;,</span></span>
<span><span>}</span></span>
<span></span>
<span><span>/// A set of locations on a given page</span></span>
<span><span>#[derive(</span><span>Encode</span><span>, </span><span>Clone</span><span>, </span><span>Debug</span><span>)]</span></span>
<span><span>pub</span><span> struct</span><span> PackedPage</span><span> {</span></span>
<span><span>    #[n(0)]</span></span>
<span><span>    pub</span><span> page_number</span><span>:</span><span> usize</span><span>, </span><span>// Won't exceed u32 but saves us some into()s</span></span>
<span><span>    #[n(1)]</span></span>
<span><span>    pub</span><span> locs</span><span>:</span><span> Vec</span><span>&lt;</span><span>i32</span><span>&gt;,</span></span>
<span><span>}</span></span></code></pre>
<p><a href="https://github.com/CloudCannon/pagefind/blob/13f3bda84206ef629e5d9bec5f7359d76a526676/pagefind/src/index/index_words.rs#L5-L28">Code</a></p>
<h3>.pf_meta</h3>
<pre><code><span><span>/// All metadata we need to glue together search queries &amp; results</span></span>
<span><span>#[derive(</span><span>Encode</span><span>, </span><span>Debug</span><span>)]</span></span>
<span><span>pub</span><span> struct</span><span> MetaIndex</span><span> {</span></span>
<span><span>    #[n(0)]</span></span>
<span><span>    pub</span><span> version</span><span>:</span><span> String</span><span>,</span></span>
<span><span>    #[n(1)]</span></span>
<span><span>    pub</span><span> pages</span><span>:</span><span> Vec</span><span>&lt;</span><span>MetaPage</span><span>&gt;,</span></span>
<span><span>    #[n(2)]</span></span>
<span><span>    pub</span><span> index_chunks</span><span>:</span><span> Vec</span><span>&lt;</span><span>MetaChunk</span><span>&gt;,</span></span>
<span><span>    #[n(3)]</span></span>
<span><span>    pub</span><span> filters</span><span>:</span><span> Vec</span><span>&lt;</span><span>MetaFilter</span><span>&gt;,</span></span>
<span><span>    #[n(4)]</span></span>
<span><span>    pub</span><span> sorts</span><span>:</span><span> Vec</span><span>&lt;</span><span>MetaSort</span><span>&gt;,</span></span>
<span><span>}</span></span>
<span></span>
<span><span>/// Communicates the pagefind/index/*.pf_index file we need to load</span></span>
<span><span>/// when searching for a word that sorts between `from` and `to`</span></span>
<span><span>#[derive(</span><span>Encode</span><span>, </span><span>PartialEq</span><span>, </span><span>Debug</span><span>)]</span></span>
<span><span>pub</span><span> struct</span><span> MetaChunk</span><span> {</span></span>
<span><span>    #[n(0)]</span></span>
<span><span>    pub</span><span> from</span><span>:</span><span> String</span><span>,</span></span>
<span><span>    #[n(1)]</span></span>
<span><span>    pub</span><span> to</span><span>:</span><span> String</span><span>,</span></span>
<span><span>    #[n(2)]</span></span>
<span><span>    pub</span><span> hash</span><span>:</span><span> String</span><span>,</span></span>
<span><span>}</span></span>
<span></span>
<span><span>#[derive(</span><span>Encode</span><span>, </span><span>Debug</span><span>)]</span></span>
<span><span>pub</span><span> struct</span><span> MetaPage</span><span> {</span></span>
<span><span>    #[n(0)]</span></span>
<span><span>    pub</span><span> hash</span><span>:</span><span> String</span><span>,</span></span>
<span><span>    #[n(1)]</span></span>
<span><span>    pub</span><span> word_count</span><span>:</span><span> u32</span><span>,</span></span>
<span><span>}</span></span></code></pre>
<p><a href="https://github.com/CloudCannon/pagefind/blob/13f3bda84206ef629e5d9bec5f7359d76a526676/pagefind/src/index/index_metadata.rs#L5-L38">Code</a></p>]]></content>
     </entry>
  
     <entry>
       <title>A Marriage Proposal Spoken Entirely in Office Jargon</title>
       <link href="https://blog.willsmidlein.com/posts/2025/jan/15/office-jargon/"/>
       <id>https://blog.willsmidlein.com/posts/2025/jan/15/office-jargon/</id>
       <published>2025-01-15T19:11:31Z</published>
       <updated>2025-01-15T19:11:31Z</updated>
       <author>
         <name>Will Smidlein</name>
         <email>will@willsmidlein.com</email>
       </author>
       <category term="mcsweeneys"/>
       <summary>Link post to https://www.mcsweeneys.net/articles/a-marriage-proposal-spoken-entirely-in-office-jargon</summary>
       
     </entry>
  
     <entry>
       <title>Programming With LLMs</title>
       <link href="https://blog.willsmidlein.com/posts/2025/jan/6/programming-with-llms/"/>
       <id>https://blog.willsmidlein.com/posts/2025/jan/6/programming-with-llms/</id>
       <published>2025-01-07T02:20:59Z</published>
       <updated>2025-01-07T02:20:59Z</updated>
       <author>
         <name>Will Smidlein</name>
         <email>will@willsmidlein.com</email>
       </author>
       <category term="llm"/>
        <category term="crawshaw.io"/>
       <summary>Link post to https://crawshaw.io/blog/programming-with-llms</summary>
       <content type="html"><![CDATA[<blockquote>
<p>The ideal task for an LLM is one where it needs to use a lot of common libraries (more than a human can remember, so it is doing a lot of small-scale research for you), working to an interface you designed or produces a small interface you can verify as sensible quickly, and it can write readable tests.</p>
<p><cite><a href="https://crawshaw.io/blog/programming-with-llms">David Crawshaw</a></cite></p>
</blockquote>]]></content>
     </entry>
  
     <entry>
       <title>Self-driving a 1993 Volvo 940 with Openpilot</title>
       <link href="https://blog.willsmidlein.com/posts/2025/jan/4/self-driving-volvo/"/>
       <id>https://blog.willsmidlein.com/posts/2025/jan/4/self-driving-volvo/</id>
       <published>2025-01-04T23:34:08Z</published>
       <updated>2025-01-04T23:34:08Z</updated>
       <author>
         <name>Will Smidlein</name>
         <email>will@willsmidlein.com</email>
       </author>
       <category term="self-driving"/>
        <category term="comma.ai"/>
       <summary>Link post to https://practicapp.com/carbagepilot-part1/</summary>
       <content type="html"><![CDATA[<p><a href="https://www.imdb.com/title/tt1310479/">Prototype This</a> was <a href="https://www.youtube.com/watch?v=9JG9dA41ZFs">ahead of their time</a></p>]]></content>
     </entry>
</feed>