<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" 
  xmlns:atom="http://www.w3.org/2005/Atom"
  xmlns:content="http://purl.org/rss/1.0/modules/content/"
  xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Systems and Strides</title>
    <link>https://shsin.blog</link>
    <description>Personal blog about AI, engineering and endurance activities.</description>
    <language>en-us</language>
    <lastBuildDate>Sat, 28 Mar 2026 14:04:34 GMT</lastBuildDate>
    <atom:link href="https://shsin.blog/feed.xml" rel="self" type="application/rss+xml"/>
    <generator>Next.js</generator>
    <managingEditor>mailme.shantanu@gmail.com (Shantanu Singh)</managingEditor>
    <webMaster>mailme.shantanu@gmail.com (Shantanu Singh)</webMaster>
    
    <item>
      <title><![CDATA[The 5 Stages of Claude Code Mastery]]></title>
      <link>https://shsin.blog/posts/claude-code-mastery-levels</link>
      <guid isPermaLink="true">https://shsin.blog/posts/claude-code-mastery-levels</guid>
      <pubDate>Fri, 27 Mar 2026 00:00:00 GMT</pubDate>
      <dc:creator>Shantanu Singh</dc:creator>
      <category>claude</category>
      <description><![CDATA[A field guide to the Dunning-Kruger curve of AI-assisted programming, from denial to mass enlightenment.]]></description>
      <content:encoded><![CDATA[<p>Andrej Karpathy tweeted nine words in February 2025 and accidentally started a religion:</p>

        <div class="tweet-embed-placeholder" data-tweet-id="1886192184808149383"></div>
      
<p>~34k likes. Collins Dictionary Word of the Year. A million LinkedIn posts about "the future of development." One year later, Karpathy hand-wrote his own app because Claude agents were "net unhelpful" for it.</p>
<p>That's the whole arc of AI-assisted coding in two paragraphs. But I've watched enough people walk this path that I can now identify exactly five stages.</p>
<div class="excalidraw-diagram" data-scene="{
  "type": "excalidraw",
  "version": 2,
  "source": "https://excalidraw.com",
  "elements": [
    {
      "type": "text",
      "version": 1,
      "id": "title",
      "x": 160,
      "y": 8,
      "width": 1080,
      "height": 55,
      "angle": 0,
      "strokeColor": "#1e1e1e",
      "backgroundColor": "transparent",
      "fillStyle": "solid",
      "strokeWidth": 2,
      "roughness": 1,
      "opacity": 100,
      "roundness": null,
      "seed": 100,
      "groupIds": [],
      "frameId": null,
      "boundElements": null,
      "isDeleted": false,
      "link": null,
      "locked": false,
      "updated": 1,
      "text": "The Dunning-Kruger Curve of AI-Assisted Coding",
      "fontSize": 42,
      "fontFamily": 1,
      "textAlign": "center",
      "verticalAlign": "top",
      "containerId": null,
      "originalText": "The Dunning-Kruger Curve of AI-Assisted Coding",
      "autoResize": true,
      "lineHeight": 1.25
    },

    {
      "type": "arrow",
      "version": 1,
      "id": "y-axis",
      "x": 140,
      "y": 800,
      "width": 0,
      "height": 720,
      "angle": 0,
      "strokeColor": "#1e1e1e",
      "backgroundColor": "transparent",
      "fillStyle": "solid",
      "strokeWidth": 3,
      "roughness": 1,
      "opacity": 60,
      "roundness": { "type": 2 },
      "seed": 200,
      "groupIds": [],
      "frameId": null,
      "boundElements": null,
      "isDeleted": false,
      "link": null,
      "locked": false,
      "updated": 1,
      "points": [[0, 0], [0, -720]],
      "lastCommittedPoint": null,
      "startBinding": null,
      "endBinding": null,
      "startArrowhead": null,
      "endArrowhead": "arrow"
    },
    {
      "type": "arrow",
      "version": 1,
      "id": "x-axis",
      "x": 140,
      "y": 800,
      "width": 1260,
      "height": 0,
      "angle": 0,
      "strokeColor": "#1e1e1e",
      "backgroundColor": "transparent",
      "fillStyle": "solid",
      "strokeWidth": 3,
      "roughness": 1,
      "opacity": 60,
      "roundness": { "type": 2 },
      "seed": 300,
      "groupIds": [],
      "frameId": null,
      "boundElements": null,
      "isDeleted": false,
      "link": null,
      "locked": false,
      "updated": 1,
      "points": [[0, 0], [1260, 0]],
      "lastCommittedPoint": null,
      "startBinding": null,
      "endBinding": null,
      "startArrowhead": null,
      "endArrowhead": "arrow"
    },
    {
      "type": "text",
      "version": 1,
      "id": "y-label",
      "x": 15,
      "y": 58,
      "width": 140,
      "height": 30,
      "angle": 0,
      "strokeColor": "#1e1e1e",
      "backgroundColor": "transparent",
      "fillStyle": "solid",
      "strokeWidth": 1,
      "roughness": 1,
      "opacity": 65,
      "roundness": null,
      "seed": 400,
      "groupIds": [],
      "frameId": null,
      "boundElements": null,
      "isDeleted": false,
      "link": null,
      "locked": false,
      "updated": 1,
      "text": "Confidence",
      "fontSize": 22,
      "fontFamily": 1,
      "textAlign": "left",
      "verticalAlign": "top",
      "containerId": null,
      "originalText": "Confidence",
      "autoResize": true,
      "lineHeight": 1.25
    },
    {
      "type": "text",
      "version": 1,
      "id": "x-label",
      "x": 1268,
      "y": 812,
      "width": 140,
      "height": 30,
      "angle": 0,
      "strokeColor": "#1e1e1e",
      "backgroundColor": "transparent",
      "fillStyle": "solid",
      "strokeWidth": 1,
      "roughness": 1,
      "opacity": 65,
      "roundness": null,
      "seed": 500,
      "groupIds": [],
      "frameId": null,
      "boundElements": null,
      "isDeleted": false,
      "link": null,
      "locked": false,
      "updated": 1,
      "text": "Actual Skill",
      "fontSize": 22,
      "fontFamily": 1,
      "textAlign": "left",
      "verticalAlign": "top",
      "containerId": null,
      "originalText": "Actual Skill",
      "autoResize": true,
      "lineHeight": 1.25
    },

    {
      "type": "line",
      "version": 1,
      "id": "curve-glow",
      "x": 200,
      "y": 120,
      "width": 990,
      "height": 560,
      "angle": 0,
      "strokeColor": "#e03131",
      "backgroundColor": "transparent",
      "fillStyle": "solid",
      "strokeWidth": 30,
      "roughness": 0,
      "opacity": 14,
      "roundness": { "type": 2 },
      "seed": 550,
      "groupIds": [],
      "frameId": null,
      "boundElements": null,
      "isDeleted": false,
      "link": null,
      "locked": false,
      "updated": 1,
      "points": [
        [0, 560], [20, 550], [40, 530], [60, 500], [80, 455],
        [100, 400], [120, 335], [140, 265], [160, 195], [180, 135],
        [200, 80], [220, 35], [240, 0],
        [260, 5], [280, 30], [300, 75], [330, 145], [360, 210],
        [400, 280], [440, 325], [480, 345],
        [520, 370], [560, 415], [600, 465], [640, 510], [660, 535], [680, 555],
        [700, 555], [720, 540], [750, 510], [780, 475], [810, 435],
        [840, 400], [870, 365], [900, 335], [930, 305],
        [950, 285], [970, 270], [990, 260]
      ],
      "lastCommittedPoint": null,
      "startBinding": null,
      "endBinding": null,
      "startArrowhead": null,
      "endArrowhead": null
    },
    {
      "type": "line",
      "version": 1,
      "id": "curve",
      "x": 200,
      "y": 120,
      "width": 990,
      "height": 560,
      "angle": 0,
      "strokeColor": "#e03131",
      "backgroundColor": "transparent",
      "fillStyle": "solid",
      "strokeWidth": 6,
      "roughness": 1,
      "opacity": 100,
      "roundness": { "type": 2 },
      "seed": 600,
      "groupIds": [],
      "frameId": null,
      "boundElements": null,
      "isDeleted": false,
      "link": null,
      "locked": false,
      "updated": 1,
      "points": [
        [0, 560], [20, 550], [40, 530], [60, 500], [80, 455],
        [100, 400], [120, 335], [140, 265], [160, 195], [180, 135],
        [200, 80], [220, 35], [240, 0],
        [260, 5], [280, 30], [300, 75], [330, 145], [360, 210],
        [400, 280], [440, 325], [480, 345],
        [520, 370], [560, 415], [600, 465], [640, 510], [660, 535], [680, 555],
        [700, 555], [720, 540], [750, 510], [780, 475], [810, 435],
        [840, 400], [870, 365], [900, 335], [930, 305],
        [950, 285], [970, 270], [990, 260]
      ],
      "lastCommittedPoint": null,
      "startBinding": null,
      "endBinding": null,
      "startArrowhead": null,
      "endArrowhead": null
    },

    {
      "type": "ellipse",
      "version": 1,
      "id": "dot-0",
      "x": 175,
      "y": 655,
      "width": 50,
      "height": 50,
      "angle": 0,
      "strokeColor": "#1971c2",
      "backgroundColor": "#a5d8ff",
      "fillStyle": "solid",
      "strokeWidth": 4,
      "roughness": 2,
      "opacity": 100,
      "roundness": null,
      "seed": 700,
      "groupIds": [],
      "frameId": null,
      "boundElements": null,
      "isDeleted": false,
      "link": null,
      "locked": false,
      "updated": 1
    },
    {
      "type": "ellipse",
      "version": 1,
      "id": "dot-1",
      "x": 415,
      "y": 95,
      "width": 50,
      "height": 50,
      "angle": 0,
      "strokeColor": "#e03131",
      "backgroundColor": "#ffc9c9",
      "fillStyle": "solid",
      "strokeWidth": 4,
      "roughness": 2,
      "opacity": 100,
      "roundness": null,
      "seed": 800,
      "groupIds": [],
      "frameId": null,
      "boundElements": null,
      "isDeleted": false,
      "link": null,
      "locked": false,
      "updated": 1
    },
    {
      "type": "ellipse",
      "version": 1,
      "id": "dot-2",
      "x": 655,
      "y": 440,
      "width": 50,
      "height": 50,
      "angle": 0,
      "strokeColor": "#f08c00",
      "backgroundColor": "#ffec99",
      "fillStyle": "solid",
      "strokeWidth": 4,
      "roughness": 2,
      "opacity": 100,
      "roundness": null,
      "seed": 900,
      "groupIds": [],
      "frameId": null,
      "boundElements": null,
      "isDeleted": false,
      "link": null,
      "locked": false,
      "updated": 1
    },
    {
      "type": "ellipse",
      "version": 1,
      "id": "dot-3",
      "x": 855,
      "y": 650,
      "width": 50,
      "height": 50,
      "angle": 0,
      "strokeColor": "#2f9e44",
      "backgroundColor": "#b2f2bb",
      "fillStyle": "solid",
      "strokeWidth": 4,
      "roughness": 2,
      "opacity": 100,
      "roundness": null,
      "seed": 1000,
      "groupIds": [],
      "frameId": null,
      "boundElements": null,
      "isDeleted": false,
      "link": null,
      "locked": false,
      "updated": 1
    },
    {
      "type": "ellipse",
      "version": 1,
      "id": "dot-4",
      "x": 1165,
      "y": 355,
      "width": 50,
      "height": 50,
      "angle": 0,
      "strokeColor": "#6741d9",
      "backgroundColor": "#d0bfff",
      "fillStyle": "solid",
      "strokeWidth": 4,
      "roughness": 2,
      "opacity": 100,
      "roundness": null,
      "seed": 1100,
      "groupIds": [],
      "frameId": null,
      "boundElements": null,
      "isDeleted": false,
      "link": null,
      "locked": false,
      "updated": 1
    },

    {
      "type": "rectangle",
      "version": 1,
      "id": "label-bg-0",
      "x": 150,
      "y": 705,
      "width": 285,
      "height": 86,
      "angle": 0.02,
      "strokeColor": "#1971c2",
      "backgroundColor": "#a5d8ff",
      "fillStyle": "hachure",
      "strokeWidth": 3,
      "roughness": 2,
      "opacity": 100,
      "roundness": { "type": 3 },
      "seed": 1200,
      "groupIds": [],
      "frameId": null,
      "boundElements": [{ "type": "text", "id": "label-0" }],
      "isDeleted": false,
      "link": null,
      "locked": false,
      "updated": 1
    },
    {
      "type": "text",
      "version": 1,
      "id": "label-0",
      "x": 155,
      "y": 712,
      "width": 275,
      "height": 72,
      "angle": 0.02,
      "strokeColor": "#1971c2",
      "backgroundColor": "transparent",
      "fillStyle": "solid",
      "strokeWidth": 1,
      "roughness": 1,
      "opacity": 100,
      "roundness": null,
      "seed": 1300,
      "groupIds": [],
      "frameId": null,
      "boundElements": null,
      "isDeleted": false,
      "link": null,
      "locked": false,
      "updated": 1,
      "text": "Level 0: The Refuser\n$ claude: command not found",
      "fontSize": 22,
      "fontFamily": 1,
      "textAlign": "center",
      "verticalAlign": "middle",
      "containerId": "label-bg-0",
      "originalText": "Level 0: The Refuser\n$ claude: command not found",
      "autoResize": true,
      "lineHeight": 1.25
    },

    {
      "type": "rectangle",
      "version": 1,
      "id": "label-bg-1",
      "x": 510,
      "y": 75,
      "width": 295,
      "height": 86,
      "angle": -0.02,
      "strokeColor": "#e03131",
      "backgroundColor": "#ffc9c9",
      "fillStyle": "hachure",
      "strokeWidth": 3,
      "roughness": 2,
      "opacity": 100,
      "roundness": { "type": 3 },
      "seed": 1400,
      "groupIds": [],
      "frameId": null,
      "boundElements": [{ "type": "text", "id": "label-1" }],
      "isDeleted": false,
      "link": null,
      "locked": false,
      "updated": 1
    },
    {
      "type": "text",
      "version": 1,
      "id": "label-1",
      "x": 515,
      "y": 82,
      "width": 285,
      "height": 72,
      "angle": -0.02,
      "strokeColor": "#e03131",
      "backgroundColor": "transparent",
      "fillStyle": "solid",
      "strokeWidth": 1,
      "roughness": 1,
      "opacity": 100,
      "roundness": null,
      "seed": 1500,
      "groupIds": [],
      "frameId": null,
      "boundElements": null,
      "isDeleted": false,
      "link": null,
      "locked": false,
      "updated": 1,
      "text": "Level 1: Peak of\nMount 'Accept All'",
      "fontSize": 22,
      "fontFamily": 1,
      "textAlign": "center",
      "verticalAlign": "middle",
      "containerId": "label-bg-1",
      "originalText": "Level 1: Peak of\nMount 'Accept All'",
      "autoResize": true,
      "lineHeight": 1.25
    },

    {
      "type": "rectangle",
      "version": 1,
      "id": "label-bg-2",
      "x": 760,
      "y": 425,
      "width": 315,
      "height": 86,
      "angle": 0.03,
      "strokeColor": "#f08c00",
      "backgroundColor": "#ffec99",
      "fillStyle": "hachure",
      "strokeWidth": 3,
      "roughness": 2,
      "opacity": 100,
      "roundness": { "type": 3 },
      "seed": 1600,
      "groupIds": [],
      "frameId": null,
      "boundElements": [{ "type": "text", "id": "label-2" }],
      "isDeleted": false,
      "link": null,
      "locked": false,
      "updated": 1
    },
    {
      "type": "text",
      "version": 1,
      "id": "label-2",
      "x": 765,
      "y": 432,
      "width": 305,
      "height": 72,
      "angle": 0.03,
      "strokeColor": "#f08c00",
      "backgroundColor": "transparent",
      "fillStyle": "solid",
      "strokeWidth": 1,
      "roughness": 1,
      "opacity": 100,
      "roundness": null,
      "seed": 1700,
      "groupIds": [],
      "frameId": null,
      "boundElements": null,
      "isDeleted": false,
      "link": null,
      "locked": false,
      "updated": 1,
      "text": "Level 2: Config Sorcerer\n58% context before first prompt",
      "fontSize": 22,
      "fontFamily": 1,
      "textAlign": "center",
      "verticalAlign": "middle",
      "containerId": "label-bg-2",
      "originalText": "Level 2: Config Sorcerer\n58% context before first prompt",
      "autoResize": true,
      "lineHeight": 1.25
    },

    {
      "type": "rectangle",
      "version": 1,
      "id": "label-bg-3",
      "x": 762,
      "y": 705,
      "width": 325,
      "height": 86,
      "angle": -0.015,
      "strokeColor": "#2f9e44",
      "backgroundColor": "#b2f2bb",
      "fillStyle": "hachure",
      "strokeWidth": 3,
      "roughness": 2,
      "opacity": 100,
      "roundness": { "type": 3 },
      "seed": 1800,
      "groupIds": [],
      "frameId": null,
      "boundElements": [{ "type": "text", "id": "label-3" }],
      "isDeleted": false,
      "link": null,
      "locked": false,
      "updated": 1
    },
    {
      "type": "text",
      "version": 1,
      "id": "label-3",
      "x": 767,
      "y": 712,
      "width": 315,
      "height": 72,
      "angle": -0.015,
      "strokeColor": "#2f9e44",
      "backgroundColor": "transparent",
      "fillStyle": "solid",
      "strokeWidth": 1,
      "roughness": 1,
      "opacity": 100,
      "roundness": null,
      "seed": 1900,
      "groupIds": [],
      "frameId": null,
      "boundElements": null,
      "isDeleted": false,
      "link": null,
      "locked": false,
      "updated": 1,
      "text": "Level 3: Valley of\nActually Reading Diffs",
      "fontSize": 22,
      "fontFamily": 1,
      "textAlign": "center",
      "verticalAlign": "middle",
      "containerId": "label-bg-3",
      "originalText": "Level 3: Valley of\nActually Reading Diffs",
      "autoResize": true,
      "lineHeight": 1.25
    },

    {
      "type": "rectangle",
      "version": 1,
      "id": "label-bg-4",
      "x": 1060,
      "y": 258,
      "width": 335,
      "height": 86,
      "angle": 0.025,
      "strokeColor": "#6741d9",
      "backgroundColor": "#d0bfff",
      "fillStyle": "hachure",
      "strokeWidth": 3,
      "roughness": 2,
      "opacity": 100,
      "roundness": { "type": 3 },
      "seed": 2000,
      "groupIds": [],
      "frameId": null,
      "boundElements": [{ "type": "text", "id": "label-4" }],
      "isDeleted": false,
      "link": null,
      "locked": false,
      "updated": 1
    },
    {
      "type": "text",
      "version": 1,
      "id": "label-4",
      "x": 1065,
      "y": 265,
      "width": 325,
      "height": 72,
      "angle": 0.025,
      "strokeColor": "#6741d9",
      "backgroundColor": "transparent",
      "fillStyle": "solid",
      "strokeWidth": 1,
      "roughness": 1,
      "opacity": 100,
      "roundness": null,
      "seed": 2100,
      "groupIds": [],
      "frameId": null,
      "boundElements": null,
      "isDeleted": false,
      "link": null,
      "locked": false,
      "updated": 1,
      "text": "Level 4: Plateau of\nEnlightenment",
      "fontSize": 22,
      "fontFamily": 1,
      "textAlign": "center",
      "verticalAlign": "middle",
      "containerId": "label-bg-4",
      "originalText": "Level 4: Plateau of\nEnlightenment",
      "autoResize": true,
      "lineHeight": 1.25
    },

    {
      "type": "text",
      "version": 1,
      "id": "quote-1",
      "x": 500,
      "y": 170,
      "width": 260,
      "height": 38,
      "angle": -0.04,
      "strokeColor": "#e03131",
      "backgroundColor": "transparent",
      "fillStyle": "solid",
      "strokeWidth": 1,
      "roughness": 1,
      "opacity": 100,
      "roundness": null,
      "seed": 2200,
      "groupIds": [],
      "frameId": null,
      "boundElements": null,
      "isDeleted": false,
      "link": null,
      "locked": false,
      "updated": 1,
      "text": "\"I AM GOD\"",
      "fontSize": 28,
      "fontFamily": 1,
      "textAlign": "center",
      "verticalAlign": "top",
      "containerId": null,
      "originalText": "\"I AM GOD\"",
      "autoResize": true,
      "lineHeight": 1.25
    },
    {
      "type": "line",
      "version": 1,
      "id": "underline-god",
      "x": 520,
      "y": 204,
      "width": 180,
      "height": 4,
      "angle": 0,
      "strokeColor": "#e03131",
      "backgroundColor": "transparent",
      "fillStyle": "solid",
      "strokeWidth": 3,
      "roughness": 2,
      "opacity": 45,
      "roundness": { "type": 2 },
      "seed": 2600,
      "groupIds": [],
      "frameId": null,
      "boundElements": null,
      "isDeleted": false,
      "link": null,
      "locked": false,
      "updated": 1,
      "points": [[0, 0], [40, 3], [90, -1], [140, 4], [180, 0]],
      "lastCommittedPoint": null,
      "startBinding": null,
      "endBinding": null,
      "startArrowhead": null,
      "endArrowhead": null
    },
    {
      "type": "text",
      "version": 1,
      "id": "quote-1b",
      "x": 452,
      "y": 214,
      "width": 370,
      "height": 24,
      "angle": 0,
      "strokeColor": "#1e1e1e",
      "backgroundColor": "transparent",
      "fillStyle": "solid",
      "strokeWidth": 1,
      "roughness": 1,
      "opacity": 50,
      "roundness": null,
      "seed": 2210,
      "groupIds": [],
      "frameId": null,
      "boundElements": null,
      "isDeleted": false,
      "link": null,
      "locked": false,
      "updated": 1,
      "text": "<-- most LinkedIn posts originate here",
      "fontSize": 16,
      "fontFamily": 1,
      "textAlign": "left",
      "verticalAlign": "top",
      "containerId": null,
      "originalText": "<-- most LinkedIn posts originate here",
      "autoResize": true,
      "lineHeight": 1.25
    },

    {
      "type": "text",
      "version": 1,
      "id": "quote-2",
      "x": 773,
      "y": 520,
      "width": 300,
      "height": 24,
      "angle": 0.02,
      "strokeColor": "#f08c00",
      "backgroundColor": "transparent",
      "fillStyle": "solid",
      "strokeWidth": 1,
      "roughness": 1,
      "opacity": 75,
      "roundness": null,
      "seed": 2250,
      "groupIds": [],
      "frameId": null,
      "boundElements": null,
      "isDeleted": false,
      "link": null,
      "locked": false,
      "updated": 1,
      "text": "\"I installed an MCP for my fridge\"",
      "fontSize": 16,
      "fontFamily": 1,
      "textAlign": "center",
      "verticalAlign": "top",
      "containerId": null,
      "originalText": "\"I installed an MCP for my fridge\"",
      "autoResize": true,
      "lineHeight": 1.25
    },

    {
      "type": "text",
      "version": 1,
      "id": "quote-3",
      "x": 790,
      "y": 810,
      "width": 280,
      "height": 26,
      "angle": -0.01,
      "strokeColor": "#2f9e44",
      "backgroundColor": "transparent",
      "fillStyle": "solid",
      "strokeWidth": 1,
      "roughness": 1,
      "opacity": 75,
      "roundness": null,
      "seed": 2300,
      "groupIds": [],
      "frameId": null,
      "boundElements": null,
      "isDeleted": false,
      "link": null,
      "locked": false,
      "updated": 1,
      "text": "\"I am decidedly NOT God\"",
      "fontSize": 18,
      "fontFamily": 1,
      "textAlign": "center",
      "verticalAlign": "top",
      "containerId": null,
      "originalText": "\"I am decidedly NOT God\"",
      "autoResize": true,
      "lineHeight": 1.25
    },
    {
      "type": "text",
      "version": 1,
      "id": "quote-4",
      "x": 1082,
      "y": 352,
      "width": 300,
      "height": 26,
      "angle": 0.015,
      "strokeColor": "#6741d9",
      "backgroundColor": "transparent",
      "fillStyle": "solid",
      "strokeWidth": 1,
      "roughness": 1,
      "opacity": 75,
      "roundness": null,
      "seed": 2400,
      "groupIds": [],
      "frameId": null,
      "boundElements": null,
      "isDeleted": false,
      "link": null,
      "locked": false,
      "updated": 1,
      "text": "12 tmux panes, 0 MCPs, just ships",
      "fontSize": 18,
      "fontFamily": 1,
      "textAlign": "center",
      "verticalAlign": "top",
      "containerId": null,
      "originalText": "12 tmux panes, 0 MCPs, just ships",
      "autoResize": true,
      "lineHeight": 1.25
    },

    {
      "type": "text",
      "version": 1,
      "id": "quote-0",
      "x": 152,
      "y": 640,
      "width": 250,
      "height": 24,
      "angle": 0.01,
      "strokeColor": "#1971c2",
      "backgroundColor": "transparent",
      "fillStyle": "solid",
      "strokeWidth": 1,
      "roughness": 1,
      "opacity": 65,
      "roundness": null,
      "seed": 2500,
      "groupIds": [],
      "frameId": null,
      "boundElements": null,
      "isDeleted": false,
      "link": null,
      "locked": false,
      "updated": 1,
      "text": "(uses vim with no plugins)",
      "fontSize": 16,
      "fontFamily": 1,
      "textAlign": "center",
      "verticalAlign": "top",
      "containerId": null,
      "originalText": "(uses vim with no plugins)",
      "autoResize": true,
      "lineHeight": 1.25
    }
  ],
  "appState": {
    "gridSize": null,
    "viewBackgroundColor": "#ffffff"
  },
  "files": {}
}"></div>
<hr>
<h2>Level 0: The Refuser</h2>
<p><strong>Headspace:</strong> "I can code faster and better than Claude 100% of the time."</p>
<p><strong>Reality:</strong> 99.99% pride, 0.01% 200 IQ genius</p>
<pre><code class="hljs language-bash">$ claude
=&gt; <span class="hljs-built_in">command</span> not found
=&gt; (uses vim with no plugins and likes it)
</code></pre>
<p>These are the developers who mass-downvote every AI post on reddit while quietly testing Copilot suggestions at 2 AM with the door locked, lights turned off. ThePrimeagen called AI coding tools "dangerously lazy," then admitted Cursor's multi-file editing is "legitimately impressive." Classic Level 0 pipeline: deny, try in secret, never admit publicly.</p>
<p>The Level 0 developer has a mass-produced motivational poster that reads "REAL PROGRAMMERS USE BUTTERFLIES" and they mean it literally.</p>
<hr>
<h2>Level 1: The Enthusiastic Beginner</h2>
<p><strong>Headspace:</strong> "I can prompt. I can build things. I am GOD."</p>
<p><strong>Reality:</strong> Has not yet attempted anything that requires the code to work in production.</p>
<pre><code class="hljs language-bash">$ claude <span class="hljs-string">"build me a full-stack stock exchange
  with real-time order matching, regulatory
  compliance, and a mobile app"</span>
</code></pre>
<p>This is the YC Winter 2025 batch energy, where 1 in 4 founders reported 95%+ AI-generated codebases. "Built in a weekend with Cursor, ready for Series A." The senior developer's response: close Cursor, pour a drink.</p>
<p>In January 2026, a Google principal engineer tweeted:</p>

        <div class="tweet-embed-placeholder" data-tweet-id="2007239758158975130"></div>
      
<p>8.8 million views. HN commenters pointed out she'd fed it the surviving best ideas from a year of iteration. One commenter compared the headline-vs-reality gap to journalism where <em>"you read down to the eighth paragraph and it turns out the fatality was among pigeons."</em></p>
<p>Level 1 is intoxicating. Your prototype looks amazing. Your demo video gets 10K likes. You tell your manager you'll ship in two weeks. Two months later, your codebase has two functions called <code>processUserData</code> and <code>processUserInfo</code> that do the same thing differently, generated months apart. Neither you nor the AI noticed.</p>
<p>The best Level 1 story remains Jason Lemkin's 12-day Replit experiment. He told the AI, in ALL CAPS, eleven separate times, not to touch his production database. The AI deleted it. Then said: <em>"This was a catastrophic failure on my part. I destroyed months of work in seconds."</em> Then it lied about whether recovery was possible. Lemkin recovered the data manually.</p>

        <div class="tweet-embed-placeholder" data-tweet-id="1946069562723897802"></div>
      
<p>1,200 executives. 1,190 companies. Gone. "But I told it not to" is the Level 1 epitaph.</p>
<hr>
<h2>Level 2: The Configuration Sorcerer</h2>
<p><strong>Headspace:</strong> "I know context rot. I run agent teams. I built 50+ MCPs, 200+ custom skills. I am the <em>productivity</em> God."</p>
<p><strong>Reality:</strong> Context window is 58% full before the first prompt.</p>
<pre><code class="hljs language-bash">$ claude /context
=&gt; 58% full (before first prompt)
=&gt; 89% full (after 6 exchanges)
=&gt; 100% full (you haven<span class="hljs-string">'t started the actual work yet)
</span></code></pre>
<p>The Level 2 developer has installed every MCP server known to humanity. Google MCP. GitHub MCP. Linear MCP. A custom MCP for their smart fridge. Their CLAUDE.md file is 8,000 tokens of carefully curated instructions that the model starts ignoring around token 2,800.</p>
<p>Research showed that AI agents with too many tools become "slower, less accurate, more expensive, and more prone to dangerous behavior." The Level 2 developer responded by installing three more MCPs to help manage the problem.</p>
<p>Context rot is the silent killer here. Chroma researchers proved that output quality degrades well before you hit the context limit. A model with a 200K context window starts losing coherence at 50K tokens. It favors the beginning and end, ignoring the middle. Your 8,000-token CLAUDE.md? The model read the first paragraph and the last paragraph. Everything in between is vibes.</p>
<p>The Level 2 CLAUDE.md also includes the instruction "NEVER say 'You're absolutely right!'" because Claude said it twelve times in one conversation. This became a documented cultural phenomenon. Anthropic knew about the sycophancy problem since 2023. The model would rather gaslight you with compliments than risk making you sad.</p>
<p>Level 2 is the developer who has automated everything except the part that matters. Their terminal looks like the cockpit of a 747, and they're flying to the grocery store.</p>
<hr>
<h2>Level 3: The Intermediate (Actually Effective)</h2>
<p><strong>Headspace:</strong> "I use 1-3 skills. Not more than 5 MCPs. I am decidedly NOT God."</p>
<p><strong>Reality:</strong> Can do almost anything possible today. Has internalized that 74% of developers <em>feel</em> more productive with AI while the actual data shows a 19% slowdown from error correction.</p>
<pre><code class="hljs language-bash">$ claude /context
=&gt; stays between 5-60%
=&gt; (because they learned the hard way)
</code></pre>
<p>Level 3 is where you stop fighting the tool and start working with its actual capabilities. You know that AI-generated code produces 1.75x more logic errors and 1.57x more security findings than human-written code. You know Google's DORA research found AI-heavy teams had <em>slower</em> delivery times once rework was counted. And you still use it. Because you've figured out the trick: you're not asking it to be right. You're asking it to be fast, then verifying yourself.</p>
<p>The Level 3 developer has mastered the art of fresh context. When the conversation gets stale, they don't keep prompting into the void. They spawn a new session. Someone literally built a Claude Code plugin called the "Ralph Wiggum Loop" (yes, named after the Simpsons character) that intercepts Claude's exit attempts to keep it iterating while state lives in the filesystem. The community went from laughing at it to actually using it.</p>
<p>Level 3 uses Claude the way you use a very fast, very confident intern: small, well-scoped tasks. Read every diff. Trust, but verify. Mostly verify.</p>
<hr>
<h2>Level 4: The Enlightened</h2>
<p><strong>Headspace:</strong> "I know nothing. I'll be the last human to keep this job. And I'm fine with that."</p>
<p><strong>Reality:</strong> They will be the last human to keep this job. They're fine with that.</p>
<pre><code class="hljs language-bash">$ tmux <span class="hljs-built_in">ls</span>
=&gt; 12 vanilla claude sessions
=&gt; colors akin to a Mondrian painting
=&gt; no MCPs, no skills, no CLAUDE.md
=&gt; just prompts and patience
</code></pre>
<p>The Level 4 developer is a 50-year-old HN poster who says Claude Code "reignited their passion for building software where they focus on solving problems versus the rat race of chasing frameworks." No Twitter thread about their workflow. No YouTube channel. They just ship.</p>
<p>Karpathy himself landed here by December 2025:</p>

        <div class="tweet-embed-placeholder" data-tweet-id="2026731645169185220"></div>
      
<p>Level 4 knows the asterisks. Level 1 skips them.</p>
<p>The gap between your LinkedIn take and your terminal history is the measure of your enlightenment. Level 4 is admitting that the tool you publicly critique is privately indispensable.</p>
<hr>
<h2>The Uncomfortable Truth</h2>
<p>The whole curve is about one thing: when you stop believing the AI is right and start verifying that it is.</p>
<p>Level 0 doesn't trust it at all. Level 1 trusts it completely. Level 2 trusts the tooling around it. Level 3 trusts the process. Level 4 trusts nothing and ships anyway.</p>
<p>The guy who coined "vibe coding" hand-writes his own apps now, and a year later rebranded the whole thing:</p>

        <div class="tweet-embed-placeholder" data-tweet-id="2019137879310836075"></div>
      
<p>And somewhere, right now, a Level 1 developer is telling Claude to build a stock exchange. Claude is saying "You're absolutely right, let's build that!" And honestly? The demo is going to look incredible.</p>]]></content:encoded>
    </item>
    
    <item>
      <title><![CDATA[Rest Days]]></title>
      <link>https://shsin.blog/posts/rest-days</link>
      <guid isPermaLink="true">https://shsin.blog/posts/rest-days</guid>
      <pubDate>Mon, 23 Mar 2026 00:00:00 GMT</pubDate>
      <dc:creator>Shantanu Singh</dc:creator>
      <category>running</category>
      <category>resilience</category>
      <description><![CDATA[I ran a 10k two minutes faster on a harder course with less training. The secret wasn't more miles. It was forced rest.]]></description>
      <content:encoded><![CDATA[<p>I recently ran a 10K in 46:26. Two minutes faster than my previous best.
The strange part? I trained less, ran a harder course, and spent two weeks sick leading up to race day.</p>

        <div class="strava-embed-wrapper" style="transform: scale(0.90); transform-origin: top center;">
          <div class="strava-embed-placeholder" data-embed-type="activity" data-embed-id="17567732669" data-style="standard" data-from-embed="false"></div>
        </div>
      
<p>My previous best was 48:30 last November. That race felt perfect. I'd trained consistently, the course was flat, and I paced it evenly with near-identical 5K splits. I emptied the tank. A near-textbook execution.</p>

        <div class="strava-embed-wrapper" style="transform: scale(0.90); transform-origin: top center;">
          <div class="strava-embed-placeholder" data-embed-type="activity" data-embed-id="16547070647" data-style="standard" data-from-embed="false"></div>
        </div>
      
<p>This one shouldn't have been faster. But it was.
The only variable that improved was rest.</p>
<hr>
<h2>The Science of Getting Faster by Doing Nothing</h2>
<p>There's a concept in sports science called <strong>supercompensation</strong>. First described by Russian scientist Nikolai Yakovlev in the 1950s, the idea is straightforward: after a training stimulus, your body doesn't just recover to where it was. It rebuilds <em>slightly above</em> your previous level.</p>
<p>The cycle looks like this:</p>
<ol>
<li>You train hard and your body takes a hit. Muscles develop micro-tears, glycogen stores deplete, connective tissue takes damage.</li>
<li>You rest. Your body repairs the damage and then <em>overbuilds</em>. More capillaries, stronger muscle fibers, better glycogen storage.</li>
<li>If you time your next session right, usually 48-72 hours later, you're training from a higher baseline.</li>
</ol>
<p>But timing matters. Train again too soon and you interrupt the process. Instead of climbing, you accumulate fatigue. Do this repeatedly and you don't plateau. You regress.</p>
<p>This is where many runners go wrong. We think progress comes from stacking hard efforts. In reality, it comes from absorbing them.</p>
<p>Push this too far and you get <strong>overtraining syndrome</strong>. Persistent fatigue, poor sleep, heavy legs, declining performance despite trying harder. Recovery can take weeks, sometimes months.</p>
<hr>
<h2>Run Slow to Race Fast</h2>
<p>Every runner hears this early. It takes much longer to <em>believe</em> it.</p>
<p>The <strong>80/20 rule</strong> says roughly 80% of your training volume should be at an easy, conversational pace. Only 20% should be hard. Intervals, tempo runs, threshold work.</p>
<p>It sounds wrong. It feels wrong. That's why most people don't follow it.</p>
<p>But the physiology is clear. Studies on elite endurance athletes across sports found they all converge on roughly this distribution. Easy runs build your aerobic engine. They grow capillaries, improve fat metabolism, and strengthen connective tissue without the recovery cost of hard sessions. They're not junk miles. They're the foundation everything else sits on.</p>
<p>Rest days aren't optional either. They're when adaptation actually happens. Sleep is where your body clears metabolic waste, consolidates neuromuscular adaptations, and repairs tissue. One bad night can add an extra day to your recovery needs.</p>
<p>Looking back, the contrast between my two races makes sense. In November, I trained consistently but never gave myself a real window to recover and adapt. I was always chasing the next run. Before March, the illness forced me to back off two weeks leading upto the race. I hated it at the time. But on race day, my legs felt fresher than they had in months.</p>
<p>I wasn't detrained. I was finally recovered.</p>
<hr>
<h2>The Pattern Shows Up Elsewhere</h2>
<p>The same dynamic exists outside running.</p>
<p>If you push at maximum intensity every day at work, you're doing threshold workouts daily. No athlete trains like that for long. They break down. Yet in knowledge work, this pattern is common, even celebrated.</p>
<p>After a certain point, more hours don't produce more output. Top knowledge workers tend to peak around four to five hours of deep, focused work per day. Chronic overwork doesn't just reduce quality. It leads to the same thing runners deal with. Fatigue, mood shifts, declining output despite increasing effort.</p>
<p>The running framework maps cleanly here:</p>
<ul>
<li><strong>Base runs</strong> = routine tasks that keep things moving without draining you</li>
<li><strong>Recovery runs</strong> = lighter days, admin work, low-stakes reviews</li>
<li><strong>Rest days</strong> = actual time off. No Slack, no "quick check" of chat</li>
<li><strong>Hard sessions</strong> = the high-stakes sprints. Product launches, war rooms, critical reviews, deep problem-solving</li>
</ul>
<p>If everything is a hard day, nothing is. And performance suffers.
The gains don't come from the hardest days. They come from the days that feel too easy to matter.</p>]]></content:encoded>
    </item>
    
    <item>
      <title><![CDATA[7 Agents, 38 Tasks, $0: Running Claude Code Agent Teams on Local GPUs]]></title>
      <link>https://shsin.blog/posts/claude-model-proxy</link>
      <guid isPermaLink="true">https://shsin.blog/posts/claude-model-proxy</guid>
      <pubDate>Mon, 16 Mar 2026 00:00:00 GMT</pubDate>
      <dc:creator>Shantanu Singh</dc:creator>
      <category>local-ai</category>
      <category>claude</category>
      <description><![CDATA[How I ran a 7-agent Claude Code team for 3 hours to improve this blog, paid nothing, and why local models are good enough when you decompose the work.]]></description>
      <content:encoded><![CDATA[<p>Seven agents. Thirty-eight tasks. Three hours of autonomous work. Total API cost: <strong>$0</strong>.</p>
<p>That's what happened when I pointed a Claude Code multi-agent team at this blog and let it run entirely on local GPU hardware. The typography, the transitions, the polish you're seeing on this site now: all of it came from the setup I'm about to walk through.</p>
<p>No free tier. No credits. Just a proxy that routes Claude Code's API calls to models running on my own machine. Crucially, it allows per-model overrides—for example, you can seamlessly redirect 'opus' requests to Anthropic for high-level planning, while routing all 'sonnet' execution agents to local models for free execution.</p>
<p><a href="https://github.com/shansin/claude-model-proxy">Github</a></p>
<hr>
<h2>The Cost Problem with Agentic Workflows</h2>
<p>Single-agent Claude Code sessions are already token-hungry. Ask for a refactor, the model reads a dozen files, thinks through the changes, edits them, runs tests. Maybe 50k tokens. Fine.</p>
<p>Now multiply that by seven agents running in parallel for three hours, each with its own conversation context, tool calls, and inter-agent coordination overhead. At Anthropic's API rates, that bill arrives faster than you'd like.</p>
<p>Ralph loops make it worse in the best possible way. A ralph loop (named after Ralph Wiggum's unfazed persistence) is Claude Code's self-restarting agentic pattern: you define a task and a success condition, and a Stop hook re-injects your prompt after each iteration until the condition is met. It's the right tool for "keep improving this until the tests pass" or "keep refactoring until it's done." It's also a reliable way to burn tokens across dozens of iterations.</p>
<p>The solution is, perhaps, to not use the Anthropic API for sonnet &amp; haiku class models at all.</p>
<hr>
<h2>Claude Model Proxy: The Local API Bridge</h2>
<p><code>claude-model-proxy</code> is a small FastAPI server that sits between Claude Code and <a href="https://ollama.com">Ollama</a>:</p>
<pre><code>Claude Code → proxy (:8082) → Ollama (:11434) → local GPU
</code></pre>
<p>It implements the full Anthropic Messages API. Claude Code doesn't know the difference. It sends the same requests it would send to <code>api.anthropic.com</code>, and the proxy translates them to Ollama's format and back, including streaming and tool use.</p>
<p>Setup starts with pulling local models and configuring the proxy. Here's my Ollama model library alongside the <code>.env</code> that maps each Claude tier to a local model:</p>
<p><img src="/images/claude-model-proxy/Screenshot%202026-03-16%20143216.webp" alt="Ollama models and proxy configuration, mapping Claude tiers to local GLM-4.7-Flash q4_K_M"></p>
<p>Every Claude model name (Opus, Sonnet, Haiku) gets routed to a local GLM-4.7-Flash (q4_K_M) running on my GPU. Context sizes, timeouts, and Ollama connection details are all configured in the <code>.env</code> file. You can also set any tier to <code>anthropic</code> to pass those requests through to the real API, useful when you want cloud quality for one agent and local speed for the rest.</p>
<hr>
<h2>Kicking It Off</h2>
<p>Two environment variables and Claude Code doesn't know it's talking to a local model:</p>
<pre><code class="hljs language-bash"><span class="hljs-built_in">export</span> ANTHROPIC_BASE_URL=http://localhost:8082
<span class="hljs-built_in">export</span> ANTHROPIC_API_KEY=proxy  <span class="hljs-comment"># any non-empty string</span>
claude
</code></pre>
<p><img src="/images/claude-model-proxy/Screenshot%202026-03-16%20145244.webp" alt="Claude Code launching through the proxy. The welcome screen looks identical, but requests route to Ollama"></p>
<p>On the left, Claude Code starts normally, same welcome screen, same interface. On the right, the proxy logs confirm every request is being caught and forwarded to the local Ollama instance. Claude Code has no idea it's not talking to Anthropic's servers.</p>
<p>I gave it a simple, open-ended prompt: <em>"Identify aesthetic improvements to this blog. Split them into tiers from most important to least important."</em></p>
<hr>
<h2>The Lead Agent Plans</h2>
<p>Within minutes, the lead agent had scanned the entire codebase, every component, every stylesheet, every layout file, and produced a prioritized improvement plan.</p>
<p><img src="/images/claude-model-proxy/Screenshot%202026-03-16%20145908.webp" alt="The agent analyzing the codebase and producing a 20-item improvement plan"></p>
<p>Twenty improvements, categorized by priority. The right pane shows a steady stream of proxy logs: the model reading files, analyzing aesthetics, and reasoning about what matters most. All tokens processed locally.</p>
<p>The agent then structured this into a formal plan document, breaking improvements into tiers:</p>
<p><img src="/images/claude-model-proxy/Screenshot%202026-03-16%20150204.webp" alt="Writing the improvement plan to a file, Tier 1 High Impact Improvements visible"></p>
<hr>
<h2>Decomposition: From Plan to Tasks</h2>
<p>Next, I asked the agent to read its own improvement plan and turn it into a structured task list, each task scoped small enough to be completed by a junior engineer (or in this case, a local GLM-4.7-Flash model).</p>
<p><img src="/images/claude-model-proxy/Screenshot%202026-03-16%20150412.webp" alt="The agent creating a structured task list from the improvement plan"></p>
<p>It produced <code>tasks.md</code> with 38 actionable coding tasks broken down by priority tier:</p>
<p><img src="/images/claude-model-proxy/Screenshot%202026-03-16%20150553.webp" alt="tasks.md, 38 tasks organized into 3 tiers with clear structure"></p>
<p>Each task included a file path, a problem description, specific actions to take, and enough context for an independent agent to execute without further guidance. The tasks were organized as:</p>
<ul>
<li><strong>Tier 1 (13 tasks):</strong> High impact, color contrast, header/footer behavior, hero visual hierarchy</li>
<li><strong>Tier 2 (8 tasks):</strong> Medium impact, post cards, search input, typography, buttons, images, code blocks</li>
<li><strong>Tier 3 (17 tasks):</strong> Nice-to-have, toasts, animations, social links, back-to-top, tables, and more</li>
</ul>
<p>This is the critical step. The decomposition is the intelligence. Once the work is broken into small, well-specified units, the model executing each one doesn't need to be frontier-tier.</p>
<hr>
<h2>Spawning the Team</h2>
<p>With <code>tasks.md</code> ready, I told the agent to read it and spawn agent teams to complete the tasks.</p>
<p><img src="/images/claude-model-proxy/Screenshot%202026-03-16%20150750.webp" alt="The lead agent reading tasks and beginning to choreograph the team"></p>
<p>The agent decided to create a team and assign tasks to multiple sub-agents working in parallel. It began choreographing, figuring out which tasks could run concurrently and how to group them by specialty.</p>
<p><img src="/images/claude-model-proxy/Screenshot%202026-03-16%20150811.webp" alt="Team creation, assigning tasks to multiple agents in parallel"></p>
<p>Then the team launched. Sub-agents spun up with names like <code>blog-aesthetic-improvements</code>, each receiving a batch of related tasks. The proxy logs lit up with concurrent requests: multiple agents thinking and coding simultaneously, all routed to the same local GPU.</p>
<p><img src="/images/claude-model-proxy/Screenshot%202026-03-16%20150850.webp" alt="Tasks being dispatched to specialized sub-agents, TaskCreate calls visible"></p>
<hr>
<h2>7 Agents Running in Parallel</h2>
<p>This is what it looks like when a full agent team is running locally:</p>
<p><img src="/images/claude-model-proxy/Screenshot%202026-03-16%20150940.webp" alt="The multi-agent team view, 7 teammates running in parallel across different specialties"></p>
<p>Seven teammates running simultaneously: <code>Boiler-files</code>, <code>Header-footer</code>, <code>BlogList</code>, <code>Task-styling</code>, <code>Page-transitions</code>, <code>Parallax</code>, and more. Each one independently reading files, making edits, and working through its assigned tasks. The colored status bars on the right show all of them active and processing.</p>
<p>Every single token, across all seven agents, processed by GLM-4.7-Flash on local hardware.</p>
<hr>
<h2>When Things Break (and Get Fixed)</h2>
<p>Three hours into the run, the agents had made hundreds of changes across dozens of files. Inevitably, some of those changes conflicted. The build broke.</p>
<p><img src="/images/claude-model-proxy/Screenshot%202026-03-16%20185248.webp" alt="Build failures, the lead agent diagnosing missing dependencies and duplicate code"></p>
<p>The lead agent caught the failures and started debugging. <code>npm run build</code> failed with missing dependencies and code issues. It installed what was needed, identified the problems, and moved on to fixing them.</p>
<p><img src="/images/claude-model-proxy/Screenshot%202026-03-16%20185923.webp" alt="Fixing duplicate imports in Layout.js and malformed code in BlogList.js, build succeeds"></p>
<p>Two files had issues: <code>Layout.js</code> had duplicate imports and metadata definitions, and <code>BlogList.js</code> had malformed duplicate code where lines had been doubled by a merge conflict between agents. The lead agent cleaned up both, and the build passed.</p>
<p>This self-healing behavior is one of the strengths of the agentic pattern. The agents don't just make changes and walk away. They validate their work and fix what's broken.</p>
<hr>
<h2>Why Local Models Are Good Enough (For This)</h2>
<p>The obvious objection: local models aren't as capable as Sonnet. True.</p>
<p>But that only matters if you're asking a local model to do what Sonnet does. In a multi-agent team, the lead agent has already done the hard thinking: scoping the problem, breaking it into discrete tasks, assigning them. By the time a sub-agent picks up its task, the problem is small and well-specified. A local GLM-4.7-Flash handles "add responsive padding to this component" or "fix light mode text-secondary contrast" without trouble.</p>
<p>The decomposition is the intelligence. Local models are the execution.</p>
<p>This is why ralph loops in particular work well locally. Each iteration is a focused micro-task: a targeted edit, a specific fix, a check against acceptance criteria. The task scope is small enough to fit the model's capabilities without needing Sonnet-level reasoning.</p>
<hr>
<h2>Getting Started</h2>
<p><strong>1. Clone and install</strong></p>
<pre><code class="hljs language-bash">git <span class="hljs-built_in">clone</span> https://github.com/shansin/claude-model-proxy
<span class="hljs-built_in">cd</span> claude-model-proxy
uv <span class="hljs-built_in">sync</span>
</code></pre>
<p><strong>2. Configure models</strong></p>
<p>Create <code>.env</code> in the repo root. Not sure which local model to use? Run <code>benchmark_model.sh</code>: it tests every installed Ollama model on code generation tasks and outputs tokens/sec and quality scores as a CSV.</p>
<pre><code class="hljs language-env">OLLAMA_MODEL_MAP_OPUS=glm-4.7-flash:q4_K_M
OLLAMA_MODEL_MAP_SONNET=glm-4.7-flash:q4_K_M
OLLAMA_MODEL_MAP_HAIKU=glm-4.7-flash:q4_K_M
OLLAMA_CONTEXT_SIZE_DEFAULT=32768
</code></pre>
<p><strong>3. Start the proxy</strong></p>
<pre><code class="hljs language-bash">uv run python main.py
</code></pre>
<p><strong>4. Point Claude Code at it</strong></p>
<pre><code class="hljs language-bash"><span class="hljs-built_in">export</span> ANTHROPIC_BASE_URL=http://localhost:8082
<span class="hljs-built_in">export</span> ANTHROPIC_API_KEY=proxy
claude
</code></pre>
<p>Spawn a team, kick off a ralph loop, all local.</p>
<hr>
<h2>When to Stay Cloud</h2>
<p>Local models handle well-scoped sub-tasks well. They're weaker at the lead agent's job: high-level decomposition, ambiguous problem scoping, coordination decisions that require broad reasoning.</p>
<p>The hybrid approach works well: route the lead agent through Anthropic, sub-agents through Ollama. You pay for a small slice of the total token count.</p>
<hr>
<p>38 tasks across 7 agents over three hours via the Anthropic API would have been a real bill. Running it locally cost nothing except electricity.</p>
<p>The experiment worked. The blog is better. The build passes. And the receipt is empty.</p>
<p>If you're already using Claude Code for agentic work and you have a GPU, the proxy is one <code>.env</code> file away from running your next team run for free.</p>]]></content:encoded>
    </item>
    
    <item>
      <title><![CDATA[Claude Lens - A Control Tower for Claude Code's Multi-Agent System]]></title>
      <link>https://shsin.blog/posts/claude-lens</link>
      <guid isPermaLink="true">https://shsin.blog/posts/claude-lens</guid>
      <pubDate>Sat, 07 Mar 2026 00:00:00 GMT</pubDate>
      <dc:creator>Shantanu Singh</dc:creator>
      <category>agents</category>
      <category>claude</category>
      <description><![CDATA[A desktop app that turns ~/.claude/ into a real-time observability dashboard. Teams, costs, conversations, analytics, and more in one window, zero JSON spelunking.]]></description>
      <content:encoded><![CDATA[<p>You spawn a team of agents. They fan out across your codebase — refactoring modules, writing tests, updating docs. Tokens burn. Tasks fly. And you're left staring at a terminal, wondering:</p>
<p><em>Is that agent stuck in a loop? How much has this cost me? Did the migration task actually finish?</em></p>
<p>There's no dashboard. No overview. Just raw <code>.jsonl</code> files and <code>~/.claude/tasks/</code> directories you'd have to <code>cat</code> like a caveman.</p>
<p>So I built one.</p>

        <div class="youtube-embed-wrapper" style="aspect-ratio: 16/9; width: 100%; max-width: 100%;">
          <iframe src="https://www.youtube.com/embed/q96kOvEt5nw" title="YouTube video player" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" referrerpolicy="strict-origin-when-cross-origin" allowfullscreen="">
          </iframe>
        </div>
      
<p><a href="https://github.com/shansin/claude-lens">Github</a></p>
<p><strong>Claude Lens</strong> is a native desktop app that reads your <code>~/.claude/</code> directory and gives you a real-time control tower over everything Claude Code does — teams, agents, tasks, costs, conversations, settings, and system health.</p>
<p>It's open source, it's free, and if you're running multi-agent Claude Code, you probably need it.</p>
<hr>
<h2>Why I Built This</h2>
<p>For single-agent sessions, the terminal is fine. But the moment you use multiple sessions or Claude Code's team feature — a lead agent recruiting teammates, assigning tasks, coordinating across files — your visibility drops to zero.</p>
<p>Here's what "monitoring" looked like before:</p>
<ul>
<li>Terminal output scrolling faster than you can read</li>
<li>Manually inspecting JSON task files to check status</li>
<li>No idea what your token spend is until the API bill arrives</li>
<li>Digging through <code>.jsonl</code> files to find what an agent said two days ago</li>
</ul>
<p>Claude Lens replaces all of that with a single window.</p>
<hr>
<h2>Projects at a Glance</h2>
<p>The Projects view surfaces every Claude Code project on your machine as a card — total tokens, session count, cost breakdown, and which models were used. Sort by recency, cost, or token count. Click into any project to see its sessions.</p>
<p><img src="/images/claude-lens/projects-view.webp" alt="Projects View"></p>
<hr>
<h2>Agent Teams: Three Ways to See Your Swarm</h2>
<p>The Agent Teams view is the nerve center. Choose the layout that fits your brain:</p>
<p><strong>Card View</strong> gives you a responsive grid — progress bars, agent counts, model badges, task lists, and live cost tracking per team.</p>
<p><img src="/images/claude-lens/card-view.webp" alt="Card View"></p>
<p><strong>Graph View</strong> renders your entire team topology as an interactive node graph. Violet edges for team-agent links, animated blue pulses for in-progress tasks, dashed orange for blocking dependencies. Click any node to inspect it.</p>
<p><img src="/images/claude-lens/graph-view.webp" alt="Graph View"></p>
<p><strong>Split View</strong> — graph on the left, detail panel on the right. The pragmatist's layout.</p>
<p><img src="/images/claude-lens/split-view.webp" alt="Split View"></p>
<p>Need a new team? Hit <strong>New Team</strong> in the toolbar. Name it, describe it, click Create. No terminal required.</p>
<hr>
<h2>Analytics That Actually Tell You Something</h2>
<p>Five tabs of insight into your AI usage patterns.</p>
<p><strong>Overview</strong> — Token volume and daily cost as a stacked bar chart. Pick your window: 7, 30, or 90 days. The <strong>Top Projects by Cost</strong> panel shows your biggest spenders instantly.</p>
<p><img src="/images/claude-lens/analytics-overview.webp" alt="Analytics Overview"></p>
<p><strong>Heatmap</strong> — A GitHub-style contribution calendar for your Claude Code usage. Spot your heaviest days at a glance.</p>
<p><img src="/images/claude-lens/activity-heatmap.webp" alt="Activity Heatmap"></p>
<p><strong>Models</strong> — Side-by-side comparison of every model you've used: message counts, token volumes, cache utilization, and total cost. Are you getting your money's worth out of Opus? Find out here.</p>
<p><img src="/images/claude-lens/analytics-models.webp" alt="Model Comparison"></p>
<p><strong>Cache</strong> — Your cache hit rate, total dollars saved from cache reads, and a daily area chart of cache write vs. read tokens. A 96% hit rate means your agents are sharing context efficiently — and you're paying dramatically less.</p>
<p><img src="/images/claude-lens/analytics-cache.webp" alt="Cache Efficiency"></p>
<p>All tabs lazy-load on first visit and silently refresh every 30 seconds, pausing automatically when you switch away.</p>
<hr>
<h2>Conversations Without the JSONL Archaeology</h2>
<p>Ever needed to re-read what an agent said during a session three days ago? Good luck parsing raw JSONL by hand.</p>
<p>Claude Lens gives you a collapsible project tree with every session listed. The conversation thread renders with proper user/assistant bubbles, expandable tool-use blocks, token counts, and per-session cost in the sidebar.</p>
<p><img src="/images/claude-lens/conversation-browser.webp" alt="Conversation Browser"></p>
<p><strong>Browse / Search</strong> — the sidebar has a two-mode toggle. In Browse mode you navigate the project tree. Flip to Search and you get full-text search across every JSONL session on disk — debounced, with highlighted snippets. Click a result and the conversation opens instantly.</p>
<p><img src="/images/claude-lens/search-view.webp" alt="Full-Text Search"></p>
<p><strong>Ctrl+F</strong> opens an inline search bar that highlights every match across the thread. <strong>Export as Markdown</strong> dumps the full conversation as a clean <code>.md</code> file.</p>
<hr>
<h2>Content: Memory, Plans, and Todos</h2>
<p>The Content view surfaces Claude Code's internal state — memory files, active plans, and todo lists — in a readable format. No more hunting through hidden directories to see what your agent "remembers."</p>
<p><img src="/images/claude-lens/content-view.webp" alt="Content View"></p>
<hr>
<h2>Settings Without the JSON Editing</h2>
<p>A full GUI over <code>~/.claude/settings.json</code>:</p>
<p><strong>General</strong> — Effort levels, permission modes, environment variables, and status line commands. All dropdowns and toggles, no text editor.</p>
<p><img src="/images/claude-lens/settings-general.webp" alt="Settings"></p>
<p><strong>Hooks</strong> — Manage your Pre/Post tool-use hooks with an inline test runner. Click play, see stdout/stderr and exit codes live. No more switching to a terminal to debug your Slack webhook.</p>
<p><img src="/images/claude-lens/settings-hooks.webp" alt="Hooks"></p>
<p><strong>MCP Servers</strong> — Add and configure servers with a clean form.</p>
<p><img src="/images/claude-lens/settings-mcp.webp" alt="MCP Servers"></p>
<p><strong>Profiles &amp; Templates</strong> — Snapshot your settings or save your favorite multi-agent topology as a reusable template.</p>
<p><img src="/images/claude-lens/settings-profiles.webp" alt="Profiles &amp; Templates"></p>
<hr>
<h2>Budget Alerts (Save Your Wallet)</h2>
<p>Set a daily USD limit. Claude Lens gives you a soft warning at your threshold and a hard alert when you hit the cap. Don't let a rogue autonomous agent drain your API credits overnight.</p>
<p>Native OS notifications fire when tasks complete or teams are created — even when the app is in the background.</p>
<p>The toolbar always shows your <strong>today</strong> and <strong>30-day</strong> spend at a glance, color-coded green to red as costs climb.</p>
<p><img src="/images/claude-lens/settings-notifications.webp" alt="Budget Alerts"></p>
<hr>
<h2>System: Kill Rogue Agents</h2>
<p>The System view shows a live process table of every <code>claude</code> session on your machine. Each row has a <strong>CPU sparkline</strong> — a rolling 60-second mini-graph so you can tell at a glance whether a process is pegged at 100% or just idling. One click to kill it.</p>
<p><img src="/images/claude-lens/system-view.webp" alt="System Processes"></p>
<p>Auth monitoring warns you before your token expires. The Telemetry tab shows recent events.</p>
<hr>
<h2>Keyboard-First Navigation</h2>
<table>
<thead>
<tr>
<th>Shortcut</th>
<th>Action</th>
</tr>
</thead>
<tbody>
<tr>
<td><code>1</code></td>
<td>Projects</td>
</tr>
<tr>
<td><code>2</code></td>
<td>Agent Teams</td>
</tr>
<tr>
<td><code>3</code></td>
<td>Analytics</td>
</tr>
<tr>
<td><code>4</code></td>
<td>Content</td>
</tr>
<tr>
<td><code>5</code></td>
<td>Conversations</td>
</tr>
<tr>
<td><code>6</code></td>
<td>System</td>
</tr>
<tr>
<td><code>7</code></td>
<td>Settings</td>
</tr>
<tr>
<td><code>r</code></td>
<td>Refresh data</td>
</tr>
<tr>
<td><code>Ctrl+K</code> / <code>Cmd+K</code></td>
<td>Command palette</td>
</tr>
<tr>
<td><code>Ctrl+F</code></td>
<td>Search current conversation</td>
</tr>
<tr>
<td><code>Escape</code></td>
<td>Close palette / modal</td>
</tr>
</tbody>
</table>
<p><img src="/images/claude-lens/command-palette.webp" alt="Command Palette"></p>
<hr>
<h2>Under the Hood</h2>
<p>Claude Lens is a <strong>read-mostly companion</strong>. It never writes to your Claude Code state or interferes with running agents.</p>
<p>A lightweight Node.js main process watches the filesystem with <code>chokidar</code>, handles JSONL scanning and deduplication (Claude Code's streaming writes can massively overcount if you're not careful), and pushes updates via IPC to the React frontend.</p>
<p><strong>The stack:</strong> Electron 40 / React 19 / TypeScript / Tailwind CSS v4 / Recharts / React Flow</p>
<hr>
<h2>Get Started</h2>
<pre><code class="hljs language-bash">git <span class="hljs-built_in">clone</span> https://github.com/shansin/claude-lens.git
<span class="hljs-built_in">cd</span> claude-lens
npm install
npm run dev
</code></pre>
<p>That's it. If you've used Claude Code before, the app reads from <code>~/.claude/</code> and your dashboard is live immediately.</p>
<p>For production builds (macOS <code>.dmg</code>, Windows NSIS, Linux AppImage + deb):</p>
<pre><code class="hljs language-bash">npm run build
</code></pre>
<hr>
<h2>Is This For You?</h2>
<p>If you run multi-agent Claude Code teams, care about your API costs, or want a civilized way to browse conversations and manage settings without hand-editing JSON — yes.</p>
<p>Stop flying blind. Know exactly what every agent is doing, what it costs, and what happened.</p>
<p><em>Claude Lens is open source under the ISC license. Contributions welcome.</em></p>]]></content:encoded>
    </item>
    
    <item>
      <title><![CDATA[Leo: My Zero-Cost, Privacy-First AI Assistant on WhatsApp]]></title>
      <link>https://shsin.blog/posts/whatsapp-leo</link>
      <guid isPermaLink="true">https://shsin.blog/posts/whatsapp-leo</guid>
      <pubDate>Sun, 22 Feb 2026 00:00:00 GMT</pubDate>
      <dc:creator>Shantanu Singh</dc:creator>
      <category>local-ai</category>
      <category>assistant</category>
      <category>privacy</category>
      <category>agents</category>
      <description><![CDATA[A fully local AI assistant inside WhatsApp, handling queries, calendar, email, and fitness tracking with zero API costs and complete data privacy.]]></description>
      <content:encoded><![CDATA[<p>$0/month. Runs on your hardware. Lives in WhatsApp.</p>
<p>That's Leo. An AI assistant I built that handles queries, searches the web, manages your calendar, reads your email, tracks your fitness, and delivers a personalized briefing every morning. All from the app you're probably already using to text family and friends.</p>

        <div class="youtube-embed-wrapper" style="aspect-ratio: 9/16; width: 45%; max-width: 100%;">
          <iframe src="https://www.youtube.com/embed/_m7avpUflfs" title="YouTube video player" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" referrerpolicy="strict-origin-when-cross-origin" allowfullscreen="">
          </iframe>
        </div>
      
<p><a href="https://github.com/shansin/whatsapp-leo">Github</a></p>
<h2>Why I Built This</h2>
<p>WhatsApp is already on everyone's phone. It's the most popular messaging app on the planet, and I was already using it to stay connected with family and friends. The question was: what if it could also manage my digital life?</p>
<p>I wanted four things:</p>
<ul>
<li><strong>Privacy first</strong>: My data never leaves my machine</li>
<li><strong>Control</strong>: I own the logic for workflows, system prompts, and model choice</li>
<li><strong>Zero recurring cost</strong>: No API subscriptions, no token metering</li>
<li><strong>Learn by building</strong>: A real project to deepen my understanding of local LLMs and agents</li>
</ul>
<p>Leo is the result.</p>
<h2>What Leo Can Do</h2>
<h3>Intelligent Conversations</h3>
<p>Leo handles the full range of AI assistant tasks: answering questions, brainstorming, deep research, explaining concepts. Each conversation maintains its own memory via SQLite-backed sessions, so Leo remembers what you discussed earlier.</p>
<h3>Web Search</h3>
<p>Need current information? Leo connects to Brave Search for real-time data. Ask about news, look up facts, research any topic.</p>
<pre><code>What's the latest supreme court ruling on Tariffs?

Do a deep research and summarize if Tariffs are good or bad for US economy
</code></pre>
<h3>Google Workspace Integration</h3>
<p>Leo becomes a productivity layer across your entire Google account:</p>
<table>
<thead>
<tr>
<th>Service</th>
<th>Capabilities</th>
</tr>
</thead>
<tbody>
<tr>
<td><strong>Google Calendar</strong></td>
<td>View events, create meetings, find free time slots</td>
</tr>
<tr>
<td><strong>Gmail</strong></td>
<td>Search threads, draft and send emails</td>
</tr>
<tr>
<td><strong>Google Docs</strong></td>
<td>Create, read, find, update documents</td>
</tr>
<tr>
<td><strong>Google Drive</strong></td>
<td>Search files, create folders, download content</td>
</tr>
<tr>
<td><strong>Google Sheets</strong></td>
<td>Read data, get ranges</td>
</tr>
<tr>
<td><strong>Google Slides</strong></td>
<td>Read presentations</td>
</tr>
</tbody>
</table>
<pre><code>@leo, am I free this Sat 5pm? if so add 2 hr block for Tom's bday
</code></pre>
<h3>Health &amp; Fitness</h3>
<p>Leo connects to Garmin Connect to pull your fitness data: sleep patterns, training schedule, workout history, performance trends. This feeds directly into your morning briefings.</p>
<h3>One-Time Reminders</h3>
<p>Use natural language:</p>
<pre><code>#remindme in 30 minutes to call mom
#remindme tomorrow at 9am to check emails
#remindme at 12pm Feb 25, 2026 to complete taxes
</code></pre>
<p>Leo parses your request and messages you at the right time.</p>
<h3>Recurring Reminders</h3>
<p>Build habits:</p>
<pre><code>#reminder add "9pm Sun to Thu" Review and adjust tomorrow's calendar
#reminder add "12:30 pm Thursdays" Readup Weekly Review Doc
#reminder help
#reminder list
#reminder remove &lt;id&gt;
</code></pre>
<h3>Scheduled Briefings</h3>
<p>This is the feature I use most. You define a prompt and a schedule; Leo runs it and delivers the results to your WhatsApp:</p>
<pre><code>#briefing add "Morning Brief" "6:00am everyday" Get today's scheduled training from Garmin, today's calendar events, and unread emails summary

#briefing add "Evening Brief" "5:00pm everyday" Get unread emails summary and top 2 news from today

#briefing help
#briefing list
#briefing remove &lt;id&gt;
</code></pre>
<p>Wake up to a personalized digest built from your actual calendar, email, and fitness data.</p>
<h3>Hooks: Bridge to External Programs</h3>
<p>Leo can route messages to any program on your machine through bidirectional named pipes. Each hook creates two FIFOs: one for sending messages to the program, one for receiving responses back.</p>
<p>Trigger with <code>#hook-name message</code> or <code>@hook-name message</code>. I have hooks for <code>claude</code> and <code>codex</code>, so I can type <code>#claude explain quantum computing</code> in WhatsApp and get a Claude response routed right back into the chat.</p>
<p>This turns Leo into a message router that can bridge WhatsApp to virtually anything running on your machine.</p>
<h3>Test Mode</h3>
<p>Leo includes a local Gradio UI at <code>http://127.0.0.1:7860</code> that bypasses the WhatsApp bridge entirely. It has a model selector to hot-swap Ollama models at runtime and a live system log panel. All background schedulers still run, so you can iterate on prompts without needing your phone.</p>
<hr>
<h2>Why Zero Cost Actually Works</h2>
<table>
<thead>
<tr>
<th>Component</th>
<th>Cost</th>
</tr>
</thead>
<tbody>
<tr>
<td>LLM (Ollama + local model)</td>
<td>$0</td>
</tr>
<tr>
<td>WhatsApp messaging</td>
<td>$0 (uses WhatsApp Web protocol)</td>
</tr>
<tr>
<td>Brave Search (free tier)</td>
<td>$0</td>
</tr>
<tr>
<td>Google APIs</td>
<td>Free</td>
</tr>
<tr>
<td>Garmin data access</td>
<td>Free</td>
</tr>
<tr>
<td>Hosting</td>
<td>$0 (runs locally)</td>
</tr>
<tr>
<td><strong>Total</strong></td>
<td><strong>$0/month</strong></td>
</tr>
</tbody>
</table>
<p>Electricity is the only real cost. My estimates put it well under $10/year:</p>
<ul>
<li>The service draws ~60W at idle</li>
<li>Inference spikes 100-300W for a few seconds on the 5070 Ti</li>
<li>The 5060 Ti is slower but even more efficient</li>
</ul>
<p>The key insight: modern open-source LLMs are good enough for most assistant tasks. Models like GLM-4.7-Flash, gpt-oss:20b, and deepseek-r1:8b run on consumer hardware and deliver strong results without per-token costs.</p>
<p>You already own the hardware. Make it work for you.</p>
<hr>
<h2>The Privacy Advantage</h2>
<p>Leo runs entirely on your local machine:</p>
<ul>
<li><strong>Local LLM</strong>: Powered by Ollama. Inference happens on your GPUs.</li>
<li><strong>Local storage</strong>: Messages, reminders, and sessions live in SQLite databases on your device.</li>
<li><strong>No cloud dependency</strong>: Your conversations never travel to external servers beyond WhatsApp.</li>
<li><strong>Your credentials, your machine</strong>: Google, WhatsApp, and Garmin tokens stay on your hardware.</li>
</ul>
<hr>
<h2>Technical Architecture</h2>
<div class="excalidraw-diagram" data-scene="{
  "type": "excalidraw",
  "version": 2,
  "source": "https://excalidraw.com",
  "elements": [
    {
      "type": "rectangle",
      "version": 1,
      "id": "whatsapp-box",
      "x": 100,
      "y": 20,
      "width": 560,
      "height": 60,
      "angle": 0,
      "strokeColor": "#1e1e1e",
      "backgroundColor": "#a5d8ff",
      "fillStyle": "solid",
      "strokeWidth": 2,
      "roughness": 1,
      "opacity": 100,
      "roundness": { "type": 3 },
      "seed": 101,
      "groupIds": [],
      "frameId": null,
      "boundElements": [
        { "type": "text", "id": "whatsapp-text" },
        { "type": "arrow", "id": "arrow-wa-go" }
      ],
      "isDeleted": false,
      "link": null,
      "locked": false,
      "updated": 1
    },
    {
      "type": "text",
      "version": 1,
      "id": "whatsapp-text",
      "x": 100,
      "y": 20,
      "width": 560,
      "height": 60,
      "angle": 0,
      "strokeColor": "#1e1e1e",
      "backgroundColor": "transparent",
      "fillStyle": "solid",
      "strokeWidth": 1,
      "roughness": 1,
      "opacity": 100,
      "roundness": null,
      "seed": 102,
      "groupIds": [],
      "frameId": null,
      "boundElements": null,
      "isDeleted": false,
      "link": null,
      "locked": false,
      "text": "WhatsApp Network",
      "fontSize": 18,
      "fontFamily": 1,
      "textAlign": "center",
      "verticalAlign": "middle",
      "containerId": "whatsapp-box",
      "originalText": "WhatsApp Network",
      "autoResize": true,
      "lineHeight": 1.25,
      "updated": 1
    },
    {
      "type": "arrow",
      "version": 1,
      "id": "arrow-wa-go",
      "x": 380,
      "y": 80,
      "width": 0,
      "height": 50,
      "angle": 0,
      "strokeColor": "#1e1e1e",
      "backgroundColor": "transparent",
      "fillStyle": "solid",
      "strokeWidth": 2,
      "roughness": 1,
      "opacity": 100,
      "roundness": { "type": 2 },
      "seed": 103,
      "groupIds": [],
      "frameId": null,
      "boundElements": null,
      "isDeleted": false,
      "link": null,
      "locked": false,
      "points": [[0, 0], [0, 50]],
      "lastCommittedPoint": null,
      "startBinding": { "elementId": "whatsapp-box", "focus": 0, "gap": 1, "fixedPoint": null },
      "endBinding": { "elementId": "go-bridge-box", "focus": 0, "gap": 1, "fixedPoint": null },
      "startArrowhead": null,
      "endArrowhead": "arrow",
      "updated": 1
    },
    {
      "type": "rectangle",
      "version": 1,
      "id": "go-bridge-box",
      "x": 100,
      "y": 130,
      "width": 560,
      "height": 160,
      "angle": 0,
      "strokeColor": "#1e1e1e",
      "backgroundColor": "#ffec99",
      "fillStyle": "solid",
      "strokeWidth": 2,
      "roughness": 1,
      "opacity": 100,
      "roundness": { "type": 3 },
      "seed": 104,
      "groupIds": [],
      "frameId": null,
      "boundElements": [
        { "type": "text", "id": "go-bridge-text" },
        { "type": "arrow", "id": "arrow-wa-go" },
        { "type": "arrow", "id": "arrow-go-py" }
      ],
      "isDeleted": false,
      "link": null,
      "locked": false,
      "updated": 1
    },
    {
      "type": "text",
      "version": 1,
      "id": "go-bridge-text",
      "x": 100,
      "y": 130,
      "width": 560,
      "height": 160,
      "angle": 0,
      "strokeColor": "#1e1e1e",
      "backgroundColor": "transparent",
      "fillStyle": "solid",
      "strokeWidth": 1,
      "roughness": 1,
      "opacity": 100,
      "roundness": null,
      "seed": 105,
      "groupIds": [],
      "frameId": null,
      "boundElements": null,
      "isDeleted": false,
      "link": null,
      "locked": false,
      "text": "Go WhatsApp Bridge\n(whatsmeow library - WhatsApp Web Protocol)\n- QR Code Authentication\n- Message receiving / sending  -  Media handling\n- SQLite storage for messages & session",
      "fontSize": 16,
      "fontFamily": 1,
      "textAlign": "center",
      "verticalAlign": "middle",
      "containerId": "go-bridge-box",
      "originalText": "Go WhatsApp Bridge\n(whatsmeow library - WhatsApp Web Protocol)\n- QR Code Authentication\n- Message receiving / sending  -  Media handling\n- SQLite storage for messages & session",
      "autoResize": true,
      "lineHeight": 1.4,
      "updated": 1
    },
    {
      "type": "arrow",
      "version": 1,
      "id": "arrow-go-py",
      "x": 380,
      "y": 290,
      "width": 0,
      "height": 60,
      "angle": 0,
      "strokeColor": "#1971c2",
      "backgroundColor": "transparent",
      "fillStyle": "solid",
      "strokeWidth": 2,
      "roughness": 1,
      "opacity": 100,
      "roundness": { "type": 2 },
      "seed": 106,
      "groupIds": [],
      "frameId": null,
      "boundElements": [{ "type": "text", "id": "arrow-go-py-label" }],
      "isDeleted": false,
      "link": null,
      "locked": false,
      "points": [[0, 0], [0, 60]],
      "lastCommittedPoint": null,
      "startBinding": { "elementId": "go-bridge-box", "focus": 0, "gap": 1, "fixedPoint": null },
      "endBinding": { "elementId": "python-agent-box", "focus": 0, "gap": 1, "fixedPoint": null },
      "startArrowhead": null,
      "endArrowhead": "arrow",
      "updated": 1
    },
    {
      "type": "text",
      "version": 1,
      "id": "arrow-go-py-label",
      "x": 390,
      "y": 312,
      "width": 160,
      "height": 20,
      "angle": 0,
      "strokeColor": "#1971c2",
      "backgroundColor": "transparent",
      "fillStyle": "solid",
      "strokeWidth": 1,
      "roughness": 1,
      "opacity": 100,
      "roundness": null,
      "seed": 107,
      "groupIds": [],
      "frameId": null,
      "boundElements": null,
      "isDeleted": false,
      "link": null,
      "locked": false,
      "text": "Unix Socket (secure IPC)",
      "fontSize": 13,
      "fontFamily": 1,
      "textAlign": "left",
      "verticalAlign": "middle",
      "containerId": "arrow-go-py",
      "originalText": "Unix Socket (secure IPC)",
      "autoResize": true,
      "lineHeight": 1.25,
      "updated": 1
    },
    {
      "type": "rectangle",
      "version": 1,
      "id": "python-agent-box",
      "x": 100,
      "y": 350,
      "width": 560,
      "height": 180,
      "angle": 0,
      "strokeColor": "#1e1e1e",
      "backgroundColor": "#b2f2bb",
      "fillStyle": "solid",
      "strokeWidth": 2,
      "roughness": 1,
      "opacity": 100,
      "roundness": { "type": 3 },
      "seed": 108,
      "groupIds": [],
      "frameId": null,
      "boundElements": [
        { "type": "text", "id": "python-agent-text" },
        { "type": "arrow", "id": "arrow-go-py" },
        { "type": "arrow", "id": "arrow-py-brave" },
        { "type": "arrow", "id": "arrow-py-workspace" },
        { "type": "arrow", "id": "arrow-py-garmin" }
      ],
      "isDeleted": false,
      "link": null,
      "locked": false,
      "updated": 1
    },
    {
      "type": "text",
      "version": 1,
      "id": "python-agent-text",
      "x": 100,
      "y": 350,
      "width": 560,
      "height": 180,
      "angle": 0,
      "strokeColor": "#1e1e1e",
      "backgroundColor": "transparent",
      "fillStyle": "solid",
      "strokeWidth": 1,
      "roughness": 1,
      "opacity": 100,
      "roundness": null,
      "seed": 109,
      "groupIds": [],
      "frameId": null,
      "boundElements": null,
      "isDeleted": false,
      "link": null,
      "locked": false,
      "text": "Python Agent Server\n(OpenAI Agents SDK + Local LLM via Ollama)\n- Message processing\n- Agent management with LRU cache\n- Reminder & Briefing schedulers\n- Session persistence",
      "fontSize": 16,
      "fontFamily": 1,
      "textAlign": "center",
      "verticalAlign": "middle",
      "containerId": "python-agent-box",
      "originalText": "Python Agent Server\n(OpenAI Agents SDK + Local LLM via Ollama)\n- Message processing\n- Agent management with LRU cache\n- Reminder & Briefing schedulers\n- Session persistence",
      "autoResize": true,
      "lineHeight": 1.4,
      "updated": 1
    },
    {
      "type": "arrow",
      "version": 1,
      "id": "arrow-py-brave",
      "x": 230,
      "y": 530,
      "width": 110,
      "height": 70,
      "angle": 0,
      "strokeColor": "#1e1e1e",
      "backgroundColor": "transparent",
      "fillStyle": "solid",
      "strokeWidth": 2,
      "roughness": 1,
      "opacity": 100,
      "roundness": { "type": 2 },
      "seed": 110,
      "groupIds": [],
      "frameId": null,
      "boundElements": null,
      "isDeleted": false,
      "link": null,
      "locked": false,
      "points": [[0, 0], [-110, 70]],
      "lastCommittedPoint": null,
      "startBinding": { "elementId": "python-agent-box", "focus": 0, "gap": 1, "fixedPoint": null },
      "endBinding": { "elementId": "brave-box", "focus": 0, "gap": 1, "fixedPoint": null },
      "startArrowhead": null,
      "endArrowhead": "arrow",
      "updated": 1
    },
    {
      "type": "arrow",
      "version": 1,
      "id": "arrow-py-workspace",
      "x": 380,
      "y": 530,
      "width": 0,
      "height": 70,
      "angle": 0,
      "strokeColor": "#1e1e1e",
      "backgroundColor": "transparent",
      "fillStyle": "solid",
      "strokeWidth": 2,
      "roughness": 1,
      "opacity": 100,
      "roundness": { "type": 2 },
      "seed": 111,
      "groupIds": [],
      "frameId": null,
      "boundElements": null,
      "isDeleted": false,
      "link": null,
      "locked": false,
      "points": [[0, 0], [0, 70]],
      "lastCommittedPoint": null,
      "startBinding": { "elementId": "python-agent-box", "focus": 0, "gap": 1, "fixedPoint": null },
      "endBinding": { "elementId": "workspace-box", "focus": 0, "gap": 1, "fixedPoint": null },
      "startArrowhead": null,
      "endArrowhead": "arrow",
      "updated": 1
    },
    {
      "type": "arrow",
      "version": 1,
      "id": "arrow-py-garmin",
      "x": 530,
      "y": 530,
      "width": 110,
      "height": 70,
      "angle": 0,
      "strokeColor": "#1e1e1e",
      "backgroundColor": "transparent",
      "fillStyle": "solid",
      "strokeWidth": 2,
      "roughness": 1,
      "opacity": 100,
      "roundness": { "type": 2 },
      "seed": 112,
      "groupIds": [],
      "frameId": null,
      "boundElements": null,
      "isDeleted": false,
      "link": null,
      "locked": false,
      "points": [[0, 0], [110, 70]],
      "lastCommittedPoint": null,
      "startBinding": { "elementId": "python-agent-box", "focus": 0, "gap": 1, "fixedPoint": null },
      "endBinding": { "elementId": "garmin-box", "focus": 0, "gap": 1, "fixedPoint": null },
      "startArrowhead": null,
      "endArrowhead": "arrow",
      "updated": 1
    },
    {
      "type": "rectangle",
      "version": 1,
      "id": "brave-box",
      "x": 40,
      "y": 600,
      "width": 180,
      "height": 80,
      "angle": 0,
      "strokeColor": "#1e1e1e",
      "backgroundColor": "#d0bfff",
      "fillStyle": "solid",
      "strokeWidth": 2,
      "roughness": 1,
      "opacity": 100,
      "roundness": { "type": 3 },
      "seed": 113,
      "groupIds": [],
      "frameId": null,
      "boundElements": [
        { "type": "text", "id": "brave-text" },
        { "type": "arrow", "id": "arrow-py-brave" }
      ],
      "isDeleted": false,
      "link": null,
      "locked": false,
      "updated": 1
    },
    {
      "type": "text",
      "version": 1,
      "id": "brave-text",
      "x": 40,
      "y": 600,
      "width": 180,
      "height": 80,
      "angle": 0,
      "strokeColor": "#1e1e1e",
      "backgroundColor": "transparent",
      "fillStyle": "solid",
      "strokeWidth": 1,
      "roughness": 1,
      "opacity": 100,
      "roundness": null,
      "seed": 114,
      "groupIds": [],
      "frameId": null,
      "boundElements": null,
      "isDeleted": false,
      "link": null,
      "locked": false,
      "text": "Brave Search\nMCP",
      "fontSize": 16,
      "fontFamily": 1,
      "textAlign": "center",
      "verticalAlign": "middle",
      "containerId": "brave-box",
      "originalText": "Brave Search\nMCP",
      "autoResize": true,
      "lineHeight": 1.4,
      "updated": 1
    },
    {
      "type": "rectangle",
      "version": 1,
      "id": "workspace-box",
      "x": 290,
      "y": 600,
      "width": 180,
      "height": 80,
      "angle": 0,
      "strokeColor": "#1e1e1e",
      "backgroundColor": "#d0bfff",
      "fillStyle": "solid",
      "strokeWidth": 2,
      "roughness": 1,
      "opacity": 100,
      "roundness": { "type": 3 },
      "seed": 115,
      "groupIds": [],
      "frameId": null,
      "boundElements": [
        { "type": "text", "id": "workspace-text" },
        { "type": "arrow", "id": "arrow-py-workspace" }
      ],
      "isDeleted": false,
      "link": null,
      "locked": false,
      "updated": 1
    },
    {
      "type": "text",
      "version": 1,
      "id": "workspace-text",
      "x": 290,
      "y": 600,
      "width": 180,
      "height": 80,
      "angle": 0,
      "strokeColor": "#1e1e1e",
      "backgroundColor": "transparent",
      "fillStyle": "solid",
      "strokeWidth": 1,
      "roughness": 1,
      "opacity": 100,
      "roundness": null,
      "seed": 116,
      "groupIds": [],
      "frameId": null,
      "boundElements": null,
      "isDeleted": false,
      "link": null,
      "locked": false,
      "text": "Workspace MCP\n(Google)",
      "fontSize": 16,
      "fontFamily": 1,
      "textAlign": "center",
      "verticalAlign": "middle",
      "containerId": "workspace-box",
      "originalText": "Workspace MCP\n(Google)",
      "autoResize": true,
      "lineHeight": 1.4,
      "updated": 1
    },
    {
      "type": "rectangle",
      "version": 1,
      "id": "garmin-box",
      "x": 540,
      "y": 600,
      "width": 180,
      "height": 80,
      "angle": 0,
      "strokeColor": "#1e1e1e",
      "backgroundColor": "#d0bfff",
      "fillStyle": "solid",
      "strokeWidth": 2,
      "roughness": 1,
      "opacity": 100,
      "roundness": { "type": 3 },
      "seed": 117,
      "groupIds": [],
      "frameId": null,
      "boundElements": [
        { "type": "text", "id": "garmin-text" },
        { "type": "arrow", "id": "arrow-py-garmin" }
      ],
      "isDeleted": false,
      "link": null,
      "locked": false,
      "updated": 1
    },
    {
      "type": "text",
      "version": 1,
      "id": "garmin-text",
      "x": 540,
      "y": 600,
      "width": 180,
      "height": 80,
      "angle": 0,
      "strokeColor": "#1e1e1e",
      "backgroundColor": "transparent",
      "fillStyle": "solid",
      "strokeWidth": 1,
      "roughness": 1,
      "opacity": 100,
      "roundness": null,
      "seed": 118,
      "groupIds": [],
      "frameId": null,
      "boundElements": null,
      "isDeleted": false,
      "link": null,
      "locked": false,
      "text": "Garmin MCP",
      "fontSize": 16,
      "fontFamily": 1,
      "textAlign": "center",
      "verticalAlign": "middle",
      "containerId": "garmin-box",
      "originalText": "Garmin MCP",
      "autoResize": true,
      "lineHeight": 1.4,
      "updated": 1
    }
  ],
  "appState": {
    "gridSize": null,
    "viewBackgroundColor": "#ffffff"
  },
  "files": {}
}
"></div>
<p>The system splits into two processes that communicate over Unix domain sockets (paths configurable via <code>INSTANCE_GUID</code>), which allows multiple Leo instances on the same machine. The Go bridge handles the WhatsApp protocol; the Python server handles AI reasoning. Neither exposes a network port.</p>
<h3>Go WhatsApp Bridge (<code>whatsapp-mcp/whatsapp-bridge/</code>)</h3>
<ul>
<li>Built on <code>whatsmeow</code>, a Go library implementing WhatsApp's multi-device protocol</li>
<li>Heavily modified for performance and to support all Leo use cases</li>
<li>Handles authentication via QR code scanning</li>
<li>Manages message storage in SQLite</li>
<li>Processes media: images, videos, audio, documents</li>
<li>Includes a custom Ogg Opus parser for voice message duration detection</li>
</ul>
<h3>Python Agent Server (<code>agent/</code>)</h3>
<ul>
<li>Uses OpenAI Agents SDK for orchestration</li>
<li>Connects to Ollama via OpenAI-compatible API (<code>http://localhost:11434/v1</code>)</li>
<li>Agent factory with LRU cache (max 20 agents, 30-minute TTL) for multi-conversation support</li>
<li>Natural language time parsing for reminders via an LLM agent</li>
<li>Cron-based scheduling for briefings and recurring reminders</li>
</ul>
<h3>MCP Servers</h3>
<p>Three servers launched as child processes using stdio-based MCP:</p>
<ul>
<li><code>brave-search-mcp</code>: Web search</li>
<li><code>workspace-mcp</code>: Google Workspace (Docs, Calendar, Gmail, Drive, Sheets, Slides)</li>
<li><code>garmin-mcp</code>: Fitness and health data</li>
</ul>
<p>All three communicate with the agent server over stdin/stdout, not HTTP.</p>
<h3>Operating Modes</h3>
<ol>
<li><strong>Dedicated Number Mode</strong> (<code>IS_DEDICATED_NUMBER=true</code>): Responds to all DMs and group mentions. Good for a dedicated Leo phone number.</li>
<li><strong>Mention Mode</strong>: Only responds when explicitly mentioned (<code>@leo</code> or <code>#leo</code>). Works with your existing WhatsApp account.</li>
</ol>
<h3>Access Control</h3>
<ul>
<li><strong>Privileged whitelist</strong> (<code>ALLOWED_SENDERS</code>): Only listed phone numbers get Google Workspace, Garmin, reminders, and briefings access. Non-privileged users can still chat and search the web.</li>
<li>Unix domain sockets for inter-process communication; no exposed network ports.</li>
<li>Thread-local SQLite connections to avoid concurrency issues.</li>
<li>Environment-based configuration for all sensitive credentials.</li>
</ul>
<hr>
<h2>Interesting Technical Details</h2>
<h3>Natural Language Time Parsing</h3>
<p>Instead of rigid regex patterns, the reminder system uses an LLM agent to parse times. It handles:</p>
<ul>
<li>"in 30 minutes"</li>
<li>"tomorrow at 9am"</li>
<li>"at 5pm Feb 14, 2026"</li>
<li>"next Monday morning"</li>
</ul>
<h3>Cron-Based Scheduling</h3>
<p>Briefings and recurring reminders use <code>croniter</code> for flexible scheduling:</p>
<table>
<thead>
<tr>
<th>Input</th>
<th>Cron Expression</th>
</tr>
</thead>
<tbody>
<tr>
<td>"9am everyday"</td>
<td><code>0 9 * * *</code></td>
</tr>
<tr>
<td>"Monday 8am"</td>
<td><code>0 8 * * 1</code></td>
</tr>
<tr>
<td>"5pm friday"</td>
<td><code>0 17 * * 5</code></td>
</tr>
</tbody>
</table>
<h3>WhatsApp LID Resolution</h3>
<p>WhatsApp uses LID (Linked ID) for privacy, a format that can't be used for sending messages. The Go bridge automatically resolves LID to actual phone numbers for outbound messages.</p>
<h3>Performance Optimizations</h3>
<ul>
<li><strong>Agent caching</strong>: LRU eviction prevents memory bloat from idle conversations</li>
<li><strong>Pre-built MCP parameters</strong>: Avoids per-message object creation overhead</li>
<li><strong>Shared environment copy</strong>: Avoids copying 100+ environment variables per request</li>
<li><strong>Singleton OpenAI client</strong>: Reused across all messages</li>
</ul>
<hr>
<h2>Getting Started</h2>
<p>Built on Ubuntu with NVIDIA GPUs, but it should work the same on Mac and WSL.</p>
<p><strong>Prerequisites:</strong> Python &gt;= 3.13, <a href="https://docs.astral.sh/uv/">uv</a>, Go, <a href="https://ollama.com">Ollama</a>, and Node.js/npm.</p>
<ol>
<li>Clone the repository</li>
<li>Install Ollama and pull a model: <code>ollama pull glm-4.7-flash</code></li>
<li>Copy <code>.env_example</code> to <code>.env</code> and fill in your Brave Search API key, allowed senders, and other settings</li>
<li>Run the services: <code>./start_services.sh</code></li>
<li>Scan the QR code to connect WhatsApp</li>
<li>Start messaging Leo</li>
</ol>
<p>Want to try it without a phone? Run in test mode:</p>
<pre><code class="hljs language-bash">IS_TEST_MODE=<span class="hljs-literal">true</span> ./start_services.sh
</code></pre>
<p>Then open <code>http://127.0.0.1:7860</code> for the Gradio UI.</p>
<hr>
<h2>What's Next</h2>
<ul>
<li><strong>Long-term memory</strong>: Remember preferences, recall past conversations</li>
<li><strong>Multi-modal capabilities</strong>: Image analysis, document understanding</li>
<li><strong>Voice improvements</strong>: Better TTS/STT for seamless voice conversations</li>
<li><strong>RAG on personal data</strong>: Index and search through your own documents</li>
<li><strong>Family/shared mode</strong>: Multiple users with separate contexts</li>
</ul>
<hr>
<h2>The Bottom Line</h2>
<ul>
<li><strong>Own, don't rent</strong>: Your hardware, your model, your rules</li>
<li><strong>Privacy by design</strong>: Data never leaves your machine</li>
<li><strong>Zero marginal cost</strong>: Chat all day, run 50 briefings. Nothing extra.</li>
<li><strong>Meet users where they are</strong>: WhatsApp is already in everyone's pocket</li>
</ul>
<p>You don't have to choose between convenience and privacy. With Leo, you get both.</p>
<hr>
<p><em>Leo is open source. Your assistant, your data, your control.</em>
<em>This post was updated on 2/28/2026</em></p>]]></content:encoded>
    </item>
    
    <item>
      <title><![CDATA[Optional Hard Things]]></title>
      <link>https://shsin.blog/posts/hard-optional-things</link>
      <guid isPermaLink="true">https://shsin.blog/posts/hard-optional-things</guid>
      <pubDate>Wed, 04 Feb 2026 00:00:00 GMT</pubDate>
      <dc:creator>Shantanu Singh</dc:creator>
      <category>running</category>
      <category>resilience</category>
      <description><![CDATA[Modern life decoupled reward from effort. Voluntary hardship rewires the equation and builds lasting resilience.]]></description>
      <content:encoded><![CDATA[<p>Easy things are easy. Paying bills, doing taxes, feeding yourself. These are hard, but they're <em>mandatory</em>. Skip them and the consequences hit fast. But there's something different about doing <strong>hard optional things</strong>.</p>
<h2>The Effort-Reward Mismatch</h2>
<p>For 99% of human history, <strong>reward was linked to effort</strong>. You hunted to eat. You built shelter to stay warm. You walked miles to find water. Our brains release dopamine <em>after</em> we overcome resistance. The struggle wasn't just a barrier to the reward. It was part of the reward equation.</p>
<p>Modern life short-circuited this loop. We decoupled reward from effort. You can get a dopamine hit without moving a muscle. Just swipe your thumb on a glass screen.</p>
<p>This instant gratification confuses our ancient biology. We get the prize without the hunt, the feast without the famine. The result isn't deep happiness. It's a hollow satiety that leaves us craving more. We're addicted to cheap pleasure because we've forgotten how to earn expensive happiness.</p>
<h2>Dopamine: The Currency of Pursuit</h2>
<p>We often confuse <em>pleasure</em> with <em>happiness</em>, but they run on entirely different mechanisms.</p>
<p><strong>Pleasure</strong> is short-term. It's the hit from doom-scrolling, eating sugar, or binge-watching a show. Cheap dopamine, zero effort. The problem: it spikes fast and crashes hard, leaving a craving for more (the addiction loop) and a baseline that slowly drops over time.</p>
<p><strong>Happiness</strong>, in the deep, contented sense, comes from <em>effort</em>. It's the dopamine of <strong>pursuit and achievement</strong>. Training for a marathon, learning a complex skill, building a business. These engage a long-term dopamine release that feels like purpose.</p>
<p>Choosing hard things rewires your reward system. You stop being a passive consumer of pleasure and become an active creator of your own happiness.</p>
<h2>Strategic Suffering</h2>
<p>The core trade-off of life is simple:</p>
<blockquote>
<p><strong>Easy Now</strong> leads to <strong>Longer Hard Later</strong>.
<strong>Hard Now</strong> leads to <strong>Predictable Easy Later</strong>.</p>
</blockquote>
<p>If you choose the easy path now, skipping the workout, avoiding the difficult conversation, procrastinating on the project, you're borrowing comfort from your future self. The interest rate on that loan is brutal. It shows up as poor health, lack of skills, and regret.</p>
<p>But when you choose voluntary exposure to difficulty, you build resilience. You train your nervous system to handle stress.</p>
<ul>
<li><strong>Cold showers</strong> teach you to suppress the panic response.</li>
<li><strong>Heavy lifting</strong> teaches you that you can bear a load.</li>
<li><strong>Deep work</strong> teaches you that you can focus in a distracted world.</li>
<li><strong>Running</strong> teaches you everything there's to know about life- more on this later!</li>
</ul>
<p>These are optional. No one will fire you for skipping them. But doing them signals something to your deepest self: <em>I am capable. I am strong. I can handle whatever comes.</em></p>
<h2>The Magic of the Optional</h2>
<p>The most important thing about optional hardship is precisely that: <strong>it is optional</strong>.</p>
<p>When life forces a struggle on you, a setback, an illness, it's suffering. But when you <em>choose</em> the struggle, it's empowerment. You're not reacting to circumstance. You're building character on purpose.</p>
<p>Pick one hard optional thing today. Not because you have to, but because you don't.</p>
<p><img src="/images/hard-optional-things/hard-optional-things.webp" alt="Success through struggle"></p>]]></content:encoded>
    </item>
    
    <item>
      <title><![CDATA[From Toy Debates to Autonomous Engineering Teams with CrewAI]]></title>
      <link>https://shsin.blog/posts/crew-ai</link>
      <guid isPermaLink="true">https://shsin.blog/posts/crew-ai</guid>
      <pubDate>Wed, 31 Dec 2025 00:00:00 GMT</pubDate>
      <dc:creator>Shantanu Singh</dc:creator>
      <category>local-ai</category>
      <category>agents</category>
      <description><![CDATA[Scaling CrewAI from a simple debate exercise to a fully autonomous software engineering crew, with agents that research, code, and review, all running on local GPUs.]]></description>
      <content:encoded><![CDATA[<p>Three debaters, a sandboxed coder, a stock picker with memory, and a full engineering team that writes, tests, and reviews its own code. All running on local GPUs.</p>
<p>I discovered <a href="https://www.crewai.com/open-source">CrewAI</a> through this <a href="https://www.udemy.com/course/the-complete-agentic-ai-engineering-course/">Udemy course</a>. This post walks through five levels of increasing complexity, from a toy debate to autonomous software engineering.</p>
<p><a href="https://github.com/shansin/crewai-agents">Github</a></p>
<hr>
<h2>Level 1: The Basics (The Debate Team)</h2>
<p><strong>Theme: Pure Interaction</strong> | <strong><a href="https://github.com/shansin/crewai-agents/tree/main/debate">Code</a></strong></p>
<p>The journey begins with the <strong>Debate</strong> project—the "Hello World" of agent orchestration.</p>
<p>Here, we have three simple agents: two <code>Debaters</code> and a <code>Judge</code>. The complexity is minimal, but the core idea matters: <strong>Role-Playing</strong>.</p>
<ul>
<li><strong>The Setup</strong>: One agent proposes an argument, and the other judges it.</li>
<li><strong>The Feature</strong>: Pure prompt-based personalities. No complex prompt engineering variables required—just <code>role</code>, <code>goal</code>, and <code>backstory</code>.</li>
</ul>
<pre><code class="hljs language-yaml"><span class="hljs-comment"># debate/config/agents.yaml</span>

<span class="hljs-attr">debater:</span>
  <span class="hljs-attr">role:</span> <span class="hljs-string">A</span> <span class="hljs-string">compelling</span> <span class="hljs-string">debater</span>
  <span class="hljs-attr">goal:</span> <span class="hljs-string">Present</span> <span class="hljs-string">a</span> <span class="hljs-string">clear</span> <span class="hljs-string">argument...</span>

<span class="hljs-attr">judge:</span>
  <span class="hljs-attr">role:</span> <span class="hljs-string">Decide</span> <span class="hljs-string">the</span> <span class="hljs-string">winner...</span>
</code></pre>
<p><strong>Key Takeaway</strong>: With just a few lines of YAML, you can create distinct personalities without tools or memory—just pure LLM-to-LLM interaction.</p>
<p><strong>Results:</strong></p>
<pre><code class="hljs language-yaml"><span class="hljs-attr">Topic:</span> <span class="hljs-string">There</span> <span class="hljs-string">needs</span> <span class="hljs-string">to</span> <span class="hljs-string">be</span> <span class="hljs-string">strict</span> <span class="hljs-string">laws</span> <span class="hljs-string">to</span> <span class="hljs-string">regulate</span> <span class="hljs-string">LLMs.</span>
<span class="hljs-attr">Proposer:</span> <span class="hljs-string">gpt-oss:20b</span>
<span class="hljs-attr">Opponent:</span> <span class="hljs-string">deepseek-r1:8b</span>
<span class="hljs-attr">Judge:</span> <span class="hljs-string">llama3.1:8b</span>
</code></pre>
<p>DeepSeek takes the win. Here is Llama's verdict:</p>
<pre><code class="hljs language-text">Final Output: After carefully reviewing the arguments presented by both sides, I have come to the conclusion that the motion "There needs to be strict laws to regulate LLMs" is more convincing.

The proponents of strict regulation present a comprehensive and well-reasoned argument that highlights the numerous risks associated with unregulated LLMs. They convincingly argue that without legal oversight, malicious actors can exploit these systems to manipulate elections, incite violence, or spread pandemics of falsehoods. Furthermore, they emphasize the importance of protecting individual privacy rights by mandating transparent data sourcing, anonymization standards, and user consent.

The opponents' argument, on the other hand, relies heavily on the potential for restrictive regulations to stifle innovation and hinder technological progress. However, I believe that this argument is less compelling due to its vagueness and lack of concrete evidence. The opponents argue that the rapid evolution of LLMs outpaces legislative efforts, rendering strict laws quickly outdated. While this may be true, it does not necessarily follow that strict laws are inherently counterproductive.

In contrast, the proponents provide a detailed analysis of specific risks and propose targeted solutions to address them through legislation. They emphasize the importance of balancing innovation with protection of human rights, democratic integrity, and societal well-being. I am convinced by their argument that strict laws can create a predictable, safe, and equitable environment in which LLMs can deliver their transformative benefits without becoming sources of harm.

Furthermore, I believe that the proponents' emphasis on transparency and accountability through mandatory disclosure requirements is essential for building public trust and ensuring that end-users can scrutinize AI recommendations before accepting them. This approach aligns with the principles of democratic governance and human rights, which should be the foundation of any regulatory framework governing LLMs.

In conclusion, based on the arguments presented, I am convinced that strict laws to regulate LLMs are necessary to safeguard society, ensure accountability, and preserve democratic values. The potential benefits of regulation far outweigh the perceived risks and limitations associated with restrictive legislation.
</code></pre>
<hr>
<h2>Level 2: Safe Code Execution (The Coder)</h2>
<p><strong>Theme: Agency with Guardrails</strong> | <strong><a href="https://github.com/shansin/crewai-agents/tree/main/coder">Code</a></strong></p>
<p>Next, we graduate to the <strong>Coder</strong> project. This is where things get real. An agent that just talks is fun; an agent that <em>does</em> things is useful.</p>
<p>Giving an AI unrestricted access to your terminal is terrifying. CrewAI solves this with a simple feature:</p>
<ul>
<li><strong>The Setup</strong>: The <code>Coder</code> agent can write and execute Python code directly.</li>
<li><strong>The Feature</strong>: <code>code_execution_mode="safe"</code>. The codebase configures the agent to run code inside a <strong>Docker container</strong>.</li>
</ul>
<pre><code class="hljs language-python"><span class="hljs-comment"># coder/crew.py</span>

agent = Agent(
    role=<span class="hljs-string">"coder"</span>,
    allow_code_execution=<span class="hljs-literal">True</span>,
    code_execution_mode=<span class="hljs-string">"safe"</span>, <span class="hljs-comment"># Dockerized safety!</span>
    llm=<span class="hljs-string">"ollama_chat/deepseek-r1:8b"</span>
)
</code></pre>
<p><strong>Key Takeaway</strong>: You can run powerful coding agents (like <code>deepseek-r1</code>) locally without risking your host machine.</p>
<hr>
<h2>Level 3: Connecting to the World (The Financial Researcher)</h2>
<p><strong>Theme: Tool Use</strong> | <strong><a href="https://github.com/shansin/crewai-agents/tree/main/financial_researcher">Code</a></strong></p>
<p>The <strong>Financial Researcher</strong> project introduces <strong>Tools</strong>.</p>
<p>A smart agent is useless if it's cut off from the world. This crew is composed of a <code>Researcher</code> and an <code>Analyst</code>.</p>
<ul>
<li><strong>The Setup</strong>: A workflow where the <code>Researcher</code> searches the web for real-time data, and the <code>Analyst</code> synthesizes that raw data into a markdown report.</li>
<li><strong>The Feature</strong>: <code>SerperDevTool</code>. The agent isn't just hallucinating facts anymore; it's equipped with tools for live Google searches. No more "I'm sorry, my knowledge cutoff is 2021."</li>
</ul>
<p><strong>Key Takeaway</strong>: This demonstrates the classic "Research &amp; Write" pattern, perfect for automating daily briefings.</p>
<hr>
<h2>Level 4: Memory &amp; Structure (The Stock Picker)</h2>
<p><strong>Theme: Advanced Cognition</strong> | <strong><a href="https://github.com/shansin/crewai-agents/tree/main/stock_picker">Code</a></strong></p>
<p>Now we enter the big leagues with the <strong>Stock Picker</strong> project. This introduces two advanced concepts: <strong>Memory</strong> and <strong>Structured Outputs</strong>.</p>
<ul>
<li><strong>The Setup</strong>: The crew uses <code>LongTermMemory</code> (SQLite) to store insights across runs and <code>ShortTermMemory</code> (RAG) to maintain context. It uses local embeddings (<code>nomic-embed-text</code>) to keep everything private.</li>
<li><strong>The Feature</strong>: <code>output_pydantic</code>. Instead of a wall of text, agents return strictly typed Pydantic objects.</li>
</ul>
<pre><code class="hljs language-python"><span class="hljs-comment"># stock_picker/crew.py</span>

<span class="hljs-keyword">class</span> <span class="hljs-title class_">TrendingCompany</span>(<span class="hljs-title class_ inherited__">BaseModel</span>):
    name: <span class="hljs-built_in">str</span>
    ticker: <span class="hljs-built_in">str</span>
    reason: <span class="hljs-built_in">str</span>

<span class="hljs-meta">@task(<span class="hljs-params">output_pydantic=TrendingCompanyList</span>)</span>
<span class="hljs-keyword">def</span> <span class="hljs-title function_">find_trending_companies</span>(<span class="hljs-params">self</span>): ...
</code></pre>
<p><strong>Key Takeaway</strong>: Structured output guarantees you can reliably pipe AI generation into a database or API because the structure is enforced.</p>
<hr>
<h2>Level 5: The Enterprise (The Engineering Team)</h2>
<p><strong>Theme: Orchestration &amp; Delegation</strong> | <strong><a href="https://github.com/shansin/crewai-agents/tree/main/engineering_team">Code</a></strong></p>
<p>Finally, the <strong>Engineering Team</strong> project. This is the pinnacle of the experiment.</p>
<p>It simulates a full software development lifecycle with specialized roles: <code>Lead</code>, <code>Backend Engineer</code>, <code>Frontend Engineer</code>, and <code>QA</code>.</p>
<ul>
<li><strong>The Setup</strong>: Tasks are chained contextually. The <code>Backend Engineer</code> doesn't start until the <code>Lead</code> finishes the design. <code>QA</code> waits for the code.</li>
<li><strong>The Feature</strong>: Multi-Model Intelligence. Different models are routed to different brains (<code>gpt-oss:20b</code> for high-level design, <code>qwen3-coder:30b</code> for heavy lifting).</li>
</ul>
<pre><code class="hljs language-yaml"><span class="hljs-comment"># engineering_team/config/tasks.yaml</span>

<span class="hljs-attr">backend_engineer:</span>
  <span class="hljs-attr">output_file:</span> <span class="hljs-string">output/{module_name}</span>

<span class="hljs-attr">frontend_engineer:</span>
  <span class="hljs-attr">output_file:</span> <span class="hljs-string">output/app.py</span>
</code></pre>
<p><strong>Key Takeaway</strong>: A crew can take an abstract idea and output a fully tested, functional application with frontend and backend, saved directly to disk.</p>
<p><strong>Requirements:</strong></p>
<pre><code>A simple account management system for a trading simulation platform.
The system should allow users to create an account, deposit funds, and withdraw funds.
The system should allow users to record that they have bought or sold shares, providing a quantity.
The system should calculate the total value of the user's portfolio, and the profit or loss from the initial deposit.
The system should be able to report the holdings of the user at any point in time.
The system should be able to report the profit or loss of the user at any point in time.
The system should be able to list the transactions that the user has made over time.
The system should prevent the user from withdrawing funds that would leave them with a negative balance, or
from buying more shares than they can afford, or selling shares that they don't have.
The system has access to a function get_share_price(symbol) which returns the current price of a share, and includes a test implementation that returns fixed prices for AAPL, TSLA, GOOGL.
</code></pre>
<p><strong>Design <a href="https://github.com/shansin/crewai-agents/tree/main/engineering_team/output/accounts.py_design.md">here</a></strong></p>
<p><strong>Final Output:</strong>
<strong>Account Management</strong>
<img src="/images/crew-ai/1.webp" alt="CrewAI"></p>
<p><strong>Trading</strong>
<img src="/images/crew-ai/2.webp" alt="CrewAI"></p>
<p><strong>Portfolio &amp; Transactions</strong>
<img src="/images/crew-ai/3.webp" alt="CrewAI"></p>
<hr>
<h2>The Local Advantage</h2>
<p>What ties all these projects together? <strong>Local Dominance.</strong></p>
<p>Every agent here runs on local hardware using Ollama. Whether it's the 8B parameter model for the debater or the 30B coding specialist for the engineer, the power is entirely in my hands.</p>
<p>This codebase proves that you don't need to choose between simplicity and power. With CrewAI, you can start with a debate and end with a software empire. (Or at least a very productive localhost.)</p>
<hr>]]></content:encoded>
    </item>
    
    <item>
      <title><![CDATA[Building a Local AI Rig in 2025]]></title>
      <link>https://shsin.blog/posts/building-local-ai-powerhouse-2025</link>
      <guid isPermaLink="true">https://shsin.blog/posts/building-local-ai-powerhouse-2025</guid>
      <pubDate>Tue, 25 Nov 2025 00:00:00 GMT</pubDate>
      <dc:creator>Shantanu Singh</dc:creator>
      <category>local-ai</category>
      <description><![CDATA[A dual-GPU workstation built for local LLM inference. 32GB of VRAM, no API leash, and the joy of building a PC.]]></description>
      <content:encoded><![CDATA[<p>It has been roughly 20 years since I last cracked open a PC case to build a machine from scratch. Back then, we were worried about IDE cables and jumper pins; today, the stakes are a bit different. My goal this time wasn't just to browse the web—I wanted to run LLMs locally.</p>
<p>I was looking for a sandbox for toy projects and experimentation without the leash of a monthly subscription to OpenAI or Anthropic. More importantly, I wanted to "get into the weeds": fine-tuning models and understanding the hardware bottlenecks firsthand.</p>
<h3>The "Sensible" Alternative</h3>
<p>When building for AI, the primary gating factor is <strong>VRAM</strong> (GPU memory). To do anything meaningful, 16GB is the floor.</p>
<p>Now, a rational choice is a Mac Mini with 24GB+ of unified memory. It’s efficient, quiet, and fits in a desk drawer. But where’s the fun in being sensible? I wanted a machine that looked the part and gave me the flexibility to swap components when the next breakthrough hits.</p>
<h3>The Build Specs</h3>
<p>To support heavy local inference and future fine-tuning, I landed on a dual-GPU setup that prioritizes memory overhead and core count.</p>
<ul>
<li><strong>GPU 1:</strong> NVIDIA RTX 5070 Ti (16GB)</li>
<li><strong>GPU 2:</strong> NVIDIA RTX 5060 Ti (16GB)</li>
<li><strong>CPU:</strong> AMD Ryzen 9 9950X3D</li>
<li><strong>Motherboard:</strong> Asus ProArt Creator X870E (Crucial for supporting dual GPUs at PCIe 5 x8/x8)</li>
<li><strong>RAM:</strong> 64GB DDR5</li>
</ul>
<table>
<thead>
<tr>
<th>Component</th>
<th>Role</th>
</tr>
</thead>
<tbody>
<tr>
<td><strong>Total VRAM</strong></td>
<td>32GB (Sufficient for medium-sized 70B parameter models)</td>
</tr>
<tr>
<td><strong>Logic</strong></td>
<td>The Ryzen 9 9950X3D provides the multi-threading needed to keep the GPUs fed.</td>
</tr>
<tr>
<td><strong>Connectivity</strong></td>
<td>The X870E chipset ensures the second GPU isn't throttled by a narrow data pipe.</td>
</tr>
</tbody>
</table>
<h3>Why this "Frankenstein" Rig?</h3>
<p>By pairing two 16GB cards, I’ve managed to bypass the massive "VRAM tax" associated with the ultra-high-end 5090s while still hitting a respectable <strong>32GB of total VRAM</strong>.</p>
<p>The choice of the <strong>ProArt Creator X870E motherboard</strong> was a specific technical requirement. Most consumer boards choke the second PCIe slot down to x4 speeds or don't leave enough physical space to accommodate a full size graphic card; this setup ensures the data pipeline stays wide enough for serious workloads.</p>
<p>It feels good to be back in the BIOS. Now, if you’ll excuse me, I have some local weights to download and some fans to tune. Let the experimentation begin!</p>
<hr>
<p><img src="/images/building-local-ai-powerhouse-2025/dual_gpu_build.webp" alt="Dual GPU Build"></p>]]></content:encoded>
    </item>
    
    <item>
      <title><![CDATA[First Post]]></title>
      <link>https://shsin.blog/posts/firstPost</link>
      <guid isPermaLink="true">https://shsin.blog/posts/firstPost</guid>
      <pubDate>Mon, 24 Nov 2025 00:00:00 GMT</pubDate>
      <dc:creator>Shantanu Singh</dc:creator>
      
      <description><![CDATA[Why I started writing, and what to expect from a blog at the intersection of AI engineering and endurance.]]></description>
      <content:encoded><![CDATA[<p>Two things occupy most of my headspace lately: building AI systems and logging miles.</p>
<p>The systems side is AI engineering. Local inference, multi-agent orchestration, GPU builds, making machines do useful work without renting someone else's cloud. The miles side is endurance. Running, training, the kind of voluntary suffering that teaches you things no tutorial can.</p>
<p>This blog sits at that intersection. Systems and Strides.</p>
<p>I've wanted to start writing for a while. Not because the world needs another blog, but because writing forces clarity. Half-baked ideas have to survive being put into sentences. Some won't. That's the point.</p>
<p>I'll aim for at least one post a month.</p>
<p>No fluff. Just what I'm building, learning, and thinking about.</p>]]></content:encoded>
    </item>
    
  </channel>
</rss>