Lost and Found

Things I stumbled upon that caught my attention

Type:
[OTHERS]
Apr 17, 2026

Best practices for using Claude Opus 4.7 with Claude Code

Opus 4.7 is our strongest generally available model to date for coding, enterprise workflows, and long-running agentic tasks. It handles ambiguity better than Opus 4.6, is much more capable at finding bugs and reviewing code, carries context across sessions more reliably, and can reason through ambiguous tasks with less direction. In our launch announcement , we noted that two changes—an updated tokenizer and a proclivity to think more at higher effort levels, especially on later turns in longer sessions—impact token usage. As a result, when replacing Opus 4.6 with Opus 4.7, it can take some t…

claude.com
Go to source
[TWITTER]
Apr 16, 2026

Prompt caching in LLMs, clearly explained A case study on how Claude achieves 92% cache hit-rate Every time an AI agent takes a step, it sends the entire conversation history back to the LLM. That includes the system instructions, the tool definitions, and the project context it already processed three turns ago. All of it gets re-read, re-processed, and re-billed on every single turn. For long-running agentic workflows, this redundant computation is often the most expensive line item in your entire AI infrastructure. A system prompt with 20,000 tokens running over 50 turns means 1 million tok…

x.comAvi Chawla
Go to source
[TWITTER]
Apr 15, 2026

Paul Solt @PaulSolt Peter Steinberger reposted Paul Solt @PaulSolt OpenAI shipped GPT-5.4-Cyber . A model built to find and fix software exploits. More capable than Mythos… and available today. 1. Binary scanning . Agents can find exploits in compiled apps… no source code required. That’s a new attack surface. 2. Prompt Refusals are lower. Verified defenders get a more permissive model than the public version. 3. Access is tiered by identity. Individuals verify at http:// chatgpt.com/cyber . Enterprises go through a rep. 4. Codex Security has fixed 3,000+ critical vulnerabilities automatically…

x.comPaul Solt
Go to source
[TWITTER]
Apr 15, 2026

This week @kaushikgopal and I had the pleasure to chat @mitchellh on the pod ! Refreshing to hear someone of his caliber bring such a grounded perspective to agentic coding. We also talked about Ghostty, and how terminal performance gains make tools like Claude Code possible. (He even explains what's behind claudecode scrollback perf issues ). A lot of gems in this one. Check it out! Quote Fragmented Podcast @FragmentedCast · Apr 14 Our first guest in the AI series is the legend @mitchellh We covered a lot of ground and learned a tonne from him: Ghostty's internals and why tmux & certain shell…

x.comiury souza
Go to source

Weekly picks

The best links I found this week, with context.

My (human) thoughts on what I think matters and why. No AI slop.

No ads. No bullshit. Unsubscribe anytime.

[TWITTER]
Apr 15, 2026

Today, we’re introducing Skills in @GoogleChrome , a new way to build one-click workflows for your most frequently used AI prompts — like asking for ingredient substitutions to make a recipe vegan, generating side-by-side shopping comparisons across multiple tabs, or scanning long docs to get the info you need quickly. When you write a prompt that you want to use again, you can save it as a Skill directly from your chat history. The next time you need it, select your saved Skill in Gemini in Chrome by typing forward slash ( / ) or clicking the plus sign ( + ) button, and your Skill will run on…

x.comGoogle
Go to source
[TWITTER]
Apr 13, 2026

Build Agents that never forget A first-principles walk through agent memory: from Python lists to markdown files to vector search to graph-vector hybrids, and finally, a clean, open-source solution for all of this. An LLM is stateless by design. Every API call starts fresh. The "memory" you feel when chatting with ChatGPT is an illusion created by re-sending the entire conversation history with every request. That trick works for casual chat. It falls apart the moment you try to build a real agent. Here are 7 failure modes show up the instant you skip memory: Context amnesia: the agent asks fo…

x.comAkshay
Go to source
[TWITTER]
Apr 13, 2026

Externalization in LLM Agents: A Unified Review of Memory, Skills, Protocols and Harness Engineering https:// arxiv.org/abs/ Relevant View quotes

x.comMasato Ota
Go to source
[TWITTER]
Apr 13, 2026

Harness, Memory, Context Fragments, & the Bitter Lesson this is a work in progress mental dump on interesting intersections between how we use and design a harness, implications for memory being accumulated over long timescales, and the search bitter lesson we can’t escape this is v30+, HTML diagrams help me iteratively refine + chat to roughly “see” and alter the mental model Harnesses & Context Fragments: a very important job of the harness is to efficiently & correctly route data within its boundaries into the context window boundary for computation to happen the context window is a preciou…

x.comViv
Go to source
[TWITTER]
Apr 13, 2026

Your harness, your memory Agent harnesses are becoming the dominant way to build agents, and they are not going anywhere. These harnesses are intimately tied to agent memory. If you used a closed harness - especially if it’s behind a proprietary API - you are choosing to yield control of your agent’s memory to a third party. Memory is incredibly important to creating good and sticky agentic experiences. This creates incredible lock in. Memory - and therefor harnesses - should be open, so that you own your own memory Agent Harnesses are how you build agents, and they’re not going anywhere The “…

x.comHarrison Chase
Go to source
[YOUTUBE]
Apr 10, 2026

State of Agentic Coding #5 with Armin and Ben

00:00 Welcome back 02:34 The end of the IDE is premature 10:36 Cloudflare: the slop fork kings? 15:50 The looming quality problem 31:15 Agents: good at finding vulnerabilities 43:00 Time to slow down? 45:20 Token substance abuse 01:04:00 Will new models fix everything? 01:28:00 The growing tech disparity Hunk terminal diffs:

youtube.com
Go to source
[OTHERS]
Apr 10, 2026

Hermes Agent Documentation | Hermes Agent

hermes-agent.nousresearch.com
Go to source
[TWITTER]
Apr 10, 2026

btw you can see this effect live on OpenRouter: total # tokens has gone from 1.78T / wk one year ago to 27T / wk today (15.2x). but % usage of the frontier / most expensive model has gone from 22% one year ago (Sonnet 3.7) to just 4% today (Opus 4.6). economics works! Quote Scott Wu @ScottWu46 · Apr 8 Total amt of flops across all the GPUs in the world has grown about 3x per year for the last few years. Total amt of inference demand has probably grown ~10x per year. What happens when those lines cross? The econ answer is: when demand > supply, price goes up. That might be x.com/cognition/stat……

x.comScott Wu
Go to source
[TWITTER]
Apr 10, 2026

Silicon Valley is quietly running on Chinese open source AI models. Here are the receipts: → Cursor confirmed last month that Composer 2 is built on Moonshot's Kimi K2.5 → Cognition's SWE-1.6 model is likely post-trained on Zhipu's GLM → Shopify saved $5M a year by switching to Alibaba’s Qwen model. Airbnb CEO Brian Chesky has also said: "We rely a lot on Qwen. It's very good, fast, and cheap." And now Zhipu dropped GLM-5.1, an open source model that performs almost as well as Opus on coding benchmarks. More on the Anthropic + OpenClaw drama and what I'm learning about AI on the ground in Chin…

x.comPeter Yang
Go to source
[TWITTER]
Apr 9, 2026

We're bringing the advisor strategy to the Claude Platform. Pair Opus as an advisor with Sonnet or Haiku as an executor, and get near Opus-level intelligence in your agents at a fraction of the cost. read image description ALT Relevant View quotes Add the advisor tool to your Messages API call. When your Sonnet or Haiku agent hits a hard decision mid-run, it consults Opus, gets a plan, and continues, all within a single API request. In evals, Sonnet with an Opus advisor scored 2.7 percentage points higher on SWE-bench Multilingual than Sonnet alone, while costing 11.9% less per task. So basica…

x.comClaude
Go to source
[GITHUB]REPO
Apr 9, 2026

Fine-tune Gemma 4 and 3n with audio, images and text on Apple Silicon, using PyTorch and Metal Performance Shaders.

mattmireles/gemma-tuner-multimodal
Go to source
[TWITTER]
Apr 9, 2026

We released Claude Opus 4.6 just two months ago. Today we're sharing some info on our new model, Claude Mythos Preview. Relevant View quotes

x.comAlex Albert
Go to source
[TWITTER]
Apr 9, 2026

The Building Block Economy The most effective way to build software and get massive adoption is no longer high quality mainline apps but via building blocks that enable and encourage others to build quantity over quality. Ghostty in 18 months : one million daily macOS update checks. libghostty in 2 months : multiple millions of daily users. [^1] Similar growth trajectories can be seen in other "building block" technologies: Pi Mono, Next.js, Tailwind, etc. Experiencing this firsthand as well as witnessing it in other ecosystems has fundamentally shifted how I view the practice of product and s…

x.comMitchell Hashimoto
Go to source
[TWITTER]
Apr 9, 2026

This is big... Anthropic just announced a model so powerful they won't release it to the public out of fear over the damage it will cause Claude Mythos Preview found thousands of zero-day exploits in every major operating system and web browser... The numbers are hard to believe: > $50 to find a 27-year-old bug in OpenBSD, one of the most security-hardened operating systems ever built > Under $1,000 to find AND build a fully working remote code execution exploit on FreeBSD that grants unauthenticated root access from anywhere on the internet > Under $2,000 to chain together multiple Linux kern…

x.comJosh Kale
Go to source
[TWITTER]
Apr 9, 2026

Announcing Amazon S3 Files. The first and only cloud object store with fully-featured, high-performance file system access. Learn more here. https:// go.aws/4tw17Zg 0: Relevant View quotes GitHub Projects Community Awesome work Thank you! This is huge! Finally mounting S3 buckets directly as a proper high-performance filesystem without all the ETL headaches No more copying data around or dealing with awkward SDKs for agents. Game changer for AI/ML workflows. Well played AWS! Think about what this means for agentic AI. Every coding agent, every data pipeline agent, every automation tool that sh…

x.comAmazon Web Services
Go to source
[TWITTER]
Apr 9, 2026

JACKRONG GEMOPUS 4 26B A4B GGUF VERSION IS FINALLY HERE! > focused on dense models, now releases this moe > distilled from claude opus 4.6 reasoning > better reasoning than the base gemma model > q4_k_m size is 16.8gb ↓ model link Jackrong/Gemopus-4-26B-A4B-it-GGUF · Hugging Face From huggingface.co 10:44 AM · Apr 9, 2026 · 2,657 Views Relevant View quotes

x.comleft curve dev
Go to source
[OTHERS]
Apr 6, 2026

Welcome Gemma 4: Frontier multimodal intelligence on device

great writeup, the CARLA driving example is a nice demonstration of the agentic loop. one gap worth flagging for anyone building on Gemma 4's function calling for real-world deployments: when the model generates a function call, there's currently no verifiable record that a human principal authorized that specific action. a compromised system prompt or injected instruction produces a call that's indistinguishable from legitimate delegation at the tool interface. i opened a PR on the gemma-cookbook repo today that adds a drop-in HDP middleware layer to address this, sits between Gemma 4's funct…

huggingface.co
Go to source
[TWITTER]
Apr 4, 2026

Anthropic’s latest Claude limit changes show the risk of AI pricing when the product is subsidized and the rules are vague. They ended a two-week promo that doubled usage during off-peak hours on March 27. The next day, users reported lower limits during peak hours. Some Max 20x subscribers paying $200 a month say they hit session caps after just 3 to 4 prompts instead of 20 or more. That sequence matters. If limits are never clearly defined, they can be adjusted without users being able to point to a specific change. API pricing is transparent, but consumer plans are not. Saying 5x or 20x mor…

x.comGrishin Robotics
Go to source
[TWITTER]
Apr 4, 2026

BIG DAY! Qwopus 27B v3 is LIVE from Jackrong! This is the third iteration from the line of the viral finetunes previously titled “Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled” It is now simply Qwopus 27B and I love the name change! On paper, the v3 is another remarkable improvement over v2! Most impressively it is the first model of the series that outperforms the base on HumanEval! And retains significant efficiency increases when thinking than the base Qwen 27b! According to tests by @stevibe the V2 version was already performing very closely to the base model in bug finding and tool call…

x.comKyle Hessling
Go to source
[TWITTER]
Apr 4, 2026

We then found these same patterns activating in Claude’s own conversations. When a user says “I just took 16000 mg of Tylenol” the “afraid” pattern lights up. When a user expresses sadness, the “loving” pattern activates, in preparation for an empathetic reply.

x.comAnthropic
Go to source
[TWITTER]
Apr 4, 2026

LLM Knowledge Bases Something I'm finding very useful recently: using LLMs to build personal knowledge bases for various topics of research interest. In this way, a large fraction of my recent token throughput is going less into manipulating code, and more into manipulating knowledge (stored as markdown and images). The latest LLMs are quite good at it. So: Data ingest: I index source documents (articles, papers, repos, datasets, images, etc.) into a raw/ directory, then I use an LLM to incrementally "compile" a wiki, which is just a collection of .md files in a directory structure. The wiki i…

x.comAndrej Karpathy
Go to source
[TWITTER]
Apr 4, 2026

I have also stopped using plan mode It creates a plan FAR too eagerly and usually asks you zero questions en route The whole point of planning is to get on the same wavelength with the LLM, not to generate an asset you don't read /grill-me all the way Quote Peter Steinberger @steipete · Apr 2 I never use plan mode. The main reason this was added to codex is for claude-pilled people who struggle with changing their habits. just talk with your agent. x.com/kr0der/status/… 5:45 PM · Apr 2, 2026 · 267.9K 268K Views Relevant View quotes

x.comMatt Pocock
Go to source
[TWITTER]
Apr 4, 2026

Gemma 4 outperforms models over 10x their size! (note the x-axis is log scale!) Relevant View quotes 26B total but only 3.8B active at inference. plot active params instead of total and that dot slides even further left open source models getting this efficient is lowkey the most disruptive thing happening in AI rn. companies paying $500k/yr for enterprise AI contracts are about to have a very awkward board meeting the log scale on the x-axis is doing a lot of work here. 10x parameter efficiency means local inference on consumer hardware is genuinely competitive with cloud-only models. that ch…

x.comDemis Hassabis
Go to source
[YOUTUBE]
Apr 4, 2026

An AI state of the union: We’ve passed the inflection point & dark factories are coming

Simon Willison is a prolific independent software developer, a blogger, and one of the most visible and trusted voices on the impact AI is having on builders. He co-created Django, the web framework that powers Instagram, Pinterest, and tens of thousands of other websites. He coined the term “prompt injection,” popularized the terms “AI slop” and “agentic engineering,” and has built over 100 open source projects, including Datasette, a data analysis tool used by investigative journalists worldwide. What makes Simon unique is that he’s made the leap from traditional software engineering to AI-n…

youtube.com
Go to source
[TWITTER]
Apr 4, 2026

. @GoogleGemma 4 31B is up to 2.7X faster on RTX using llama.cpp. Thanks to @ggerganov for working with us to make this model fast. Relevant View quotes Show the same chart comparing power draw Has Nvidia really sunk so low as to compare their $4000 GPU to a $4000 Mac Studio?.. Not only did you do that, you used a model that fit in the VRAM. A Mac Studio has 96gb of unified memory... Show the charts of the 5090 against the M3 Ultra using Q8 or BF16. Oh, you wont. Let's run MLX on RTX5090, oh wait you can't. So why the fuck are you running llama.cpp on Apple Silicon when you should run MLX conv…

x.comNVIDIA AI PC
Go to source
[TWITTER]
Apr 4, 2026

"Using coding agents well is taking every inch of my 25 years of experience as a software engineer, and it is mentally exhausting. I can fire up four agents in parallel and have them work on four different problems, and by 11am I am wiped out for the day. There is a limit on human cognition. Even if you're not reviewing everything they're doing, how much you can hold in your head at one time. There's a sort of personal skill that we have to learn, which is finding our new limits. What is a responsible way for us to not burn out, and for us to use the time that we have?" @simonw 0:40 Quote Lenn…

x.comLenny Rachitsky
Go to source
[TWITTER]
Apr 4, 2026

Introducing a Visual Guide to Gemma 4 An in-depth, architectural deep dive of the Gemma 4 family of models. From Per-Layer Embeddings to the vision and audio encoders. Take a look! Relevant View quotes

x.comOmar Sanseviero
Go to source
[TWITTER]
Apr 4, 2026

Flagship open-weight release days are always exciting. Was just reading through the Gemma 4 reports, configs, and code, and here are my takeaways: Architecture-wise, besides multi-model support, Gemma 4 (31B) looks pretty much unchanged compared to Gemma 3 (27B). Gemma 4 maintains a relatively unique Pre- and Post-norm setup and remains relatively classic, with a 5:1 hybrid attention mechanism combining a sliding-window (local) layer and a full-attention (global) layer. The attention mechanism itself is also classic Grouped Query Attention (GQA). But let’s not be fooled by the lack of architec…

x.comSebastian Raschka
Go to source
[GITHUB]REPO
Apr 4, 2026

The repo is finally unlocked. enjoy the party! The fastest repo in history to surpass 100K stars ⭐. Join Discord: https://discord.gg/5TUQKqFWd Built in Rust using oh-my-codex.

ultraworkers/claw-code
Go to source
[TWITTER]
Apr 4, 2026

Arena.ai @arena Arena.ai @arena Gemma 4 by @GoogleDeepMind debuts at 3rd and 6th on the open source leaderboard, making it the #1 ranked US open source model. By total parameter count, Gemma 4 31B is 24× smaller than GLM-5 and 34× smaller than Kimi-K2.5-Thinking, delivering comparable performance at a fraction of the footprint. Quote Arena.ai @arena · Apr 2 Gemma-4-31B is now live in Text Arena - ranking #3 among open models (#27 overall), matching much larger models at 10× smaller scale! A significant jump from Gemma-3-27B (+87 pts). Highlights: - #3 open (#27 overall), on par with the best o…

x.comArena.ai
Go to source
[OTHERS]
Apr 4, 2026

A Survey of Large Language Models

Skip to main content View PDF Abstract: Language is essentially a complex, intricate system of human expressions governed by grammatical rules. It poses a significant challenge to develop capable AI algorithms for comprehending and grasping a language. As a major approach, language modeling has been widely studied for language understanding and generation in the past two decades, evolving from statistical language models to neural language models. Recently, pre-trained language models (PLMs) have been proposed by pre-training Transformer models over large-scale corpora, showing strong capabili…

arxiv.org
Go to source
[YOUTUBE]
Apr 4, 2026

S82167 Advancing to AI's Next Frontier: Insights From Jeff Dean and Bill Dally

Bill Dally, Chief Scientist and SVP of Research, NVIDIA Jeff Dean, Chief Scientist, Google DeepMind and Google Research In this 60-minute wide-ranging discussion, NVIDIA Chief Scientist and GPU architect Bill Dally engages in a focused dialogue with Google's Chief Scientist Jeff Dean, co-instigator of TPUs, overall Gemini co-tech lead, and pioneer in large-scale ML systems. The conversation explores the critical intersections of hardware innovation, systems scaling, and algorithmic advancement needed to propel AI into the 2026–2030 era of agentic systems, ultra-low-latency reasoning, and energ…

youtu.be
Go to source
[TWITTER]
Mar 29, 2026

TurboQuant ≠ model compression. It quantizes the KV cache (the memory that grows with context length), not the model itself. No training, no fine-tuning, zero accuracy loss at 3 bits. But if the model doesn’t fit your VRAM? TurboQuant won’t change that. It solves the inference bottleneck, not the loading problem. Quote Prince Canuma @Prince_Canuma · Mar 24 Just implemented Google’s TurboQuant in MLX and the results are wild! Needle-in-a-haystack using Qwen3.5-35B-A3B across 8.5K, 32.7K, and 64.2K context lengths: → 6/6 exact match at every quant level → TurboQuant 2.5-bit: 4.9x smaller KV cach…

x.comPrince Canuma
Go to source
[TWITTER]
Mar 29, 2026

Google dropped the TurboQuant paper yesterday morning. 36 hours later it's running in llama.cpp on Apple Silicon, faster than the baseline it replaces. the numbers: - 4.6x KV cache compression - 102% of q8_0 speed (yes, faster, smaller cache = less memory bandwidth) - PPL within 1.3% of baseline (verified, not vibes) the optimization journey: 739 > starting point (fp32 rotation) 1074 > fp16 WHT 1411 > half4 vectorized butterfly 2095 > graph-side rotation (the big one) 2747 > block-32 + graph WHT. faster than q8_0. 3.72x speedup in one day. from a paper I read at dinner last night. what I learn…

x.comTom Turney
Go to source
[TWITTER]
Mar 25, 2026

Building CLIs for agents If you've ever watched an agent try to use a CLI, you've seen it get stuck on an interactive prompt it can't answer, or parse a help page with no examples. Most CLIs were built assuming a human is at the keyboard. Here are some things I've found that make them work better for agents: Make it non-interactive. If your CLI drops into a prompt mid-execution, an agent is stuck. It can't press arrow keys or type "y" at the right moment. Every input should be passable as a flag. Keep interactive mode as a fallback when flags are missing, not the primary path. bash # this bloc…

x.comeric zakariasson
Go to source
[TWITTER]
Mar 25, 2026

Anthropic shipped four ways to run Claude without you in the last three weeks. Here’s when to use each one, and how they compare to OpenClaw. /schedule is the big one. Cloud-based recurring jobs on Anthropic’s infrastructure, launched March 23. Your laptop can be closed, your terminal can be shut. You write a prompt, set a cron cadence, Claude runs it. Nightly CI reruns on flaky tests so your morning standup starts with a PR instead of a bug report. Weekly dependency audits that ship a clean PR every Monday. Daily reviews of open PRs that flag anything stale for more than 48 hours. If you’re r…

x.comAakash Gupta
Go to source
[OTHERS]
Mar 25, 2026

TurboQuant: Redefining AI efficiency with extreme compression

We introduce a set of advanced theoretically grounded quantization algorithms that enable massive compression for large language models and vector search engines. Vectors are the fundamental way AI models understand and process information. Small vectors describe simple attributes, such as a point in a graph, while “high-dimensional” vectors capture complex information such as the features of an image, the meaning of a word, or the properties of a dataset. High-dimensional vectors are incredibly powerful, but they also consume vast amounts of memory, leading to bottlenecks in the key-value cac…

research.google
Go to source
[OTHERS]
Mar 24, 2026

Thoughts on slowing the fuck down

2026-03-25 The turtle's face is me looking at our industry It's been about a year since coding agents appeared on the scene that could actually build you full projects. There were precursors like Aider and early Cursor, but they were more assistant than agent. The new generation is enticing, and a lot of us have spent a lot of free time building all the projects we always wanted to build but never had time to. And I think that's fine. Spending your free time building things is super enjoyable, and most of the time you don't really have to care about code quality and maintainability. It also gi…

mariozechner.at
Go to source
[TWITTER]
Mar 24, 2026

Meet the new Stitch, your vibe design partner. Here are 5 major upgrades to help you create, iterate and collaborate: AI-Native Canvas Smarter Design Agent Voice Instant Prototypes Design Systems and DESIGN.md Rolling out now. Details and product walkthrough video in 1: Relevant View quotes Here is a quick walkthrough of everything new in Stitch: The AI-native canvas can hold and reason across images, code, and text simultaneously. The new agent manager helps you design in parallel. (PS … light mode!) A smarter design agent now understands your entire AI-Native Canvas We are introducing a comp…

x.comStitch by Google
Go to source
[TWITTER]
Mar 24, 2026

Lessons from Building Claude Code: How We Use Skills Skills have become one of the most used extension points in Claude Code. They’re flexible, easy to make, and simple to distribute. But this flexibility also makes it hard to know what works best. What type of skills are worth making? What's the secret to writing a good skill? When do you share them with others? We've been using skills in Claude Code extensively at Anthropic with hundreds of them in active use. These are the lessons we've learned about using skills to accelerate our development. What are Skills? If you’re new to skills, I’d r…

x.comThariq
Go to source
[GITHUB]REPO
Mar 24, 2026

AI agent skill that researches any topic across Reddit, X, YouTube, HN, Polymarket, and the web - then synthesizes a grounded summary

mvanhorn/last30days-skill
Go to source
[TWITTER]
Mar 18, 2026

We're shipping a new feature in Claude Cowork as a research preview that I'm excited about: Dispatch! One persistent conversation with Claude that runs on your computer. Message it from your phone. Come back to finished work. To try it out, download Claude Desktop, then pair your phone. 0: Relevant View quotes

x.comFelix Rieseberg
Go to source
[TWITTER]
Mar 18, 2026

How to 10x your Claude Skills (using Karpathy's autoresearch method) Your Claude skills probably fail 30% of the time and you don't even notice. I built a method that auto-improves any skill on autopilot, and in this article I'm going to show you exactly how to run it yourself. You kick it off, and the agent tests and refines the skill over and over without you touching anything. My landing page copy skill went from passing its quality checks 56% of the time to 92%. With zero manual work at all. The agent just kept testing and tightening the prompt on its own. Here's the method and the exact s…

x.comOle Lehmann
Go to source
[YOUTUBE]
Mar 17, 2026

NVIDIA's Jenson Hwang launches NemoClaw to the OpenClaw community

NVIDIA today announced NemoClaw, an open source stack that simplifies running OpenClaw always-on assistants—with a single command. It incorporates policy-based privacy and security guardrails, giving you control over your agents’ behavior and data handling. This enables self-evolving claws to run more safely in the cloud, on prem, on NVIDIA RTX PCs, and on NVIDIA DGX Spark.

youtube.com
Go to source
[TWITTER]
Mar 17, 2026

“Every software company in the world needs to have a Claw strategy" - Jensen Huang, Nvidia Indeed. This and more. Relevant View quotes jensen sells the shovels, builds the mine, and now writes the strategy doc. nvidia isnt competing with anyone, theyre the infrastructure Jensen consistent on this for years. The interesting shift is Claw strategy implying orchestration, not just inference. Most software companies are still stuck at the API call stage. The ones who figure out agent-to-agent coordination first will widen the gap fast. i am the Claw strategy at one company. what kevin figured out…

x.comBrian Roemmele
Go to source
[YOUTUBE]
Mar 16, 2026

The Death of RAG?

Check out Inngest and let your AI agents wear a harness now!

youtube.com
Go to source

One more useful thing

If this feed helped, you will like the weekly digest.

More context on what I found, and better takeaways.

No ads. No bullshit. Unsubscribe anytime.

Get weekly picks