DEV Community

soy profile picture

soy

Patent lawyer turned AI engineer. Processed 4M patents with local LLM on RTX 5090. Building PatentLLM — AI-powered patent search. Also ranked #1 on Floodgate (shogi AI). Writing about local LLM etc.

How Google Finds Every Restaurant in Japan — And Why Your Full-Text Search Can't

How Google Finds Every Restaurant in Japan — And Why Your Full-Text Search Can't

Comments
5 min read

Want to connect with soy?

Create an account to connect with soy. You can also sign in below to proceed if you already have an account.

Already have an account? Sign in
OpenAI Acquires Astral (uv / Ruff) — What It Really Means

OpenAI Acquires Astral (uv / Ruff) — What It Really Means

Comments
5 min read
The Technical Debt Local AI Must Fix Before It's Too Late — What NemoClaw Says About NVIDIA's Philosophy

The Technical Debt Local AI Must Fix Before It's Too Late — What NemoClaw Says About NVIDIA's Philosophy

Comments
17 min read
Punching Through NVIDIA NemoClaw's Sandbox to Hit Local vLLM on RTX 5090

Punching Through NVIDIA NemoClaw's Sandbox to Hit Local vLLM on RTX 5090

2
Comments
4 min read
Why Google Wasn't Indexing My FastAPI Site — The HEAD Request Trap

Why Google Wasn't Indexing My FastAPI Site — The HEAD Request Trap

1
Comments
2 min read
vLLM vs TensorRT-LLM vs Ollama vs llama.cpp — Choosing the Right Inference Engine on RTX 5090

vLLM vs TensorRT-LLM vs Ollama vs llama.cpp — Choosing the Right Inference Engine on RTX 5090

1
Comments
7 min read
Using Python to Load Google Docs into AI — Drive API Minimal Permission Setup

Using Python to Load Google Docs into AI — Drive API Minimal Permission Setup

Comments
5 min read
Hardware Selection for Local LLMs: Overcoming the VRAM Wall with Practical GPU, CPU, and Memory Configurations

Hardware Selection for Local LLMs: Overcoming the VRAM Wall with Practical GPU, CPU, and Memory Configurations

1
Comments
6 min read
What I Gained from Interacting with Shogi AI: The Path to 1st Place in Floodgate and My Approach to Distilled Models

What I Gained from Interacting with Shogi AI: The Path to 1st Place in Floodgate and My Approach to Distilled Models

1
Comments
3 min read
Turn Conversation Data into Assets with Gemini API: History Export, RAG, and Streamlit

Turn Conversation Data into Assets with Gemini API: History Export, RAG, and Streamlit

Comments
8 min read
Automating Video Generation with Remotion and VOICEVOX: From Environment Setup to Performance Optimization

Automating Video Generation with Remotion and VOICEVOX: From Environment Setup to Performance Optimization

1
Comments
8 min read
Cloudflare Tunnel Practical Guide: Securely Exposing a Home AI Server Without Port Forwarding

Cloudflare Tunnel Practical Guide: Securely Exposing a Home AI Server Without Port Forwarding

1
Comments
6 min read
Automated Google Drive Backup with Rclone: Headless OAuth Authentication and systemd Configuration

Automated Google Drive Backup with Rclone: Headless OAuth Authentication and systemd Configuration

1
Comments
7 min read
Claude Code Practical Guide: Debugging, Test Automation, and CUDA Environment Setup with Opus 4.6

Claude Code Practical Guide: Debugging, Test Automation, and CUDA Environment Setup with Opus 4.6

Comments
4 min read
I Posted My Patent Search AI to Reddit r/LocalLLaMA and Got 65 Upvotes and Over 20 Questions

I Posted My Patent Search AI to Reddit r/LocalLLaMA and Got 65 Upvotes and Over 20 Questions

1
Comments
5 min read
Coders at Work — Index of All 15 Programmer Interviews

Coders at Work — Index of All 15 Programmer Interviews

Comments
7 min read
RTX 5090 + Nemotron 9B on vLLM — Benchmarks & TRT-LLM Comparison

RTX 5090 + Nemotron 9B on vLLM — Benchmarks & TRT-LLM Comparison

1
Comments
2 min read
Talent Blooms When You Stop Relying on "Motivation": 7 Insights on the "Spring Mind" Left by Genius Mathematician Kiyoshi Oka

Talent Blooms When You Stop Relying on "Motivation": 7 Insights on the "Spring Mind" Left by Genius Mathematician Kiyoshi Oka

Comments
6 min read
Three Months of Code: What a Patent Lawyer Built from Zero

Three Months of Code: What a Patent Lawyer Built from Zero

Comments 1
5 min read
I Built a Free Patent Search Engine with 3.5M US Patents — No Login, Powered by SQLite FTS5

I Built a Free Patent Search Engine with 3.5M US Patents — No Login, Powered by SQLite FTS5

Comments 1
3 min read
Operational Techniques for Automatically Starting vLLM, Flask, and cron with systemd Services in WSL2

Operational Techniques for Automatically Starting vLLM, Flask, and cron with systemd Services in WSL2

Comments
3 min read
Achieving Bidirectional Integration of Streamlit Backend Flutter Frontend in a WSL2 Environment

Achieving Bidirectional Integration of Streamlit Backend Flutter Frontend in a WSL2 Environment

Comments
2 min read
A Regulatory Analysis Dashboard for Fast Searching NITE CHRIP Data using FTS5

A Regulatory Analysis Dashboard for Fast Searching NITE CHRIP Data using FTS5

Comments
2 min read
Searching Case Law PDFs with RAG — A Legal AI Search System using Gemini + SQLite FTS5

Searching Case Law PDFs with RAG — A Legal AI Search System using Gemini + SQLite FTS5

Comments
3 min read
google-generativeai google-genai Migration Guide

google-generativeai google-genai Migration Guide

Comments
2 min read
Gemini 2.5 Flash x Nemotron 9B — Optimal Division of Roles for Cloud LLM and Local LLM

Gemini 2.5 Flash x Nemotron 9B — Optimal Division of Roles for Cloud LLM and Local LLM

Comments
3 min read
Reduce API Costs for Large-Scale Document Analysis with Gemini Context Caching

Reduce API Costs for Large-Scale Document Analysis with Gemini Context Caching

Comments
2 min read
Skit: The Man Obsessed with Claude Code

Skit: The Man Obsessed with Claude Code

Comments
3 min read
Building a Free Research Agent with DuckDuckGo Search + Local LLM

Building a Free Research Agent with DuckDuckGo Search + Local LLM

Comments
2 min read
A Daily Report System to Automatically Aggregate Claude Code + Gemini CLI Usage History Every Morning with Cron

A Daily Report System to Automatically Aggregate Claude Code + Gemini CLI Usage History Every Morning with Cron

Comments
2 min read
Reducing Token Consumption in Claude Code — FTS5 Knowledge DB + Tiered Index Design

Reducing Token Consumption in Claude Code — FTS5 Knowledge DB + Tiered Index Design

Comments 1
2 min read
Implementing Stripe Checkout Billing in PatentLLM

Implementing Stripe Checkout Billing in PatentLLM

Comments
2 min read
Building a 5-in-1 App with Local LLM and Flutter

Building a 5-in-1 App with Local LLM and Flutter

Comments
2 min read
Leveraging Claude Code's MCP Server

Leveraging Claude Code's MCP Server

Comments 1
2 min read
LoRA and FT Are Unnecessary: How to Approach Distilled Models

LoRA and FT Are Unnecessary: How to Approach Distilled Models

Comments
2 min read
Lineage of OSS Supporting the AI Development Stack: Its Origins and Creators

Lineage of OSS Supporting the AI Development Stack: Its Origins and Creators

Comments
6 min read
Running NVIDIA Nemotron-Nano-9B-v2-Japanese Locally: Mamba SSM + Thinking Mode Support

Running NVIDIA Nemotron-Nano-9B-v2-Japanese Locally: Mamba SSM + Thinking Mode Support

Comments
2 min read
Strategic Data Organization Techniques Using SQLite, JSONL, XML, and TSV: Lessons

Strategic Data Organization Techniques Using SQLite, JSONL, XML, and TSV: Lessons

Comments
3 min read
Shogi AI with RTX 5090 — Record of TensorRT FP8 Quantization and Floodgate Practical Games

Shogi AI with RTX 5090 — Record of TensorRT FP8 Quantization and Floodgate Practical Games

Comments
2 min read
Practical Guide to Running Nemotron-Nano-9B-v2-Japanese with vLLM and Integrating it into Your Custom Application via an Open...

Practical Guide to Running Nemotron-Nano-9B-v2-Japanese with vLLM and Integrating it into Your Custom Application via an Open...

Comments
6 min read
Python Environment Management with uv: Introduction and Practical Use of a High-Speed Package Manager Replacing pip/venv

Python Environment Management with uv: Introduction and Practical Use of a High-Speed Package Manager Replacing pip/venv

Comments
3 min read
Automatically Prevent Port Conflicts and Dangerous Commands Proactively with Claude Code's Hooks Feature

Automatically Prevent Port Conflicts and Dangerous Commands Proactively with Claude Code's Hooks Feature

Comments 1
2 min read
Giving a 'Brain' to Minecraft NPCs with a Local LLM — Nemotron + Mineflayer Implementation Notes

Giving a 'Brain' to Minecraft NPCs with a Local LLM — Nemotron + Mineflayer Implementation Notes

Comments
3 min read
Exposing Multiple Web Applications from a Home Server with Cloudflare Tunnel + Caddy

Exposing Multiple Web Applications from a Home Server with Cloudflare Tunnel + Caddy

Comments
2 min read
Personal AI Development Environment Built with RTX 5090 + WSL2 — A Practical Setup Fully Utilizing 32GB GPU

Personal AI Development Environment Built with RTX 5090 + WSL2 — A Practical Setup Fully Utilizing 32GB GPU

Comments
2 min read
Individual Developer's Portfolio Strategy: Running 13 Projects on a Single RTX 5090

Individual Developer's Portfolio Strategy: Running 13 Projects on a Single RTX 5090

Comments
2 min read
Using Local LLMs as a "Batch Processing Engine" — A Design for Automatically Generating Artifacts from Your Own Data

Using Local LLMs as a "Batch Processing Engine" — A Design for Automatically Generating Artifacts from Your Own Data

Comments
10 min read
Fast Searching 4 Million Patent Records with FTS5

Fast Searching 4 Million Patent Records with FTS5

Comments
2 min read
loading...