Insights & Ideas

Latest blog posts

Discover stories, tips, and resources to inspire your next big idea.

Optimizing Qwen3 Coder for RTX 5090 and PRO 6000

Dmitry TrifonovMar 5, 2026

I got Qwen3 Coder from 277 tok/s to 1,207 tok/s on a PRO 6000, and from 556 to 1,157 tok/s on an RTX 5090. Here's exactly how, with reproducible recipes.

Leadership

Why Big Tech is not About Business

Dmitry TrifonovFeb 7, 2026

After spending ten years inside three big tech corporations, I can't see them as anything but an imperial court. There are enclaves, wars, and palace intrigue. There are fair and unfair leaders. And for commonfolk, nothing left but to fight someone else's wars.

Leadership

Why Big Tech Leaders Destroy Value

Dmitry TrifonovJan 31, 2026

Over my ten-year tenure in Big Tech, I've witnessed conflicts that drove exceptional people out, hollowed out entire teams, and hardened rifts between massive organizations. These conflicts are not about strategy or money - they are about identity.

Benchmarks

Blackwell Dominates. Benchmarking LLM Inference on NVIDIA B200, H200, H100, and RTX PRO 6000

Natalia TrifonovaJan 21, 2026

We benchmarked NVIDIA B200, H200, H100, and RTX PRO 6000 for long-context LLM inference using 8K input + 8K output (16K total). B200 delivers up to 4.9× the throughput of RTX PRO 6000 and is now the cost efficiency leader across all models.

GPU

The True Cost of GPU Ownership: Computing Run Costs for Self-Hosted AI Infrastructure

Natalia TrifonovaJan 21, 2026

Cloud GPU pricing fluctuates wildly based on supply and demand. We break down the actual cost of owning and operating GPU hardware—from electricity and depreciation to maintenance and colocation—to help you make informed infrastructure decisions.

Big Tech

Why Big Tech Performance Reviews Aren't Meritocratic

Dmitry TrifonovJan 16, 2026

A cynical look at big tech performance evaluation systems through Apple and Roblox - two companies that tried opposite approaches and failed in opposite ways. No performance review can address unfair outcomes; what employees want is to be treated like humans.

Big Tech

Why Big Tech Turns Everything Into a Knife Fight

Dmitry TrifonovJan 1, 2026

A reflection on leaving corporate tech for startups, exploring how organizational size breeds infighting and why entrepreneurship felt less like escape and more like a search for a better way.

Benchmarks

RTX PRO 6000 vs Datacenter GPUs: Is the new RTX an H100 killer?

Dmitry TrifonovNov 27, 2025

I benchmarked RTX PRO 6000 against H100 and H200 datacenter GPUs for LLM inference. The PRO 6000 beats the H100 on single-GPU workloads at 28% lower cost per token, but NVLink-equipped datacenter GPUs pull ahead 3-4x for large models requiring 8-way tensor parallelism.

Latest blog posts

Optimizing Qwen3 Coder for RTX 5090 and PRO 6000

Why Big Tech is not About Business

Why Big Tech Leaders Destroy Value

Blackwell Dominates. Benchmarking LLM Inference on NVIDIA B200, H200, H100, and RTX PRO 6000

The True Cost of GPU Ownership: Computing Run Costs for Self-Hosted AI Infrastructure

Why Big Tech Performance Reviews Aren't Meritocratic

Why Big Tech Turns Everything Into a Knife Fight

RTX PRO 6000 vs Datacenter GPUs: Is the new RTX an H100 killer?

How to Set Up ComfyUI with Cloud Storage for Portable AI Experiments

How to Mount Cloud Storage on a VM (Google Drive, GCS, S3)

Feel the Power: Run ComfyUI on Cloud GPUs - Full VM Setup Guide

ComfyUI in the Cloud: Set Up in Under 2 Minutes

RTX 4090 vs RTX 5090 vs RTX PRO 6000: Comprehensive LLM Inference Benchmark

Benchmarking LLM Inference on RTX 4090, RTX 5090, and RTX PRO 6000

Building a Community LLM Exchange

Evolution of GPU Programming

Host Setup for Qemu KVM GPU Passthrough with VFIO on Linux

How to Give Your RTX GPU Nearly Infinite Memory for LLM Inference

Bug Bounty: NVidia Reset Bug

From Zero to GPU: Creating a dstack Backend for CloudRift

Choosing Your LLM Powerhouse: A Comprehensive Comparison of Inference Providers

AI tools for designers who don’t code (yet)

How to run Oobabooga WebUI on a rented GPU

How to Leverage Cloud-hosted LLM and Pay Per Usage?

Godot Game Server with Chat Bots

Godot Game Server

How to Rent a GPU for ComfyUI: Complete Setup Guide

How to Develop your First (Agentic) RAG Application?

So you're curious about open source AI (and a little intimidated)?

UnSaaS your Stack with Self-hosted Cloud IDEs

How to Rent a GPU-Enabled Machine for AI Development

Prompting DeepSeek: How smart is it, really?

How to develop your first LLM app? Context and Prompt Engineering

How to run Oobabooga in Docker?

How to start development with LLM?

A Transformative Journey? Why Certificates Won't Make You a Better Designer