📝Daily
Day 8
Tony Duong
Mar 18, 2026 · 1 min
#ai#coding#benchmarks#mit#anthropic#ddia#databases#partitioning#collections
Watched: MIT, Anthropic, and New Benchmarks Just Revealed AI's Biggest Coding Limits (from ~3:26).
Research from MIT and Anthropic highlights where AI coding still falls short: benchmarks like SWE-Bench can be gamed, and models that score well often fail on other languages or real-world tasks. Automatic evaluation tends to overestimate performance—agents produce code that passes tests but has issues with formatting, linting, or test coverage that humans would catch. So despite bold industry claims, actual coding limits and productivity gains are still unclear.
Today, I:
- read DDIA Chapter 6: Partitioning — partition strategies (key range vs hash), secondary indexes (local vs global), rebalancing, and request routing
- added collections to this website to better organize some parts