Complete methodology for local LLM performance optimization. Core principle: maximize context while fully covering GPU memory — find the sweet spot where GPU...