Researchers from ByteDance introduce Virtual Width Networks (VWN), a method to scale the representational capacity of large models by decoupling embedding width from backbone compute. By expanding the embedding space while keeping the core hidden size constant, VWN avoids the quadratic cost increases usually associated with wider layers. Large-scale experiments demonstrate that an 8x virtual width expansion can accelerate optimization by up to 3x, revealing a log-linear scaling relationship between virtual width and loss reduction that suggests a new dimension for improving large-model efficiency.