arXiv
arXiv is the preeminent open-access repository for scientific preprints, hosting over 2.5 million scholarly articles across physics, mathematics, computer science, and related fields. Operated by Cornell University, arXiv has become the primary distribution channel for AI and machine learning research.
Nearly every significant advance in modern AI — from the original Transformer paper to landmark work on scaling laws, RLHF, and diffusion models — was first published on arXiv. The platform enables immediate, free access to cutting-edge research, accelerating the pace of AI development far beyond what traditional peer-reviewed journals could support.
As a knowledge asset, arXiv represents one of the most concentrated repositories of AI research in existence, making it a critical data source for training the next generation of AI models and a key reference for understanding the rapidly evolving field.