The application of engineering materials impacts every facet of society, through manufacturing, energy, transportation, medicine, and more. One the most important trends in materials science is the emergence of atomic scale characterization of materials. It is now possible to capture an image with atomic-scale resolution for a piece of material nearly large enough to be seen by the naked eye. In describing the vast datastreams that can be generated, the numbers are literally astronomical: a grain of sand contains more atoms than there are stars in the universe! But what we can learn from this approach has been limited until now by our ability to work with the datastreams. At the same time, data science has emerged as a new discipline, including powerful machine learning and AI, advanced statistical methods, and cyberinfrastructure such as cloud-based data storage and management. These tools, and the fundamental ”big data” problems of volume, variety, velocity, and veracity apply in any “data-rich, information-poor” domain, but no more so than in the science of materials.
Industry and national labs employ the majority of STEM graduates with advanced degrees. Companies that make, use, and apply materials, and national labs in the energy and defense sectors are now beginning to demand graduates with data science expertise, along with materials science computational and experimental training. But there are no formal programs in US universities that prepare students in this converging area. DIGI-MAT is a graduate traineeship focused around a recognition that characterization of materials is no longer a data-limited endeavor. The DIGI-MAT vision is that soon, scientists who study materials – physicists, chemists, mechanical engineers, materials scientists – will approach materials problems as data problems, and understanding of materials as a challenge in generating, curating, managing, and parsing massive datastreams.