The Evaluated Nuclear Structure Data File (ENSDF) serves as the cornerstone database for nuclear structure and decay data, underpinning research in nuclear physics, energy, medicine, and astrophysics. However, its legacy 80-column ASCII format, established decades ago, presents significant challenges in data accessibility, interoperability, and scalability amidst today's data-intensive research environment. This work systematically addresses the urgent need for modernizing ENSDF by analyzing its historical limitations, reviewing international modernization initiatives, and presenting a comprehensive framework that integrates advanced data formats, database technologies, machine learning (ML), and interactive visualization.
The primary objectives of this study are twofold: first, to articulate a pathway for transforming ENSDF into a FAIR (Findable, Accessible, Interoperable, Reusable)-compliant resource through structural and technological upgrades; and second, to demonstrate the scientific potential of modernized nuclear data by applying ML methods to predict decay properties of superheavy nuclei.
Methodologically, we propose and implement a multi-faceted modernization framework. This includes the adoption of JSON Schema to replace rigid column-based records with hierarchical, self-describing, and machine-readable data structures. We introduce CouchDB, an object-oriented database, to natively support diverse data types and enable efficient querying via precomputed views. Additionally, we discuss the integration of machine learning both as a tool for enhancing database maintenance—such as automated PDF table extraction and anomaly detection—and as a predictive engine for nuclear properties. Inspired by international efforts like the modernized NuDat interface, we also developed a localized visualization platform using Dash and Plotly, enabling interactive exploration of nuclide charts and level schemes based on locally parsed ENSDF data.
As a concrete application of modernized nuclear data, we focused on the decay properties of heavy nuclei. Using JSON-formatted structure and decay data derived from ENSDF, we trained a Random Forest (RF) model to predict half-lives and dominant decay modes across α decay, \beta^\pm decay, electron capture, and spontaneous fission. The model was trained to correct residuals from established semi-empirical formulas (e.g., the Universal Decay Law for α decay and a new three-parameter formula for spontaneous fission). Input features included proton number Z, neutron number N, mass number A, parity, decay energies, and fission barriers.
Our results demonstrate a significant improvement in predictive accuracy. The RF model achieved 92.2% agreement with experimental data in identifying the dominant decay mode and reduced the root-mean-square error (RMSE) by an average of over 50% across all decay channels. Notably, the model successfully reproduced known systematic trends, such as the elongated α decay valley and the competition between α decay and spontaneous fission in the northeast region of the chart. It also predicted an island of enhanced stability southwest of Z = 114, N = 184, correlated with higher fission barriers.
In conclusion, this work underscores that modernizing ENSDF is not merely a technical upgrade but a paradigm shift toward data-driven nuclear science. By implementing JSON-based structuring, object-oriented databases, ML-aided evaluation, and advanced visualization, we establish a foundation for scalable, interoperable, and intelligent nuclear data infrastructure. The successful application of Random Forest to superheavy nuclear decay validates the synergy between modern data formats and ML, offering a powerful tool for exploring unknown nuclides and constraining theoretical models. These efforts, exemplified by the autonomous development at Sun Yat-sen University, reflect China's growing capacity for innovation in nuclear data infrastructure and its commitment to supporting future discoveries in nuclear physics, astrophysics, and applied nuclear technologies.