In recent years, soft lattices have been considered a primary physical origin of defect tolerance in lead-halide perovskite materials, with bulk modulus serving as a key indicator of lattice “softness”. This work focuses on cubic perovskites and constructing a dataset of bulk moduli for 213 compounds based on density functional theory (DFT) calculations. A total of 138 features are compiled, including 132 statistical features extracted using the Matminer toolkit and 6 manually selected elemental descriptors. Four conventional machine learning regression models (RF, SVR, KRR, and EXR) are employed for prediction. Of them, the SVR model shows the best performance, achieving a test-set Root Mean Square Error (RMSE) of 7.35 GPa and Coefficient of Determination (
R2) of 97.86%. Feature importance analysis reveals that thermodynamic-structural features such as melting point, covalent radius, and atomic volume play dominant roles in determining bulk modulus. Based on the 12 most important features, a thermodynamic-structural coupling descriptor is constructed using the SISSO method, yielding a test-set RMSE of 7.41 GPa and
R2 of 97.80%. The resulting descriptor indicates that the bulk modulus is proportional to melting point and inversely proportional to atomic volume. Furthermore, the VS-SISSO method combined with a random subset selection and iterative variable screening strategy is used, enabling the selection of electronic-level features such as electronegativity, valence state, and number of unpaired electrons. The resulting electronic-thermodynamic-structural coupling descriptor further improves the prediction accuracy, reaching an RMSE of 5.34 GPa and
R2 of 98.35% on the test set. Notably, due to the difference in valence states, this model effectively distinguishes between the bulk moduli of chalcogen-based (divalent) and halogen-based (monovalent) perovskites. Based on this model, high-throughput screening is performed on over 10000 cubic chalcogenides and halide perovskites, and approximately 170 lead-free candidates with bulk moduli in the range of 10–20 GPa are identified, which are comparable to Pb-I perovskites. These results provide preliminary evidence for supporting the applicability of the soft-lattice mechanism in lead-free systems and offer theoretical guidance and data support for the high-throughput discovery of stable, defect-tolerant, lead-free perovskite materials. All the data presented in this paper are openly available at
https://doi.org/10.57760/sciencedb. j00213.00161.