Citation preview
200219MI2I_miyake
SGL ~2018.5 DAM, Hieu-ChiJAIST
KEK JAIST PL
2
STAGE III
NdFe12N Sm(Fe,Co)12
1-12
Nd2Fe14B
T. Miyake et al., J. Phys. Soc. Jpn. 83, 043702 (2014)
6
Material {xi } (i =1, 2, 3, … , N )
Property {yi } (i =1, 2, 3, … , N )
First-principles calc. Regression
Prediction
7 (R1-a R′a )(Fe(1-b)(1-g)Cob (1-g) Tig )12
R = Y, Nd, Sm R’ = Zr, Dy a = 0.0, 0.1, …, 1.0 b = 0.0, 0.1, …, 1.0
g = 0.0, 0.5, …, 2.0
3,630
T. Fukazawa et al., Phys. Rev. Mater. 3, 053807 (2019)
0%
25%
50%
75%
100%
0
20
40
60
80
100
RS BO #1 #2 #3 #4 #5 #6 #7 #8 #9 #10 #11 , (without nor ) (with and
) (, only)
Su cc es s ra te [% ]
µ0M TC E
Similarity-based information measure
R = Y, Nd, Sm R’ = Zr, Dy a = 0.0, 0.1, …, 1.0 b = 0.0, 0.1, …, 1.0
g = 0.0, 0.5, …, 2.0
3,630
Nd Fe(8i)
10
Material {xi } (i =1, 2, 3, … , N )
Property {yi } (i =1, 2, 3, … , N )
First-principles calc. Regression
Prediction
ü
11Orbital Field Matrix (OFM,
T.L. Pham et al., STAM 18, 756 (2017); JCP 148, 204106 (2018)
n n n matminer XenonPy
STAGE III
20 nm
(a) t’=0 (b) t’=0.5 (c) t’=2.5 (d) t’=5 (e) t’=10
0 10 at%Dy
16
STAGE III
Category Descriptors Atomic properties of transition metals
(T)
ZT , rT , rcv T , IPT , !T , S 3d , L3d , J3d
Atomic properties of rare-earth metals (R)
ZR, rR, rcv R , IPR, !R, S 4 f , L4 f , J4 f , gJ , J4 f gJ ,
J4 f (1 ! gJ) Structural information (S)
CT , CR, dT!T , dT!R, dR!R, NT!R, NR!R, NR!T
Table I. Transition metal, rare-earth and structural descriptors.
See also the supporting information.
binations in another way, such as indicator diagram to select the
best combinations depending on the purpose of the analy-
sis.6–8)
Yet, it isn’t easy to understand relationship and structures among
descriptors from a huge list of scores and descriptors. Informatics
treatment usually thinks little of the importance of meaning of
descriptors, though they are the physical param- eters that
physicists regards as important. But, we hope that we can extract
more information from the huge data. In the present work, we
introduce well-defined subgroup concept to clarify the relationship
among descriptors. Our method can also elucidate the way how to
choose the combination of the descriptors systematically as well as
the way to understand the meaning of descriptors.
Our target variable is the experimental TC of the rare-earth
transition-metal binary stoichiometry alloys in this study.9)
We take descriptors from the element dependent categories (R for
rare-earth elements and T for transition metal ele- ments). We
utilize the knowledge of the conventional theory- driven method.
The key parameters of the e!ective theory- driven models are
related to the properties of constituent el- ements and/or
structural parameters. For example, the orbital energy level
increases (becomes deeper) as the atomic num- ber Z increases. The
electron interaction becomes stronger as the atomic orbital is more
localized. The magnetic exchange- couplings are associated with the
strength of electron interac- tion and transfer integrals. Coupling
strength between TM-3d and RE-4f (through RE-5d) is crucial for
discussing RE de- pendence of the magnetism. The strength is
proportional to the 3d-4f e!ective exchange coupling and the 4f
total spin projected onto the 4f total angular moment J4 f . The
latter quantity is given by J4 f (1 ! gJ) with gJ the Lande
g-factor. We also add the descriptors from the structure-related
cate- gory (S) to describe the ratio of the elements as well as
real volume or spatial dependent simple variables to distinguish,
e.g., Th2Zn17 and Th2Ni17 polytypes. We list the descriptors in
Table I and give their detailed explanations in the support- ing
information.
As a regression model, we employ the kernel ridge regres- sion with
the radial basis function kernel. The kernel ridge regression can
include the non-linear e!ects of the descrip- tors and has much
stronger power to fit target functions with descriptors though
there exist a demerit of taking much more time to fit/predict the
regression models than the linear re- gression does. We used python
scripts with mpi4py, scipy and scikit-learn.10, 11) Our scores in
the regression models are the R2 values, which we evaluate in the
leave-one-out cross vali- dation.
First, we analyze the descriptors. We take Pearson’s corre-
all
leave-CR-out
Fig. 1. Top panel: The blue line shows the best score for each
number of descriptors. The orange dotted line shows the score when
CR is removed. Bottom panel: CR (Å!3) vs TC (" C).
lation coe"cient between descriptors. The absolute values of
Pearson’s correlation coe"cient among three descriptors, ZT , rT
and S 3d, among the T categories are the same, 1, which means that
their contributions are the same in the regression model after the
normalization procedure. Therefore, the num- ber of the independent
descriptors is reduced from 27 to 25. Then, we execute exhaustive
search for 225 ! 1 = 3.3 # 107
regression models where the combinations of descriptors are
di!erent and evaluate their accuracy values (scores).
We usually evaluate the score of the regression model. However, we
want to evaluate the importance of the descrip- tors. We change a
viewpoint from the regression model to the descriptor to discuss
the importance of the latter. We use the relevance analysis,12, 13)
which roughly corresponds to the linear response theory as to the
descriptors. (See also the supporting information.) It originally
utilizes the change of values when we remove/add a descriptor. The
former corre- sponds to the leave-one-out experiment, while the
latter the add-one-in experiment. The descriptor is strongly or
weakly relevant when their accuracy score changes meaningfully in
the leave-one-out or in the add-one-in experiment, respec-
tively.
Our first relevance analysis is that on the strong relevance. We
found that it is only the descriptor, CR, which is strongly
relevant. We can verify the importance of CR when we plot CR vs TC
. Almost all the points are placed in the bottom-left side of the
right panel of Fig. 1. It is no doubt that CR has dominant
dependence to TC . We note that we can’t find such relationship if
we simply execute regressions.
The second relevance analysis is that on the weak rele-
2
27
Category Descriptors Atomic properties of transition metals
(T)
ZT , rT , rcv T , IPT , !T , S 3d , L3d , J3d
Atomic properties of rare-earth metals (R)
ZR, rR, rcv R , IPR, !R, S 4 f , L4 f , J4 f , gJ , J4 f gJ ,
J4 f (1 ! gJ) Structural information (S)
CT , CR, dT!T , dT!R, dR!R, NT!R, NR!R, NR!T
Table I. Transition metal, rare-earth and structural descriptors.
See also the supporting information.
binations in another way, such as indicator diagram to select the
best combinations depending on the purpose of the analy-
sis.6–8)
Yet, it isn’t easy to understand relationship and structures among
descriptors from a huge list of scores and descriptors. Informatics
treatment usually thinks little of the importance of meaning of
descriptors, though they are the physical param- eters that
physicists regards as important. But, we hope that we can extract
more information from the huge data. In the present work, we
introduce well-defined subgroup concept to clarify the relationship
among descriptors. Our method can also elucidate the way how to
choose the combination of the descriptors systematically as well as
the way to understand the meaning of descriptors.
Our target variable is the experimental TC of the rare-earth
transition-metal binary stoichiometry alloys in this study.9)
We take descriptors from the element dependent categories (R for
rare-earth elements and T for transition metal ele- ments). We
utilize the knowledge of the conventional theory- driven method.
The key parameters of the e!ective theory- driven models are
related to the properties of constituent el- ements and/or
structural parameters. For example, the orbital energy level
increases (becomes deeper) as the atomic num- ber Z increases. The
electron interaction becomes stronger as the atomic orbital is more
localized. The magnetic exchange- couplings are associated with the
strength of electron interac- tion and transfer integrals. Coupling
strength between TM-3d and RE-4f (through RE-5d) is crucial for
discussing RE de- pendence of the magnetism. The strength is
proportional to the 3d-4f e!ective exchange coupling and the 4f
total spin projected onto the 4f total angular moment J4 f . The
latter quantity is given by J4 f (1 ! gJ) with gJ the Lande
g-factor. We also add the descriptors from the structure-related
cate- gory (S) to describe the ratio of the elements as well as
real volume or spatial dependent simple variables to distinguish,
e.g., Th2Zn17 and Th2Ni17 polytypes. We list the descriptors in
Table I and give their detailed explanations in the support- ing
information.
As a regression model, we employ the kernel ridge regres- sion with
the radial basis function kernel. The kernel ridge regression can
include the non-linear e!ects of the descrip- tors and has much
stronger power to fit target functions with descriptors though
there exist a demerit of taking much more time to fit/predict the
regression models than the linear re- gression does. We used python
scripts with mpi4py, scipy and scikit-learn.10, 11) Our scores in
the regression models are the R2 values, which we evaluate in the
leave-one-out cross vali- dation.
First, we analyze the descriptors. We take Pearson’s corre-
all
leave-CR-out
Fig. 1. Top panel: The blue line shows the best score for each
number of descriptors. The orange dotted line shows the score when
CR is removed. Bottom panel: CR (Å!3) vs TC (" C).
lation coe"cient between descriptors. The absolute values of
Pearson’s correlation coe"cient among three descriptors, ZT , rT
and S 3d, among the T categories are the same, 1, which means that
their contributions are the same in the regression model after the
normalization procedure. Therefore, the num- ber of the independent
descriptors is reduced from 27 to 25. Then, we execute exhaustive
search for 225 ! 1 = 3.3 # 107
regression models where the combinations of descriptors are
di!erent and evaluate their accuracy values (scores).
We usually evaluate the score of the regression model. However, we
want to evaluate the importance of the descrip- tors. We change a
viewpoint from the regression model to the descriptor to discuss
the importance of the latter. We use the relevance analysis,12, 13)
which roughly corresponds to the linear response theory as to the
descriptors. (See also the supporting information.) It originally
utilizes the change of values when we remove/add a descriptor. The
former corre- sponds to the leave-one-out experiment, while the
latter the add-one-in experiment. The descriptor is strongly or
weakly relevant when their accuracy score changes meaningfully in
the leave-one-out or in the add-one-in experiment, respec-
tively.
Our first relevance analysis is that on the strong relevance. We
found that it is only the descriptor, CR, which is strongly
relevant. We can verify the importance of CR when we plot CR vs TC
. Almost all the points are placed in the bottom-left side of the
right panel of Fig. 1. It is no doubt that CR has dominant
dependence to TC . We note that we can’t find such relationship if
we simply execute regressions.
The second relevance analysis is that on the weak rele-
2
J. Phys. Soc. Jpn. LETTERS
vance, where, in the original prescription, we add another de-
scriptor to the set of descriptors, which we must define. We define
groups and subgroups here and make use of them in the relevance
analysis. We utilize hierarchal clustering analysis, where the
distance between descriptors is one minus absolute values of
Pearson’s correlation coe!cient. We can define the groups or
subgroups that are made of tree nodes containing below the
distance, d, of the clustering. For example, we can define four
groups at d = 0.5. Two of them have the same de- scriptors as those
of the T and R categories, while we have two groups for the
original S category. (We call the original cluster category and the
cluster by the hierarchical analysis group.) The dTR constitutes a
group, while the other S category de- scriptors do the other. It
isn’t surprising that the grouping at d = 0.5 is almost the same as
the categories defined a priori as T, R and S when we remember the
definition of the descrip- tors of materials. Here we successfully
defined the groups and subgroups, where the groups are almost the
same as the orig- inal category but are clustered from the data
themselves. (We redefine the group S as the result of this
clustering. The group S that doesn’t include dTR is di"erent from
the category S.)
We can further make advances in this grouping. We notice that the
definition of the value of d is unnecessary, but we only have to
define the vertical line of the decomposition tree to define
subgroups because the child nodes below the vertical line is the
same. Thus, we are able to define many subgroups of descriptors as
sets of the child nodes of the dendrogram.
We also notice that the relevance analysis can be done not only for
a descriptor, but also for a subgroup of descriptors. We apply the
relevance analysis not to a descriptor but to a subgroup/group. We
call this method subgroup relevance analysis. We plotted the result
in Fig. 2. The horizontal score is evaluated in the leave-one-out
experiment and related to the strong relevance and the vertical
scores in the add-one-in experiment and to the weak relevance. Note
that we evaluate score of a subgroup belonging to the group under
the condi- tion that we must use at least one descriptor in the
subgroup and we can add any descriptors belonging to the other
group in the weak relevance analysis.
We explain how to read Fig. 2. The weak relevance val- ues, or
leave-one-out values are written as vertical values. The subgroup
containing only rR has the score, 0.894670, which is the highest
score in the condition that we must take the sub- group rR in the
group R and we can take any descriptors in the other groups. (A
subgroup which has a descriptor is also a subgroup.) The subgroup
containing rR, ZR and rcv
R has the score, 0.954451, which is the highest score in the
condition that we must take at least one descriptor in the subgroup
rR, ZR and rcv
R of the group R and we can take any descriptors in the other
groups as explained in the previous paragraph.
The sole descriptor ZR in the group R has the highest score
(0.954451). It means that ZR can solely represent the group R. It
is also the case for the CR subgroup in the group S. But, the
structure of the group T is di"erent from those of the group R and
of the group S. The subgroup made of J3d, !T , rcv
T , ZT (and rT and S 3d) has the highest score (0.948763), but its
child subgroup descriptors has smaller scores (0.924265 and
0.946501). It means that there exists no single descriptor that can
represent the whole nature of the group T. When we ex- amine all
the combinations made of J3d, !T , rcv
T , ZT , we find that ZT takes the best score (0.954501) if we
choose descrip-
Table II. The best R2 score and descriptors as a function of the
number of descriptors n.
n score descriptor(s) 2 0.870153 CR,ZT 3 0.942222 CR,ZR,ZT 4
0.953386 J3d ,CR,ZR,ZT 5 0.954294 L3d ,J3d ,CR,ZR,ZT 6 0.954391 L3d
,J3d ,!T ,CR,ZR,ZT 7 0.954452 L3d ,J3d ,!T ,CR,ZR,ZT ,rcv
T 8 0.954448 L3d ,J3d ,!T ,IPT ,CR,ZR,ZT ,rcv
T
tors among them, a set of ZT and J3d is the best (0.953386) for two
descriptors and a set of ZT , J3d and L3d is the best (0.954451)
for three descriptors. We note that the descriptor ZT has the same
e"ect as S 3d. We discuss the the interpreta- tion of the result
later.
We can also know the importance of the groups by the hor- izontal
values above the yellow solid line in Fig. 2. They are the strong
relevance, or leave-one-out values as to the groups, T, R and S.
For example, the group R has the value, 0.875866, which is the best
score when we remove all the descriptors of the group R. The better
the score is, the less important the group is. The value, 0.506824,
is the smallest among them, which means that the group S is the
most important among the groups. On the other hand, the least
important group is R, the value of which is 0.875866. It means that
the score still holds a high value even if we exclude all the
descriptors in the group R. Therefore, the importance of group R is
the lowest among T, S and R.
We add additional explanation in Fig. 2. The descriptor J4 f (1!gJ)
can represent the subgroup containing gJ ,...,J4 f gJ , but the
score is 0.932963, which is lower than the score 0.954451 of ZR. We
also add comment on the group of dTR. The strong relevance value is
0.954451 and the weak rele- vance value is 0.953824. The facts that
their di"erence is small and that the weak relevance value is
smaller than the strong relevance value mean that the existence of
the group dTR makes the regression model worse.
Here, we compare the result of the subgroup relevance analysis in
Fig. 2 with the best score having n descriptors without the
subgroup relevance analysis in table II. The set of CR, ZR and ZT
is the best score (0.94222) for n = 3. The set of CR, ZR, ZT and JR
is the best score (0.953386) for n = 4. The set of CR, ZR, ZT , JR
and L3d is the best score (0.954294) for n = 5. The descriptor sets
are made of the most important descriptor in the group R (ZR) and
that in the group S (CR) and those in the group T (ZT when we
choose a descriptor. J3d and ZT when we choose two descriptors and
J3d, L3d and ZT when we choose three.) These combinations are the
same as the analysis in the previous paragraph. Thus, the subgroup
relevance analysis successfully illustrates the structure among
descriptors and their importance.
We can get the conclusion that the descriptor CR is strongly
relevant when we define subgroups at d " 0 and execute the
leave-one-out experiment. The original relevance analysis is the
special case of the subgroup relevance analysis. Therefore, the
subgroup relevance analysis is the natural extension from the
original relevance analysis.
We explain the advantage of the expression with the den- drogram.
For example, we can easily choose rcv
R if we don’t
3
20Dissimilarity
Descriptor
$2 = 1
21Bagging-based dissimilarity voting machine
KRR + Ensemble learning Nguyen et al., J. Phys. Mater. 2, 034009
(2019); (Corrigendum) ibid, 3, 019501(2019)
22Materials discovery based on EA
22
T. Ishikawa et al., Phys. Rev. B 100, 174506 (2019)
23
datasets
R = 0.81
GP training
f = ÷ × + μ* - - ÷ 9 - - - 8 + + μ* - - ÷ 9 - + ÷ ÷ 10 10 2 ÷ - - -
÷ ÷ + × 10 + 6 9 S - + + 6 h h M - μ* M + μ* 10 10 10 × ÷ 8 2 4 + 6
h × 4 4 10 ÷ - - + M 7 + + - 9 ÷ - - - + - ÷ 9 - ÷ P 6 + 6 h × × +
μ* + μ* × μ* h 9 9 + 3 h × 10 + 6 h 10 10 P - 9 + + S 7 M ÷ ÷ × 8 7
- - ÷ P - + h 10 - - M 7 1 - 8 + + + 8 1 × μ* h ÷ - - + M 7 + ÷ ÷ ÷
+ M 8 h - - + M + + - μ* × 9 h ÷ - ÷ × - + - 3 × M M 10 4 10 4 - +
6 9 × 1 M P + μ* × - × - 6 M μ* + 1 μ* ÷ 9 9 1 - + M 7 7 6 + + 8 ÷
7 5 - 9 + ÷ 7 h S × 1 M 6 10 6 × 1 M 6 - ÷ 4 - - ÷ P + - 5 4 1 - 8
+ + 3 h - - - ÷ 8 - + ÷ ÷ 6 10 2 ÷ - - × × 4 h 1 × 10 + 6 h M + M 7
× ÷ 8 2 4 × 10 × × 9 h M 10 10 ÷ × × + μ*
- ÷ × + h 10 10 4 × M × + 6 9 ÷ 2 h 9 × ÷ 10 10 5 × 4 M 10 + 6 h ×
9 h 10 9 - - - 9 ÷ - - - ÷ × × + μ* 10 9 × ÷ - - ÷ 9 + × ÷ M P 6 1
× × + μ* + μ* × μ* h 9 9 10 10 5 × 4 M × 10 × ÷ ÷ 9 10 2 M 10 10 P
9 - × h 9 9
Polish notation: F = + × 2 X 1 ÷ 3 cos Y
⇒ 2 × (X - 1) + 3 ÷ cos(Y)
Details of function:
Tc (K) from predictor
CaH6 Im-3m 300 1.66 1388 174 155 KScH12 C2/m 300 1.54 1139 133
81
GA structure search ⇒ C2/m (No. 12)
Modulated H-cage structure
KScH12