ChemEnv : a fast and robust tool to automatically identify coordination environments

The tutorial was written by Janine George (Contact: janine.george(at)uclouvain.be, Université catholique de Louvain, https://jageo.github.io/) in collaboration with Geoffroy Hautier (Université catholique de Louvain) and David Waroquiers (Université catholique de Louvain). The tutorial is based on a sample file by David Waroquiers.

Written using:

  • pymatgen==2018.3.13

This tutorial will show you how to use ChemEnv to analyze chemical coordination environments and to identify model coordination environments that are closest to the coordination environment at hand. The latter is decided by the calculation of continous symmetry measures (CSM).

The analysis of the coordination environments proceeds as follows:

  1. Search for neighbors by a modified Voronoi analysis on a grid of distance and angular parameters
  2. Calculation of corresponding continous symmetry measures (CSMs) with model environments for all distance and angular parameters
  3. Refinement of the results with different strategies that will be explained below (e.g., by using certain distance and angular parameters)

More information can be found in: D. Waroquiers et al., Chem Mater., 2017, 29, 8346

First steps

  1. Download and install pymatgen

  2. Sign up for the materials project on https://materialsproject.org/.

  3. Setup the connection to the materials project with: pmg config --add PMG_MAPI_KEY <USER_API_KEY>

If you only want to analyse very few structures, you can also use the get_environment script. Just type get_environment and follow the instructions. It will use the SimplestChemenvStrategy with default settings (i.e. distance_cutoff=1.4, angle_cutoff=0.3, Only anion-cation bonds and continuous_symmetry_measure_cutoff=10.0 ). With this script, you can also visualize the coordination environments. To do so, please install vtk (e.g. via pip3 install vtk). Moreover, you can analyze the dependency of the coordination environments on the distance and angular parameters. Just use the 'grid' setting after the question "See list of environments determined for each (unequivalent) site ?" and choose a site. Now, a tutorial to write your own scripts to analyze the coordination environments will follow.

Import the relevant modules

In [1]:
from pymatgen.analysis.chemenv.coordination_environments.coordination_geometry_finder import LocalGeometryFinder
import logging
from pymatgen.ext.matproj import MPRester
from pymatgen.analysis.chemenv.coordination_environments.chemenv_strategies import SimplestChemenvStrategy, MultiWeightsChemenvStrategy
from pymatgen.analysis.chemenv.coordination_environments.structure_environments import LightStructureEnvironments

Load a structure from the materials project

We take a simple example here: alpha-quartz.

In [3]:
# Get a structure from the materials project (alpha-quartz)
a = MPRester()
struct = a.get_structure_by_material_id('mp-7000')

Here, the structure of alpha-SiO2 (mp-7000) is depicted with possible coordination environments for Si (blue).

The graphic is created with VESTA: K. Momma and F. Izumi, "VESTA 3 for three-dimensional visualization of crystal, volumetric and morphology data," J. Appl. Cryst., 2011, 44, 1272.

Alternatively: load a structure from a cif

In [4]:
#just remove the comment
#from pymatgen.io.cif import CifParser
#parser = CifParser("mystructure.cif")
#struct = parser.get_structures()[0]

Setup of the local geometry finder (Voronoi analysis and csm on a grid follow)

There are several centering_types that can be used in the Voronoi analysis. Here, the structures are centered at the centroid (centering_type='centroid') and the central site, for which the Voronoi analysis is performed, is included in the calculation of the centroid (include_central_site_in_centroid=True). See the code for more information. Moreover, a logging is introduced. This is especially important if you run very large calculations (large distances and small angle parameters).

In [5]:
# Setup the local geometry finder
lgf = LocalGeometryFinder()
lgf.setup_parameters(centering_type='centroid', include_central_site_in_centroid=True)

#you can also save the logging to a file, just remove the comment

logging.basicConfig(#filename='chemenv_structure_environments.log',
                    format='%(levelname)s:%(module)s:%(funcName)s:%(message)s',
                    level=logging.DEBUG)
lgf.setup_structure(structure=struct)
If you use the ChemEnv tool for your research, please consider citing the following reference(s) :
==================================================================================================
David Waroquiers, Xavier Gonze, Gian-Marco Rignanese, Cathrin Welker-Nieuwoudt, Frank Rosowski,
Michael Goebel, Stephan Schenk, Peter Degelmann, Rute Andre, Robert Glaum, and Geoffroy Hautier,
"Statistical analysis of coordination environments in oxides",
Chem. Mater., 2017, 29 (19), pp 8346–8360,
DOI: 10.1021/acs.chemmater.7b02766

Get the StructureEnvironments (=Voronoi analysis plus CSM calculation on a grid of distance and angular parameters)

In the first step, sets of neighbors for different distance and solid angle parameters via Voronoi analysis and corresponding continous symmetry measures (CSMs) are computed.

Have a close look at the code for more information on parameters that you can define. For example, we define the maximum_distance_factor to save some computing time. Multiplied with the distance to the closest neighbour it results in the maximum distance that is considered in the Voronoi approach. Similarily, one could define a minimum_angle_factor. This factor is multiplied with the maximum solid angle to the neighbours to arrive at the minimum solid angle that is considered in the Voronoi analysis. Please, be careful. One can also refine these parameters later. For more information on these parameters, see D. Waroquiers et al., Chem Mater., 2017, 29, 8346. Moreover, one can also exclude atoms from the computation of the coordination environments (e.g., se = lgf.compute_structure_environments(excluded_atoms=['O'])). Additional parameters such as only_atoms, only_indices and only_cations in combination with valences might be helpful as well.

In [6]:
# Get the StructureEnvironments
se = lgf.compute_structure_environments(maximum_distance_factor=1.41,only_cations=False)
INFO:coordination_geometry_finder:compute_structure_environments:Getting DetailedVoronoiContainer
INFO:voronoi:__init__:Setting Voronoi list
INFO:voronoi:setup_voronoi_list:Getting all neighbors in structure
INFO:voronoi:setup_voronoi_list:Setting up Voronoi list :
INFO:voronoi:setup_voronoi_list:  - Voronoi analysis for site #0 (1/9)
INFO:voronoi:setup_voronoi_list:  - Voronoi analysis for site #1 (2/9)
INFO:voronoi:setup_voronoi_list:  - Voronoi analysis for site #2 (3/9)
INFO:voronoi:setup_voronoi_list:  - Voronoi analysis for site #3 (4/9)
INFO:voronoi:setup_voronoi_list:  - Voronoi analysis for site #4 (5/9)
INFO:voronoi:setup_voronoi_list:  - Voronoi analysis for site #5 (6/9)
INFO:voronoi:setup_voronoi_list:  - Voronoi analysis for site #6 (7/9)
INFO:voronoi:setup_voronoi_list:  - Voronoi analysis for site #7 (8/9)
INFO:voronoi:setup_voronoi_list:  - Voronoi analysis for site #8 (9/9)
INFO:voronoi:setup_voronoi_list:Voronoi list set up in 0.89 seconds
INFO:voronoi:__init__:Setting neighbors distances and angles
INFO:voronoi:__init__:Neighbors distances and angles set up in 0.00 seconds
INFO:coordination_geometry_finder:compute_structure_environments:DetailedVoronoiContainer has been set up
INFO:coordination_geometry_finder:compute_structure_environments: ... in site #0/9 (Si)
DEBUG:coordination_geometry_finder:compute_structure_environments:    ... getting environments for nb_set (4, 0)
DEBUG:coordination_geometry_finder:compute_structure_environments:    ... getting environments for nb_sets added from hints
INFO:coordination_geometry_finder:compute_structure_environments:    ... computed in 0.05 seconds
INFO:coordination_geometry_finder:compute_structure_environments: ... in site #1/9 (Si)
DEBUG:coordination_geometry_finder:compute_structure_environments:    ... getting environments for nb_set (4, 0)
DEBUG:coordination_geometry_finder:compute_structure_environments:    ... getting environments for nb_sets added from hints
INFO:coordination_geometry_finder:compute_structure_environments:    ... computed in 0.03 seconds
INFO:coordination_geometry_finder:compute_structure_environments: ... in site #2/9 (Si)
DEBUG:coordination_geometry_finder:compute_structure_environments:    ... getting environments for nb_set (4, 0)
DEBUG:coordination_geometry_finder:compute_structure_environments:    ... getting environments for nb_sets added from hints
INFO:coordination_geometry_finder:compute_structure_environments:    ... computed in 0.05 seconds
INFO:coordination_geometry_finder:compute_structure_environments: ... in site #3/9 (O)
DEBUG:coordination_geometry_finder:compute_structure_environments:    ... getting environments for nb_set (2, 0)
DEBUG:coordination_geometry_finder:compute_structure_environments:    ... getting environments for nb_sets added from hints
INFO:coordination_geometry_finder:compute_structure_environments:    ... computed in 0.02 seconds
INFO:coordination_geometry_finder:compute_structure_environments: ... in site #4/9 (O)
DEBUG:coordination_geometry_finder:compute_structure_environments:    ... getting environments for nb_set (2, 0)
DEBUG:coordination_geometry_finder:compute_structure_environments:    ... getting environments for nb_sets added from hints
INFO:coordination_geometry_finder:compute_structure_environments:    ... computed in 0.03 seconds
INFO:coordination_geometry_finder:compute_structure_environments: ... in site #5/9 (O)
DEBUG:coordination_geometry_finder:compute_structure_environments:    ... getting environments for nb_set (2, 0)
DEBUG:coordination_geometry_finder:compute_structure_environments:    ... getting environments for nb_sets added from hints
INFO:coordination_geometry_finder:compute_structure_environments:    ... computed in 0.02 seconds
INFO:coordination_geometry_finder:compute_structure_environments: ... in site #6/9 (O)
DEBUG:coordination_geometry_finder:compute_structure_environments:    ... getting environments for nb_set (2, 0)
DEBUG:coordination_geometry_finder:compute_structure_environments:    ... getting environments for nb_sets added from hints
INFO:coordination_geometry_finder:compute_structure_environments:    ... computed in 0.00 seconds
INFO:coordination_geometry_finder:compute_structure_environments: ... in site #7/9 (O)
DEBUG:coordination_geometry_finder:compute_structure_environments:    ... getting environments for nb_set (2, 0)
DEBUG:coordination_geometry_finder:compute_structure_environments:    ... getting environments for nb_sets added from hints
INFO:coordination_geometry_finder:compute_structure_environments:    ... computed in 0.03 seconds
INFO:coordination_geometry_finder:compute_structure_environments: ... in site #8/9 (O)
DEBUG:coordination_geometry_finder:compute_structure_environments:    ... getting environments for nb_set (2, 0)
DEBUG:coordination_geometry_finder:compute_structure_environments:    ... getting environments for nb_sets added from hints
INFO:coordination_geometry_finder:compute_structure_environments:    ... computed in 0.02 seconds
INFO:coordination_geometry_finder:compute_structure_environments:    ... compute_structure_environments ended in 1.38 seconds

Different strategies to analyze the StructureEnvironments

Now, the strategy to interpret the data from before is chosen to arrive at information about the coordination environments. One can choose between two different types of strategies: we start with the SimplestChemenvStrategy. This strategy type uses fixed angle and distance parameters for the definition of neighbors in the Voronoi approach. The resulting coordination environment is uniquely defined and is then given as the one with the lowest continuous symmetry measure. One of the disadvantages is that it fails for intermediate coordination environments and depends very much on the cutoff parameters chosen. However, it is right in about 85% of the cases.

The important parameters for this strategy are distance_cutoff and angle_cutoff. The neighbouring atoms that are considered in this approach have a maximal distance of distance_cutoff * the distance to the closest neighbour and a minimal solid angle of angle_cutoff * the biggest solid angle. For more information, see: D. Waroquiers et al., Chem Mater., 2017, 29, 8346 and the code.

In [7]:
strategy = SimplestChemenvStrategy(distance_cutoff=1.4, angle_cutoff=0.3)
lse = LightStructureEnvironments.from_structure_environments(strategy=strategy, structure_environments=se)

Next, one can print the information on the coordination environments for each site in the structure. Here, the information for a site occupied by one oxygen is printed.

In [8]:
#print coordination environments for a special site
isite = 5
print(lse.coordination_environments[isite])
[{'ce_symbol': 'A:2', 'ce_fraction': 1.0, 'csm': 2.2602837757388414, 'permutation': [0, 1]}]

ce_symbol symbolizes the coordination environment. A:2 is the angular coordination environment. The full list of coordination environments can be seen in the supporting information of D. Waroquiers et al., Chem Mater., 2017, 29, 8346: Supporting Information. With this strategy, the ce_fraction is always equal to 1.0. The csm is the continous symmetry measure. This value lies between 0.0 and 100.0. At 0.0 the chemical environment in the structure is identical to the model environment and can be interpreted as a distance to a shape. In this example, a csm of 2.26 shows that the environment still shows some similarity to the model environment. Coordination environments with an csm greater than 2.5 are already considered as rather distorted. For more information on the CSM, have a look at the original publiciation: M. Pinksky, D. Avnir, Inorg. Chem., 1998, 37, 5575.

A more evolved strategy type, especially for intermediate coordination enviroments, and the one used in the publication D. Waroquiers et al., Chem Mater., 2017, 29, 8346 is the MultiWeightsChemenvStrategy. In the following, the parameters (weigths) from the publication are used. Of course, experts could also modify this. Have a close look at the code for this here.

In [9]:
#Get the strategy from D. Waroquiers et al., Chem Mater., 2017, 29, 8346.
from pymatgen.analysis.chemenv.coordination_environments.chemenv_strategies import MultiWeightsChemenvStrategy
strategy = MultiWeightsChemenvStrategy.stats_article_weights_parameters()

lse = LightStructureEnvironments.from_structure_environments(strategy=strategy, structure_environments=se)

We start with the same oxygen site.

In [10]:
#print coordination environments for a special site
isite = 5
print(lse.coordination_environments[isite])
[{'ce_symbol': 'A:2', 'ce_fraction': 0.6735414322036577, 'csm': 2.2602837757388414, 'permutation': [0, 1]}, {'ce_symbol': 'L:2', 'ce_fraction': 0.3264585677963424, 'csm': 2.8902708777516026, 'permutation': [0, 1]}]

This time, two coordination environments for one site exist as indicated by two appearances of ce_symbol and there is also a ce_fraction for each of the environments. The latter indicates the coordination environment is an intermediate between A:2 (angular) and L:2 (linear) with 67% angular environment and 33% linear environment. Also, there is a csm value for both environments. As already indicated by the ce_fraction, the csm for A:2 is lower (=in better agreement with the model environment) than the csm for L:2.

Now, an example follows where only one coordination environment exists. A Si occupies the corresponding site.

In [11]:
#another site where you have only one coordination environment (tetrahedron, T:4)
isite = 1
print(lse.coordination_environments[isite])
[{'ce_symbol': 'T:4', 'ce_fraction': 1.0, 'csm': 0.009887786111312944, 'permutation': [0, 1, 2, 3]}]

The resulting coodination environment is T:4 (tetrahedron).