Automated construction of symmetrized Wannier-like tight-binding models from ab initio calculations

Wannier tight-binding models are effective models constructed from first-principles calculations. As such, they bridge a gap between the accuracy of first-principles calculations and the computational simplicity of effective models. In this work, we extend the existing methodology of creating Wannier tight-binding models from first-principles calculations by introducing the symmetrization post-processing step, which enables the production of Wannier-like models that respect the symmetries of the considered crystal. Furthermore, we implement automatic workflows, which allow for producing a large number of tight-binding models for large classes of chemically and structurally similar compounds, or materials subject to external influence such as strain. As a particular illustration, these workflows are applied to strained III-V semiconductor materials. These results can be used for further study of topological phase transitions in III-V quantum wells.

Wannier tight-binding models are effective models constructed from first-principles calculations. As such, they bridge a gap between the accuracy of first-principles calculations and the computational simplicity of effective models. In this work, we extend the existing methodology of creating Wannier tight-binding models from first-principles calculations by introducing the symmetrization post-processing step, which enables the production of models that respect the symmetries of the considered crystal. Furthermore, we implement high-throughput workflows, which allow for producing a large number of tight-binding models for a specified class of materials, or materials subject to external influence (such as strain). As an illustration, these workflows are applied to strained III-V semiconductor materials, representing an example of strained III-V semiconductor compounds. These results can be used for further study of topological phase transitions in III-V quantum wells.

I. INTRODUCTION
A significant part of materials science is devoted to the problem of finding the electronic structure of a given material. As a result, numerous computational techniques have been developed to study this problem. These techniques can roughly be classified into two kinds: Firstprinciples methods solve the problem using the fundamental physical principles and properties of atoms comprising the material. For weakly-interacting systems, density functional theory (DFT) [1] is the dominant (mean field) technique for solving the electronic structure problem from first principles.
In contrast, empirical methods aim to capture the relevant physical properties using a simplified model. Such models are usually matched to known properties of the material, which can be obtained from either experiments or first-principles calculations. An example of such an empirical method is given by the tight-binding approximation, which describes a material as a set of localized orbitals and predefined electron hopping terms between them. While the first-principles methods typically have superior accuracy, empirical methods are often used due to their lower computational cost. In particular, calculations of complex device geometries are often inaccessible to a direct first-principles study. As such, the construction of reliable empirical models is of significant importance. And the technique of creating Wannier tight-binding models from first-principles calculations is arguably one of the most popular tools in nowardays computational materials science.
In recent years, high-throughput techniques made a profound impact in various fields of materials science [2][3][4][5]. While the domain eludes a strict definition, a common feature of such techniques is that computational tools are applied to a wide range of candidate materials, or variations of a given material, in search of some beneficial property. Existing codes and techniques are combined and applied on a scale that was not previously possible. A range of automated frameworks [6,7] support this by facilitating the combination of separate calculations into logical workflows. The challenge in designing such a high-throughput workflow is to make it resilient to varying input parameters. Since the number of calculations performed is too large to be human-controlled, many decisions -for example which calculation to perform based on the output of a previous calculation -need to be encoded into the automated workflow.
In this paper, we introduce two steps that allow us to construct a high-throughput workflow for creating firstprinciples based Wannier tight-binding models. These two steps address two standardly known problems of using Wannier90 [8,9] in combination with any ab initio software: the absence of symmetries present in the original compound in the obtained tight-binding model, and the neccessity to search for optimal internal and external energy windows for projection of the first-principles energy bands.
In Sec. II, we review the general process of calculating the Wannier tight-binding models by means of Wan-nier90 and explains the proposed and implemented sym-metrization and automatic energy window choice procedures. Sec. III describes how these procedures are used for the development of a high-throughput workflow using the AiiDA [6] framework. While this workflow automates the tight-binding calculation itself, there are still some tunable parameters which might be eliminated by a more sophisticated system. By using a modular design approach, we provide an extensible framework for implementing such improvements. In the final section, we illustrate the application of this high-throughput workflow to calculate tight-binding models for strained III-V semiconductor materials. These are useful in the pursuit of Majorana nanowire devices [10][11][12], enabling the study of transport properties for different topological devices.

II. CONSTRUCTION OF WANNIER TIGHT-BINDING MODELS
In this section, we describe the process of generating symmetric Wannier tight-binding (WTB) models. First, we give a short description of the method as introduced in the works of Refs. [13,14] and implemented in Wannier90 [8,9] software package. Next, we describe a method for symmetrizing these WTBs in a postprocessing step. Finally, we describe a scheme to enhance the band-structure accuracy by optimizing the energy windows used by Wannier90.

A. Wannier tight-binding construction
Tight-binding models represent a common way to describe crystalline systems in a computationally cheap way. The material is described as a system of localized orbitals with positions t i in the unit cell, and hopping terms H ij [R] between the j-th orbital in the unit cell at location R and the i-th orbital in the home unit cell R = 0. From these parameters, the matrix Hamiltonian can be written as [15] The Wannier tight-binding (WTB) method utilizes localized Wannier functions as basis orbitals to capture the compound's physics. These basis Wannier functions are obtained from first-principles simulations. This procedure is based on the work of Refs. [13,14] and implemented in the Wannier90 [8,9] code. After obtaining the necessary input files from a first-principles calculation, two steps are performed to construct these Wannier functions: In a first step, the Bloch wave-functions ψ n,k calculated by the first-principles code are disentangled to obtain M wave-functions, where M is the target number of basis Wannier functions in WTB. For selecting the Bloch wave-functions which are involved in this procedure, one needs to choose an outer energy window. Optionally, an inner energy window can be chosen. States inside this inner window will be preserved by the disentanglement. An optimization routine is performed to select the M states such that the "change of character" Ω I (defined in Ref. [14]) is minimized. As an initial guess for this optimization procedure, M localized trial orbitals |g m are used. Because the disentanglement procedure needs to discard some states, it usually changes both the symmetry and the energy bands of the model in comparison with first-principles results. Consequently, choosing good values for both the energy windows and the trial orbitals has a strong effect on the quality of the resulting model.
As a second (optional) step, another optimization is performed to find a unitary transformation such that the resulting Wannier functions are maximally localized [13]. Again, the trial orbitals |g m are used to create an initial guess for this optimization. Typically, these orbitals are chosen to be those chemical atomic orbitals that contribute most to the bands of interest. A method for constructing Wannier orbitals without the need for such a guess is described in Ref. [16].

B. Symmetrization
An important feature of tight-binding models, especially for studying topological effects, is that they preserve certain crystal symmetries. For a given symmetry group G, the symmetry constraint on the Hamiltonian matrix is given by [17] ∀g ∈ G : where D(g) is the representation of the symmetry g from the group G. For a Hamiltonian which does not fulfill these symmetry constraints, we define the symmetrized Hamiltonian as the group averagẽ As shown in App. A, this modified Hamiltonian respects Eq. 2. Furthermore, if the original Hamiltonian is already symmetric, the original and symmetrized Hamiltonian are identical. It is important to note that the eigenstates and eigenvalues of the symmetrized Hamiltonian may differ significantly from those of the non-symmetrized Hamiltonian. In fact, for an anti-symmetric initial Hamiltonian, meaning that for some symmetry g, the symmetrized result vanishes completely. However, given a Hamiltonian which almost respects the symmetry, this technique can effectively eliminate small symmetry-breaking terms.
In the context of tight-binding models, it is convenient to rewrite Eq. 3 directly in terms of the hopping matrices H[R] (see App. B for derivation): and the indices m, l only go over values for which T ml ij is a lattice vector. Fig. 1 shows the results of this symmetrization procedure on a tight-biding model for bulk InSb. The initial model already approximately fulfills the symmetry condition, which is reflected in the fact that the bandstructure does not change in the electronvolt scale. However, at the sub-millielectronvolt scale the degeneracy at Γ is lifted in the original model, but restored after the symmetrization procedure.
To determine the matrix representations D(g), we use the fact that Wannier90 allows one to manually choose the trial orbitals |g m . As a result, the basis after the disentanglement procedure corresponds to the chosen orbitals, up to some numerical error. Since the behavior of the basis orbitals under symmetries is known, D(g) can be determined in this way. Importantly, we used Wan-nier90 without performing the maximal localization step. It is the case in the illustrated application of Sec. IV, where this allows us to preserve the orbital basis. Alternatively, one could use the basis transformation matrices U (k) provided by Wannier90 [8] to transform D(g) into the maximally-localized basis. While this approach produces computationally cheaper localized models, the drawback is that the basis is different for each produced tight-binding model. As a result, comparing models is more difficult. Also, linear interpolation between models, as described in Section IV C, would require a change of basis.
Another approach to obtaining symmetric tightbinding models is to use the site-symmetry mode implemented in Wannier90. This approach relies on obtaining the symmetry information from the first-principles code, which is currently implemented only for Quantum Espresso [18,19]. The workflow described in Sec. III could be adapted to allow using this approach with only minimal changes.
C. Optimization for bandstructure fit As described above, an important parameter in running Wannier90 is the choice of the so-called energy windows [8]. There are two such windows: The outer window determines which states are taken into account for the disentanglement procedure. At every k-point, it must contain at least M bands, where M is the desired number of bands in the tight-binding model. The inner (or frozen) window on the other hand determines which states should not be modified during disentanglement. It can contain at most M bands at any given k.
Since the quality of the resulting tight-binding model depends sensitively on the choice of energy windows, a strategy for reliably choosing good windows is required. A straightforward way of achieving this is by iteratively optimizing the window values. Having constructed and symmetrized a tight-binding model, its quality can be determined by comparing its bandstructure to a reference computed directly from first-principles [20]. As a measure of their mismatch, we choose the average difference between the energy eigenvalues Some values of the energy windows cannot produce a tight-binding model, for example if the outer window contains less than M bands. As a result, finding appropriate energy windows is a constrained, four-dimensional optimization problem. The Nelder-Mead (downhill simplex) algorithm [21] can be used to solve this problem [22]. Fig. 2 shows the result of such an optimization procedure, for unstrained InSb as described in Section IV. A clear improvement is visible between the tight-binding model obtained with the initial windows chosen by hand (a), and the optimized window values (b). In particular, the conduction bands at the X and Z points are represented more accurately in the optimized model.

III. IMPLEMENTATION IN AiiDA WORKFLOWS
The AiiDA [6] platform is a Python framework for performing high-throughput calculations, focused on the field of materials physics. It enables reproducible research by keeping track of inputs, outputs and settings for each calculation. On top of this provenance layer, it provides a toolset for automatically chaining calculations into user-defined workflows.
In this section, we describe the implementation of the Wannier tight-binding extraction scheme as an AiiDA workflow. This automation enables the high-throughput application described in Section IV. Special care has been taken to design the workflow in a modular way, which enables re-using parts of the workflow for purposes other than tight-binding extraction. We first discuss these design principles, before showing how they are applied in the tight-binding workflows.
The code for the AiiDA workflows is available in the open-source aiida-tbextraction package, and provided as supplementary material.

A. Modular workflow design
The basic principle of modular workflow design is to split up a single monolithic workflow into minimal subworkflows or calculations that perform exactly one task.  Table II. For example, the tight-binding model created by Wan-nier90 is post-processed by parsing it to an HDF5 format, followed by optionally changing the order of the basis and symmetrizing the model. While this could easily be implemented in a single script, splitting these three steps up into separate calculations allows separately re-using each of the steps.
More complex workflows are created by combining multiple sub-workflows into a logical unit at a higher abstraction level. Inputs to the sub-workflow are either forwarded directly from the input to the parent workflow or created within the parent workflow. Similarly, outputs from the sub-workflow can either be forwarded to be an output of the parent workflow or consumed directly to guide the further execution of the parent workflow.
Since a complex workflow can consist of multiple lay-ers of wrapped sub-workflows, this modular approach is maintainable only if the overhead of forwarding input and output is minimal. Following the single responsibility principle, a parent workflow should not have to change if an input or output parameter of a sub-workflow changes, unless it directly interacts with this parameter. To achieve this, a syntax is needed to specify that a parent workflow will inherit inputs or outputs of a subworkflow, without explicitly listing each parameter. In AiiDA, such a feature is available in the newly-introduced expose functionality, as described in Appendix C. The modular architecture improves not only the reusability, but also the flexibility of workflows. Often, a given part of a workflow could be performed in different ways. For example, many different codes can perform the first-principles calculations in the tight-binding extraction workflows. Additionally, one might want to add steps such as relaxation or cut-off energy convergence.
To allow for this, the parent workflow can allow for dynamically selecting a workflow for performing a given task by passing it as an input [23]. An abstract workflow class defines the interface that a workflow must fulfill so that it can be used to perform the task. If needed, the parent workflow can allow for dynamic inputs, which are just forwarded to the specific workflow implementing the interface. In this way, the parent workflow can act as a template that defines an abstract series of steps, without knowledge of the detailed input flags available on each step.

B. Tight-binding extraction workflow
Having discussed the design principles for modular workflows, we now show how these are applied to create a workflow for the construction of tight-binding models. This workflow is implemented in the OptimizeFirst-PrinciplesTightBinding class as sketched in Fig. 3. At the uppermost level, the workflow has two parts: First-PrinciplesRunBase, which executes the first-principles calculations, and WindowSearch which calculates the tight-binding model with energy window optimization.
Since different first-principles codes can produce the input files required by Wannier90, FirstPrinciples-RunBase defines only the minimum interface needed to perform this task. As described in the previous section, a workflow that implements this interface for a specific first-principles code can then be chosen dynamically. As a result, the subsequent parts of the workflow are independent of which first-principles code is used.
The WindowSearch workflow performs the Nelder-Mead algorithm for finding the optimal energy window. Because optimization schemes are useful outside of this specific application, we implemented the Nelder-Mead method in a general way. The OptimizationWorkChain, defined in the aiida-optimize module, can be used to solve generic optimization problems in the context of  Figure 3. Schematic of the AiiDA workflow for creating tightbinding models with energy window optimization. Workflows are shown in blue, and calculations in purple. Orange arrows show calls from parent-to child-workflows (or calculations). Dashed green arrows show the implicit data dependency between workflows of the same level. In calculation names, the suffix Calculation is omitted for brevity.
AiiDA workflows. It requires two inputs: A workflow which defines the function to be optimized, and an engine that implements the optimization method. Consequently, changing the whole workflow to use a different optimization method would be a simple matter of using a different engine. Because AiiDA workflows need to be able to stop and re-start after any given step, the engine is written in an object-oriented instead of a procedural way. While this complicates implementing the Nelder-Mead method, it allows for serializing and storing the state of the engine.
The function which is optimized by the OptimizationWorkChain is implemented in the Run-Window workflow.
It again consists of two parts: TightBindingCalculation creates the tight-binding model itself, and ModelEvaluationBase evaluates the quality of the model. The first step in the Tight-BindingCalculation workflow is to run Wannier90 on the given input parameters. In a second step, the Wannier90 output is parsed and converted into the TBmodels [24] HDF5 format. A third, optional, "slicing" step is used to either permute the basis orbitals or discard some orbitals. Finally, the (also optional) symmetrization procedure is performed. Both the Slice and the Symmetrize calculation have a TBmodels HDF5 file as both input and output, meaning that they could be chained arbitrarily with other such post-processing steps.
For the evaluation of the tight-binding model, we again use an abstract interface class, ModelEvaluationBase. While for the purposes of this paper we used the average difference of band energies (Eq. (6)) as a measure of model quality, other quantities might be more appropri-ate for different applications.

IV. STRAIN-DEPENDENT TIGHT-BINDING MODELS FOR MAJORANA NANOWIRES
The quest for Majorana zero modes (MZMs) in condensed matter systems has recently attracted a lot of interest [10][11][12]. The non-abelian exchange statistics of Majorana frmions makes these zero modes promising candidates for future realization of topological quantum computation devices [10]. Experimental investigations of possible MZMs focus on the proposal by Lutchyn et. al. and Oreg et. al. [11,12] in which MZMs appear on the boundaries of proximitized spin-orbit coupled quantum wires. Current experimental realization setups include semi-super-conductor InAs/Al nanowires [25] and InAs/GaAs heterostructures, in which the quantum spin Hall effect [26,27] can be realized providing the possibility to proximity couple the helical edge state. While there is a good deal of evidence suggesting that MZMs exist in such setups [28,29], a conclusive proof requires directly showing the braiding statistics of MZMs. Such an experiment requires highly optimized device parameters and thus, a better theoretical understanding of the electronic structure in such a device is strongly required.
Highly accurate first-principles methods, using hybrid functionals [30], or the GW approximation [31], are computationally too demanding for the simulation of realistic device geometries and heterostructures. State of the art simulations of such structures use the k.p-method [32], or empirical tight-binding (ETB) methods [33]. In both of these methods the Hamiltonian is parametrized by a small number of parameters which are obtained empirically, e.g. via fitting to the first-principles band structure. For both of these methods the choice of parameters is ambiguos and one can obtain a good fit of the bandstructure while at the same time the electronic wavefunction might be wrongly represented. This might lead to unphysical solutions in confined geometries [34,35], and low transferability of the bulk models to the heterostructure in general. Recently, it was shown that better matching the ETB with the first-principles calculations can improve their transerability [35,36].
Realistic simulations of heterostructures require a correct treatment of strains at interfaces. In the k.p and the ETB method this is usually done by strain-dependent parameter sets, however, often the symmetries are not broken correctly. In this context, the Wannier tight-binding models (WTB) can offer a significant improvement by accurately representing the first-principles wavefunction and correctly capturing the effect of strain. As a demonstration of the high-throughput AiiDA workflow, we construct WTB models for the III-V semiconductors InSb, InAs and GaSb.
Including spin-orbit coupling (SOC), we require only 14 basis functions, whereas the popular sp 3 d 5 s * ETB models require 40 [37]. The reason for this is that WTB models generally include longer-range neighbor interactions, whereas ETB is typically limited to nearestneighbor (or next-nearest-neighbor in some cases [38]) interactions to keep the number of parameters manageable.
To account for strain, we construct tight-binding models with biaxial (001), (110) and (111) strains, and the uniaxial [110] strain, as described in Appendix D. For each material and strain direction, we calculated 16 models in the range of ±4% strain. Including the unstrained models, we constructed a total of 195 tight-binding models, showing the applicability of the AiiDA workflow in a high-throughput setting.

A. Strained tight-binding workflow
To automatically extract tight-binding models for different strain directions and strengths, we define an additional workflow, OptimizeStrainedFirstPrinciples-TightBinding, as shown in Fig. 4. The first step in this workflow, ApplyStrainsWithSymmetry, creates the strained structures from the initial structure and strain parameters. Since strain can break crystal symmetries, the symmetries of the unstrained system are tested against the strained structure. With the strained structures and the remaining symmetries, we then use the OptimizeFirstPrinciplesTightBinding workflow to create a tight-binding model for each strain value.

B. First-principles calculations
In the first step of generating the WTB we need to carry out a first-principles calculation of the bulk semiconductor structure. We performed all first-principles calculations using the Vienna Ab-initio Simulation Package (VASP) utilizing projector augmented-wave (PAW) basis sets [39]. To obtain an accurate prediction of the band gap we employed hybrid functionals [40]. The HSE03/HSE06 hybrid functionals proved to be successful in computing band structures of III-V semiconductors [41]. These hybrid functionals are constructed by replacing a quarter of the density functional short-range exchange (which is the Perdew-Burke-Enzerhof functional in our case [42]) with its Hartree-Fock counterpart. The screening parameter µ defines the separation into longand short-range parts. In the popular HSE06 scheme, it is set to µ = 0.2Å −1 . We treated µ as an empirical parameter such that the calculated band gap is fitted to the experimental value. In this work, we used µ InAs = 0.20Å −1 , µ GaSb = 0.15Å −1 and µ InSb = 0.23Å −1 , following the prescriptions of Ref. [43]. Since the SOC of III-V semiconductors is significant, we accounted for it by using scalar-relativistic PAW potentials. InAs, GaSb and InSb crystallize in the zincblende structure with space group T d (#216). For the unstrained structures we perform the first-principels calculation with the experimental lattice constant a at 300K, that is a InAs = 6.058Å, a GaSb = 6.096Å, a InSb = 6.479Å, from ref. [44]. For the plane-wave energy cutoff 380 eV was used throughout the paper. The Brillouinzone integrations were sampled by a 6 × 6 × 6 Γ-centered k-points mesh.
To get optimal results from the Wannier90 code in conjuction with VASP [39] we found that it is necessary to turn symmetries off in VASP, i.e. setting the ISYM-tag to 0. Since the states are obtained by a numerical diagonalization routine, they obtain a random phase at each k-point. When symmetries are enabled however, the phases are the same for all vectors forming the star of k. Since the convergence of Wannier90 is better if the numerical phases are random, turning symmetries off generally results in more localized Wannier functions after the projection step.
The interface for running first-principles calculations in the tight-binding extraction workflow is defined in the FirstPrinciplesRunBase class (see Section III B). Here, we describe the specific sub-class used to implement these calculations with VASP [39], VaspFirst-PrinciplesRun (see Fig. 5). In a first step, this workflow performs a self-consistent calculation. The resulting wave-function is then passed to calculations for the reference band-structure and the input files for Wannier90. Two workflows VaspReferenceBands and VaspWannier-Input are used to perform these calculations. The workflows are thin wrappers around the corresponding calculations from the aiida-vasp plugin [45], providing additional input and output validation. For the bandstructure calculation, the workflow also adds the k-point grid needed for hybrid functional calculations.

C. Strain interpolation
Using the AiiDA workflow, we obtained tight-binding models for strains in the range of ±4%, in steps of 0.5%. However, it is sometimes useful to have a finer control over the strain value without having to run additional first-principles calculations. A common way of obtaining this is by linear interpolation of the hopping parameters. Given two strain values s 1 and s 2 , for which the hopping parameter H si [R] are known, the hopping parameters for an unknown s * can be calculated as where Since this method assumes that the hopping parameters are a linear function of strain value, it becomes unreliable when s * is too far away from s 1 and s 2 . For this reason, we compared a tight-binding model for InSb with 2% biaxial (001) strain obtained from linear interpolation of 1% and 3% strain models with one calculated directly from first-principles. Figure 6 shows a comparison of the two band-structures, which we find to be almost identical.
Important to note is that while linear interpolation works well for strains of the same kind, this is not necessarily the case when combining two models with different strain directions. The reason for this is that the symmetries of a particular structure depend on the direction of the applied strain, but (unless it is zero) not on its strength. As a result, a tight-binding model resulting from linear interpolation between two models of a different strain direction would not have the correct symmetries.

D. Results
To validate the tight-binding models obtained using the aiida-tbextraction workflows, several material parameters were calculated. Table I shows effective masses and g-factors for the unstrained models, in comparison to first-principles [43] and experimental [43,46] values. Effective masses for the tight-binding models were calculated using second-order polynomial fit with range 0.001Å −1 . The g-factor calculations were performed using both perturbation theory and a Landau level calculation [47], with good agreement (< 0.5% difference) between the two methods. The effect of the energy window optimization is shown in Table II, which lists the initial and optimized windows, as well as the corresponding band-structure mismatch.  As shown in Fig. 2, it can be seen that the mismatch is substantially reduced after optimization. Finally, the effect of strain on the energy levels at highsymmetry points is shown in Fig. 7. The numerical data is listed in Appendix E.
In the supplementary materials of this paper, an export of the AiiDA database is given. This database contains the full provenance of each calculation performed to create the tight-binding models. For ease of accessibility, a separate dataset containing only the 195 strained tightbinding models is also given.  Table II. Initial and optimized energy windows used for calculating unstrained tight-binding models, and the corresponding band-structure mismatch as defined in Eq. (6).

V. CONCLUSION AND OUTLOOK
We have implemented an automated workflow for an automatic construction of Wannier tight-binding models from first-principles calculations. Building on the known procedure for calculating these models, we introduced a post-processing step to symmetrize the models, and an optimization of the energy windows used for disentanglement. These workflows are implemented in the aiida-tbextraction package, which is a free and opensource plugin for the AiiDA framework. As a test case, tight-binding models for strained III-V semiconductor materials were calculated. These results should enable device simulations for Majorana nanowire designs and other quantum devices.
The workflows have been implemented in a modular and extensible way. As a result, they can be used as building blocks for further improvements in automating the process of generating Wannier tight-binding models. Possible directions include extending the number of firstprinciples codes which are compatible with the plugin, adding different fitness criteria for the energy window optimization, and further minimizing the number of tunable parameters. For example, the need for choosing initial trial orbitals could be eliminated either by using another optimization step, or by utilizing the method of Ref. [16].
of ETH Zurich.
The symmetrized Hamiltonian is defined as (Eq. (3)) We first show that this Hamiltonian respects the symmetries in G. Let g ∈ G: Also, it is easily shown that symmetrizing a Hamiltonian H symm. First, we notice that D il (g) = 0 only if gt l − t i ∈ Z d , meaning that orbitals centered at t l are mapped onto t i , up to a possible lattice translation. Using Eqs. (1) and (3), we can write the symmetrized Hamiltonian as where the indices l, m only go over non-zero D(g) il and D(g −1 ) mj , meaning that Using the fact that g is (anti-)unitary, we can re-write Eq. (B1) as where the positive (negative) sign corresponds to the unitary (anti-unitary) case. This sign is promptly canceled out when the exponential and D(g −1 ) are exchanged, since the representation contains a complex conjugation in the anti-unitary case.
Next, we substitute t m − t l using T ml ij defined above, and define R = gR + T ml ij . Since R is again a lattice vector, we can change the summation from R to R : Finally, we again use Eq. (1) to obtain the symmetrized real-space hopping matrices Listing 3. A workflow that wraps SubWF without using the expose functionality.
In the distorted system, the position of the atoms also changes with the strain tensor. In the unstrained InAs, GaSb and InSb system, the cation is located in the (0, 0, 0) site and the anion is located atτ =(1/4, 1/4, 1/4) in primitive lattice vectors. In Cartesian coordinates, τ changes by the following [50-52]: where a 0 is the lattice constant without strain.

(110) biaxial strain and [110] uniaxial strain
The internal displacement ζ and the stiffness constants C 11 , C 12 , C 44 of InAs, GaSb and InSb we used in the paper are listed in a From Ref. [53] b From Ref. [54] c From Ref. [55] Table III. Strain parameters used in this work

Appendix E: Strain-dependent band energy values
Here, we list the energy values (in eV) for the two highest valence bands and lowest conduction bands (each being doubly degenerate), as shown in Fig. 7.