I have another repository on GitHub. I decided to have one where I’ll put python code for computational physics issues that are simpler / less complete than the code for the C++ projects. I’ll put there both jupyter notebooks and python scripts. At this moment there are only two of them, hopefully I’ll add more.
So, without much ado, here it is: PythonCompphys^{1}.
At this moment only two subjects are in there, Hartree-Fock and the Car-Parrinello one^{2}.
The Hartree-Fock is a preliminary for the Car-Parrinello and a gentler approach to Hartree-Fock than the Hartree-Fock C++ project. It does not take into account symmetries and optimizations for solving the integrals do not exist and it’s only considering spherically symmetric Gaussian Orbitals.
This time there is no video. You have the option of running the notebooks in binder (there is a button at the bottom of the README if you visit the GitHub repository) or better, download the notebooks or the python scripts and run them locally.
Here is a screenshot of the script run in the Qt Console:
In the first chart, the charted value is the energy of the H2 molecule with the distance of the nuclei being kept constant (it’s not the equilibrium one), while electrons relax towards the ground state. The result is very close to the Hartree-Fock solution.
In the second chart, both the electrons and the nuclei relax, the chart being for the distance between nuclei. The result is again quite good for the equilibrium distance, considering the used basis. Also the frequency of oscillation is close to the real one.
The theory is very nicely exposed in Computational Physics book by Jos Thijssen^{3}, Chapter 9, Quantum Molecular Dynamics.
It’s also described in Electronic Structure, Basic Theory and Practical Methods by Richard M Martin^{4}, Chapter 18, having the same title, Quantum Molecular Dynamics.
Some Wikipedia links for start: Car–Parrinello molecular dynamics, Hellmann–Feynman theorem, Verlet integration, Euler–Lagrange equation.
And here is a lecture by Roberto Car (Princeton University) and Michele Parrinello (ETHZ) at CECAM:
Here are some slides for a lecture describing Car-Parrinello: Ab initio molecular dynamics from University of Southampton^{5}. I’m sure there are many others that are available, I won’t provide a lot of more links here.
One more is of course worth mentioning, the original article of Car and Parrinello: Unified Approach for Molecular Dynamics and Density-Functional Theory Phys. Rev. Lett. 55, 2471 – Published 25 November 1985^{6}.
The main ideas of the method is:
The code is relatively simple and it uses only math, numpy, scipy and matplotlib for charts.
This is probably the most important part:
for cycle in range(numPoints): # Fock matrix computation #for i in range(2*basisSize): # for j in range(2*basisSize): # F[i, j] = H[i, j] # for k in range(2*basisSize): # for l in range(2*basisSize): # F[i, j] += Qt[i, k, j, l] * C[k] * C[l] F = H + np.einsum('ikjl,k,l', Qt, C, C) # compute energy Eg = C.dot(H + F).dot(C) + 1. / X #print(Eg) energies[cycle] = Eg if abs(oldE-Eg) < 1E-12: break # Verlet # compute Ct - 9.31, but with friction force added Ct = (2. * mass * C - massMinusGamma * Cprev - 4. * F.dot(C) * dt2) / massPlusGamma # determine lambda - see 9.32 - but be careful, it's wrong! Get it from 9.28 by replacing C[r] = Ct[r] - lambda * h^2 * sum(S[r, s]*C[s]), h^4 and h^2 are missing (here h is dt) OC = O.dot(C) OCt = O.dot(Ct) OOC = O.dot(OC) a = OOC.dot(OC) * dt4 b = -2. * OC.dot(OCt) * dt2 c = OCt.dot(Ct) - 1. delta = b*b - 4.*a*c if delta < 0: print("Delta negative!") break sdelta = m.sqrt(delta) lam1 = (-b-sdelta) / (2. * a) lam2 = (-b+sdelta) / (2. * a) if lam1 < 0: lam = lam2 else: lam = lam1 # now adjust the newly computed Cs Ct -= lam * dt2 * OC # switch to the next step Cprev = C C = Ct oldE = Eg
It’s the code that relaxes the electrons towards the ground state. It’s used almost unchanged later in the code that also adds the dynamics of the nuclei, relaxing towards the equilibrium distance.
Please point out if you find any issues and have any suggestions. I’m also open to suggestion on what to add to the python repository. I intend to add some more simple things in there sometime.
I’ll try to describe here briefly another project I have on GitHub^{1}. There will not be much except links, but hopefully those will be useful.
The project is far from being perfect, the simple averaging used there for example could be improved, as there are autocorrelations that one should be aware of, but I’ll do that maybe later. Also the way the Gaussian orbitals are combined could be improved, too.
Nevertheless, the program appears to work, although one should be careful of what parameters are used. A big number of steps is necessary to obtain a reliable result.
As usual, here is a record of a run:
It doesn’t show much, but with some patience the program could be extended to supply more info.
As for the last posts, I’ll provide here only links, as currently I lack motivation to describe the theory in more detail. It’s very well described elsewhere, anyway.
Some starts can be provided by Wikipedia: Quantum Monte Carlo, Variational Quantum Monte Carlo. There are some links there worth visiting.
One book that provides a good start is already mentioned several times on this blog: Computational Physics book by Jos Thijssen^{2}.
A very good lecture by Morten Hjorth-Jensen is found on GitHub: Computational Physics^{3}. In doc/Literature you’ll find lectures2015.pdf. The formulae in there are mentioned in the code, using the numbering from there, so this is the main reference you should consult. Another link to look over from that repository is here: Computational Physics 2: Variational Monte Carlo methods.
A GitHub repository where are posted a lot of Master Thesis and PhD Thesis is here: CompPhysics\ThesisProjects^{4}. I will point you in there this: Quantum Monte Carlo Simulation of Atoms and Molecules by Roger Kjøde, because it presents a project similar with what I describe here and also has some results you could use for comparison.
The code reuses a part of the code from the Hartree-Fock project. Most of the classes are in the same namespaces as in the Hartree-Fock project, but there are changes and additions. Since Gaussian Orbitals are used, the code for loading them and representing them as classes is borrowed as is from that project. There are some additions, though, for example the Orbitals::VQMCOrbital
class is new. It implements a linear combination of two contracted Gaussian Orbitals, to be used as a ‘molecular orbital’ by combining atomic orbitals from the last shell of the atoms in the case of a diatomic molecule. Another change is that I needed the gradient and laplacian in this project, so I added them to the orbitals implementation. This addition is also added to the Hartree-Fock project, although it’s not used there.
The place to start looking over the project is in VQMCMolecule::Compute
. The most important classes are Wavefunction
and VQMC
.
The used libraries are, as in other projects described on this blog: wxWidgets^{5} and Eigen^{6}.
Here are some results for the ground state energy in Hartrees, computed with STO3G and STO6G, compared with results obtained with the DFTAtom project (for atoms) and the Hartree-Fock one. The DFTAtom results are very similar with the results from NIST and the Hartree-Fock results are very close to other projects that use the same basis set. For Hartree-Fock I used the restricted method for He and unrestricted for the other atoms. If you would check the above mentioned master thesis, you would notice that the results are typically quite close with the results from the above thesis (where STO3G was used).
For now I’ll add here some atoms and molecules with a small number of electrons, I might add later some more complex ones. Of course, the lower values are better, as the variational principle is used.
Atom | VQMC STO3G | VQMC STO6G | HF STO3G | HF STO6G | DFTAtom |
---|---|---|---|---|---|
He | -2.841 | -2.876 | -2.807 | -2.846 | -2.834 |
Li | -7.360 | -7.430 | -7.315 | -7.399 | -7.335 |
Be | -14.428 | -14.590 | -14.351 | -14.503 | -14.447 |
B | -24.228 | -24.533 | -24.148 | -24.394 | -24.344 |
For molecules, I used the restricted Hartree-Fock method, as the electrons are paired for the chosen molecules.
Molecule | Distance | VQMC STO3G | VQMC STO6G | HF STO3G | HF STO6G |
---|---|---|---|---|---|
H2 | 1.4 | -1.148 | -1.156 | -1.116 | -1.125 |
Li2 | 5.051 | -14.808 | -14.887 | -14.638 | -14.808 |
LiH | 4.516 | -7.823 | -7.909 | -7.784 | -7.873 |
An improvement that should be done and I’m probably going to do in the future is to implement better statistics. The current simple averaging suffers due of autocorrelations.
Also the way the Gaussian Orbitals are picked and combined could probably improved as well. Currently the ones from the valence shell are paired in order. Hartree-Fock for example uses all of them in computation, filling them up in order of their energy. If the shell is not filled, in this project case not all orbitals from the basis set are used.
To improve the results, probably all of them could be used by using a linear combination of Slater determinants, each of them having a different combination of those orbitals from the last shell. This method would work in general, not only on diatomic molecules, too. This would require a lot of variational parameters, instead of a single one as it’s currently in the program, and gradient descent algorithms should be used to find the minimum (in a similar way as for the DFT Quantum Dot project). Of course, not all ‘tricks’ for speeding up used in this project would work anymore and besides, having so many Slater determinants and variational parameters would slow the speed a lot, so I’m not going to consider such an implementation for such a project.
The Variational Quantum Monte Carlo can be a start for Diffusion Quantum Monte Carlo, which is in principle exact, so the program could be extended to improve results using that method.
If you find any bugs and/or have suggestions for improvements, please point them out here or on GitHub.
This is a very brief post, with a simulation written in javascript. It’s a very simple model of how a dissease can spread into a population. You may get the code, adjust the parameters and watch different outcomes.
The idea of this project came from this: Why outbreaks like coronavirus spread exponentially, and how to ‘flatten the curve’. I highly recommend reading it.
I recommend watching this video, for some theory:
It’s a good start, there is a lot more information about this on the internet, so I won’t bother with formulae here.
The model resembles a lot the one from the link pointed above, with an addition: this model also simulates death, not all ‘people’ recover.
Briefly, it’s just molecular dynamics in 2D, in a ‘box’, with elastic collisions. All ‘particles’ have the same mass and they can be either not infected, infected, cured and dead. The dead ones disappear from the simulation.
Once an infected ‘particle’ hits one that wasn’t infected yet, it will infect it. A ‘particle’ once having the illness, has a chance of dying (but only in the last two thirds of the infected period of time). After being sick for a while, if it doesn’t die in the meantime, it will recover and have immunity, so it cannot be infected again.
There is also a chart showing the number of infections over time. You’ll notice that it will level out under 100%, the difference being the ones that did not get an infection, being lucky, and the dead ones.
The code has a feature that it’s usually a bug: because of the too big time step, some ‘particles’ can stick together. Basically when the collide they overlap a little (more, if the time step is big) and if the cannot go apart in one time step, remaining overlapped, they will be considered as colliding again and so on. There are various ways of solving this but for this model I consider it a feature: people sometimes do something analogous.
The blue ‘particles’ are not infected, the red ones are infected and the green ones are cured. The ones that die disappear from the canvas.
The computation will be a little slow on a phone, for this you should better use a desktop.
Related to this, for a better implementation on such molecular dynamics where there is short range interaction only, visit Event Driven Molecular Dynamics. It’s also 3D and with nicer graphics, using OpenGL.
There are some comments in the code, hopefully it’s clear enough. I wrote it very fast, so it’s far from perfect but it seems to work as intended.
(function () { var canvas = document.getElementById("Canvas"); var chart = document.getElementById("Chart"); canvas.width = Math.min(window.innerWidth - 50, window.innerHeight - 50); if (canvas.width < 300) canvas.width = 300; else if (canvas.width > 800) canvas.width = 800; canvas.height = canvas.width; chart.width = canvas.width; chart.height = chart.width / 3; var ctx = canvas.getContext("2d"); var canvasText = document.getElementById("canvasText"); var ctxChart = chart.getContext("2d"); ctxChart.strokeStyle = "#FF0000"; ctxChart.lineWidth = 4; var people = []; // some parameters var nrPeople = 500; var speed = canvas.width / 50.; var radius = canvas.width / 100.; var deltat = 0.01; var cureTime = 5.; var deathProb = 0.05; var infectProb = 1.; // chart var chartPosX = 0; var chartValY = 0; var chartXScale = 0.1; var chartYScale = chart.height / nrPeople; // statistics var deaths = 0; var infections = 1; var cures = 0; function Init() { deaths = 0; infections = 1; cures = 0; people = []; chartPosX = 0; chartValY = 0; for (i = 0; i < nrPeople; ++i) { var person = { posX: 0, posY: 0, velX: 0, velY: 0, dead: false, infected: false, cured: false, infectedTime: 0.0, NormalizeVelocity: function () { var len = Math.sqrt(this.velX * this.velX + this.velY * this.velY); this.velX /= len; this.velY /= len; }, Distance: function (other) { var distX = this.posX - other.posX; var distY = this.posY - other.posY; return Math.sqrt(distX * distX + distY * distY); }, Collision: function (other) { var dist = this.Distance(other); return dist <= 2. * radius; }, Collide: function (other) { var velXdif = this.velX - other.velX; var velYdif = this.velY - other.velY; var posXdif = this.posX - other.posX; var posYdif = this.posY - other.posY; var dist2 = posXdif * posXdif + posYdif * posYdif; var dotProd = velXdif * posXdif + velYdif * posYdif; dotProd /= dist2; this.velX -= dotProd * posXdif; this.velY -= dotProd * posYdif; other.velX += dotProd * posXdif; other.velY += dotProd * posYdif; } }; for (; ;) { var X = Math.floor(Math.random() * (canvas.width - 2. * radius)) + radius; var Y = Math.floor(Math.random() * (canvas.height - 2 * radius)) + radius; person.posX = X; person.posY = Y; overlap = false; for (j = 0; j < i; ++j) { var person2 = people[j]; if (person2.Collision(person)) { overlap = true; break; } } if (!overlap) break; } person.velX = Math.random() - 0.5; person.velY = Math.random() - 0.5; person.NormalizeVelocity(); person.velX *= speed; person.velY *= speed; if (i == 0) person.infected = true; people.push(person); } } Init(); function Advance() { // for each from the population for (i = 0; i < nrPeople; ++i) { var person = people[i]; if (person.dead) continue; // move person.posX += person.velX * deltat; person.posY += person.velY * deltat; // collide / infect / cure // first, walls collisions if (person.posX <= radius) { person.velX *= -1; person.posX = radius; } else if (person.posX >= canvas.width - radius) { person.velX *= -1; person.posX = canvas.width - radius; } if (person.posY <= radius) { person.velY *= -1; person.posY = radius; } else if (person.posY >= canvas.height - radius) { person.velY *= -1; person.posY = canvas.height - radius; } // keep track of how long the infection lasts if (person.infected) person.infectedTime += deltat; // collisions and infections between them for (j = 0; j < i; ++j) { var person2 = people[j]; if (person2.dead) continue; if (person.Collision(person2)) { person.Collide(person2); if (person.infected && !person2.infected && !person2.cured) { if (Math.random() < infectProb) { person2.infected = true; ++infections; } } else if (person2.infected && !person.infected && !person.cured) { if (Math.random() < infectProb) { person.infected = true; ++infections; } } } } // cure if (person.infected && person.infectedTime > cureTime) { person.infected = false; person.cured = true; ++cures; } // kill if (person.infected && person.infectedTime > cureTime / 3.) { if (Math.random() < deathProb * deltat / (cureTime * 2. / 3.)) { person.dead = true; ++deaths; } } } // display ctx.clearRect(0, 0, canvas.width, canvas.height); for (i = 0; i < nrPeople; ++i) { var person = people[i]; if (person.dead) continue; ctx.beginPath(); ctx.arc(person.posX, person.posY, radius, 0, 2 * Math.PI, false); ctx.fillStyle = person.infected ? 'red' : (person.cured ? 'green' : 'blue'); ctx.fill(); ctx.stroke(); } canvasText.innerHTML = "Total: " + nrPeople + " Infected: " + infections + " Deaths: " + deaths + " Cured: " + cures + " Sick: " + (infections - cures - deaths); ctxChart.beginPath(); ctxChart.moveTo(chartPosX * chartXScale, chart.height - chartValY * chartYScale); ++chartPosX; chartValY = infections; ctxChart.lineTo(chartPosX * chartXScale, chart.height - chartValY * chartYScale); ctxChart.stroke(); if (infections - deaths == cures) // there is no more left to cure { ctxChart.clearRect(0, 0, chart.width, chart.height); Init(); } } setInterval(Advance, 10); })();
As usual, if you find mistakes or have suggestions, please point them out.
I’ll end up with a quote from Will the pandemic kill you? by Nobel Laureate Peter Doherty:
“What influenza also tells us is that, while viruses that spread via the respiratory route are the most likely cause of some future pandemic, only the most draconian and immediate restrictions on human travel are likely to limit the spread of infection, and then only briefly.”
“Such limitations are likely to be applied quickly if we are faced with a situation in which, for example, more than 30% of those affected develop severe or even fatal consequences. The more dangerous situation may be when mortality rates are in the 1-3% range, causing (ultimately) 70 million to 210 million deaths globally. Such an infection could “get away” before we realised what was happening.”
The post Epidemic first appeared on Computational Physics.]]>
As I mentioned in the previous post, I already have a project on KKR on GitHub^{1}. I won’t add here a lot of description, I’ll point you some references instead. Usually I also try to add links to some other projects that are small enough to be comprehensible in a reasonable time, this time I couldn’t find one, although I searched for one on GitHub and with google. Please let me know if you find another one.
Here is a movie with the program running:
Here is one with a short description of the Korringa–Kohn–Rostoker method. Another one worth looking into is Multiple scattering theory.
As for the previous post, a book that treats the subject is Electronic Structure, Basic Theory and Practical Methods by Richard M Martin^{2}. KKR and MTO and other related subjects are in chapter XVI, LMTO and related subjects are in chapter XVII. I don’t have another one that deals with the subjects, so it’s the only book reference I give here.
The first one is ON THE CALCULATION OF THE ENERGY OF A BLOCH WAVE IN A METAL by J Korringa^{3}. The second one is Solution of the Schrodinger Equation in Periodic Lattices with an Application to Metallic Lithium by W. Kohn and N. Rostoker^{4}.
Here is one that also covers the Ewald summation: Energy Bands in Periodic Lattices—Green’s Function Method by F. S. Ham and B. Segall^{5}.
Here is another one: Algorithms for Korringa-Kohn-Rostoker electronic structure calculations in any Bravais lattice by E. Bruno and B. Ginatempo^{6}.
If you want to look further into MTO and LMTO, here are two worth looking into:
One about MTO: Muffin-tin orbitals and molecular calculations: General formalism by O.K. Andersen and R. G. Woolley^{7}.
One about LMTO (and LAPW): Linear methods in band theory by O. Krogh Andersen^{8}.
Some lecture notes translated from German from TU Graz Institute of Theoretical and Computational Physics: Chap 16^{9}.
Another pdf: The Korringa-Kohn-Rostoker (KKR) Green Function Method by Phivos Mavropoulos and Nikos Papanikolaou^{10}.
Again, the code is here, on GitHub^{1}.
It’s worth looking into other projects in order to understand this one, the other three ones that deal with electronic band structure, for example, especially the previous one that deals with APW/LAPW. The Muffin Tin Approximation that is described and used there is relevant for this project, too. I won’t give links here, they are easily accessible from the side bar.
Another project worth looking into is DFTAtom, especially if you would want to implement a LMTO full potential project. Another one is the Scattering one, I modified Scattering::PhaseShift
there just to have a match with this project in the computation of the phase shift.
Most of the interesting classes for the project are in the KKR
namespace, the essential places to look into are the BandStructure::Compute
function and the Lambda
class implementation. Most of the others are similar with those from the APW project, except the Coefficients
class which computes the Gaunt coefficients (using Wigner3j/Clebsch Gordan) and caches them (they are computed only once before doing the calculations).
As an observation, just in case somebody notices, there are small differences in the results of the KKR program and the APW program, especially at high energy (where big L matter more). The reason is that I used a small max L in the KKR program, 2 (maximum that would be needed is 4). With 3 the results are already very good (you can change that in the program to verify, it’s lMax
in BandStructure::Compute
). The reason I didn’t set it to 3 is that it’s a little faster with 2 and because I implemented and tested it with 2, singularities avoidance is better for that value. It kind of works for 3, too, but not that well.
For the user interface, I used wxWidgets^{11}. For the chart, I used VTK^{12} and for the matrix stuff, I used as usually for the projects for this blog, Eigen^{13}.
I would like to add to the project MTO and LMTO and perhaps using the later to implement a ‘full potential’ computation, but I’ll take a break from electronic band structure computations for a while, since I already have four open source projects on GitHub on the subject. Maybe I’ll implement those some other time when I’ll have more free time and motivation.
As usual, if you find any bugs or have suggestions, please contact me, either here or on GitHub.
Again, I have a new project on GitHub. It’s not so new, it was working already last year (the APW part) but I didn’t have the patience to write a description for it until now. Actually, there are two new projects on GitHub, related, but this post is about the Augmented Plane Waves^{1} one. Some things I’ll point out in this post are relevant to the KKR project, too. The description for that project will follow in another post.
Here is the program running:
At that time, I didn’t have LAPW working, so it shows only the APW results.
The periodicity of the lattice suggests dealing with Bloch waves as basis functions and an approach that uses plane waves could be very useful, as it could be seen for the DFT for an atom project. Unfortunately, although that approach has clear advantages, there are still issues. For example, near nucleus the field is very strong and the big kinetic energy of electrons translates in a lot of plane waves needed for the accurate wavefunction description. This can be solved by using a pseudopotential. A pseudopotential is also used in this project and the next one (KKR), but, as I’ll point out later, you can also do computations with a ‘full potential’ in the end, using the muffin tin approach, this is just one more simplification for the beginning. The muffin tin approximation takes into account the strong field close to the nuclei and the fact that ‘far away’ from them the potential varies much slowly, to approximate it with a field that is spherically symmetric inside spheres centered on the atomic nuclei and constant outside (a constant that can be taken to be zero, since you can choose the energy reference point – it’s the differences in energy that matter, not the absolute energy).
As a note, as you dive deeper in the theory for APW and KKR, you find that many simplifying assumptions can be improved, for example the spherical symmetry assumption can be relaxed, or the shape of the muffin potential can take some other form, not necessarily spheres, or the spheres can overlap and so on. By default, this project and the next one assume touching spheres, although for example you may try some smaller spheres with some pseudopotential for Al that could work ok, because for aluminum the nearly free electrons approximation is already relatively good.
Using the above simplifying assumptions, we can try to build up some better basis functions for the solution. The idea is to use the solutions of the Schrodinger equation inside the sphere and plane waves outside the spheres. Since we assumed spherical symmetry we can solve inside the sphere the radial Schrodinger equation only (1D, so it’s relatively easy to solve) as in the DFTAtom project case and with spherical harmonics we have the full solutions. To obtain a basis function we just join a linear combination of solutions of the Schrodinger equation with a plane wave. For that, the plane wave is expanded in spherical harmonics and by requiring equality on the boundary of the sphere we fix the expansion coefficients and so we obtain the augmented plane wave. In the process we limit the maximum azimuthal quantum number to some max value (currently 8 in the project). Then we just have to solve the general eigenvalue equation to get the coefficients for expressing the actual wavefunction in terms of the augmented plane waves. This sounds easy enough, but there is a problem, the solutions of the Schrodinger equation are energy dependent. This makes the generalized eigenvalue equation quite difficult to solve, because it’s non-linear. The program just scans the entire energy interval using a small step, locating changes in sign of the determinant or values that are close to zero (a change of sign might not exist if there is a degeneracy, for example, or if energies are very close and the code steps over a couple). As a detail, in this description a lot of things are not apparent, for example the derivative discontinuity at the sphere boundary in the augmented plane wave is not exactly nice, to see how that is handled you’ll have to check the referred books or papers.
The energy dependence makes obtaining the solutions very difficult, you can notice that already by comparing the speed of running APW versus LAPW in the project. The non-linearity would make implementing ‘full potential’ (as in a self consistent program using DFT) very difficult. An idea to fix this would be to just pick an energy at half of the energy interval of interest and do computations with that. Unfortunately, the solutions to the Schrodinger equation vary strongly with energy, so this would not give good results. A better approach is to take energy dependence into account by considering also the energy derivative of the solution to the Schrodinger equation. Using that, it is assumed that the solution to the Schrodinger equation has a linear dependence on the energy, around the fixed energy we chose. Again we do the expansion of the plane wave with spherical harmonics, but we match on the sphere boundary not only the value, but also the energy derivative. I won’t give more details here, I’ll end up this by telling that we have to solve the generalized eigenvalue equation, but this time it’s linear, so it can be solved relatively easy.
I might do that in the future, but for now, it’s only a suggestion for extension: Instead of using a pseudopotential, one could use a ‘full potential’ approach in a self-consistent way, that is, use DFT to find the solutions starting from a reasonable electron density, use those solutions to compute a new electron density and so on until convergence is obtained. Probably a lot of the code from DFTAtom could be used. I’ll provide some links that will detail the method more (including code).
Again the Computational Physics book by Jos Thijssen^{2} must be mentioned. Both APW and LAPW are described in chapter 6. And again, another book is Electronic Structure, Basic Theory and Practical Methods by Richard M Martin^{3}. APW is in chapter XVI, LAPW is in chapter XVII.
By the way, never assume that the formulae in books are correct, for example 6.44a from Computational Physics has a when it should be .
I won’t point many articles for APW, instead of that I’ll point out a couple on LAPW, because it’s more important:
This one basically describes the LAPW as implemented in the project: Use of energy derivative of the radial solution in an augmented plane wave method: application to copper by D D Koelling and G O Arbman^{4}.
This one is worth looking into for both LAPW and also LMTO which is related with the next topic: Linear methods in band theory by O. Krogh Andersen^{5}.
Those are lecture notes from Rutgers University: Application of DFT to Crystals by Kristjan Haule^{6}. Although the ideas are correct, I must warn you that they are full of tiny mistakes, I checked a little more carefully the LAPW part and I noticed several: for example, the versor of the sum is not the same as the sum of versors, they have a at the denominator instead of numerator, they tell that they always use Hartree units but at least at LAPW they switch to Rydberg, and so on. They have an example APW code (the pseudopotential is the same as I used, the actual source of it – at least in my case – is a Fortran program that is given as a resource for the Computational Physics book): APW^{7}. As far as I can tell, that code won’t detect the degenerate (or nearly degenerate) situations. There is also a LAPW program which is a ‘full potential’ self consistent one: LAPW^{8}. I only briefly looked over it, but I think it uses the formalism from Linear methods in band theory by O. Krogh Andersen^{5}.
Again, the code is here, on GitHub^{1}.
The classes related with APW are in the APW
namespace. The BandStructureBasis
class is very similar with the BandStructure
class you can find in the Tight Binding project or the Empirical Pseudopotential one. I simply got it from the Empirical Pseudopotential project, along with the SymmetryPoint
and SymmetryPoints
classes and changed it a bit. The actual computation happens in the derived class, BandStructure
, with the help of the Numerov
classes (for solving the Schrodinger equation, check out the DFTAtom project for another example that does that) and Secular
which contains the computation of matrices elements and the determinant of the secular equation.
The LAPW
namespace contains classes related with LAPW, but the computation also uses classes from the APW
namespace. You’ll find in there the Integral
class (which I got from the DFTAtom project) and the Hamiltonian
matrix, which deals with matrix elements computations and solving the generalized eigenvalue problem. As for APW, the computation happens in the Compute
function of the BandStructure
class, but from the LAPW
namespace this time.
Both APW and LAPW implementations use a non linear grid for solving the Schrodinger equation.
As a funny detail about how I implemented it, I added the LAPW code in a hurry, trying to have it working as fast as possible. Of course I made a mistake (actually several, but this one was the one that delayed me), passing the index k (an integer) instead of the actual . Of course, I thought that I might have some errors in the formulae (and there were some, at that point) and there is a chance of having such errors even in scientific articles, so to be sure on the formulae I did the whole derivation, until I was certain that what I have is correct. Then, of course, I found the real problem. Because I hurried too much when writing the code I actually lost much more time with checking/searching for the error.
For the user interface, I used wxWidgets^{9}. For the chart, I used VTK^{10} and for the matrix stuff, I used as usually for the projects for this blog, Eigen^{11}.
If you have suggestions or you find bugs, please let me know. I might sometime implement a full potential LAPW, but for now – except the next post, for which the project is already done – I’ll switch to other topics, I have already four projects dealing with band structure computation in the repository, so for a while those should be enough.
I have a new project on GitHub^{1}. The project is using Density Functional Theory to do calculations for an atom. The project is actually not so new, I’ve put it on GitHub more than three months ago, but it had some issues I had to solve and also I did not have patience until now to write a new blog entry. I mentioned the post on Quantum Scattering that I will have some other project that takes advantage of spherical symmetry, so here it is. A very simple one, since I don’t have time and motivation right now for complex projects.
As usual, I have a video showing the program in action. This is with an older version, the newer one is much better for heavy atoms:
This time I decided to not chart anything, it just shows some computed values. I had it first writing in the console, then I changed the code to display it in a rich text control. For that I simply redirected cout
.
I won’t write much here, the references should be enough to figure the code out. For solving an atom, one might want to go relativistic (relativity does matter, especially for heavy atoms, the electrons are fast enough to need relativity). In such a case you either go with the Dirac equation or you go with relativistic corrections. In order to keep the project simple, I chose to ignore this. If you want to see what is needed, check out the Martin book or this link. Also this link should be very helpful: How to build an atom^{2}. I wanted a simple project, so I didn’t go that path. To benefit from the spherical symmetry, I also went with Local Density Approximation in 1D. For an atom, despite having spherical symmetry for the nucleus potential, there is no spherical symmetry in general, only for fully occupied shells, in the case of LDA. In general the symmetry is cylindrical, but with Local Spin Density Approximation you also have it for half-filled shells. It would be reasonably simple to implement LSDA, you just go with two densities instead of one, but I chose to go with LDA, for the purpose of this blog it should work well enough even for atoms that are not in the noble gases column. For solving the Kohn-Sham equation I chose the shooting method with the Numerov method. Since I already used that for Quantum Scattering, I thought that it would be too simple to also use it to solve the Poisson equation, so I went there with the full multigrid method. One additional complexity is that the code is using a non uniform grid, the benefit is quite good to be worth it although it adds a little complexity. One simply goes with something like equation 23 from here, writes the functions as functions of n instead of position, applies the chain rule for derivatives, may do some other change of variable to simplify things and computations go over a regular grid that’s simpler that the original one (delta is 1, the difference between n+1 and n). The method is exposed as a problem in Thijssen’s book. I think it’s worth mentioning that the code is using the shooting method from different points, the core wave-functions are more localized than the outer shell ones. The starting point is decided based on the value of the approximation used for far starting values.
Here is a succint description with the main points:
Related with the multigrid method, I have this blog post that exposes it at an entry level: Relaxation Method. In this case though it is full multigrid, not as simple as there. A project that uses self-consistency is the Hartree-Fock one. A blog post about DFT theory is here and a project that uses DFT, but with a plane-waves basis is the DFT for a Quantum Dot one. I already mentioned the Quantum Scattering project for the Numerov method that is also used there. If you would like to go with Runge-Kutta instead of Numerov, you might find the Electric Field Lines and Chaos posts relevant.
I already mentioned the Computational Physics book by Jos Thijssen^{3}. It’s worth mentioning again, chapter 5 is on the Density Functional Theory and the problem 5.1 is on switching from the uniform grid to the non-uniform one. Another great book is Electronic Structure, Basic Theory and Practical Methods by Richard M Martin^{4}. Chapter 10 treats the electronic structure of atoms, but a lot of the book deals with relevant information here and some other projects described on this blog.
I won’t put many links here, for more please visit the relevant posts on this blog. I’ll add only some relevant to this project.
One nice article for the theory is the already mentioned How to build an atom by M.S.S Brooks^{2}. You’ll also find there fortran code, if you like fortran more than C++. I also recommend looking over this lecture from Rutgers University: DFT lecture^{5} and the associated C++ code project for the lecture^{6}. NIST has some theory on the site, with some references, so it’s worth checking not only for verifying the results: NIST^{7}. Also this arxiv paper is worth checking, it will lead you to a more serious fortran project: dftatom: A robust and general Schrödinger and Dirac solver for atomic structure calculations by Ondrej Certik, John E. Pask d, Jiri Vackar^{8}. You may find a lot of info on the internet about the full multigrid method, here is just one link: A multigrid tutorial by William L Briggs and others^{9}.
Since I mentioned fortran above, twice, I think it’s worth mentioning this: Fortran is still a thing.
The code is relatively simple once you get the theory, it’s well under 2000 lines of code, probably well under 1000 for the code that matters for the atom computations. The relevant classes are in the DFT
namespace, the most important being the Numerov
class for the algorithm used for the shooting method used to solve the Kohn-Sham equation, the PoissonSolver
for solving the Poisson equation and the DFTAtom
class, this one being the one that does the computations using the other ones. There are other helper classes, they are either similar as used in other projects on this blog (as for example VWNExchCor
) or they are straightforward (as for example Integral
). The program uses wxWidgets^{10} as many other projects described on this blog. To ease up understanding of the code, I also added code that is using an uniform grid, for comparison. The uniform grid computation is in DFTAtom::CalculateUniform
, while the non uniform one is in DFTAtom::CalculateNonUniform
.
First, here is how the non uniform grid looks like, for the hydrogen wavefunction:
Delta was 0.01, radius 10 and multigrid levels 10, that is, 1025 grid points. You can easily see that the grid points get much denser closer to the nucleus.
Let’s see how it goes for a bad scenario, Argon:
Step: 30
Energy 1s: -3204.75628814 Num nodes: 0
Energy 2s: -546.577960661 Num nodes: 1
Energy 2p: -527.533025107 Num nodes: 0
Energy 3s: -133.369144873 Num nodes: 2
Energy 3p: -124.172862647 Num nodes: 1
Energy 3d: -106.945006737 Num nodes: 0
Energy 4s: -31.2308038208 Num nodes: 3
Energy 4p: -27.1089854743 Num nodes: 2
Energy 4d: -19.4499946904 Num nodes: 1
Energy 4f: -8.95331847594 Num nodes: 0
Energy 5s: -5.88968292298 Num nodes: 4
Energy 5p: -4.40870280587 Num nodes: 3
Energy 5d: -1.91132966098 Num nodes: 2
Energy 6s: -0.626570734867 Num nodes: 5
Energy 6p: -0.293180043718 Num nodes: 4
Etotal = -21861.3469029 Ekin = 21854.6726982 Ecoul = 8632.01604609 Eenuc = -51966.1203929 Exc = -381.915254274
Finished!
1s2 2s2 2p6 3s2 3p6 3d10 4s2 4p6 4d10 4f14 5s2 5p6 5d10 6s2 6p6
I used 17 for ‘multigrid levels’ (that means 131073 nodes) 0.0001 for delta, mixing 0.5 and the max radius 50.
The results are not perfect, I guess with some other parameters they might be improved somewhat. The energy levels usually get all decimals given by NIST right, but occasionally the last one is wrong.
The problem is for total energies, the total energy gets three decimals right, the partial ones get three or four decimals right.
Here are the NIST values for comparison: NIST Radon.
For a lighter noble gas I get better results, for Argon for example:
Step: 29
Energy 1s: -113.800134222 Num nodes: 0
Energy 2s: -10.7941723904 Num nodes: 1
Energy 2p: -8.44343924178 Num nodes: 0
Energy 3s: -0.88338408662 Num nodes: 2
Energy 3p: -0.382330129715 Num nodes: 1
Etotal = -525.946199815 Ekin = 524.969812556 Ecoul = 231.45812437 Eenuc = -1253.13198253 Exc = -29.2421542116
Finished!
1s2 2s2 2p6 3s2 3p6
I used 14 for ‘multigrid levels’ (that means 16385 nodes), 0.0005 for delta, mixing 0.5 and the max radius 25.
The energy levels results match all given decimals from NIST. The total energy gets five decimals right, the kinetic, coulomb and nuclear energies even six, the exchange correlation one seems to be the worse, with only four decimals (maybe there is room for improvement there?).
Here is the NIST data for comparison: NIST Argon.
Let’s see how good it can be for 1025 points only:
Step: 29
Energy 1s: -113.800105669 Num nodes: 0
Energy 2s: -10.7941413182 Num nodes: 1
Energy 2p: -8.44340935717 Num nodes: 0
Energy 3s: -0.883372166507 Num nodes: 2
Energy 3p: -0.382319656694 Num nodes: 1
Etotal = -525.945824275 Ekin = 524.965679469 Ecoul = 231.458119091 Eenuc = -1253.12751443 Exc = -29.2421084002
Finished!
1s2 2s2 2p6 3s2 3p6
Multigrid levels was set to 10, radius to 15, grid delta to 0.006.
Four decimals for energy levels and two for total energy and the partial ones, not bad at all for so small number of points. The non uniform grid helps a lot, you would need a lot more points if using an uniform one.
As always, please point out any issues/bugs or improvements/enhancements ideas. The program is far from perfect, I’m told you can get much better results, especially for heavy atoms, but for now the results seem of for the purpose of this blog. I might use it in the future in other project(s).
The idea of this project^{1} came at the time when Nvidia released the RTX graphic cards. I’ve seen lots of comments on various sites about them and realized that many people do not understand what Ray Tracing is about. Since it has Monte Carlo, including importance sampling and obviously it is related with optics, I thought it would be a nice addition to the projects. The project is not about how it’s implemented by Nvidia, they had to use some ‘tricks’ to have it working in real-time, for example the number of reflections is cut out drastically, the number of rays/pixel is limited, the scene for ray tracing might be smaller than the normal one, the noise is eliminated by using ‘AI’, the model is a hybrid, using both rasterization and ray tracing and so on… The project is about regular ray tracing.
Usually I record a movie showing the program during execution, but now I changed the code to record an animation, frame by frame, so I show that instead:
This is how I generated the movie from the frames, using gstreamer:
gst-launch-1.0 multifilesrc location="c:\\temp\\frame%06d.jpg" caps="image/jpeg,framerate=30/1" ! jpegdec ! x264enc pass=quant ! avimux ! filesink location="out.avi"
For the program execution, you’ll have to compile the code and look at it yourself, hopefully the UI is not a big deal, it allows you to generate either the ‘one weekend’ scene, a Cornell box or some other scene that allows you to specify a sky box and some obj file to load. The generated image can be saved in several image format files.
This time I started by looking for articles, blog posts and so on before jumping onto implementing the project, I knew that there is a lot to find about it on internet. Indeed there is a lot, so I won’t repeat it here, instead I will point out links to useful info.
The most important is that I found some free books^{2} by Peter Shirley which were very helpful. You may find his GitHub repositories^{3} very useful. I did not look into those when implementing the project, but the resemblances you may find are not a coincidence, you’ll have to parse the books to find out why.
Here are some other links you may want to check if you look into the project:
Of course, those are only a start, you should find plenty of information starting from those links, though.
I followed the books quite carefully, although I skipped some more boring parts for me, as textures generation and motion blur, they are straightforward to implement. I chose some different class hierarchy and naming, also there are changes in implementation. Sometimes I chose something faster, as it is the case of ray intersection with the bounding box, sometimes out of convenience I chose a slower implementation, like the rotation for the rectangles. I chose to have a different class for colors, I used the vector class implementation that is already used in some projects for this blog and I used double
instead of float
for values. It does not make such a big difference in speed, the advantages from a better precision are more important. I also used smart pointers a lot and a C++ random number generator. The only library used is for UI: wxWidgets^{7}. I went beyond the ‘rest of your life’ book, first I added triangles to objects, they I added an obj file loader. I thought about using the tinyobjloader library, but I preferred to implement my own. With some changes I might use it in the future in some other projects that use OpenGL. Maybe I’ll also enhance it in time, because it has various issues, one of the biggest is that it cannot load properly concave polygons.
In order to understand the code^{1}, you’ll have to look over it and also over Perter Shirley’s books^{2} and referenced articles. I will describe here the namespaces and classes, very briefly.
The Camera
class is similar with the one I used in the projects that use OpenGL, but simplified, with the addition of getRay
function. Vector3D
is the same as in the other projects, Color
is a very simple class that contains r, g, b components. The ObjLoader
class is obviously for loading obj files. It’s not portable and far from perfect, I wrote it very quickly, but it works in many cases. It can load not only the objects but also materials, with colors and textures. OrthoNormalBasis
is for the local ortho normal basis. To see how it is used, check out for example the AnisotropicPhongPDF::Generate
function. PointInfo
and ScatterInfo
are for some objects that are passed along, containing useful information. Random
is for what the name suggests, it can generate various random number distributions. It uses std::mt19937_64
. Ray
represents, as you guessed, a ray and contains the origin and direction, together with the inverse of direction which is used to speed up computations. Scene
implements the scene, it’s derived from Objects::VisibleObjectComposite
and its main function is the one for ray casting, RayCast
.
There are several namespaces in the project, separating classes depending on their purpose. The BVH
namespace is for bounding volume hierarchy classes, it contains the AxisAlignedBoundingBox
class and the BVHNode
class. Materials
contains material classes, such as Dielectric
, Metal
, Isotropic
, Lambertian
and AnisotropicPhong
. Objects
contains the objects such as Triangle
, Sphere
, Box
and so on. PDFs
contains probabilistic density function classes like CosinePDF
and AnisotrophicPhongPDF
. Textures
contains texture classes like ColorTexture
and ImageTexture
. Actions
contains some classes for transformations that could be applied on some objects, like TranslateAction
or FlipNormal
.
There are some other classes that are for UI implementation, options and so on, I’ll let you discover those in the project^{1}.
I’ve got some obj files for tests and displaying from here: free3d^{8}. I downloaded the sky boxes from here^{9}. The ‘Earth’ texture I used for some generated images is the same I used in the Newtonian Gravity project, so you can visit that page for a download link.
Such a program could be extended indefinitely. I would first try to improve its performance. Probably it could benefit from a better bounding volume hierarchy construction. If I would have patience and time, I would also go for some other performance enhancements. For example, I used the already built-in rotation in vector implementation, computing it again and again for rectangles, instead of caching the cosines and sines as in the book. That’s not exactly optimal. I bet the program could benefit from many such optimizations. Since the random number generation is used a lot, probably a faster number generator would help, but I guess it’s quite hard to find a good faster one than the one I used. If you find one, please let me know.
I would also switch to a spectral rendering, this way you could have truly realistic refractions, for example. Here is a nice way to start: Lazy spectral rendering on Peter Shirley’s blog. Spectral rendering was one reason why I have a Color
class in the project instead of using the vector class as in the books. During development I also found a problem in the last book and associated project, so I signaled it: Importance Sampling issue. You can see the issue in action in the book, too, it manifests itself by those black dots in the images and it required the deNaN cleaning that is also described in the book. That’s why I love open source, people can contribute, they get something for free and give something in return. Thanks to Peter Shirley for sharing his experience!
As usual, if you have suggestions or you find bugs, please let me know, either here or on GitHub^{1}.
I have a project on GitHub about Quantum Scattering^{1} on a Lennard-Jones potential. The idea is from chapter 2 of a book^{2} I already mentioned on this blog. For this project I won’t put theory here, but refer you to the book instead. It’s a very good book, it’s worth having. I don’t know a better computational physics book that is also treating many topics, the ones that are better are on narrow subjects, not so general. I strongly recommend it. I won’t present the theory here except a few words, I’ll let you look into the book and in a linked article. For the rest, there is the code…
Here is a video on youtube:
The peaks you see there are due of resonant scattering states (orbiting resonances).
What you see in the chart in the above video or in the featured picture is the total cross-section for scattering of a hydrogen atom on some noble gas atom. The scattering potential is approximated with a parametrized Lennard-Jones potential and taking advantage of the spherical symmetry, the Schrödinger equation, more specifically, the radial one, is solved using the Numerov method. Then the cross-section is computed using the phase shift. For the details you either will have to get the book, or look into one of those links: Central Potential Scattering and Phase Shifts^{3} or Scattering Tutorial^{4}.
Now, the last but not the least, the paper that has some results reproduced by the program: Molecular beam scattering studies of orbiting resonances and the determination of van der Waals potentials for H–Ne, Ar, Kr, and Xe and for H2–Ar, Kr, and Xe^{5}.
The code is the best documentation. If interested only in the computational physics part, you have to look only in one namespace: Scattering
, also perhaps in the SpecialFunctions
namespace, but the last version uses the std
implementations. The class names are very suggestive, LennardJonesPotential
, Numerov
and Scattering
, to name some. I hope the code expresses intent well enough to not need much description.
As usual for the last projects, here they are: wxWidgets^{6} and VTK^{7} (no, not Eigen this time).
Hopefully this is not the last project for this blog that takes advantage of the spherical symmetry, but I won’t reveal more of my plans about it.
As usual, if you find bugs or have suggestions, please comment.
People never cease to amaze me. By that I mean both laymen and ‘scientists’. I’ve seen many opinions of laymen about some sort of a ‘balance’ existing in all sorts of systems that no knowledgeable individual could claim to be in equilibrium. I’ve seen pretenses that the evil and sinning humans are ‘disrupting’ that balance. Salvation comes of course either through ascetism and suffering or by paying indulgences in form of taxes. It’s very like in the old religions, things do not change really much in the cargo cult ones. No matter if it’s about individuals, in psychology or about groups of people in sociology, or about climate, or something from biology, you see this a lot. You see it combined, too. Psychology mixed with climatology, climatology paired with biology, sociology with climatology, you see all sorts of ‘predictions’ or post hoc ‘explanations’ of some change, usually the more catastrophist ones being selected by the media and pushed onto gullible individuals. As if the publishing bias and other issues from sciences are not enough, the publishing bias from mass media on top certainly helps… Those ‘sciences’ have many problems, for example they are vague enough and allow rationalizations being usually protected by weasel words that allow them to predict things that are contradicting each other without instantly being proved false.
Anyway, this post is not about those issues, although the topic is related strongly with those. I wanted to have at least a post on this blog about chaos theory, I’ll present here a model that is related with many things I mentioned above. It’s about a model of population dynamics, more specifically, the Competitive Lotka–Volterra equations. You should see the connection by now, in biology they study the animals that are modeled with those equations as interacting, humans are also animals and the climate is defined by averaging all sorts of very complex, nonlinear things which include… biology. Now, this model presented here is very simple, chaos theory still fights with very simple models, while cargo cult sciences act like they can predict the very complex systems (without actually predicting them in the scientific sense). I’ll just let you watch some videos and give links to papers instead of writing a lot here. The associated program^{1} is simple enough, too. It’s just an application of the Runge–Kutta methods together with a four-dimensional visualization with VTK (a 3D chart together with the color for the fourth dimension).
First, here is the program in action:
You can see that the system exhibits a nice strange attractor.
Second, here is the main article that you should consult: Chaos in low-dimensional Lotka–Volterra models of competition by JA Vano, JC Wildenberg, MB Anderson, JK Noel and JC Sprott^{2}. Here is a lecture that might help: Lotka-Volterra Dynamics – An introduction by Steve Baigent^{3}.
An easy to follow presentation on youtube:
If that opened your appetite, here are the publications of Edward Lorenz^{4}.
Since I mentioned climate, too, here is a nice lecture about a toy climate:
If you failed to visit the Lorenz papers link^{4} and you believe that the climate is not weather I would suggest you to visit this link from the Church itself (later edit: they removed the original link from the site, I’m not surprised at all, it was too easy to point out. For the current link, check out ‘14.2.2 Predictability in a Chaotic System’). Funny how they talk about ‘balances’ there, isn’t it? Anyway, it’s from an older report, they swiped that under the rug in the newer ones. It’s not that the climate suddenly became non chaotic, though.
I tested the Runge-Kutta code from the Electric Field Lines project, changing the code slightly (the change is also in the original project, to keep the code in sync), and I displayed the results with VTK^{5}. The application is also using wxWidgets^{6}.
Besides the Runge-Kutta implementation, you might want to look into CompLKFunc.h
to see how the equations are implemented. To see how they are used together with the Runge-Kutta solvers, check out CompLKFrame::Compute
(both of them). If you are interested in the visualization, all CompLKFrame
class might be of interest.
I could have something more interesting perhaps, some system like a double-pendulum or even more complex, starting from two very close points, animated, but at least for the double pendulum you can find tons of those, even on youtube.
Here is an example:
I’ll let that for some other time. As usual, if you find mistakes in the code or you have something to add, please do so.
Later edit: Since I mentioned climatology and media cherry picking, I think I should also add some information I directly checked on the subject. Not so long ago, a lot of propaganda was circulated, about Earth going Venus or something like that. I’m subscribed to all sorts of groups that are supposed to present scientific news, but since it’s media, they cannot help themselves to cherry pick the more ‘spectacular’ and catastrophist ones and add their bias on top. So, exaggerated titles as usual, you should get the picture. Despite the fact that the authors clearly stated that Earth is not going Venus, the titles ‘suggested’ otherwise. The comments were precious, as always. I was annoyed enough by the pseudo scientific propaganda to look into the actual computer model. It was a simple model (too simple to be able to pretend to simulate the real world), easy to check out because they put the code on GitHub. I’ve looked five minutes into it, until I found this: Climate computer model bug. It’s a serious bug, to compute optical thickness with H2O instead of CO2 after CO2 was so badly presented as the devil is not something nice. Errare humanum est, but in this case it is diabolical. Not intended, that is almost certain, but nasty nevertheless. Why can that happen? Well, because it’s a pseudo science, not a science. A real science would have means to test the model against the real world and find that it does not fit the model. A pseudo science that does not have that ‘luxury’ would let such errors pass as the absolute truth. There is an advantage of not having to put your model to the test. Or maybe not.
There are many methods to calculate band structures of crystals. I implemented the Empirical Pseudopotential project, some of the code can be reused for other methods. One of the methods is simple and fast, the tight binding method, so I simply took the code from the last project, cut a part out and modified another part and here it is: Semi-Empirical Tight-Binding^{1}.
Here is the program in action, on YouTube:
You should probably go to the last post for some of the theory, there are there links relevant to this project, too. I won’t write here much about the theory, but I’ll give some links, as always.
The main idea is that the method is opposed of the one from the last post, in the sense that now electrons are considered ‘tightly bound’ to the nucleus. The description is based on Linear Combination of Atomic Orbitals which you could have met on this blog already, for example related with the Hartree-Fock project. Because orbitals for different atoms overlap, there is a probability for the electrons to ‘jump’ between different atoms. Only close neighbors must be taken into account, the overlap for atoms that are far apart being negligible.
Now, some links to papers. First, the most important paper which started it: Simplified LCAO Method for the Periodic Potential Problem by Slater and Koster^{2}. Here is a review worth looking into: The Slater–Koster tight-binding method: a computationally efficient and accurate approach by Papaconstantopoulos and Mehl^{3}. Here is another paper you might find interesting: Tight-Binding Calculations of the Valence Bands of Diamond and Zincblende Crystals by Chadi and Cohen^{4}. The paper that describes the method and contains the parameters for this project is: A SEMI-EMPIRICAL TIGHT-BINDING THEORY OF THE ELECTRONIC STRUCTURE OF SEMICONDUCTORS by P. Vogl, Harold P. Hjalmarson, John D. Dow^{5}. Here is some introductory text: An Introduction to the Tight Binding Approximation – Implementation by Diagonalisation by Paxton^{6}. Here are two PhD thesis on the subject: Semi-Empirical Tight-Binding Ways and Means for the Atomistic Simulation of Materials by Oliver Hein^{7} and Spin-Orbit Coupling Effects From Graphene To Graphite by Sergej Konschuh^{8}. Those links should get one started, of course there are plenty more docs on the internet worth looking into.
You can also find some tight binding related projects on GitHub, here is one that is worth looking into (python code): Tight Binding program to compute the band structure of simple semiconductors by Rick Muller^{9}. There are at least three implementations there, one based on the Chadi and Cohen paper mentioned above, one based on a book and one of the Vogl paper I also used for the implementation. I did not look over that code before implementing my project, I looked over it briefly only after having it working. I implemented the Hamiltonian using the same order for the basis as in the article, the python project orders the orbitals differently. It even makes more sense, because if you separate the orbitals for each atom like that, you separate the Hamiltonian on blocks, having the diagonal blocks diagonal (sic) because the orbitals for one atom are orthogonal, with the off diagonal blocks being the ‘overlap’ ones. There are also some signs that are different, but probably that does not change the spectrum. I’ll let you look into that code in more detail than I did. I preferred to use the same layout as in the paper, because it’s easier to understand the program if you look over the paper.
The code is based on the code of the Empirical Pseudopotential project, so it’s worth looking into that one first. I simply dropped the Pseudopotential
class, having now the parameters described in the Material
directly. The ‘big’ change is in the Hamiltonian
implementation. As in the last project, the important classes are in a separate namespace, this time called TightBinding
. The project is simpler than the last one, for example I got rid of GenerateBasisVectors
from the BandStructure
class. Because a lot of code is inherited from the last project, there are even features that are not really necessary for this one, such as multithreaded computation. The Hamiltonian matrix is small so computations are really fast. For all the details I’ll let you look over the project^{1}, there isn’t much to describe besides what I already did in the last post.
Of course they are the same as for the last project, but here they are again: wxWidgets^{10}, VTK^{11} and Eigen^{12}.
This ends the description of a project I implemented very fast by reusing code from the last project. If you find and issues or have suggestions, please let me know, either here or on GitHub^{1}.
I did not write anything on the blog for a long time. I was very busy, I had less time for it. Nevertheless, I did some work on some projects related with it. One of them, a DFT program that calculates a molecule with the supercell method, was supposed to be the theme for a blog entry. I implemented it, it works (sort of) but it needs some checks and cleanup. I implemented it with local pseudopotentials, specified in real space, one which I generated from the Ge one given in the references I mentioned in the last posts and some more which I found on the internet. Unfortunately it doesn’t give results as good as I expected. I might want to change it to use nonlocal pseudopotentials.
I also modified the Hartree-Fock project to be able to calculate and display the dissociation energy and the ionization energy, using Koopmans’ theorem, so in fact it’s the last energy level. I did this mainly because I wanted to compare some results with the DFT molecule program, the dissociation energy computed with Hartree-Fock cannot be very good. Another project I modified a little was the Lattice Boltzmann one, I tried to improve its speed a little.
What I wanted to have in the end with all those DFT projects is to use it in computations for crystals. So, since I postponed that project indefinitely, I thought I should at least have another project with pseudopotentials. Something much simpler than DFT. But before going to discuss that project, here is a result that is also presented in the lectures I pointed out in the last blog entries:
This is obtained with the Octave code, although I can reproduce the results with the C++ code from the molecule computation project.
Now, back to the subject of this blog post: a project that computes the band structure of a crystal having the diamond/zincblende structure, for various elements. The project is already on GitHub: Empirical Pseudopotential^{1}, it was there for a while before writing this. Actually, there is already another project on GitHub which also misses description here, but about that one, later.
As usual, here is the program in action, in a YouTube video:
Since this is such an easy subject I will resume to providing some Wikipedia links for visitors that are not familiar with the concepts and then some links to some pdfs dealing with the subject, then I will only very briefly write some words about the theory.
You should be familiar with the Bravais lattice. The fact that we’re dealing with a periodic infinite structure allows us to study it by applying Fourier theory, so you should also understand how the Reciprocal lattice is used. It’s not merely a mathematical trick, the reciprocal vectors have a direct correspondence with scattering vectors from diffraction on a crystal lattice. Bloch waves play a very important role in studying crystals, so you should be familiar with those, too. Once you understand how a particular Bloch wave can have different decompositions, you should understand why we can restrict to studying only the first Brillouin zone.
The program is based on parameters given in Band Structures and Pseudopotential Form Factors for Fourteen Semiconductors of the Diamond and Zinc-blende Structures by Marvin L. Cohen and T. K. Bergstresser^{2} so I thought I should reference it first. One pdf that describes the theory very nicely is on nanohub: Empirical Pseudopotential Method by Dragica Vasileska^{3}. A master thesis containing both theory and C code is: Empirical pseudopotential method for modern electron simulation by Adam Lee Strickland^{4}. If you grasp the theory you may want to apply the method on some other materials, with some other structure, like graphene. For such a case, here is a master thesis dealing with it: Band Structure of Graphene Using Empirical Pseudopotentials by Srinivasa Varadan Ramanujam^{5}. Those links should be enough, if not, google should be able to reveal more related with the subject.
We’ll start with the uni-electron Schrödinger equation:
With it, we are describing non-interacting electrons moving into the potential of the atomic nuclei in the crystal. Of course, the problem is a many-body problem and often you cannot get away with such simplification, but sometimes even the free electron model can be good enough, or at least the nearly free electron model, so it’s worth a try.
We can benefit from having a periodic arrangement of atoms. Because the Hamiltonian commutes with all discrete translation operators, we can pick common eigenvectors to describe the solutions, namely Bloch vectors. A Bloch function has the form:
We can avoid the ambiguity of decomposing in the plane wave and the periodic function u by limiting to the first Brillouin zone. Since electrons close to nuclei are deep in the potential well of the nucleus, they are perturbed very little by the potentials of the other nuclei and they cannot easily tunnel out (that is, they are not so ‘free’ as the electrons we are trying to describe), so we can describe them using linear combination of atomic orbitals. A problem is that the Bloch functions are not orthogonal on the atomic orbitals. That would force us to use the overlap matrix and have to deal with the generalized eigenvalue problem. Alternatively, we can orthogonalize the ‘plane waves’ by the usual procedure: extract from vectors the projections of the vector along all the others:
where are the atomic orbitals. It’s easy to get from this the original wavevector and substitute it into the Schrödinger equation:
Applying the Hamiltonian operator on each term and moving the second right term to the left we get:
We can force it back to Schrödinger equation:
Call the repulsive term and notice that we still have a Schrödinger equation with the same eigenvalues but different eigenvectors, having not only the attractive potential given by the nucleus, but also a repulsive potential given by the core electrons. Call the pseudopotential and we have to solve:
The only problem that remains apart from solving it is to determine the pseudopotential. You have various means for that, from computing it to determining it from experimental results. I’ll let you find the details from the given links…
If you are interested only in the computations, all the relevant classes – except Vector3D
which was already used in many projects for this blog – are in the EmpiricalPseudopotential
namespace. SymmetryPoint
is a class for a very simple object: it has only a ‘name’ and a position in the reciprocal space. SymmetryPoints
contains a map of such objects. Its constructor just initializes it with the critical points of the face-centered cubic lattice. SymmetryPoints::GeneratePoints
generates the points that are charted. You pass it a ‘path’ formed by symmetry points and the number of desired points to have in the chart and it computes the k-points coordinates. It also returns the indices of the symmetry points in symmetryPointsPositions
. The Material
class is also simple, it’s for objects that store a material name, a distance in Bohrs between the basis cell atoms and the pseudopotential. There is a Materials
class that just holds a map of materials. The map is filled in the constructor, a configuration file would be probably a better alternative, but for the purpose of this project, it’s good enough. By the way, you’ll see some conversions going on there, it’s because the parameters were given in Angstroms and Rydbergs, but computations are done in Hartree atomic units.
The Pseudopotential
just holds the parameters for a pseudopotential and has a single function:
std::complex<double> Pseudopotential::GetValue(const Vector3D<int>& G, const Vector3D<double>& tau) const { const int G2 = G * G; const double Gtau = 2. * M_PI * tau * G; double VS = 0; double VA = 0; if (3 == G2) { VS = m_V3S; VA = m_V3A; } else if (4 == G2) { VA = m_V4A; } else if (8 == G2) { VS = m_V8S; } else if (11 == G2) { VS = m_V11S; VA = m_V11A; } return std::complex<double>(cos(Gtau) * VS, sin(Gtau) * VA); }
It’s quite simple because we need values only in some points, over 11 for the value can be very well approximated with 0 and for the value can be set to zero being only a shift in energy.
All the above mentioned classes should be very easy to understand. I have only two of them left to describe, both being important. First, one for the Hamiltonian:
class Hamiltonian { public: Hamiltonian(const Material& material, const std::vector<Vector3D<int>>& basisVectors); void SetMatrix(const Vector3D<double>& k); void Diagonalize(); const Eigen::VectorXd& eigenvalues() const { return solver.eigenvalues(); } protected: const Material& m_material; const std::vector<Vector3D<int>>& m_basisVectors; Eigen::MatrixXcd matrix; Eigen::SelfAdjointEigenSolver<Eigen::MatrixXcd> solver; };
You may recognize there the material for which the Hamiltonian is constructed, the basis vectors that are used for matrix representation of the Hamiltonian, the actual Hamiltonian matrix and the solver used to diagonalize it.
You may look in the project sources^{1} for the full implementation, here is the one for `SetMatrix’:
void Hamiltonian::SetMatrix(const Vector3D<double>& k) { const unsigned int basisSize = static_cast<unsigned int>(m_basisVectors.size()); for (unsigned int i = 0; i < basisSize; ++i) for (unsigned int j = 0; j < i; ++j) // only the lower triangular of matrix is set because the diagonalization method only needs that // off diagonal elements matrix(i, j) = m_material.pseudopotential.GetValue(m_basisVectors[i] - m_basisVectors[j]); for (unsigned int i = 0; i < basisSize; ++i) { // diagonal elements // this is actually with 2 * M_PI, but I optimized it with the /2. from the kinetic energy term const Vector3D<double> KG = M_PI / m_material.m_a * (k + m_basisVectors[i]); matrix(i, i) = std::complex<double>(2. * KG * KG); // 2* comes from the above optimization, instead of a /2 } }
If you looked over the theory, you’ll recognize the formulae immediately.
The last one is BandStructure
, a little more complex but not a big deal. AdjustValues
just converts Hartees to eV and shifts the energy value to a zero which is found with FindBandgap
. Initialize
is straightforward, GenerateBasisVectors
just fills basisVectors
with vectors that have the proper length (that length squared is in G2
).
Here is the Compute
function:
std::vector<std::vector<double>> BandStructure::Compute(const Material& material, unsigned int startPoint, unsigned int endPoint, unsigned int nrLevels, std::atomic_bool& terminate) { std::vector<std::vector<double>> res; Hamiltonian hamiltonian(material, basisVectors); for (unsigned int i = startPoint; i < endPoint && !terminate; ++i) { hamiltonian.SetMatrix(kpoints[i]); hamiltonian.Diagonalize(); const Eigen::VectorXd& eigenvals = hamiltonian.eigenvalues(); res.emplace_back(); res.back().reserve(nrLevels); for (unsigned int level = 0; level < nrLevels && level < eigenvals.rows(); ++level) res.back().push_back(eigenvals(level)); } return std::move(res); }
kpoints
is a vector of positions in the reciprocal space, previously generated in Initialize
using SymmetryPoints::GeneratePoints
. You can see above how the Hamiltonian
class is used. The code just iterates between startPoint
and endPoint
, for each k-point the Hamiltonian matrix is set then it’s diagonalized, then the eigenvalues are retrieved and saved in results. startPoint
and endPoint
and terminate
are used because the computation can be done with several threads.
For how that is done and how the data is put into a chart, you’ll have to look into the the project sources^{1}, mostly in the EPThread
and EPseudopotentialFrame
classes.
Options
is for… options and OptionsFrame
is the class that implements the options dialog box for editing them. EPseudopotentialApp
is a very simple wxWidgets^{6} application class, wxVTKRenderWindowInteractor
is a class that allows to easily use VTK^{7} from wxWidgets that I’ve got from here^{8} and modified it slightly to be able to compile with a more recent version of wxWidgets and that’s about it.
It should be obvious by now that I switched from mfc to wxWidgets^{6}. I’ve got bored by mfc, although it’s easier to use with VisualStudio, but since they released a wxWidgets version that compiled with no issues on Windows, I decided to use it from now on with projects for this blog. The big benefit is that with minimal efforts the projects could be compiled to run not only on Windows, but on Linux or Mac as well. Since I switched to wxWidgets, to keep things portable I also had to avoid using my 2D charting class which I used in other projects. Instead of writing some new, portable code for a chart, I preferred to use VTK^{7} for the 2D chart.
As in many other projects, I also used Eigen^{9}.
This concludes the presentation of the Empirical Pseudopotential project. If you find any issues, please let me know, either here or on GitHub.
Since I’m quite busy I might not post for a while here again, but there is already another project on GitHub, also for computing band structure, but using the tight binding method. The project is based on this project, I simply had to remove some code and change it into some places (obviously, for example the Hamiltonian had to be changed). The code is even simpler than the one of this project. In fact, it’s deceptively simple, the theory is not so simple if you go into the details.
This post is describing another project that is leading (hopefully) towards something a little more serious: DFT Quantum Dot^{1}. It will be a very short description because it’s Sunday.
As usual, here is the program in action:
It displays the quantum dot ‘orbitals’ with VTK volume rendering. Do not assign much physical reality to individual orbitals (the total density is another matter) or energy levels (but for the last see Koopmans’ theorem).
This project is based on this lecture, already mentioned on this blog:
You might also want to look over this: New Algebraic Formulation of Density Functional Calculation^{2} and the lecture support files: Tomas Arias lectures support files^{3}. Some DFT theory was exposed in the last post, so there is enough theory to allow one to understand the program.
I usually list some more important code in the posts, but I won’t do that here. I will only point where to look in the project^{1} instead. Besides, if you want to understand the code, the best way is to first watch the lectures, read the paper^{2} and solve the assignments in Matlab or Octave or SciLab but the last one might miss some functions you’ll need so you’ll have to implement them yourself. Of course you could also use another language like python but the Tomas Arias lectures support files^{3} already contains some skeleton m files and some helper code that will shorten the development time.
This project^{1} corresponds to the third assignment (without the last part, which is easy). If you know Matlab a little, it shouldn’t take you long to solve the first three assignments, then understanding the C++ code should be much easier.
I re-implemented the Poisson solver (already described here for another project) in the DFT solver class, this time using the notation from the lectures. The reason for the more complex code is to allow changing it to use another basis set very easily. You might want to look into the Poisson project to see it implemented in a simpler manner. I also reused the FFT classes used in the NMRI project.
Everything related with the DFT computations is in the DFT
namespace. You’ll see in there the RealSpaceCell
and ReciprocalSpaceCell
classes which have little changes (if any, I don’t recall if I changed them or not) from the last project. The QuantumDot
class hopefully has a suggestive naming, is very simple and it just initializes the potential in the constructor. The potential can be harmonic (as in the assignment) or linear. The most important DFT class is DFTSolver
which allows using different exchange-correlations by using a template parameter. For the assignments, the Vosko-Wilk-Nusair is already provided. I implemented it in C++ and tested the project with it, but I did not provide that class on GitHub. Instead I put in the project the ChachiyoExchCor
class, which as the name says, implements the Chachiyo correlation^{4}. It’s simpler to implement and should be good enough for the purpose of this blog.
The classes for minimum finding are in the DFT::SolutionFinders
namespace. I abused a little templates in there, but that allowed me to have a single implementation for Descend
method (except for SteepestDescent
) and just plug in the different code for each minimum finding method. It’s probably far from being the best way. Anyway, with the help of the lectures the code should not be hard to understand. Here are some additional Wikipedia pages: Gradient descent, Preconditioning, Conjugate-gradient methods, Nonlinear Conjugate-gradient method. In addition, this paper (referenced on Wikipedia page) should be better than what I would have patience to write here: An Introduction to the Conjugate Gradient Method Without the Agonizing Pain^{5}.
You might also want to look into the document class and view class, the later especially if you want to figure out the VTK code. The view code is a little more complex than it should because on my video card I had an issue which I tried to alleviate by hiding the warning/error VTK window (and logging in a file instead) and I also added some recovery code (which usually works, but not always). Sometimes when changing the orbital the VTK library fails. I did not look much into it, I simply provided a workaround which will simply stay with the old orbital forcing you to retry to switch it.
Besides the usual C++ libraries and mfc, the project uses VTK^{6} for visualization, FFTW^{7} for Fourier Transform, and Eigen^{8} for matrix computations.
That’s about it. I might implement another project for atoms/molecules next, but if you are impatient (or for the case I’ll decide not to implement it), you could implement it yourself, it’s quite easy using the code from this project. In fact, I already implemented it and tested some simple atoms/molecules but I removed the code from the project (later edit: now that code is also in the project, as an example of the solver usage).
As I already revealed in the last post, I intend to have several projects with Density Functional Theory on this blog. I already have a simple project on GitHub, about a ‘quantum dot’^{1} with volumetric visualization of orbitals with VTK.
I thought that exposing some theory in a separate post would be nice for further references, so without further ado, here it is.
The How to solve a quantum many body problem post is relevant, it contains topic related things like Born–Oppenheimer approximation and variational principle. The post on Hartree-Fock is also relevant, up to the Hartree term discussion. Actually, all of it is relevant but I’ll detail that later. The section about basis sets is also important. I used plane-waves for the DFT code, but it could be changed to use something else, like Gaussians or wavelets. Also the last post, Solving Poisson Equation contains related information.
I’m not going to give a lot of links here, one can find a lot of them on the net, but a couple shouldn’t hurt. Here is the Nobel Lecture of W. Kohn^{2}. A very nice read, but if you want something more detailed, here is another one: A bird’s-eye view of density-functional theory by Klaus Capelle^{3}. Another one would be The ABC of DFT by Kieron Burke & friends^{4}. Since I mentioned Hartree-Fock, here is a paper with a comparison: Density Functional Theory versus the Hartree Fock Method: Comparative Assessment^{5}. Here is a link from where you can download some lecture notes: Electron Density functional theory by Roi Baer^{6} Last but not least, I should mention a paper I already mentioned last post: New Algebraic Formulation of Density Functional Calculation Sohrab Ismail-Beigi, T. A. Arias^{7}.
As for the pdf documents case, one can find plenty of video lectures on the net, this is a very big subject. Nevertheless, besides the video about the practical aspects I already pointed to in the last post, I’ll put here this introductory one:
Besides this, I’ll point to a recent set of video lectures from CECAM: Teaching the Theory in Density Functional Theory^{8}. They are more detailed, so if the above video is easy for you, you might want to see some of those.
As in the Hartree-Fock case, I’ll briefly present here some sort of justification (but it’s hand waving, really, you should check out the real derivations). The theory requires some variational calculus (as in the Hartree-Fock case) and it would be too much to type here in LaTex, so for the details one could follow the above links.
In the Hartree-Fock post I presented an equation containing the Fock term:
Here is written a little differently, but it’s the same thing. The second term contains the nuclear potential and perhaps some other external potential, the third one has the sum moved after the integral sign then the sum is done to obtain the density. This is relatively easy to solve but it has big issues, being a mean field theory equation. This equation neglects electron correlations and the Pauli exclusion principle is not obeyed. For molecules it gets energies too high, bindings too weak and distances between atoms too big. Electrons lower the total energy by avoiding each other and make the bindings stronger.
The Hartree-Fock equation tries to solve the issue (along with eliminating the self-interaction that appears due of the Fock term) by adding the exchange term. But that neglects the electrons correlations.
It turns out that one can try to take into account correlations, too, by using an additional exchange-correlation potential term. By adding such exchange-correlation term, one gets the Kohn-Sham equations:
The resemblance with the Hartree-Fock equations should be obvious. For Hartree-Fock, only exchange is taken into account and the potential is a non-local one.
For DFT, the potential is obtained from the exchange-correlation energy functional: , which is a density functional, whence the naming.
The density is the usual:
This makes the Kohn-Sham equations, similarly with the Hartree-Fock ones, self-consistent equations. One could solve them by guessing a density for the electrons, use it to solve the Kohn-Sham equations, get the new density out of the solutions and so on, repeating until convergence. For details, please see the Hartree-Fock posts.
Unfortunately the correlation-exchange functional is not known exactly, so approximations are used. For the purpose of this blog, the local-density approximation is good enough, so I’ll use only that one:
It’s worth mentioning that there are other approximations, like the generalized gradient approximation, where besides the local density also the gradient of density is used, or the meta-GGA, where not only the fist derivative but also the second derivative is used. There are also hybrid approaches using the Hartree-Fock exchange.
is further divided into the exchange and correlation parts and the approximation is given by the jellium model, the exchange being known exactly, the correlation being obtained by quantum Monte-Carlo.
One more thing I want to mention here is that one could use LSDA instead of LDA (S comes from ‘spin’). The main idea is to work with two densities, , instead of one. It’s not much more difficult.
There are various ways one could try solving the Kohn-Sham equations. One way would be to attack the problem in real space. It’s easy to implement the code like that (but not so easy to do it efficiently) and it’s easy to understand. If you want to go that way, you should check this paper: Real-space mesh techniques in density functional theory by Thomas Beck^{9}.
Another way would be to use Gaussian orbitals. For molecules calculations it would be a very good choice. One could take the already implemented Hartree-Fock program and change it to be also able to perform DFT. That should not be hard at all, in comparison with the difficulty of implementing the Hartree-Fock one. I considered doing that, but because I want to use DFT in more projects than for molecules and because I don’t want to make that project more complex than it is, I gave up the idea.
Another approach would be to start a program from scratch, writing the code in the way described by Tomas Arias^{7} in the lectures I pointed to in the last post and use the Gaussian orbitals instead of the plane waves basis. The relevant code for the Gaussian orbitals could be taken from the Hartree-Fock program.
Anyway, I decided to use plane waves basis. For a crystal it is a better choice because of the periodic boundary conditions. Working with the reciprocal lattice is a natural choice.
Here I presented briefly some DFT theory for further reference. I might edit the post to add some things in the future, but it’s unlikely that I’ll do substantial additions. The links expose enough theory for understanding the programs.
When I started this blog I already expected to have projects that use the Fast Fourier Transform. I actually wrote down several topic ideas for the blog, both solving the Poisson equation and the subject this post will lead to were there, too. I already mentioned in the Relaxation Method post that one can use the Discrete Fourier Transform to solve the problem faster and here it is, as an intermediate step leading to at least one project on Density Functional Theory.
I won’t discuss here the Fast Fourier Transform, I already have the previous post for that, so the classes that deal with FFT will not be presented here. I just copied them from the last project and used them. I will reuse them in the DFT project(s) as well, along with the Poisson solver (probably with changes).
This post also introduces VTK, The Visualization Toolkit^{1}, a library which I will also use in future projects. Even for this project, where I could implement those surfaces easily with OpenGL – but not so easily the axes behaving like those from VTK, an implementation which could take quite long and be quite boring – the library saved quite a bit of time. VTK allows very impressive visualizations of scientific data saving a lot of time (I’m told that the library has way more than 1000 years of developer time in it, obviously I cannot beat it in reasonable amount of time). Using it will allow me to focus on the parts related with the blog topic more and will allow me make some projects that I would avoid otherwise due of the visualization complexities, as I avoided for example the 3D flow for the Lattice Boltzmann post. I know it’s nicer to just download the code and be able to compile it, with as little dependencies as possible, but having the VTK dependency brings up too many advantages to avoid it. I might use just OpenGL in the future for certain projects, but for complex 3D visualizations or even charting, except very simple charts as I already used in projects for this blog, I’ll use VTK.
For this post I implemented a Poisson equation solver using a spectral method. Here is the program in action:
What you see in there is just a section halfway through the 3D volume, with periodic boundary conditions. In the left view I represented the charge density, generated with two gaussians, in the right view is the solution to the Poisson equation. The source code for the project is on GitHub^{2}.
I didn’t want to reveal the purpose of bringing up the FFT themes on the blog yet, because I thought that I might just give up implementing the DFT project for various reasons, but I thought it would be nice to give some links I looked into while having a DFT prototype implemented. Incidentally the featured image at the top of the post is an Octave chart I generated while implementing the DFT prototype.
At first I wanted to simply write an easy DFT project using a real space discretization. I already implemented a simple DFT program – proof of concept style – that way, some time ago. Discretization of the real space and using a simple method like the finite difference method is not a bad way to start, the method is simple to understand and implement. But if you want some performance it might not be the best way, so this time I thought to use Gaussian orbitals similarly as for the Hartree-Fock project. Then I thought I might want to implement sometime a project for a crystal, in which case the periodic boundary conditions come into play. In such a case, due of periodicity, the plane wave basis is very good and besides, it brings something new for this blog. With the help of the Hartree-Fock posts and the DFT ones, there should be enough information to be able to implement a DFT program using Gaussian orbitals, too.
So, I decided to go the plane waves basis way, then I started looking for some info on the net about it. I knew the theory, knew some things about implementation but I wanted to look into the details a little more, the devil is in the details. I started working on the Hartree-Fock project before having all the details clear and it took me more time than I anticipated. Of course there are plenty of lectures on DFT, I’ll point to those in the posts about DFT, but they touch only the theory. With a little search on YouTube I found something more appropriate, from Cornell University:
The first two lectures and a part of the third are relevant to this project, if I recall correctly.
Unfortunately it misses some parts, but with the help from the links the gaps can be filled. Here is a very useful link (there is a link mentioned in the lecture, that’s how I reached this link): Arias Classes^{3}. You want to look into the Phys 7654 Practical DFT folders, preferably the latest. You’ll find there readings and very importantly, assignments. One of the theory papers is also found on arxiv: New Algebraic Formulation of Density Functional Calculation^{4}. I just implemented the assignments in Octave to be sure it’s something relatively easy to do, now I only have to ‘translate’ them to C++ and of course, have a nicer visualization if possible. The project for this blog post corresponds to the first assignment, but computations are simplified to make the project easier to understand. There is no ‘overlap’ operator in the C++ project (the basis is orthogonal) and also I got rid of various cI, cIdag, cJ, cJdag, forward and inverse FFT is sufficient for this project.
Poisson equation appears in various contexts, but for now we are interested in the electromagnetic field. The equation is:
Where is the Laplace operator. By the way, from now on I will consider a Coulomb gauge. You may attack this equation directly, numerically, by discretizing space and in 2D you can even beat FFT with multigrid methods, for details please check this page. Its solution is:
For more details, please check this page: Mathematical description of the electromagnetic field.
Not surprisingly, because of linearity, the field is given by the ‘sum’ (the integral) of the fields generated by each ‘point’ charge in the volume. You could try to solve the integral numerically and for a few point charges it wouldn’t be so difficult. Also for a limited number of localized charge distributions, with spherical symmetry, you could use some tricks to simplify things, but in general, this integral is quite hard to solve in real space. For 3D if you cut the volume in pieces, you have to sum over all of them for each point, that is, the complexity is . FFT is which is much faster. So let’s use the Fourier transform on the Poisson equation:
You can interchange the Laplace operator with the integral and more, since the operator does not act on k, skip over , it acts only on the exponential. It just brings down twice. So in the reciprocal space the Laplace operator becomes quite simple, a multiplication with , that is, you won’t have to deal with a costly diagonalization, the operator is diagonal and you only have to perform a simple multiplication to apply it. To solve the Poisson equation you have to compute charge density in the reciprocal space using the discrete Fourier transform, , solve it by simply dividing each value with which gives then simply do the inverse discrete Fourier transform back to the real space.
The FFT code is in the Fourier
namespace and I already mentioned it in the previous post. I also used in this project the Vector3D<T>
class I used in other projects. I think it should be pretty clear what it does, I won’t describe it, either. There are other things present in the code (like Ewald summation) I won’t detail, either, for those details please see the video lecture and the lecture files.
All the code relevant with solving the Poisson equation is in the Poisson
namespace. The RealSpaceCell
and ReciprocalSpaceCell
should be obvious even after a superficial look over the code, I think even naming is revealing enough. If they are not clear, those two Wikipedia pages should have relevant information: Bravais lattice, Reciprocal lattice. Also following the lectures I pointed out and looking over assignments should be very helpful. By the way, you should try to solve the assignments with Octave yourself if you are interested in the subject, the code here might not correspond 1:1 with the one required there. For example, you’ll see in there a matrix S which I got rid of (for example the cell volume is computed with det(S)). In general, you’ll have the cell basis vectors on columns in that matrix and its determinant gives the cell volume, but since up to computing a molecule you will need only orthogonal vectors, I got rid of it for now. This is not true in general for a crystal, though. It’s useful to see the things more general, though, although they add complexity. I also got rid of the overlap matrix, also present in the lectures and assignments, because of the orthogonality of the plane waves basis it’s diagonal, but the matrix O would be very relevant if you would use a Gaussian basis set, for example. One thing that might seem odd is the indexing of points in the reciprocal cell, which is to alleviate aliasing^{5}. If you look at how indexing looks and the fact that the cell is periodic, what happens there is not that big deal.
For now, the code uses only Gaussian charge distributions and the class GaussianChargeDistribution
represents such a charge. It has a position
vector inside and the charge Z
. A number of such charges are packed inside the class Charges
in a vector along with some goodies which are useful for calculations. Now, if you look inside ComputeStructureFactorAndChargeDensity
you might notice the calculation of charge density to real space back from the Fourier space, which should suggest an optimization, since solving Poisson equation with this method requires transforming the charge density to Fourier space. The reason I didn’t use it is to have solving the Poisson equation in a general manner, you typically start with having the charge density specified in the real space. But I added a comment that should be clear enough: you have to comment two lines of code and uncomment another to switch to the faster but less general solution. You can take advantage on the way the charge distribution is generated: a single Gaussian with the center in the center of the cell is generated then it is transformed into the reciprocal space, being used to generate all charges distribution by simply ‘translating’ (that’s why the StructureFactor
exists) that distribution.
Now, here are the relevant portions of the code about solving the Poisson equation, from the class PoissonSolver
, first, the method that brings the solution back to real space:
static inline Eigen::VectorXcd SolveToRealSpace(Fourier::FFT& fftSolver, Poisson::RealSpaceCell& realSpaceCell, Poisson::ReciprocalSpaceCell& reciprocalCell, Charges &charges) { Eigen::VectorXcd fieldReciprocal = SolveToReciprocalSpace(fftSolver, realSpaceCell, reciprocalCell, charges); Eigen::VectorXcd field(realSpaceCell.Samples()); fftSolver.inv(fieldReciprocal.data(),field.data(),realSpaceCell.GetSamples().X, realSpaceCell.GetSamples().Y, realSpaceCell.GetSamples().Z); return field; }
It just calls SolveToReciprocalSpace
which returns the solution in reciprocal space, and just performs an inverse Fourier transform on it to have the solution in real space. Here is the code for having the solution in the Fourier space, together with the mentioned comments:
static inline Eigen::VectorXcd SolveToReciprocalSpace(Fourier::FFT& fftSolver, Poisson::RealSpaceCell& realSpaceCell, Poisson::ReciprocalSpaceCell& reciprocalCell, Charges &charges) { Eigen::VectorXcd fieldReciprocal(realSpaceCell.Samples()); fftSolver.fwd(charges.ChargeDensity.data(), fieldReciprocal.data(), realSpaceCell.GetSamples().X, realSpaceCell.GetSamples().Y, realSpaceCell.GetSamples().Z); // uncomment this line and comment the two above if you want a faster solution //Eigen::VectorXcd fieldReciprocal = realSpaceCell.Samples() * charges.rg; fieldReciprocal(0) = 0; for (int i = 1; i < realSpaceCell.Samples(); ++i) { // inverse Laplace operator fieldReciprocal(i) *= 4. * M_PI / realSpaceCell.Samples() / reciprocalCell.LatticeVectorsSquaredMagnitude(i); } return fieldReciprocal; }
There are two things here worth mentioned, first, the 4. * M_PI, which is there because of using atomic units, more specifically, because .
The other one is setting fieldReciprocal(0) = 0;
. If you would calculate it instead by using the Laplace operator in Fourier space you would get a division by zero. The reason of using this is that the zero frequency corresponds to a constant in real space, which we conveniently set to zero. I already mentioned the Coulomb gauge…
Anyway, that’s about everything I have patience to comment about the Poisson solving code right now, there is plenty more to it but the lectures and the documentation supplied with them should be enough if you want to know more.
The second goal of this project was to introduce VTK, The Visualization Toolkit^{1} which I will use from now on for projects for this blog. They have a nice textbook and a nice user guide, please check them out.
The library can be used on various platforms and has support for parallel processing. It can be used not only from C++, but also from other languages, such as Python or Java, which is of less importance for the projects for this blog. It uses a pipeline architecture, having on one end the data sources (such as files or geometric objects or simply data structures as the ones I used for this project) and on the other end renderers that render in the render window. You can use multiple renderers to have multiple views in the same window, a feature that I took advantage of in this project. Along the pipeline you can have various ‘filters’ that deal with your data and pass it along from one another you have a lot of algorithms already implemented (such as for example Delaunay triangulation). Those objects can have multiple inputs and outputs. The library can do a lot of things, it’s worth looking over it!
Although the library has powerful charting in it, see for example the vtkChartXY class, I’ll probably use more than charting in the future projects for this blog, although I might also use relatively simple charts, so for this project I generated and displayed the 3D surfaces instead of using the charting directly. The code that deals with VTK is in the CPoissonDoc
for the data sources, and the rest of it in CPoissonView
. Although VTK provides smart pointers I did not use them for simple cases where the objects were created once (typically in constructor) and deleted at the end in the destructor. To expose data to VTK I put it into ‘image’ objects which represents data that is evenly spread apart. In this case it was 2D but it also can be 3D (or 1D for that matter). For example in the document object you have vtkImageData* fieldImage;
which is initialized in the constructor like this (the analogous charge density code is removed for clarity):
fieldImage = vtkImageData::New(); fieldImage->SetSpacing(realSpaceCell.GetSize().Y/realSpaceCell.GetSamples().Y, realSpaceCell.GetSize().Z/realSpaceCell.GetSamples().Z, 0); fieldImage->SetDimensions(realSpaceCell.GetSamples().Y, realSpaceCell.GetSamples().Z, 1); //number of points in each direction fieldImage->AllocateScalars(VTK_FLOAT, 1);
The VTK object is deleted in the document destructor with fieldImage->Delete();
. The ::New()
and ->Delete()
are needed because on various platforms the actual objects created might differ (this is obviously the case for the rendering windows, for example), New
is a factory method. Besides, the objects are reference counted, the Delete
method does not simply delete the object. Various objects might retain the pointer to the object and increase its reference count if passed to them, for example as an input connection.
The data is set in Calculate
after solving the Poisson equation:
// slice the result - put the values in the 'image' data for VTK for (unsigned int i = 0; i < realSpaceCell.GetSamples().Y; ++i) for (unsigned int j = 0; j < realSpaceCell.GetSamples().Z; ++j) { unsigned int pos = start + realSpaceCell.GetSamples().Z * i + j; fieldImage->SetScalarComponentFromDouble(i, j, 0, 0, field(pos).real()); }
It’s simply a slice through the ‘cell’.
That’s about all there is to it in the document class, most of the code is in the CPoisonView
view class. I should mention that they actually have a vtkMFCWindow
in ‘GUI support’ but I avoided using it, I feel that I have more control in the way I implemented the view.
As for the case of the document class, objects are created with ::New()
some of them in the view constructor, some of them in CPoissonView::OnInitialUpdate
and destroyed in the view destructor. The objects have some properties set in OnInitialUpdate
, too. Drawing is done in CPoissonView::OnDraw
and for drawing in the window case it’s quite simple, the render window is asked to render (it’s more complex because of printing and print preview). By the way, the render window is created by VTK embedded in the mfc window. You can use the mfc window directly – that is, draw into it, instead of a child window – but I couldn’t make the interactor work that way.
Having two views in the same window is easy and here is how it works in the Pipeline
implementation:
void CPoissonView::Pipeline() { PipelineView(ren1, geometryFilter1, warp1, mapper1, chartActor1, axes1); PipelineView(ren2, geometryFilter2, warp2, mapper2, chartActor2, axes2); }
The first call to PipelineView
is for the left side view, the second one is for the right side view. Here is how PipelineView
is implemented:
void CPoissonView::PipelineView(vtkRenderer *ren, vtkImageDataGeometryFilter* geometryFilter, vtkWarpScalar* warp, vtkDataSetMapper* mapper, vtkActor* chartActor, vtkCubeAxesActor2D* axes) { warp->SetInputConnection(geometryFilter->GetOutputPort()); //mapper->SetInputConnection(warp->GetOutputPort()); //*************************************************************************************** // Gouraud shading needs normals. Just uncomment the above and comment what follows up to //**** // to remove shading vtkSmartPointer<vtkPolyDataNormals> normals = vtkSmartPointer<vtkPolyDataNormals>::New(); normals->SetInputConnection(warp->GetOutputPort()); normals->SplittingOff(); normals->ComputePointNormalsOn(); normals->ComputeCellNormalsOff(); normals->ConsistencyOn(); mapper->SetInputConnection(normals->GetOutputPort()); //**************************************************************************************** chartActor->SetMapper(mapper); chartActor->GetProperty()->SetInterpolationToGouraud(); ren->AddActor(chartActor); // add & render CubeAxes axes->SetInputData(warp->GetOutput()); axes->SetCamera(ren->GetActiveCamera()); ren->AddViewProp(axes); }
You should also look into the CPoissonView::OnInitialUpdate()
. In there for example the source of the data is set, but I’ll let you look on GitHub^{2} for the rest of the code.
Besides VTK^{1}, the project also uses FFTW^{6} and Eigen^{7} and obviously mfc.
This is the second step towards having a Density Functional Theory program working. The next step is going to be quite complex compared with the first ones, so you should have a good understanding of this project before looking into the next one. I still have no idea when I’ll have the next one ready, but I’ll have it done, eventually, for now it’s only as an Octave prototype.
I needed to use Fast Fourier Transform for a project that I’ll implement (hopefully) for this blog. Something quite different from the theme of this post, but I won’t reveal it. Depending on my free time and mood, I might have it working in a week or I might even drop it. I do have a working prototype implemented in Octave so there is a good chance I’ll have it implemented in C++.
The Fourier Transform is very important in physics, I already hinted that on this blog, in the Relaxation Method post I already mentioned that using a FFT would be better. Spectral methods are very important in solving differential equations and from physical point of view, Fourier Transform is paramount important. That cannot be stressed enough, but I’ll give here just a hint: Reciprocal Lattice. You meet the Fourier Transforms in many various areas of physics, obviously I cannot enumerate them here…
So, I needed that I since it’s something I don’t like to implement myself, I thought I might use some classes from Eigen. They do have some unsupported classes in there for this purpose, but after looking over them I decided to use FFTW^{1} and implement myself the wrapper classes for the library. The ideas are based on the Eigen^{2} implementation, but its implementation misses 2D and 3D transforms and dealing with multithreading. I didn’t like the way they index plans, too, so I decided to have the wrappers implemented by me, this way I can easily extend them further as needed in the future. It’s quite easy to implement 2D and 3D transforms from 1D transform but why bother when the library already provides it?
I needed FFT for some other projects for this site, but as a quick test for the classes I thought I should have first a very easy project that uses the classes and it’s also related with physics. So here it is, a project that uses the 2D Fourier Transform to obtain the image from Nuclear Magnetic Resonance Imaging raw data. The project is on GitHub^{3}.
You can also see the program in action:
In the left image you can see the raw data, the right one obviously displays the image one would actually want to see after the measurement.
I thought that illustrating what happens if you cut out high or low-frequency information could be interesting, so I added that, too, not only simply visualization of raw and Fourier transformed data.
Here are some lectures from Stanford^{4}:
There are 30 lectures and if you have patience to watch them, you’ll learn about many things, among them being the Fast Fourier Transform and 2D Fourier Transform in Medical Imagining.
Here is an online book related with the project subject: The Basics of MRI^{5}.
If you want to look more into Nuclear Magnetic Resonance, here is another one: The Basics of NMR^{6}.
The program^{3} is very simple, it’s the typical mfc program with a doc/view architecture, so you have the document class, the view class, the application class and the main frame class. Nothing fancy is going on in there, the view has a timer that redraws it, drawing is done using the memory bitmap class I took from the Ising program and changed a little to work with the new data. Instead of options and property pages I simply used a properties window with two bool
properties that allow filtering out the low and high frequency information. There is a NMRIFile
class which implements loading the data from a file and of course, there are also the FFT classes for a FFTW plan, FFTWPlan
and the FFT
class, both being in the Fourier
namespace.
By the way, the data file is from the GPU Gems 2 CD, available here^{7}. The entire text is also available^{8} and here is the Chapter 48. Medical Image Reconstruction with the FFT^{9}. You’ll find in there also a mouse heart file, but the structure is different, so I chose to load only Head2D.dat at startup (you’ll have to provide that file, from the GPU Gems CD, I did not put it on GitHub). If you want you could change the code to load the mouse file, too, it’s not a big deal. You’ll have to look over the GPU Gems code to figure it out.
Beside the FFT classes, you might want to look into NMRIFile::Load
which loads the file, and the NMRIFile::FFT
which does what the name says, plus filtering out low and/or high frequency information, if filterLowFreqs
and/or filterHighFreqs
flags from the same object are set.
For compiling and execution, you’ll have to provide the FFTW library dll. One way could be to download its sources and compile them, but there is an easier way, you can download the dlls and headers from here^{10}. Just unpack the lib files and have them placed in C:\LIBs\fftw-3.3.5-dll64
(or the similar 32 bits directory if you compile for 32 bits). You’ll still have to generate the import libraries, but it’s easier than compiling the library.
Obviously they are the ones from the Fourier
namespace, FFTWPlan
and FFT
.
The main goal of this project was to have the FFT classes implemented and tested, to ensure they are working correctly and I just picked a project that’s related with some physics, but very easy to implement.
I will reuse those classes in future projects so I tried to pick a library that’s very efficient, such computations can be quite slow. Apparently FFTW is used by Matlab, so it should be fast enough.
I also decided to stick to double
values, float
is usually too inaccurate. This simplifies implementation, Eigen has it for floats, too (also for long double
which in VC++ is the same as double
).
FFT
is a façade that aims to hide the FFTW complexity behind a simple interface. Internally, it holds ‘plan’ objects which are used in computation. FFTWPlan
is an adapter that simply wraps an FFTW plan ‘handle’ and uses it in method calls. Actually, the FFTW plan is created only a the first call of the method of the plan object. Here is how the forward 1D transform is implemented for complex to complex, the others are very similar:
inline void fwd(fftw_complex* src, fftw_complex* dst, unsigned int n) { if (!plan) plan = fftw_plan_dft_1d(n, src, dst, FFTW_FORWARD, FFTW_ESTIMATE | FFTW_PRESERVE_INPUT); fftw_execute_dft(plan, src, dst); }
FFTW plans can be obtained by actually benchmarking various FFT methods and picking the most efficient one on the particular hardware used, they can even be saved and loaded later, but those things can be an overkill for the projects for this blog. I allowed setting a number of threads and I presumed an estimate would be good enough. If not, I’ll change the implementation later.
FFT
has some maps that hold the 1D, 2D and 3D FFT plans, indexed by tuples containing information about ‘in place’ computation, alignment, if it’s a direct or inverse transform, if it’s between different types and the size of data transformed. Here is how the forward 3D transform is implemented:
inline void fwd(std::complex<double>* src, std::complex<double> *dst, int n0, int n1, int n2) { GetPlan(false, false, src, dst, n0, n1, n2).fwd(reinterpret_cast<fftw_complex*>(src), reinterpret_cast<fftw_complex*>(dst), n0, n1, n2); }
The GetPlan
for 3D is simply:
inline FFTWPlan& GetPlan(bool inverse, bool differentTypes, void *src, void* dst, unsigned int n0, unsigned int n1, unsigned int n2) { return Plans3D[std::tuple<bool, bool, bool, bool, unsigned int, unsigned int, unsigned int>(InPlace(src, dst), Aligned(src, dst), inverse, differentTypes, n0, n1, n2)]; }
Plans3D
is a std::map
:
std::map< std::tuple<bool, bool, bool, bool, unsigned int, unsigned int, unsigned int>, FFTWPlan> Plans3D;
Entire books exist about the theory involved here, also long lectures exist. I already pointed out several sources^{4}^{5}^{6}, you could check them out for some details. I thought I should write some words about it, though, but only a few.
Since this blog is about programs using numerical methods for computational physics, the focus will be on the Discrete Fourier Transform. The Fast Fourier Transform is nothing else than Discrete Fourier Transform, with optimized computations. Basically it avoids repeating computations.
Imagine you have some function where x can be time, or space. It could be 1D, 2D (as it’s the case with this project) or 3D as it will be the case with future projects, I hope. Typically one has a (or more of them) differential equation, a PDE equation or an integro-differential one that involve that function, that need to be solved. A typical way of attacking such problem with a computer would be to first discretize space (again, which could be also time), then use a method like Finite Difference Method. Doing that will turn the problem into simple (relatively, it can be very computationally intensive) algebra. Of course, this is not the only way of attacking the problem. You could also start by discretizing the space, considering only the points evenly spread apart (again, in space and/or time), but instead of considering your function you express it into a Fourier series, partial sum:
where D is the size of the space (in general it would be the volume, but here think of the 1D case only) and . You can get rid of D by using a unit where the size of the cell is 1. Of course, the cell could have different sizes for x, y, z and more, the three axes of the cell need not be orthogonal, but about such cases, maybe later.
Its inverse transform (this is the actual transform to the ‘frequency’ space, the above being actually the ‘inverse’ transform back to the real space) is:
You’ll see later how this is useful in the case of the current topic, but in general, a good suggestion would be to look into using this when you notice a periodicity in the problem. For example, you’ll find periodicity in a crystal, where you have a basis cell repeated in the lattice. With Fourier transform you can switch to the reciprocal lattice which will ease computations. Sometimes even with no periodicity you can benefit from the plane wave basis decomposition by using a cell big enough with periodic boundary conditions.
The exponentials in the series terms have some nice properties, you can figure that immediately by hitting it with a differentiation or integral. You can see that way how a differential equation can turn into simple algebra. I hope I’ll have much more to say about this in future posts. By the way, since this post also handles a 2D problem and in the future 3D problems will be approached, solving a multi-dimensional Fourier transform is easy once you have the 1D transform. It’s easy to figure out that all you have to do is to do is to repeat the Fourier transform repeatedly for each dimension, the order has no importance. FFTW already has 2D and 3D transforms implemented, but for example for this project all I would have to do is to Fourier transform each row of the raw matrix then each column after that (or first the columns, then the rows), if only the 1D Fourier transform would be available.
Before finishing this, I want to mention that it’s not the first time you could see the expansion in a functions basis, it was already used in the Hartree-Fock project, but there a local basis was used, having our basis composed from Gaussians. Since the orbitals used were not orthogonal, the program had to deal with the overlap matrix and the generalized Eigen-value problem. Since plane waves are orthogonal, the overlap matrix is diagonal, so in an analogous problem one gets rid of the overlap matrix and deals only with the regular Eigen-value problem. More, the matrix elements there were quite hard to calculate. For the plane wave basis, one could find out that not only the matrix elements are easy to calculate but also operators that appear in the equation will be diagonal. This will ease up computations a lot, but there will be more about the details in future posts.
Nuclear magnetic resonance is used for the magnetic resonance imaging so I thought I should at least point to some links here, I won’t enter much into details.
As the naming implies, the nuclear magnetic moment is the main ingredient in the method. Some nuclei do have spin. More, because the particles that compose a nucleus have charge, besides having each a 1/2 spin, they have a magnetic moment, too. The ‘have charge’ is more subtle than the net charge, because the neutron has no net charge, but still having 1/2 spin, it has a magnetic moment. This is because the quarks inside do have charge, despite adding to a zero net charge. The hydrogen nucleus, being a single proton, has a 1/2 spin. The nuclei that have both even number of protons and even number of neutrons, have no spin so they are not interesting for the subject. The reason if that the opposite spin identical in any other way will pair up in the same energy level, ending up with a zero spin for the nucleus. A nucleus with an odd number of protons and an odd number of neutrons will have an integer spin while for even-odd or odd-even numbers of protons and neutrons, respectively, the spin will be fractional (as in 1/2, 3/2 or 5/2). Of course the nuclear spin is quantified, too. If there is no magnetic field, there will be 2S+1 degenerate states, but if a magnetic field is added, the degeneracy is lifted.
If the magnetic field is on the z-axis direction, the energy difference between levels will be:
where g is the g-factor and is the nuclear magneton. A transition between two levels is done by either a photon absorption (to a higher energy level) or an emission (to a lower energy level), with the energy , so from both one gets the Larmor frequency.
Obviously, the atoms are not typically at zero temperature, so only some nuclei will be in the lower energy state, for the two levels of a 1/2 spin:
The magnetization is proportional with . Now, imagine you apply a transverse field oscillating with the Larmor frequency. It’s the frequency that allows jumping between energy levels, so depending on which population is higher, you get either absorption or emission due of the transitions. There are more nuclei in the lower energy state, so you get an absorption of photons which it will turn later into an emission when relaxing. There are two main ways of relaxing, one by transitioning back to equilibrium distribution, the spin-lattice relaxation and one due of decoherence/dephasing, the spin-spin relaxation. They have characteristic times noted with and respectively. Please check out the links to wikipedia for the details or the online books^{5}^{6}.
I’ll try to describe a relatively simple method of having 2D (3D is also possible) magnetic resonance imaging. In practice there could be more complex, but you can find out more about that from the links.
Above I described how by applying a longitudinal homogeneous magnetic field, one could afterwards apply a transverse pulse that would turn into an absorption and later into emission in a very narrow frequency range corresponding to the resonant frequency. The emission intensity is determined by the nuclei concentration, but how do we find out the specific places where the signal originated?
You can select a particular plane by applying along the homogeneous magnetic field a longitudinal magnetic field gradient. This way a frequency value (a narrow range, actually) will correspond to a particular plane in the sample. If you apply the traversal 90 degrees pulse while having the gradient active, you’ll have only nuclei from the selected plane excited and precessing with the Larmor frequency. By the way, the pulse is specially crafted to have a narrow rectangular shape in frequency domain, so only a thin slice through the sample is excited.
After you apply the 90 degrees pulse, the longitudinal gradient is removed. If only this would be done, the evolution of nuclei from the slice would be with a precession with the same Larmor frequency. But if a transverse gradient is applied, the precession frequency will depend on the local magnetic field. After the gradient is eliminated, they will again precess with the common Larmor frequency, but they will have different phases, depending on the position along the transverse gradient. This results in a phase encoding. Measuring the phase will tell you where along the direction of the transverse gradient the signal originated. There is a problem, though, one needs two coordinates to locate a point in a plane! To solve this, another transverse gradient – in a direction perpendicular on the phase encoding one – is applied while measuring the signal. This results in a frequency encoding, since different positions along the second transverse direction correspond to different Larmor frequencies.
I will refer you to the image on top of this post, more specifically, the left image with the raw data. Imagine xOy coordinate axis with the (0, 0) point in the center of the image. The Ox axis corresponds to frequency encoding direction and the Oy one to the phase encoding. The frequency encoding is easier to understand, so I’ll explain first the data along the Ox axis (with y=0). In that case, there is no phase encoding gradient applied. Only the longitudinal plane selection gradient is applied along with the 90 degrees pulse to excite nuclei in the selected plane, then they are left to decay in a transversal frequency encoding gradient. While they are decaying the emitted signal is recorded. What you see on the Ox axis is the signal recorded, Ox is the time axis. Obviously the signal originates from points with different Larmor frequency, being the sum of all such signals. To separate out the frequencies and in consequence to obtain the position along the frequency encoding gradient, one has to do a Fourier transform. This separation still results in having a signal originating from all over the line perpendicular on the frequency encoding gradient direction, that is, a sum of all signals from the points along that line. To separate out the points one needs the phase encoding.
Phase encoding is a bit harder to grasp, so I will refer you to this page^{11}. You don’t simply measure an absolute phase. What one can do is to measure relative phases. Typically one uses a reference signal to compare it with the measured signal. Forget about the fact that the signal is a sum of different frequencies, this is solved anyway by Fourier transforming the raw signal matrix rows, imagine you have a reference signal of a certain frequency and you have your measured signal of the same frequency but with a different phase. To keep things simple, imagine they have the same amplitude (if not, you can adjust one of them to have the same amplitude as the other). Using the reference signal you can find out the phase difference. The problem in the NMRI case is that you don’t have a reference signal but you have many signals (of the same frequency, after you separated out frequencies with the horizontal Fourier transform) with different phases, added up. You cannot extract the phase using a single such signal. The solution is to apply a different phase encoding gradient several times and record the signal as for the case with no phase encoding gradient. The difference can be either in the strength of the gradient, or you can apply the same gradient but for a longer and longer period of time. For the later case what you have on the vertical axis is the duration of the phase encoding gradient. For the negative Oy half-plane the phase encoding gradient is simply reversed.
So, after you used the Fourier transform on each line, you have the frequencies separated (which in turn separates out the distances in the Ox direction), but you still have a sum of different phases for the signal. To separate out the individual phases, one simply has to do the same Fourier transform in the phase encoding direction, by this also obtaining the distance along the gradient for phase encoding.
The goal of this post was to introduce the Fourier Transform and hint about some of its usages. I’ll refer to it in some other posts, where I won’t insist on Fourier transforms, I’ll consider that part as known. I’ll reuse the classes presented here is some other projects, perhaps improved, hopefully the changes won’t need any more explanations.
It’s time for an easier topic than the last time. I noticed – but I also expected it – that I have more success with projects like the Solar System than with a project like the Numerical Renormalization Group. There are mainly two reasons for it: it looks more spectacular to see a 3D animation than a boring chart and the level required to understand it is lower. Those are some reasons why I’m going to have such easier and preferably nicely looking projects for this blog in the future, too. It also takes me much less time to implement such a project than a project as the Hartree-Fock one, for example.
So, the current topic is Lattice Boltzmann methods. The associated project is here^{1}. I already have it working for quite some time, but I had to implement the user interface, options and so on and I added some more boundary conditions for inlet and outlet, a little in a hurry and incomplete, but they should be enough to form an idea. In the meantime I took a short vacation and went to the annual Romanian hang gliding meeting, “Deltazaurii”, where I had a great time: part of a flight filmed from the crossbar.
The temptation to write a 3D one is big, what stopped me was first the 3D visualization which I don’t want to deal with currently and the amount of computation, which would require using the video card – with OpenCL or CUDA, but if I’ll decide to carry out something like this I would probably pick OpenCL. Even this project could use an OpenCL implementation, but the project would be a little harder to compile and also the code would be a bit less clear, so I gave up the idea.
It took very little time to have the algorithm working, just a bit more to have it multithreaded, most of the time was taken by refining displaying, adding the options, options property sheet and the additional boundary conditions for the inlet and outlet, where I became bored by the project so I rushed it out. One has to stop somewhere, or else developing a project on such a subject could take a lot of time. There are people that spent years developing projects related with subjects on this blog, obviously I cannot go into such depth if I want to have various topics on the blog and besides, such a project is very difficult to understand, which would be against the purpose of the blog.
Before entering into details, here is the program in action:
This is a topic where I have no intention to present a lot of theory, so you might have to look into some other places for information on it. As usual, I’ll give some links here, but there is plenty more information to be found on the internet.
Here are a couple of papers I’ve looked into and I found to be enough to get the general idea: The Lattice Boltzmann Method for Fluid Dynamics: Theory and Applications^{2}, Implementation techniques for the lattice Boltzmann method^{3}. I’ve heard good things about this one: The Lattice Boltzmann method with applications in acoustics^{4} although I very briefly looked over it. You should not stop there if you want more than this project covers, there is a lot to find out, there is a lot of work on thermal Lattice Boltzmann methods, a lot of work on boundary conditions, multi-phase flow and so on.
It’s not always advisable to write your own code – re-inventing the wheel – if you want to do some simulations, although my opinion is that for understanding the subject one should implement at least easy projects as this one before using some sophisticated library written by somebody else. If you want to use libraries, here are a couple of projects: OpenLB^{5}, Palabos^{6}. I’m sure you can find more…
Very shortly, the idea for fluid dynamics simulation is to have some equations that model the fluid flow and solve them. They cannot be analytically solved except for very simple situations, so a numerical approach is used.
Those equations are obtained using some laws obeyed by the fluid, like conservation laws (mass conservation, momentum conservation, energy conservation). Historically, the fluid was considered a continuum medium and without more details, one had to solve Navier-Stokes equations, together with mass conservation, boundary conditions and perhaps energy conservation thrown in, perhaps with simplifying assumptions, like considering the fluid not compressible. One could attack the problem using at least the finite difference method but there are other methods that are used as finite element method, finite volume method and so on. I won’t detail much on this approach, I might decide to have something on this blog about either the finite element or finite volume method in the future, but not necessarily for fluid dynamics.
It should be obvious that the continuum hypothesis is actually false, we know that real fluids are composed of interacting particles (being atoms or molecules or more ‘exotic’ ones, like quark-gluon plasma). Of course it’s hopeless currently and for the foreseeable future to try to simulate such fluids ab initio for a typical fluid volume we need to simulate. We could observe that in some conditions some bigger particles (like sand) still behave like a fluid, so we could hope that even if making the particles big with some idealized interactions between them, we could get lucky and because of the universality we could get away and simulate the fluid flow using a much less number of particles than the fluid has. Historically, that was the method used, with Lattice Gas Automata. It had some issues so it was quickly replaced by the Lattice Boltzmann Methods.
The main idea is that instead of treating individual particles, a statistical physics approach is used. using distribution functions for particles. More specifically, it starts from Boltzmann equation which describes the behavior of the particles distribution at non equilibrium, involving collisions:
You can find the details about it either in the Wikipedia links or in the papers for which I provided the links already. The collision term can be quite complicated even with the “Stosszahlansatz”, making it a partial integro-differential equation quite hard to solve. The collision term is typically simplified further, using a relaxation time , the collision term becoming . In general, if you have a mix of different fluid phases, different phases have different relaxation times.
You’ll find the mathematical details of reaching from this the discretized Navier-Stokes equations in the links, along with the advantages and disadvantages compared with the ‘classical’ methods.
The project^{1} I implemented to illustrate the method is a typical mfc doc/view program, similar with other projects described on this blog. For details about the classes unrelated with the Lattice Boltzmann method, please check out the other posts, especially the ones from the beginning of the blog, where I detailed the classes a little more. For displaying I used the MemoryBitmap
class which I took from the Ising model project and changed it a little to fit the current one. If you want to find more about the project than the actual Lattice Boltzmann code, you might want to look first into CMainFrame::OnFileOpen()
where the image file that contains the obstacles is loaded, then into the document methods starting with CLatticeBoltzmannDoc::SetImageAndStartComputing
and the others at the end of the cpp file implementing the document. The drawing is done by the view, which contains a timer to refresh the image. The most important method is CLatticeBoltzmannView::OnDraw
. I’ll let you alone to figure out the options and their UI.
LatticeBoltzmann
namespaceThe code related with the post subject is in the LatticeBoltzmann
namespace. There are only two classes, Cell
and Lattice
and by the name I guess you can already figure out their purpose. The Cell
class is small enough to be listed here entirely. First, the header, leaving out the namespace to have less lines:
class Cell { public: Cell(); ~Cell(); static std::array<int, 9> ex; static std::array<int, 9> ey; static std::array<double, 9> coeff; std::array<double, 9> density; enum Direction { none = 0, N, NE, E, SE, S, SW, W, NW }; void Init(); inline static std::pair<int, int> GetNextPosition(Direction direction, int x, int y) { return std::make_pair<int, int>(x + ex[direction], y + ey[direction]); } inline static Direction Reverse(Direction dir) { switch (dir) { case Direction::N: return Direction::S; case Direction::S: return Direction::N; case Direction::W: return Direction::E; case Direction::E: return Direction::W; case Direction::NE: return Direction::SW; case Direction::SE: return Direction::NW; case Direction::NW: return Direction::SE; case Direction::SW: return Direction::NE; } return Direction::none; } inline static Direction ReflectVert(Direction dir) { switch (dir) { case Direction::N: return Direction::S; case Direction::S: return Direction::N; case Direction::W: return Direction::W; case Direction::E: return Direction::E; case Direction::NE: return Direction::SE; case Direction::SE: return Direction::NE; case Direction::NW: return Direction::SW; case Direction::SW: return Direction::NW; } return Direction::none; } inline double Density() const { double tDensity = 0; for (int i = 0; i < 9; ++i) tDensity += density[i]; return tDensity; } inline std::pair<double, double> Velocity() const { double tDensity = 0; double vx = 0; double vy = 0; for (int i = 0; i < 9; ++i) { tDensity += density[i]; vx += ex[i] * density[i]; vy += ey[i] * density[i]; } if (tDensity < 1E-14) return std::make_pair<double, double>(0, 0); return std::make_pair<double, double>(vx / tDensity, vy / tDensity); } // this can be optimized, I won't do that to have the code easy to understand // accelX, accelY are here to let you add a 'force' (as for example gravity, or some force to move the fluid at an inlet) inline std::array<double, 9> Equilibrium(double accelXtau, double accelYtau) const { std::array<double, 9> result; double totalDensity = density[0]; double vx = ex[0] * density[0]; double vy = ey[0] * density[0]; for (int i = 1; i < 9; ++i) { totalDensity += density[i]; vx += ex[i] * density[i]; vy += ey[i] * density[i]; } vx /= totalDensity; vy /= totalDensity; vx += accelXtau; vy += accelYtau; const double v2 = vx * vx + vy * vy; static const double coeff1 = 3.; static const double coeff2 = 9. / 2.; static const double coeff3 = -3. / 2.; for (int i = 0; i < 9; ++i) { const double term = ex[i] * vx + ey[i] * vy; result[i] = coeff[i] * totalDensity * (1. + coeff1 * term + coeff2 * term * term + coeff3 * v2); } return std::move(result); } inline void Collision(double accelXtau, double accelYtau, double tau) { const std::array<double, 9> equilibriumDistribution = Equilibrium(accelXtau, accelYtau); for (int i = 0; i < 9; ++i) density[i] -= (density[i] - equilibriumDistribution[i]) / tau; } };
Then, the cpp file, which is very simple:
#include "Cell.h" namespace LatticeBoltzmann { const double c0 = 4. / 9.; const double c1 = 1. / 9; const double c2 = 1. / 36.; // 0, N, NE,E, SE, S, SW, W, NW std::array<int, 9> Cell::ex = std::array<int, 9>{ {0, 0, 1, 1, 1, 0, -1, -1, -1} }; std::array<int, 9> Cell::ey = std::array<int, 9>{ {0, 1, 1, 0, -1, -1, -1, 0, 1} }; std::array<double, 9> Cell::coeff = std::array<double, 9>{ { c0, c1, c2, c1, c2, c1, c2, c1, c2 } }; Cell::Cell() { for (int i = 0; i < 9; ++i) density[i] = 0; } Cell::~Cell() { } void Cell::Init() { for (int i = 0; i < 9; ++i) density[i] = coeff[i]; } }
The code should be self-explanatory. The most important methods are Collision
and Equilibrium
. I hope you already spotted the collision term mentioned above. For the equilibrium distribution implementation details you might want to look into the linked papers. Density
and Velocity
are used for getting results. They are already calculated in Equilibrium
but I think the code is cleaner as it is, the results are not computed each simulation step anyway. The code could be optimized, but again in order to have it clear enough I prefer not to. About optimizations, later. Reverse
and ReflectVert
are used for ‘bounce back’ – that is, zero flow speed at boundary – and ‘slippery’ – that is, no friction at boundary, just reflection – implementations, respectively.
The Lattice
class is a little more complex and I would let out of presentation several methods, you should check out the GitHub repository^{1} for the full implementation.
Here is the Simulation
method, which runs in a different thread than the UI one, to avoid UI locking:
void Lattice::Simulate() { Init(); CellLattice latticeWork = CellLattice(lattice.rows(), lattice.cols()); std::vector<std::thread> theThreads(numThreads); processed = 0; wakeup.resize(numThreads); for (unsigned int i = 0; i < numThreads; ++i) wakeup[i] = false; int workStride = (int)lattice.cols() / numThreads; for (int t = 0, strideStart = 0; t < (int)numThreads; ++t) { int endStride = strideStart + workStride; theThreads[t] = std::thread(&Lattice::CollideAndStream, this, t, &latticeWork, strideStart, t == numThreads - 1 ? (int)lattice.cols() : endStride); strideStart = endStride; } for (unsigned int step = 0; ; ++step) { WakeUp(); WaitForData(); if (!simulate) break; lattice.swap(latticeWork); // compute values to display, here I also use an arbitrary 'warmup' interval where results are not calculated if (step > 2000 && step % refreshSteps == 0) GetResults(); } WakeUp(); for (unsigned int t = 0; t < numThreads; ++t) if (theThreads[t].joinable()) theThreads[t].join(); }
Since the Lattice Boltzmann methods can be very easily parallelized – more about that, later – I tried to have a little benefit from that, so the simulation domain is split into ‘strides’ to be passed to different threads that do the collision and streaming.
Here is the method that does those computations:
void Lattice::CollideAndStream(int tid, CellLattice* latticeW, int startCol, int endCol) { CellLattice& latticeWork = *latticeW; // stream (including bounce back) and collision combined int LatticeRows = (int)lattice.rows(); int LatticeRowsMinusOne = LatticeRows - 1; int LatticeCols = (int)lattice.cols(); int LatticeColsMinusOne = LatticeCols - 1; double accelXtau = accelX * tau; double accelYtau = accelY * tau; for (;;) { WaitForWork(tid); if (!simulate) { SignalMoreData(); break; } for (int y = 0; y < LatticeRows; ++y) { int LatticeRowsMinuOneMinusRow = LatticeRowsMinusOne - y; bool ShouldCollide = (Periodic == boundaryConditions || (0 != y && y != LatticeRowsMinusOne)); for (int x = startCol; x < endCol; ++x) { // collision if (!latticeObstacles(y, x) && ShouldCollide && (useAccelX || (x > 0 && x < LatticeColsMinusOne))) lattice(y, x).Collision(x == 0 && useAccelX ? accelXtau : 0, accelYtau, tau); // stream // as a note, this is highly inefficient // for example // checking nine times for each cell for a boundary condition that is fixed before running the simulation // is overkill // this could be solved by moving the ifs outside the for loops // it could be for example solved with templates with the proper class instantiation depending on the settings // I did not want to complicate the code so much so for now I'll have it this way even if it's not efficient // hopefully the compiler is able to do some optimizations :) for (int dir = 0; dir < 9; ++dir) { Cell::Direction direction = Cell::Direction(dir); auto pos = Cell::GetNextPosition(direction, x, LatticeRowsMinuOneMinusRow); pos.second = LatticeRowsMinusOne - pos.second; // *************************************************************************************************************** // left & right if (useAccelX) //periodic boundary with usage of an accelerating force { if (pos.first < 0) pos.first = LatticeColsMinusOne; else if (pos.first >= LatticeCols) pos.first = 0; } else { // bounce them back if ((pos.first == 0 || pos.first == LatticeColsMinusOne) && !(pos.second == 0 || pos.second == LatticeRowsMinusOne)) direction = Cell::Reverse(direction); } // *************************************************************************************************************** // top & bottom, depends on boundaryConditions if (Periodic == boundaryConditions) { if (pos.second < 0) pos.second = LatticeRowsMinusOne; else if (pos.second >= LatticeRows) pos.second = 0; } else if (pos.second == 0 || pos.second == LatticeRowsMinusOne) { if (BounceBack == boundaryConditions) direction = Cell::Reverse(direction); else direction = Cell::ReflectVert(direction); } // *************************************************************************************************************** // bounce back for regular obstacles if (latticeObstacles(pos.second, pos.first)) direction = Cell::Reverse(direction); // x, y = old position, pos = new position, dir - original direction, direction - new direction if (pos.first >= 0 && pos.first < LatticeCols && pos.second >= 0 && pos.second < LatticeRows) latticeWork(pos.second, pos.first).density[direction] = lattice(y, x).density[dir]; } } } DealWithInletOutlet(latticeWork, startCol, endCol, LatticeRows, LatticeCols, LatticeRowsMinusOne, LatticeColsMinusOne); SignalMoreData(); } }
Very shortly, each thread deals with its patch: for each cell it collides the ‘particles’, evolving them towards equilibrium, then they are streamed out into neighboring cells. The program uses another matrix to stream into (this could be also optimized). At the end the matrices are swapped.
I’ll let you look into the synchronizing methods yourself. The code is more complex than it could be because of the different boundary conditions I implemented.
DealWithInletOutlet
is inspired by this article: On pressure and velocity flow boundary conditions and bounce back for the lattice Boltzmann BGK model^{7}. Maybe it could benefit from a better treatment at the corners but I added it quite fast at the end and at that moment I was quite bored by it, so I let it as it is.
For the case when periodic boundary conditions are used for the inlet and outlet sides together with an inlet acceleration, the acceleration is applied in the Equilibrium
method of the Cell
. This could also be optimized.
As an implementation detail, I used Eigen^{8} for the lattice. It could be easily implemented in some other way but since it was already available, I used it. Many projects for this blog also use it.
While I developed the project I made some videos and here they are, first Density and Speed:
Then, since I consider it quite important, I added vorticity:
Both are done with periodic boundary conditions for inlet/outlet and inlet acceleration, so the turbulent flow can exit from one side to enter through the other. To be noted that one could specify values in the settings that take the Lattice Boltzmann method outside its range of validity, so numerical errors can kick in quite hard. I did not add any checks so you’ll have to be careful with the settings.
This project is far from being perfect, I had to stop somewhere and besides, since one of the purposes is to be easy to understand, I avoided some complexities that would arise from optimization or more fancy things like multi-phase flow. Here are some things you could do using this project as a starting idea:
latticeWork
– one could get away using less ‘work’ memory, in 2D a vector, in 3D only a plane instead of the whole volume. In many cases the needed simulation contains a lot of ‘full’ zones, that is, obstacles, for example for flow simulation in porous media. In such case it would be worthless to have the overhead of a Cell
in so many places the fluid does not actually flow. One could use a full lattice that only stores pointers to Cell
objects, nullptr
for the obstacles case. If you have a multi-phase flow, you save even more memory, especially in 3D.if
s out of loops. There are quite a bit of places where this could be done. I’ll let you look into it.That’s about it. As usual, please point out any bugs you find out. Suggestions are also welcomed.
I already mentioned Matrix Product States in the Density Matrix Renormalization Group post. I thought it would be a nice topic to have on this blog, but since I already implemented DMRG, I’ll have something a little different related with the subject. With the information presented here (mostly in the links) you could go back to the DMRG post and either re-implement DMRG from scratch using matrix product states or at least add tDMRG – which I just mentioned, with some links – to it.
So, the subject is Time-Evolving Block Decimation. I already implemented an iTEBD program (i comes from ‘infinite’) for the Heisenberg model, spin 1/2. I guess it should not be hard to extend it for spin-1. The GitHub repository is here^{1}. I might implement the finite chain in the future, it shouldn’t be much more difficult, but for now the infinite chain should do.
Here, as usual, you can see the program in action:
This is another huge topic, especially if you want to look into more than TEBD, so obviously I cannot describe it in detail here, but I’ll provide enough links to expose the theory.
It was realized that in the Numerical Renormalization Group and in the Density Matrix Renormalization Group, matrix product states are constructed, you might want to visit the topics on this blog about them. After understanding how those methods work and the information provided here, I’m sure you’ll figure out how those arise.
Here is an article about DMRG that might also help: A class of ansatz wave functions for 1D spin systems and their relation to DMRG^{2}.
On the Time Evolving Block Decimation topic, there are three papers you should check out, all of them by the same author, Guifre Vidal.
The first one is Efficient classical simulation of slightly entangled quantum computations^{3}. It’s a treatment from the Quantum Computation point of view. It was followed by Efficient simulation of one-dimensional quantum many-body systems^{4}. Then he published another one showing how to apply the method on infinite chains by taking advantage of the translation symmetry: Classical simulation of infinite-size quantum lattice systems in one spatial dimension^{5}.
First, two reviews: Matrix Product States, Projected Entangled Pair States, and variational renormalization group methods for quantum spin systems by F. Verstraete, J.I. Cirac and V. Murg^{6} and the already mentioned one in the DMRG entry The density-matrix renormalization group in the age of matrix product states by Ulrich Schollwock^{7}.
The reviews are quite long and you might want to check some shorter and easier papers, so here are some: DMRG: Ground States, Time Evolution, and Spectral Functions by Ulrich Schollwock^{8}, A Practical Introduction to Tensor Networks: Matrix Product States and Projected Entangled Pair States by Roman Orus^{9}, Matrix Product State Representations by D. Perez-Garcia, F. Verstraete, M.M. Wolf, J.I. Cirac^{10}, Efficient Numerical Simulations Using Matrix-Product States by Frank Pollmann^{11}, Finite automata for caching in matrix product algorithms by Gregory M. Crosswhite and Dave Bacon^{12}.
Here is a diploma paper: Numerical Time Evolution of 1D Quantum Systems by Elias Rabel^{13} and a PhD paper: Tensor network states for the description of quantum many-body systems by Thorsten Bernd Wahl^{14}. You can find plenty more with a search.
You might find a lot of projects related with the subject, I’ll point to one which also has nice documentation: ITensor^{15}. I think I already mentioned it in the DMRG post, pointing to this: Fermions and Jordan-Wigner String.
Here is a nice introductory lecture, more general, about Matrix Product States, DMRG and the more general Tensor Networks:
First, I’ll limit it to 1D systems. Although you will find information for usage in higher dimensions, about tensor product states, PEPS, MERA and so on, I won’t go into those. In higher dimensions you’ll have more indices and computations will complicate and the area law won’t help as much as in the 1D systems, but you’ll find info on it in the links provided, I won’t detail it much.
Second, I’ll limit to open boundary conditions. For periodic boundary conditions you’ll see an additional trace in formulae (unlike for open boundary conditions, the sites at the end that are joined have associated full matrices, as opposed to vectors in the case of OBC) and in the diagram representations there is a ‘loop over’ joining the sites at the end of the chain. It’s not really much difficult but there is more to describe so I’ll let that out.
Third, one could have a chain where sites are not identical, or a chain where sites have structure (as in one spin 1, one 1/2 in a single site, for example, or an entire ‘molecule’). I’ll consider only simple sites, currently only spin 1/2, maybe I’ll add spin 1 in the future.
I’ll also avoid using symmetries for speeding up computation, because that would complicate the program. The goal is to have something easy to understand, so for now symmetries are out, too.
For time evolution I will use only the first order approximation, for higher orders please check out the documentation.
At the moment I write this, I have only the infinite case implemented. I might implement the finite chain in the future, but this post will treat only the infinite chain case.
Since I limited the discussion to the simple 1D chain, imagine a chain formed by sites with spin. A state of such chain is in the space obtained as a tensor product of local spaces (for each individual spin). The state can be decomposed as:
where each index runs over the local basis ( over the basis of site 1 and so on). C is a nasty object (let’s call it a tensor from now on) which has elements. The exponential increase of the state space makes this very hard to deal with.
The idea would be to somehow find a simplification. The easiest one is to consider:
That is, a simple product state. This is the approximation used in mean field theory, the sites do not interact (directly) with the other sites, only with the mean field. The complexity drops from to or even d if there is translational symmetry.
Quite nice, but this approximation does not work in general. So let’s find something better.
The idea is to separate out the chain in two and apply Singular Value Decomposition on it. Please visit the DMRG post for details, for DMRG, SVD is also very important.
I’ll start with the leftmost site in the chain, you could start with the rightmost site, or you could split the chain in two – you can call one part ‘system’, one ‘environment’, to see the link with DMRG. Keeping the singular values separately will lead you to the Vidal decomposition, I’ll simply multiply S into to keep things simple. So, by grouping all indices but the first one into a single index we get a matrix with elements on which one can apply singular value decomposition:
The S matrix can be multiplied with the one and the decomposition for a tensor element becomes:
Let’s add a dummy index for U, too, its role will be obvious soon:
Now let’s perform SVD again for , grouping first two indices together into one and the remaining ones into the other to make a matrix (the procedure is called reshaping, by the way):
By arranging indices, inserting in the previous one and removing one tilde sign (so is now renamed ):
By repeating the procedure one finally gets:
What we have now is just a product of matrices. By considering the dimensionality, which I did not discuss and I advise you to look into the references for details, it’s not immediately obvious how this could help.
But we performed SVD as in DMRG and with the help of the area law we can truncate those matrices. Please check out the links for details and maybe also the DMRG post.
This way the computations will simplify tremendously.
The matrix decomposition is not unique, one can insert to change the decomposition, but some decompositions are more useful than others, so the left and right and mixed canonical are used.
To summarize, we ended up with a set of d matrices for each site, d being the dimensionality of the local space, that is, for a 1/2 spin site d=2 and we have two matrices, one for spin up and one for spin down for each site.
The indices I noted with ‘v’ are called virtual indices (or valence indexes), the ones noted with ‘i’, physical indices, they select the matrix, in our 1/2 spin case, the spin down or spin up matrix.
This is only a brief presentation of the product matrix decomposition, if one is not familiar with it already, he should check also the valence bond picture and the usage in context of DMRG.
There is also a very useful graphical representation which is used a lot, one should also check it out. In that representation, for a site (which is not at an end), there is a circle with three legs sticking out: two horizontal ones corresponding to virtual indices and a vertical one corresponding to the physical one. Connected legs means summing over.
To get a feeling of how the MPS looks like, I thought it might be useful to mention some trivial cases. First, for a case when there is a simple product state, let’s say all spins up: in this case the matrices have the dimension 1×1 and all matrices for spin down are 0, all for spin up are one, so the two matrices for a site are 0 and 1. For a case where the state cannot be written as a simple product state, let’s consider the superposition between all spins up and all spins down:
In this case for each site there is a matrix for spin up, one for spin down, which have the form:
and
An operator is more complex than a state:
We have an uglier tensor but indices can be arranged by grouping them together:
By considering each pair (ij) as a single index one can follow the same procedure as before to decompose it into a matrix product. The difference is that one ends up with two physical indices instead of one, so the graphical representation has two vertical and two horizontal legs/site. For the simplest case when the operator acts on a single site, for all the other sites the operator matrices are trivial.
It’s beyond the scope of this post to present in detail everything, for the details please check at least DMRG: Ground States, Time Evolution, and Spectral Functions^{8}.
Let’s assume the Hamiltonian having only nearest neighbor interaction and on site terms, in the form:
The time evolution operator is:
The time can be divided into n infinitesimal periods, to have:
The Hamiltonian can be written as a sum of even and odd terms:
The reason is that the even terms commute among themselves, also the odd terms commute among themselves, although an even term and an odd term do not necessarily commute. So separately for the even and odd parts:
And similarly for the odd part of the Hamiltonian.
The first order approximation is:
The approximation is exact when n goes to infinity. It’s only an approximation because terms from the even part do not commute with terms from the odd part, by the way.
We have above all ingredients to perform a time evolution. Putting it all together we have:
This can be translated into words approximately as: Apply all even evolution operator terms on the state then all the odd terms. Repeat the procedure n times to evolve the state to time t.
One can perform an imaginary time evolution by absorbing i into time, transforming it into imaginary time (see: Wick rotation). The evolution operator becomes:
The assumption is that we have a ground state separated by an energy gap.
Let’s start with a random state, the only requirement being to have a non-zero component along the ground state vector. We can write the state using the Hamiltonian eigenstates:
with being the ground state vector. The states have the corresponding eigenvalues , being the ground state energy. By applying the imaginary time evolution operator on the initial state for a sufficiently long time:
the ground state is amplified (the coefficient gets bigger and bigger) while all the others are getting smaller in comparison.
The blowup of the ground state coefficient is compensated by normalizing the evolved state:
The approximation becomes exact in the limit of going to infinity.
Here is another imaginary time evolution presentation for the Schrödinger equation in a simpler context: Numerical solution of PDE:s, Part 5: Schrödinger equation in imaginary time^{16}
For the details I’ll let you look in the Vidal iTEBD paper, I’ll present only the main idea:
The infinite chain has translation symmetry, so instead of taking into account all sites one can get away by considering only the sites i and i+1, the chain looks everywhere the same, anyway. First the even evolution term is applied to those, then the i site is moved to the i+2 position and the odd evolution term is applied on sites i+1 and i+2 (equivalently, the sites are swapped and the odd term is applied on sites i and i+1). Then the process is repeated. This will become much clear in the code presentation below, also if you take a look in the iTEBD paper^{5}.
The program is a quite standard mfc application, very similar with other applications I presented on this blog, so I won’t bother to describe it in detail. You have the standard doc/view and the application object. I got from the other projects the number edit control and the chart class, as usual there is an options object in the application that allows saving/loading into/from registry, there is the options property sheet and pages and a computation thread which lets the application be responsive while doing the calculations.
Everything important for the topic is in the TEBD
namespace.
The operators are implemented in the TEBD::Operators
namespace. I simply copied them from the DMRG project (so you’ll see some remains from there – also from the NRG project – despite not being used in the current project) and changed and extended them to fit to this project. The operator classes are just wrappers around Eigen^{17} matrices with some functionality added. The matrices are initialized to the proper value in constructors.
The currently used Hamiltonian is:
class Heisenberg : public Hamiltonian<double> { public: Heisenberg(double Jx, double Jy, double Jz, double Bx = 0, double Bz = 0); };
with its constructor implementation being:
inline Heisenberg::Heisenberg(double Jx, double Jy, double Jz, double Bx, double Bz) : Hamiltonian<double>(2) { Operators::SxOneHalf<double> sx; Operators::SyOneHalf<double> sy; Operators::SzOneHalf<double> sz; matrix = - (Jx * Operators::Operator<double>::KroneckerProduct(sx.matrix, sx.matrix) - Jy * Operators::Operator<double>::KroneckerProduct(sy.matrix, sy.matrix) + Jz * Operators::Operator<double>::KroneckerProduct(sz.matrix, sz.matrix) + Bx/2 * (Operators::Operator<double>::IdentityKronecker(2, sx.matrix) + Operators::Operator<double>::KroneckerProductWithIdentity(sx.matrix, 2)) + Bz/2 * (Operators::Operator<double>::IdentityKronecker(2, sz.matrix) + Operators::Operator<double>::KroneckerProductWithIdentity(sz.matrix, 2))); }
I guess the code should be explicit enough. It’s just the Hamiltonian for the two sites from iTEBD. This could also be used for the finite length chain, the algorithm is very similar, just instead of two sites which are ‘swapped’ each step, the finite algorithm walks over all sites (first over all those having ‘odd’ interactions, then over those having ‘even’ interactions).
The rest of the operators code was already described in other posts of this blog and is straightforward. Something which is new and is worth mentioning is the exponentiation of an operator, implemented in the DiagonalizableOperator
class:
inline Eigen::MatrixXcd ComplexExponentiate(double tau) { Diagonalize(); const Operator<T>::OperatorMatrix& eigenV = eigenvectors(); const Operator<T>::OperatorVector& eigenv = eigenvalues(); Eigen::VectorXcd result = Eigen::VectorXcd(eigenv.size()); for (int i = 0; i < eigenv.size(); ++i) result(i) = std::exp(std::complex<double>(0, -1) * tau * eigenv(i)); return eigenV * result.asDiagonal() * eigenV.transpose(); }
There is another similar method which is used for the imaginary time variant.
The code is used to calculate , H being the Hamiltonian operator matrix.
The idea is to diagonalize the matrix, calculate the exponentials of the eigenvalues then bring the matrix back to the original basis using the eigenvectors. More details on Wikipedia page.
The matrix product state is stored in the an iMPS object, declared as:
template<typename T, unsigned int D = 2> class iMPS { public: iMPS(unsigned int chi = 10); virtual ~iMPS(); virtual void InitRandomState(); void InitNeel(); Eigen::Tensor<T, 3> Gamma1; Eigen::Tensor<T, 3> Gamma2; typename Operators::Operator<double>::OperatorVector lambda1; typename Operators::Operator<double>::OperatorVector lambda2; };
It’s not very complicated: it just stores the gamma tensors and lambda values for two sites and has methods to initialize the state to a random one or to a Neel state (translation symmetrized by averaging).
As a side note, having it implemented for the finite chain should not be hard at all: instead of only two sites, have values for all sites, in vectors. Maybe the lambda values could be multiplied into the site matrices, but I didn’t give that much thought.
By the way, the code uses the ‘unsupported’ tensor library included in Eigen. Despite being unsupported, I guess it’s good enough since it’s used in TensorFlow. I had some troubles with reshaping and I ended up implementing that myself (along with shuffling).
Most of the computation code is in the iTEBD
implementation. Evolution, both in real and imaginary time, is implemented this way:
template<typename T, unsigned int D> double iTEBD<T, D>::CalculateImaginaryTimeEvolution(Operators::Hamiltonian<double>& H, unsigned int steps, double delta) { m_iMPS.InitRandomState(); Operators::Operator<T>::OperatorMatrix Umatrix = GetImaginaryTimeEvolutionOperatorMatrix(H, delta); Eigen::Tensor<T, 4> U = GetEvolutionTensor(Umatrix); isRealTimeEvolution = false; Calculate(U, steps); return GetEnergy(delta, thetaMatrix); } template<typename T, unsigned int D> void iTEBD<T, D>::CalculateRealTimeEvolution(Operators::Hamiltonian<double>& H, unsigned int steps, double delta) { Eigen::MatrixXcd Umatrix = GetRealTimeEvolutionOperatorMatrix(H, delta); Eigen::Tensor<T, 4> U = GetEvolutionTensor(Umatrix); isRealTimeEvolution = true; Calculate(U, steps); }
There is not much difference between evolution in imaginary time and real time: the difference is in the evolution operator and a flag that indicates what kind of evolution is, and that’s about it.
inline static typename Operators::Operator<double>::OperatorMatrix GetImaginaryTimeEvolutionOperatorMatrix(Operators::Hamiltonian<double>& H, double deltat) { return H.Exponentiate(deltat); } inline static Eigen::MatrixXcd GetRealTimeEvolutionOperatorMatrix(Operators::Hamiltonian<double>& H, double deltat) { return H.ComplexExponentiate(deltat); }
As expected, the imaginary and real time evolution operators are nothing more than already described above.
GetEvolutionTensor
just converts the operator matrix into a tensor ready to be applied, I’ll let you look into the code for its implementation. It’s one place where reshaping and reshuffling are implemented by me instead of using the tensor library.
Now let’s see the main part:
template<typename T, unsigned int D> void iTEBD<T, D>::Calculate(const Eigen::Tensor<T, 4> &U, unsigned int steps) { for (unsigned int step = 0; step < steps; ++step) { bool odd = (1 == step % 2); Eigen::Tensor<T, 2> lambdaA(m_chi, m_chi); Eigen::Tensor<T, 2> lambdaB(m_chi, m_chi); lambdaA.setZero(); lambdaB.setZero(); Eigen::Tensor<T, 3> &gammaA = odd ? m_iMPS.Gamma2 : m_iMPS.Gamma1; Eigen::Tensor<T, 3> &gammaB = odd ? m_iMPS.Gamma1 : m_iMPS.Gamma2; for (unsigned int i = 0; i < m_chi; ++i) { lambdaA(i, i) = odd ? m_iMPS.lambda2(i) : m_iMPS.lambda1(i); lambdaB(i, i) = odd ? m_iMPS.lambda1(i) : m_iMPS.lambda2(i); } // construct theta // this does the tensor network contraction as in fig 3, (i)->(ii) from iTEBD Vidal paper Eigen::Tensor<T, 4> thetabar = ConstructTheta(lambdaA, lambdaB, gammaA, gammaB, U); // *********************************************************************************************************** // get it into a matrix for SVD - use JacobiSVD // the theta tensor is now decomposed using SVD (as in (ii)->(iii) in fig 3 in Vidal iTEBD paper) and then // the tensor network is rebuilt as in (iii)->(iv)->(v) from fig 3 Vidal 2008 thetaMatrix = ReshapeTheta(thetabar); Eigen::JacobiSVD<Operators::Operator<T>::OperatorMatrix> SVD(thetaMatrix, Eigen::DecompositionOptions::ComputeFullU | Eigen::DecompositionOptions::ComputeFullV); Operators::Operator<T>::OperatorMatrix Umatrix = SVD.matrixU().block(0, 0, D * m_chi, m_chi); Operators::Operator<T>::OperatorMatrix Vmatrix = SVD.matrixV().block(0, 0, D * m_chi, m_chi).adjoint(); Operators::Operator<double>::OperatorVector Svalues = SVD.singularValues(); for (unsigned int i = 0; i < m_chi; ++i) { double val = Svalues(i); if (odd) m_iMPS.lambda2(i) = val; else m_iMPS.lambda1(i) = val; if (abs(lambdaB(i,i)) > 1E-10) lambdaB(i, i) = 1. / lambdaB(i, i); else lambdaB(i, i) = 0; } if (odd) m_iMPS.lambda2.normalize(); else m_iMPS.lambda1.normalize(); SetNewGammas(m_chi, lambdaB, Umatrix, Vmatrix, gammaA, gammaB); // now compute 'measurements' // this program uses a 'trick' by reusing the already contracted tensor network if (odd && isRealTimeEvolution && m_TwoSitesOperators.size() > 0) ComputeOperators(thetabar); } }
It looks more complicated than it should, but it’s not very bid deal:
There is a for
loop which performs a number of steps of evolution (one could do better by changing the step size and matrices size along evolution to be able to have more accuracy along a longer time evolution, but I’ll refer you to documentation for such subtleties). In the for
loop, five things are happening:
* first, the site are swapped if it’s an ‘odd’ step. The first portion of the code down to the // construct theta
comment does that, along with getting the lambda values into tensors to deal with them easier in further computations.
* second, the lambda and gamma tensors are contracted together to obtain a ‘four legs’ tensor and then the evolution tensor is applied (contracted with the result). All those happen in the ConstructTheta
call
* third, a singular value decomposition is performed after the four legs tensor is reshaped into a matrix
* fourth, the result of SVD is split back into the sites matrices and lambda values. As a detail, the site tensor has three legs, the second one being the physical one (that is, that index selects the matrix). This is implemented after the call to singularValues
down to (and including) the call to SetNewGammas
* last, if there are ‘measurement’ operators to compute, compute them
Here is the ConstructTheta
implementation:
// this does the tensor network contraction as in fig 3, (i)->(ii) from iTEBD Vidal paper template<typename T, unsigned int D> Eigen::Tensor<T, 4> iTEBD<T, D>::ConstructTheta(const Eigen::Tensor<T, 2>& lambdaA, const Eigen::Tensor<T, 2>& lambdaB, const Eigen::Tensor<T, 3>& gammaA, const Eigen::Tensor<T, 3>& gammaB, const Eigen::Tensor<T, 4>& U) { Eigen::Tensor<T, 4> theta = ContractTwoSites(lambdaA, lambdaB, gammaA, gammaB); // apply time evolution operator typedef Eigen::Tensor<T, 4>::DimensionPair DimPair; // from theta the physical indexes are contracted out // the last two become the physical indexes const Eigen::array<DimPair, 2> product_dim{ DimPair(1, 0), DimPair(2, 1) }; // this applies the time evolution operator U return theta.contract(U, product_dim); }
The last line applies the evolution operator, the rest is delegated to ContractTwoSites
:
template<typename T, unsigned int D> Eigen::Tensor<T, 4> iTEBD<T, D>::ContractTwoSites(const Eigen::Tensor<T, 2>& lambdaA, const Eigen::Tensor<T, 2>& lambdaB, const Eigen::Tensor<T, 3>& gammaA, const Eigen::Tensor<T, 3>& gammaB) { // construct theta // contract lambda on the left with the first gamma // the resulting tensor has three legs, 1 is the physical one const Eigen::array<Eigen::IndexPair<int>, 1> product_dims1{ Eigen::IndexPair<int>(1, 0) }; Eigen::Tensor<T, 3> thetaint = lambdaB.contract(gammaA, product_dims1); // contract the result with the lambda in the middle // the resulting tensor has three legs, 1 is the physical one const Eigen::array<Eigen::IndexPair<int>, 1> product_dims2{ Eigen::IndexPair<int>(2, 0) }; thetaint = thetaint.contract(lambdaA, product_dims2).eval(); // contract the result with the next gamma // the resulting tensor has four legs, 1 and 2 are the physical ones const Eigen::array<Eigen::IndexPair<int>, 1> product_dims3{ Eigen::IndexPair<int>(2, 0) }; Eigen::Tensor<T, 4> theta = thetaint.contract(gammaB, product_dims3); // contract the result with the lambda on the right // the resulting tensor has four legs, 1 and 2 are the physical ones const Eigen::array<Eigen::IndexPair<int>, 1> product_dims4{ Eigen::IndexPair<int>(3, 0) }; return theta.contract(lambdaB, product_dims4); }
There isn’t much more to say here, the comments should be clear enough.
I’ll let you look over the project source^{1} for the code that is not presented here, it shouldn’t be hard to figure out by now.
I tested the code with some results I found for both imaginary time evolution and real time evolution, I’ll present here only one result I reproduced with the program:
It’s the magnetization real time evolution from Classical simulation of infinite-size quantum lattice systems in one spatial dimension^{5}, fig 6, for an infinite Ising chain with transverse magnetic field with initial state obtained by imaginary time evolution.
That’s about it for now. I guess the program should be easily changed to work with spin-1 Hamiltonians and with some more effort the finite chain could be implemented. I might implement both in the future, but for this post this is enough.
If you find any issues or have suggestions, please let me know.
It looks like I will be quite busy for a while so I won’t have much time for the blog. I might only post easy things like this one for a while… anyway, here it is, with only brief explanations and the JavaScript code. I didn’t feel the need and didn’t have the time and patience to write a C++ program for this topic. It’s just too simple and besides, it’s nice to see the code action in the page.
So, this post is about the relaxation method combined with a multigrid method applied on the Laplace equation.
The methods are easy and I suppose Wikipedia pages are a good start, but here there are several other links for help, I just googled them, you can find many more on the subject. First, a paper^{1}. A web page is here^{2} and another one here^{3}.
You can find the Laplace equation (or more general, the Poisson equation) in various topics, for example originating from Gauss law if you use the electric potential instead of the electric field, or in the heat equation in a steady state when the time derivative drops out (also see: diffusion equation).
What is nice about this equation, apart from the superposition principle is that the solutions are harmonic functions which means that the value in a point is the average of the points around it (for a more rigorous explanation, please see the Wikipedia page). This allows us to use a relaxation method to solve it, although there are faster methods that one could find (for example, one could use Fourier transform). Despite this, the method is easy to understand and can be a start for other methods of solving the equation.
The method is very easy, first the equation is discretized using a finite difference method then one iteratively averages the points in the discretized space. This can be coupled with a multigrid method by starting out with a coarse grid first.
That should be enough theory, here is the result in action, on a very simple model on a square with the boundary condition of two sides with the field with value 1 and the other two with value -1:
The code just iterates relaxation until the difference between the old solution and the new one is under a certain threshold, then the resolution is increased and the same is done again until it reaches a certain resolution, then it starts again.
It could be instructive to change the code to not use the multigrid method, but instead to start from the beginning with the smallest mesh size.
Here is the model class:
function SquareModel(Size) { this.Size = Size; this.field = []; this.IndexForSize = function(row, col, size) { return size*row+col; }; this.Index = function(row, col) { return this.IndexForSize(row, col, this.Size); }; this.Value = function(matrix, row, col) { return matrix[this.Index(row, col)]; }; this.Boundary = function(row, col) { return row == 0 || col == 0 || row == this.Size - 1 || col == this.Size - 1; }; this.Field = function(row, col) { return this.field[this.Index(row, col)]; }; this.SetBoundary = function() { for (i = 1; i < this.Size - 1; ++i) { // two borders to -1, two 1 this.field[this.Index(0, i)] = -1; this.field[this.Index(this.Size - 1, i)] = -1; this.field[this.Index(i, this.Size - 1)] = 1; this.field[this.Index(i, 0)] = 1; } }; this.Init = function() { for (i = 0; i < this.Size; ++i) for (j = 0; j < this.Size; ++j) this.field[this.Index(i, j)] = 0; this.SetBoundary(); } this.Init(); }
There is not much to say about it apart of what I already described. I guess you could change it with some other model…
Here is the Relaxation class:
function Relaxation(relaxationModel) { this.StartSize = relaxationModel.Size; this.iteration = 1; this.newField = []; this.model = relaxationModel; this.SwapFields = function() { tmp = this.model.field; this.model.field = this.newField; this.newField = tmp; }; this.MakeGridSmaller = function() { // make a smaller grid by interpolating the values from the source for (i = 0 ; i < this.model.Size; ++i) for (j = 0 ; j < this.model.Size; ++j) { this.newField[this.model.IndexForSize(2*i, 2*j, this.model.Size*2)] = this.model.Field(i, j); this.newField[this.model.IndexForSize(2*i+1, 2*j, this.model.Size*2)] = (this.model.Field(i, j) + (i<this.model.Size - 1 ? this.model.Field(i+1, j) : this.model.Field(i, j)))/2; this.newField[this.model.IndexForSize(2*i, 2*j+1, this.model.Size*2)] = (this.model.Field(i, j) + (j<this.model.Size - 1 ? this.model.Field(i, j+1) : this.model.Field(i, j)))/2; this.newField[this.model.IndexForSize(2*i+1, 2*j+1, this.model.Size*2)] = (this.model.Field(i, j) + (i<this.model.Size - 1 ? this.model.Field(i+1, j) : this.model.Field(i, j)) + (j<this.model.Size - 1 ? this.model.Field(i, j+1) : this.model.Field(i, j)) + (i<this.model.Size - 1 && j<this.model.Size - 1 ? this.model.Field(i+1, j+1) : this.model.Field(i, j)))/4; // the above is not quite correct for the rightmost column and last row, but it doesn't matter, SetBoundary should set them to the correct values, anyway } this.SwapFields(); this.model.Size *= 2; this.model.SetBoundary(); }; this.Reset = function() { this.model.Size = this.StartSize; this.model.Init(); this.iteration = 1; }; this.Relax = function() { change = 0; for (i = 0 ; i < this.model.Size; ++i) for (j = 0 ; j < this.model.Size; ++j) if (!this.model.Boundary(i, j)) { oldVal = this.model.Field(i, j); newVal = (this.model.Field(i - 1, j) + this.model.Field(i, j - 1) + this.model.Field(i, j + 1) + this.model.Field(i + 1, j))/4.0; this.model.field[this.model.Index(i, j)] = newVal; dif = oldVal - newVal; change += dif * dif; } ++this.iteration; return Math.sqrt(change); }; }
The code uses the Gauss-Seidel method instead of the slower Jacobi method. As a note, the interpolation method I made in the MakeGridSmaller
method is very crude, it shifts the values up and to the left than they should be, one can do much better and it should be done better in ‘real life’ code. I didn’t have patience to implement something better in this short time I allocated for this post.
In order to be fully functional, the code needs also functions to create the objects, successively apply the methods and display the results, and here it is:
function DisplayModel(canvas, model) { function Color(val) { r = 0; g = 0; b = 0; if (val > 0) { r = Math.ceil(255. * val); g = Math.floor(255. * (1. - val)); } else { val *= -1; g = Math.floor(255. * (1. - val)); b = Math.ceil(255. * val); } return "rgb(" + r.toString() + "," + g.toString() + "," + b.toString() + ")"; }; ctx = canvas.getContext("2d"); displaySize = canvas.width / model.Size; rectSize = Math.ceil(displaySize); for (i = 0 ; i < model.Size; ++i) for (j = 0 ; j < model.Size; ++j) { ctx.fillStyle = Color(model.Field(i, j)); ctx.fillRect(j * displaySize, i * displaySize, rectSize, rectSize); } }; ParticularModel1 = new SquareModel(8); Relaxation1 = new Relaxation(ParticularModel1); canvas = document.getElementById("relaxationCanvas"); function Tick() { DisplayModel(canvas, Relaxation1.model); error = Relaxation1.Relax(); if (error / (Relaxation1.model.Size * Relaxation1.model.Size) < 0.0000001) { Relaxation1.MakeGridSmaller(); if (Relaxation1.model.Size == 256) Relaxation1.Reset(); } } setInterval(Tick, 10);
Everything is packed into an anonymous closure and that’s about it:
(function() { // all the code from above is here })();
Maybe I was too brief on this subject but it’s weekend… as usual, if you find something wrong, please point it out.
There are many things one might try to improve such algorithm, for example the interpolation when switching to a finer mesh could be improved and using over-relaxation will speed up the convergence.
As promised in the Numerical Renormalization Group post, I implemented a Density Matrix Renormalization Group program. As I start writing this post, the program is already on GitHub^{1}. It’s quite basic, currently it is implemented only for Heisenberg model chains, for both spin 1/2 and 1. It runs only for even number of sites and symmetries are not used to speed up computation. Also it does not work for fermionic operators, although I might change it in the future to have it working for a Hubbard chain.
As for other posts, here is the program in action:
Currently it displays the ground state energy/site and the local bond strength. I checked if it is able to reproduce a result from Density-Matrix algorithms for quantum renormalization groups^{2}, fig 6a. One should be able to add ‘measurements’ for other operators fairly easy, either simple ones or long range correlations, this is only an example.
As usual, since I cannot cover the theory in a lot of details, I’ll give a lot of links that should help in understanding the program.
First, I’ll point to some related things that exist on this site. Since it’s a variational method, you might want to visit How to solve a quantum many body problem post. You might also want to visit the Renormalization Groups post for some generalities about renormalization. Very important is the The Numerical Renormalization Group. The idea for DMRG originated from there, and also DMRG ideas went back into NRG to make the Density Matrix Numerical Renormalization Group (which I briefly mentioned in that post, but you’ll find links in there with more details).
Here^{3} there are some lectures by Adrian Feiguin, with both theory and C++ code. Here is a dissertation: Competition of magnetic and superconducting ordering in one-dimensional generalized Hubbard models^{4} by Christian Dziurzik. There is an annex there that shows a way of handling fermionic operators signs. Here is the PhD thesis of Javier Rodriguez Laguna: Real Space Renormalization Group Techniques and Applications^{5}. A course by Andre Luiz Malvezzi: An introduction to numerical methods in low-dimensional quantum systems^{6}. And the last one in this paragraph: Density Matrix Renormalization Group for Dummies^{7} by Gabriele De Chiara, Matteo Rizzi, Davide Rossini, and Simone Montangero.
A review paper by Ulrich Schollwoeck, The density-matrix renormalization group^{8}. A newer one by the same author The density-matrix renormalization group in the age of matrix product states^{9}, oriented as the title says, to Matrix Product States into which anybody who wants to implement more than a toy program must look.
There are plenty of serious DMRG codes available, used for research. I’ll offer a link to only one of them: CheMPS2: a spin-adapted implementation of DMRG for ab initio quantum chemistry^{10}. You’ll find there not only the source code, but also links to publications, a workshop video and a user manual.
Since this blog is oriented to relatively small projects that are easier to understand than the sophisticated ones used for research, I’ll point to several simpler ones:
I found Tiny DMRG^{11} while looking for a Lanczos algoritm implementation. I had my implementation but because of the loss of orthogonality I looked for another one that has that prevented. I was simply too lazy to implement it myself, so I took the one from this project and changed it quite a bit to be easier to understand (also to use Eigen). Another pedagogical one is DMRG101^{12}, implemented in Python. This one is implemented in C++, requires armadillo: DMRGCat^{13}. This one implements Hubbard chains: Hubbard DMRG^{14}. I put the link here because for fermionic Hubbard chains one must take special care of fermionic operators (there is an annoying sign issue). I don’t know if I’m going to implement it in DMRG, if not, one can have a taste of it from the NRG project or look over this link to see how it’s done. Look also over the paper links above, it’s described in quite a detail. One more tutorial is Simple DMRG^{15}. One using Eigen and Matrix Product States is Eigen DMRG^{16}.
Another link that is worth a look is ITensor^{17}. You’ll find there – among others – info on Matrix Products Operators, Singular Value Decomposition and another way of dealing with fermionic operators, that is, Jordan-Wigner transformation.
Here is a lecture by Steven White:
Ok, those should suffice for now, I’m sure you can find more yourself if needed…
Since the Density Matrix is an important concept for DMRG, I’ll try to write more about it here. I already mentioned it on the blog several times but I did not detail it much.
If the system is in a pure state , then for an observable O we have a Hermitian operator for which the expectation value is:
where the unit operator was inserted. is an operator and we note it with . It’s the Density Matrix operator for the pure state . We have:
You can immediately prove that the Density Matrix operator is a projection operator. Tr is the trace.
Often instead of knowing a pure quantum state for a system, we have only a set of quantum states associated with probabilities for them. In that case, for an observable O we obtain the expected value as usual in the case with the classical probabilities:
is the probability of having the system in the pure state . is calculated as usual for the case of the pure quantum state . Density matrix turns into:
To see how classical probabilities can arise from limited knowledge, let’s consider a system, called the Universe, composed of two systems, one called System, one Environment. Let’s say the system Hilbert space dimension is N and the environment Hilbert space dimension is M. The Hilbert space for the Universe has dimension NxM and it’s the tensor product of the two Hilbert spaces. Let’s assume that the Universe is in a pure quantum state, . The density matrix for it is . Writing
with
Considering the expectancy value of an observable that acts only on the System and does nothing to the Environment, that is:
after a bit of algebra one can find that:
With being the reduced density matrix obtained by tracing out the Environment:
The reduced density matrix describes a mixed state, if the System is entangled with the Environment.
The System is not entangled with the Environment if one can find that for and , , that is the pure state of the Universe can be written as a simple product between states of the System and Environment. If the Universe state cannot be expressed like that, the System and Environment are entangled.
How could you test if a density matrix describes a pure quantum state or a mixed one? For the above Universe case, how could one check if the reduced density matrix for the System describes an entangled state or a pure one? One way would be to notice that in the pure quantum state case . Another equivalent way is to find out is to find the rank (Schmidt rank, see below the details) of the matrix. If it’s 1, it’s a pure state. Another way to deal with it is Von Neumann entropy which is:
It’s easy to check that for a pure state the entropy is zero.
For an operator O represented by a nxn matrix, one can use eigendecomposition to switch to a basis where the operator is diagonal. For our description we have a ‘problem’, the System and the Environment do not necessarily have equal dimensions. There is a generalization of eigendecomposition which can be used, the Singular Value Decomposition. The generalization for eigenvalues is singular values. Applying the decomposition on the matrix of the coefficients of the pure function for the Universe, one obtains the Schmidt decomposition for the wave vector:
k runs up to . It means we can change the basis for both the System and the Environment to have the wave vector coefficients matrix diagonal. In this basis, called the Schmidt basis, tracing out the Environment gives the reduced density matrix for the System:
It’s similar when tracing out the System, the reduced density matrix of the Environment has the same eigenvalues as the one for the System, except that obviously the basis vectors are for the Environment instead of the System ones. Please check out the Lecture Notes of Adrian Feiguin^{3} for details.
I will refer again the Numerical Renormalization Group post. I mentioned there that it works because of the exponential drop in the interaction strength. If one tries to apply the same method in real space to a chain, one can find out that it doesn’t work. You can’t simply add a site, diagonalize the Hamiltonian and pick up the states that have the lowest eigenvalues and drop the other ones, then repeat. Or better said, you can, but the results will not be very good. It turns out that the interaction with the environment/boundary conditions matter so much that they cannot be ignored while extending the chain. Wilson tried the blocks renormalization group on a single particle in a box and showed that it fails. For details please check the chapter 2 of Real Space Renormalization Group Techniques and Applications^{5}.
Steven White published the solution in 1992: Density matrix formulation for quantum renormalization groups Steven R White, Phys. Rev. Lett. 69, 2863^{18}. The solution is based on two important things: do not ignore the environment and a better way of selecting the states that should be kept, than simply selecting those with the lowest energies.
The first one is obtained by considering not only the System as in NRG, grown one site each step, but also the Environment. The Environment can be obtained in the same way as the System, or if there is a mirror symmetry, one can simply reflect the System into the Environment. For the finite size algorithm, an Environment calculated at a previous step is used. This way not only the System is taken into account, but also the Environment and the interactions between the System and Environment, joined together into a superblock.
The selection of the best wave vectors when truncating the basis is done by considering the vectors that have a high probability. Considering the Schmidt decomposition and the expectancy value:
The operator is arbitrary, if we want an algorithm that keeps the most important states independent of it, we should keep the ones that correspond to the largest eigenvalues of the reduced density matrix, this way we should expect small errors.
Alternatively, one can see that if we want to obtain a good approximation for the whole pure wave vector , , we should try to minimize
For the coefficient matrices this is the Frobenius distance and the approximation is the low ranking approximation. For a given number of kept states r, the distance becomes:
Now it should be quite clear that we must retain the wave vectors corresponding to the largest eigenvalues of the reduced density matrix.
It turns out that this method also retains a maximum entanglement between System and Environment. Just calculate the von Neumann entropy and see how it changes by truncation.
That should be enough theory, from now on I’ll present things very briefly together with the code. For more details please check the links.
Here is how the infinite size algorithm looks in the code:
template<class SiteHamiltonianType, class BlockType> double GenericDMRGAlgorithm<SiteHamiltonianType, BlockType>::CalculateInfinite(int chainLength) { if (chainLength % 2) ++chainLength; if (chainLength <= 0) return std::numeric_limits<double>::infinity(); ClearInit(); finiteAlgorithm = false; double result = 0; truncationError = 0; for (int i = 0; 2 * systemBlock->length < chainLength; ++i) result = Step(i); return result; }
Of course, the relatively complex work is in the Step
call, but I’ll detail that later.
The algorithm can keep track of measurement operators, too. They have to be transformed, too. For details, please check out the documentation. The toy program implements it only for the left side blocks (for the last sweep), it can be implemented for both left and right but in this case I took advantage of the mirror symmetry.
Here is the finite size algorithm implementation:
template<class SiteHamiltonianType, class BlockType> double GenericDMRGAlgorithm<SiteHamiltonianType, BlockType>::CalculateFinite(int chainLength, int numSweeps) { if (chainLength % 2) ++chainLength; if (chainLength <= 0) return std::numeric_limits<double>::infinity(); ClearInit(); finiteAlgorithm = false; truncationError = 0; systemBlocksRepository = new std::map<int, std::unique_ptr<BlockType>>(); environmentBlocksRepository = new std::map<int, std::unique_ptr<BlockType>>(); double result = 0; // infinite size algorithm AddToRepository(systemBlock->length, systemBlock, systemBlocksRepository); AddToRepository(environmentBlock->length, environmentBlock, environmentBlocksRepository); for (int i = 0; 2 * systemBlock->length < chainLength; ++i) { result = Step(i); AddToRepository(systemBlock->length, systemBlock, systemBlocksRepository); AddToRepository(environmentBlock->length, environmentBlock, environmentBlocksRepository); } // finite size algorithm bool left = false; finiteAlgorithm = true; for (int i = 0; i < numSweeps; ++i) { TRACE("SWEEP NUMBER %d\n", i); for (int step = 0;; ++step) { int key = chainLength - systemBlock->length - 1; delete environmentBlock; environmentBlock = (*environmentBlocksRepository)[key].release(); environmentBlocksRepository->erase(key); if (1 == environmentBlock->length) { // we reached the end of the chain, start sweeping in the other direction std::swap(environmentBlock, systemBlock); std::swap(environmentBlocksRepository, systemBlocksRepository); left = !left; // put the state we're starting with to the systems repository (it just switched from environment) // we removed it from there and we need it back AddToRepository(systemBlock->length, systemBlock, systemBlocksRepository); if (!left && numSweeps - 1 == i) addInteractionOperator = true; } result = Step(step); AddToRepository(systemBlock->length, systemBlock, systemBlocksRepository); if (!left && systemBlock->length == chainLength / 2) break; } } CalculateResults(); return result; }
It’s a bit more complex but with the help of comments it should be relatively easy to understand. AddToRepository
saves blocks, if (1 == environmentBlock->length)
checks for swapping left with right.
Here is the code for Step
:
template<class SiteHamiltonianType, class BlockType> double GenericDMRGAlgorithm<SiteHamiltonianType, BlockType>::Step(int step) { // extend blocks and operators Extend(); // a simple copy is enough for this toy program // one might need to construct the right block too, depending on the model // if mirror symmetry does not exist if (!finiteAlgorithm) CopySystemBlockToEnvironment(); unsigned int SysBasisSize = (unsigned int)systemBlock->hamiltonian.matrix.cols(); unsigned int EnvBasisSize = (unsigned int)environmentBlock->hamiltonian.matrix.cols(); // join the system block and the environment block into the superblock Operators::Hamiltonian superblockHamiltonian = CalculateSuperblock(SysBasisSize, EnvBasisSize); // get the ground state of the superblock hamiltonian double GroundStateEnergy = LanczosGroundState(superblockHamiltonian, GroundState); TRACE(L"Energy/site: %f for system length: %d (Sys: %d, Env: %d) at step: %d\n", GroundStateEnergy / (systemBlock->length + environmentBlock->length), (systemBlock->length + environmentBlock->length), systemBlock->length, environmentBlock->length, step); // construct the reduced density matrix // the density matrix for the ground state is |GroundState><GroundState| // in terms of system and environment basis states the ground state is |GroundState> = \Sum_{i,j} c_{i,j} |i>|j> // where |i> is system vector and |j> is environment vector // c_{i,j} = <i|<j|GroundState> // the density matrix constructor takes care of getting the reduced density matrix from c_{i,j} Operators::DensityMatrix densityMatrix(GroundState, SysBasisSize, EnvBasisSize); densityMatrix.Diagonalize(); const Eigen::MatrixXd& eigenV = densityMatrix.eigenvectors(); eigenvals = densityMatrix.eigenvalues(); // now pick the ones that have the biggest values // they are ordered with the lowest eigenvalue first unsigned int keepStates = std::min<unsigned int>(maxStates, SysBasisSize); // construct the transform matrix from the chosen vectors (the ones with the higher probability) Eigen::MatrixXd Ut(eigenV.rows(), keepStates); int numStates = (int)eigenV.cols(); truncationError = 1.; // also calculate the truncation error for (unsigned int i = 0; i < keepStates; ++i) { int index = numStates - i - 1; Ut.col(i) = eigenV.col(index); truncationError -= eigenvals(index); } TRACE("Truncation Error: %f\n", truncationError); const Eigen::MatrixXd U = Ut.adjoint(); TransformOperators(U, Ut); return GroundStateEnergy; }
With the help of the comments it should be easy to understand. Of course some of the work is delegated to other methods, either from the algorithm class or blocks. I’ll detail the most relevant ones later.
The program is quite simple and the mfc classes are not really interesting. It’s a standard mfc program so I won’t describe them anymore in detail. It’s just a doc/view program with options, property sheet and two property pages on it. Has a Chart
which I took from another project already described on this blog. There is a ComputationThread
that also resembles thread classes from other projects, from which DMRGThread
is derived. The thread is started in CdmrgDoc::StartComputing()
like this:
if (0 == theApp.options.model) thread = new DMRGThread<DMRG::Heisenberg::DMRGHeisenbergSpinOneHalf>(theApp.options.sites, theApp.options.sweeps, theApp.options.states); else thread = new DMRGThread<DMRG::Heisenberg::DMRGHeisenbergSpinOne>(theApp.options.sites, theApp.options.sweeps, theApp.options.states); thread->Start();
It would be quite easy to add another algorithm to it.
Please visit an older entry if you want to see a similar program described a little more. I’ll just describe here what’s different and related with DMRG.
Operators are implemented in the DMRG::Operators
namespace. The root class is Operator
but one is not supposed to use it directly. Two classes are derived from it, SiteOperator
from which all ‘site’ operator classes are derived, and DiagonalizableOperator
from which Hamiltonian
and DensityMatrix
operator classes are derived, obviously because we need diagonalization for them.
There is quite a bit of resemblance with the operator classes implemented in the Numerical Renormalization Group project, ideally they should share the classes but that’s not happening yet. There are differences, maybe in the future I’ll make them in such a way that they’ll be common. Since I mentioned NRG, I should mention that I kept a convention that is used there, that is, a newly added site (for the left block) is put in the most significant bits position in the Hamiltonian, that is, to the left. This way when the Hamiltonian is extended, the old Hamiltonian for the existing sites appears as blocks on the diagonal of the new matrix. I find it easier to visualize the newly added site (operators for it) in the matrix. It’s also easier to deal with the fermionic operators that way. Since currently the DMRG project implements only Heisenberg chains, it deals only with bosonic operators, so there is no sign issue. If you want to implement a Hubbard chain then you will have to deal with it. I already provided links that describe the issue in detail. You can also check out the Numerical Renormalization Group project and see how it’s handled there. Currently there is not much support in the DMRG project except that the Operator
class has a changeSign
flag that should indicate is the operator is fermionic, but there is no support in the implementation for it yet.
Please be aware that some tutorial projects and papers might extend the left Hamiltonian the other way around than in the convention I used, that is, I use for the left block while they might use .
Since I used tensor products in the formulae above, I should mention that the Kronecker products are implemented as static
members in the Operator
class. One implements the product between two matrices and the other two the product with the identity matrix.
There is not much else to say about the operators. I think the most important one to check is the DensityMatrix
:
DensityMatrix::DensityMatrix(const Eigen::VectorXd& GroundState, unsigned int SysBasisSize, unsigned int EnvBasisSize, bool left) : DiagonalizableOperator(SysBasisSize, false) { // the density matrix for the ground state is |GroundState><GroundState| // in terms of system and environment states the ground state is |GroundState> = \Sum_{i,j} c_{i,j} |i>|j> // where |i> is environment vector and |j> is system vector // c_{i,j} = <i|<j|GroundState> Eigen::MatrixXd c(EnvBasisSize, SysBasisSize); // first construct the coefficients matrix for (unsigned int i = 0; i < EnvBasisSize; ++i) for (unsigned int j = 0; j < SysBasisSize; ++j) c(i, j) = GroundState(i * SysBasisSize + j); //trace out environment if (left) matrix = c.adjoint() * c; else matrix = c * c.adjoint(); // this traces out the environment because // \rho^{sys} = Tr_{env}|GroundState><GroundState| = \Sum_i <i|GroundState><GroundState|i> // sandwiching it between <j| and |k> => // \rho^{sys}_{j,k} = \Sum_i <i|<j|GroundState><GroundState|k>|i> // where |i> are environment vectors, |j> and |k> are both for system // so \rho^{sys}_{j,k} = \Sum_i c{i,j} * c^*_{i,k} // that is, \rho = c^\dagger * c }
Hopefully the comments are helpful. ‘Site’ operators are quite easy, for example for for 1/2 spin:
SzOneHalf::SzOneHalf(unsigned int size) : SiteOperator(size, false) { int subsize = size / 2; matrix.block(0, 0, subsize, subsize) = 1. / 2. * Eigen::MatrixXd::Identity(subsize, subsize); matrix.block(subsize, subsize, subsize, subsize) = -1. / 2. * Eigen::MatrixXd::Identity(subsize, subsize); }
It should be quite obvious that it is for:
Please check the project sources^{1} for the rest of operators implementation. There is not much more to it, really.
I wrote a generic class for blocks, GenericBlock
:
template<class SiteHamiltonianType> class GenericBlock { public: GenericBlock(bool Left = true); virtual ~GenericBlock(); int length; bool left; SiteHamiltonianType hamiltonian; //we start with a single site hamiltonian SiteHamiltonianType SiteHamiltonian; virtual void Extend(); virtual Operators::Hamiltonian GetInteractionHamiltonian() const = 0; int GetSingleSiteBasisSize() const { return SiteHamiltonian.cols(); } };
The GetInteractionHammiltonian
is supposed to be implemented in a derived class. Extend
should also be implemented in a derived class but the base class has an implementation for it:
template<class SiteHamiltonianType> void GenericBlock<SiteHamiltonianType>::Extend() { Operators::Hamiltonian interactionHamiltonian = GetInteractionHamiltonian(); hamiltonian.Extend(SiteHamiltonian, interactionHamiltonian, left); ++length; }
That’s about it. For a specific model you just derive from it:
template<class SiteHamiltonianType, class SzType, class SplusType> class HeisenbergBlock : public GenericBlock<SiteHamiltonianType> { public: HeisenbergBlock(bool Left = true); virtual ~HeisenbergBlock(); SzType SzForBoundarySite; SplusType SplusForBoundarySite; static SzType SzForNewSite; static SplusType SplusForNewSite; virtual Operators::Hamiltonian GetInteractionHamiltonian() const; virtual void Extend(); };
It contains operators needed for extending the block and implements the two methods and that’s about it.
Here is how it calculates the interaction Hamiltonian needed for adding a new site:
template<class SiteHamiltonianType, class SzType, class SplusType> Operators::Hamiltonian HeisenbergBlock<SiteHamiltonianType, SzType, SplusType>::GetInteractionHamiltonian() const { Operators::Hamiltonian interactionHamiltonian(hamiltonian.GetSingleSiteSize()); if (left) { interactionHamiltonian.matrix = Operators::Operator::KroneckerProduct(SzForNewSite.matrix, SzForBoundarySite.matrix) + 1. / 2. * (Operators::Operator::KroneckerProduct(SplusForNewSite.matrix, SplusForBoundarySite.matrix.adjoint()) + Operators::Operator::KroneckerProduct(SplusForNewSite.matrix.adjoint(), SplusForBoundarySite.matrix)); } else { interactionHamiltonian.matrix = Operators::Operator::KroneckerProduct(SzForBoundarySite.matrix, SzForNewSite.matrix) + 1. / 2. * (Operators::Operator::KroneckerProduct(SplusForBoundarySite.matrix, SplusForNewSite.matrix.adjoint()) + Operators::Operator::KroneckerProduct(SplusForBoundarySite.matrix.adjoint(), SplusForNewSite.matrix)); } return interactionHamiltonian; }
and here is how the Extension
is extended (sic):
template<class SiteHamiltonianType, class SzType, class SplusType> void HeisenbergBlock<SiteHamiltonianType, SzType, SplusType>::Extend() { int BasisSize = (int)hamiltonian.matrix.cols(); GenericBlock::Extend(); if (left) { SplusForBoundarySite.matrix = Operators::Operator::KroneckerProductWithIdentity(SplusForNewSite.matrix, BasisSize); SzForBoundarySite.matrix = Operators::Operator::KroneckerProductWithIdentity(SzForNewSite.matrix, BasisSize); } else { SplusForBoundarySite.matrix = Operators::Operator::IdentityKronecker(BasisSize, SplusForNewSite.matrix); SzForBoundarySite.matrix = Operators::Operator::IdentityKronecker(BasisSize, SzForNewSite.matrix); } }
This concludes the blocks discussion. I should mention two things, though.
First, it’s usually safe to ignore code like if (left)
in this project (there might be a couple of places where it’s not) and consider only the code in the if (left)
branch ignoring the else
part. The reason is that the right block is simply copied from the left one. I added the ‘left’ check just in case the code will be extended to construct the right block in a similar manner as the left one.
Second, you might be a little confused by the block copy. It should be reflected, right? You may visualize what’s happening by considering the two left and right blocks with the two sites in between (as in the featured image but symmetrical). Bend the chain in the middle until the right block is turned under the left one. Together with a relabeling of the sites in the right (now under) block, it should be more clear what is happening.
I already presented the most important parts of the algorithm implementation above, but there is more to it. I implemented a GenericDMRGAlgorithm
where most of the code is, the model specific code being the responsibility of derived classes. This class is the most important but I cannot present it here entirely. Please check the source code for details about it. What I presented here about it should be enough to help you understand it. Just one more thing about it here:
template<class SiteHamiltonianType, class BlockType> Operators::Hamiltonian GenericDMRGAlgorithm<SiteHamiltonianType, BlockType>::CalculateSuperblock(unsigned int SysBasisSize, unsigned int EnvBasisSize) { Operators::Hamiltonian superblockHamiltonian; // Hsuperblock = Hsystem + Hinteraction + Henvironment Operators::Hamiltonian SystemEnvironmentInteraction = GetInteractionHamiltonian(); superblockHamiltonian.matrix = Operators::Operator::IdentityKronecker(EnvBasisSize, systemBlock->hamiltonian.matrix) + SystemEnvironmentInteraction.matrix + Operators::Operator::KroneckerProductWithIdentity(environmentBlock->hamiltonian.matrix, SysBasisSize); return superblockHamiltonian; }
That’s how the Superblock Hamiltonian is obtained.
From this generic class I derived the Heisenberg class. It does not have much into it, it just implements a couple of methods. One is for the interaction Hamiltonian between the two blocks (needed when constructing the Superblock):
template<class SiteHamiltonianType, class SzType, class SplusType> Operators::Hamiltonian HeisenbergDMRGAlgorithm<SiteHamiltonianType, SzType, SplusType>::GetInteractionHamiltonian() const { Operators::Hamiltonian interactionHamiltonian; interactionHamiltonian.matrix = Operators::Operator::KroneckerProduct(environmentBlock->SzForBoundarySite.matrix, systemBlock->SzForBoundarySite.matrix) + 1. / 2. * (Operators::Operator::KroneckerProduct(environmentBlock->SplusForBoundarySite.matrix, systemBlock->SplusForBoundarySite.matrix.adjoint()) + Operators::Operator::KroneckerProduct(environmentBlock->SplusForBoundarySite.matrix.adjoint(), systemBlock->SplusForBoundarySite.matrix)); return interactionHamiltonian; }
The other one is for transforming the operators in the new basis:
template<class SiteHamiltonianType, class SzType, class SplusType> void HeisenbergDMRGAlgorithm<SiteHamiltonianType, SzType, SplusType>::TransformOperators(const Eigen::MatrixXd& U, const Eigen::MatrixXd& Ut, bool left) { GenericDMRGAlgorithm::TransformOperators(U, Ut, left); // takes care of the system block Hamiltonian if (left) { systemBlock->SplusForBoundarySite.matrix = U * systemBlock->SplusForBoundarySite.matrix * Ut; systemBlock->SzForBoundarySite.matrix = U * systemBlock->SzForBoundarySite.matrix * Ut; } else { environmentBlock->SplusForBoundarySite.matrix = U * environmentBlock->SplusForBoundarySite.matrix * Ut; environmentBlock->SzForBoundarySite.matrix = U * environmentBlock->SzForBoundarySite.matrix * Ut; } }
Not a big deal, the call into the base class does something similar for the Hamiltonian and measurement operators, that is .
And this is the final class for Heisenberg spin 1/2:
class DMRGHeisenbergSpinOneHalf : public HeisenbergDMRGAlgorithm<Operators::Hamiltonian, Operators::SzOneHalf, Operators::SplusOneHalf> { public: DMRGHeisenbergSpinOneHalf(unsigned int maxstates = 18); };
There isn’t much to it, the ‘trick’ is in the template arguments and that’s about it. Here is the constructor:
DMRGHeisenbergSpinOneHalf::DMRGHeisenbergSpinOneHalf(unsigned int maxstates) : HeisenbergDMRGAlgorithm<Operators::Hamiltonian, Operators::SzOneHalf, Operators::SplusOneHalf>(maxstates) { }
That’s the end of the algorithm presentation. There isn’t much more in the code, except in GenericDMRGAlgorithm
. The most important method that I did not present here is LanczosGroundState
. With the help of the comments it should be easy to understand. You might want to also check Extend
, TransformOperators
and CalculateResults
. They are only a few lines long and easy to figure out.
When you are looking over the GitHub code^{1} you might find out that it’s not identical with what I present here. It might be the case that I improved/extended the code, but this post will more or less stay as it is. I have to warn you about possible differences, I might change the code in the future (also the NRG one).
As for other projects before, this project uses mfc, the chart draws using GDI+ and the matrix library used is Eigen^{19}.
I used Density-Matrix algorithms for quantum renormalization groups, by Steven R White^{2} to check if the program works correctly. It seems that the program reproduces correctly fig 6a (spin 1/2, 60 sites) and fig 7a (spin 1, 60 sites). With some minimal changes it should be able to reproduce all charts from fig 6 and 7.
Here is the one for fig 6, only half of it because of the mirror symmetry:
I’ll indicate here some possible improvements, they won’t be exhaustive. The subject is very large, please check the links I provided for more info. If you intend to write a serious program, you should look into matrix product states.
There are some easy things that can be done even for this toy program and some not so easy.
There is much more to investigate, please at least check out the The density-matrix renormalization group review by Ulrich Schollwoeck^{8} for more that could be done and more details about the above hints.
That’s the end for DMRG for now. I might have some additions in the future for it, though.
As usual, if you notice any mistakes/bugs or you have suggestions, please let me know.
I won’t update the code description above, I’ll try to keep it simple, but the code can suffer changes over time.
Currently the code has the following additions:
Step
implementation, so the added complexity should not make the code harder to understand.
After some posts about the theory it is time to present the Hartree-Fock program^{1}. You might find the previous posts useful, along with the links in there: How to solve a quantum many-body problem, The Hartree-Fock method.
Here is the program in action (the final version might have minor differences):
Although it doesn’t show much, with minor changes you could do quite a bit more and with some effort a lot more. I had to stop development somewhere, though. I’m not happy with the code, especially about the IntegralsRepository
, I feel that there is a lot to improve there, but here it is. At least it appears to work as intended and the code is pretty clear in the most important parts I wanted to present. This project took me more time than I anticipated, especially because of the electron-electron integrals. I think I had the restricted method working in less than a couple of hours (but it’s not the first time I implement a Hartree-Fock program), after having the integrals computation working, most of the difficulty was in having those integrals computed right. This includes finding clear enough papers on it, having the ability to test the results and other details you don’t take into account in advance.
Obviously since the program deals with atoms and molecules, one should expect to have objects in the program that deal with them and indeed that is the case. Here is for example the class declaration for Atom
:
class Atom { public: Vector3D<double> position; unsigned int Z; unsigned int electronsNumber; unsigned int ID; Atom(unsigned int nrZ = 0, int nrElectrons = -1); virtual ~Atom(); };
It contains what one would expect from it: a position for the atom, the atomic number, and the electrons number. The ID is there (and elsewhere) because it’s easier to identify a particular atom (or whatever has an ID) using an ID instead of its position, for example. The electrons number needs not to be equal with Z, it might be an ion, but it’s typically ignored and number of electrons for the whole molecule is used instead. There is an AtomWithShells
derived from Atom
which basically has shells assigned to it and some methods that deal with them.
Also a Molecule
class exists, which as expected, has inside a bunch of atoms:
class Molecule { public: std::vector<AtomWithShells> atoms; unsigned int alphaElectrons; unsigned int betaElectrons; Molecule(); ~Molecule(); unsigned int CountNumberOfContractedGaussians() const; unsigned int CountNumberOfGaussians() const; void SetIDs(); double NuclearRepulsionEnergy() const; unsigned int ElectronsNumber(); unsigned int GetMaxAngularMomentum(); void SetCenterForShells(); void Normalize(); void Init(); };
All those classes are in the Systems
namespace, to be grouped together and to be easier to locate.
The program uses atomic orbitals as basis. More specifically, they are used to form a basis set which is not complete but it tries to span a subspace that includes most of the real solution.
The base class declaration looks like this:
class Orbital { public: Vector3D<double> center; QuantumNumbers::QuantumNumbers angularMomentum; unsigned int ID; unsigned int centerID; unsigned int shellID; Orbital(); virtual ~Orbital(); virtual double operator()(const Vector3D<double>& r) const = 0; virtual Vector3D<double> getCenter() const; char AtomicOrbital() const { return angularMomentum.AtomicOrbital(); } };
It yet does not force you to use Gaussian-type orbitals (they could be Slater-type) or have them in Cartesian coordinates, they could be in spherical coordinates, for example (although the center vector holds Cartesian coordinates internally). The program uses Gaussian orbitals:
class GaussianOrbital : public Orbital { public: double coefficient; double alpha; double normalizationFactor; GaussianOrbital(); virtual ~GaussianOrbital(); virtual double getCoefficient() const; virtual double getAlpha() const; virtual double operator()(const Vector3D<double>& r) const; Vector3D<double> ProductCenter(const GaussianOrbital& other) const; protected: double getNormalizationFactor() const; public: void Normalize(); };
By looking into GaussianOrbital::operator()
you could find out that their value is:
K is given by the normalization factor (the integral over the whole space should be 1) and a constant that is there because they are used in a linear combination to form a ‘contracted’ orbital that’s a better approximation to a Slater-type orbital. Here is the declaration for the contracted orbital:
// the contained gaussian orbitals all have the same center and quantum numbers, whence the derivation from Orbital class ContractedGaussianOrbital : public Orbital { public: std::vector<GaussianOrbital> gaussianOrbitals; virtual double operator()(const Vector3D<double>& r) const; ContractedGaussianOrbital(); virtual ~ContractedGaussianOrbital(); void Normalize(); };
Several such orbitals are packed into a ‘contracted shell’:
// all contracted gaussian orbitals that are contained share the same center and the same set of exponents // for example it might contain an s contracted gaussian orbital and the three ones for px, py, pz class ContractedGaussianShell : public Shell { public: std::vector<ContractedGaussianOrbital> basisFunctions; ContractedGaussianShell(); ~ContractedGaussianShell(); void AddOrbital(char type); void AddGaussians(double exponent); virtual Vector3D<double> getCenter() const; std::string GetShellString() const; unsigned int CountOrbitals(char orbital) const; unsigned int CountContractedOrbitals(char orbital) const; unsigned int CountNumberOfContractedGaussians() const; unsigned int CountNumberOfGaussians() const; virtual double operator()(const Vector3D<double>& r) const; protected: static unsigned int AdjustOrbitalsCount(char orbital, unsigned int res); void AddOrbitalsInCanonicalOrder(unsigned int L); public: void SetCenters(const Vector3D<double>& center); void Normalize(); };
The reason for packing them like that is to ease some integrals computations. An atom just has several such shells.
All orbitals are in the Orbitals
namespace. There is also an important class in there, Orbitals::QuantumNumbers::QuantumNumbers
. It packs the l, m, n integer values and it’s also a generator. That simplifies a lot iterating over integrals in the code, it’s more clear that way, but about that, later.
In the QuantumNumbers
namespace there are also a lot of unused classes, like PxQuantumNumbers
, PyQuantumNumbers
and so on. I might use them later, they are there because I started to implement the code differently then I changed my mind, but still they could be useful in the future.
The basis is in the Chemistry
namespace and it’s quite simple:
class Basis { public: std::vector<Systems::AtomWithShells> atoms; Basis(); ~Basis(); void Load(std::string fileName); void Save(const std::string& name); void Normalize(); };
It just packs together a lot of atoms with their shells. They are loaded from a text file. The Load
implementation is far from optimal and there is not much error checking, so it should be easy to crash with a malformed file. I just wanted to have something that works, I thought that I will polish the code afterwards but I gave up to the idea for now. It fulfills its purpose.
The source of the basis sets files is here: EMSL Basis Sets Exchange^{2}. Currently the program uses STO-3G and STO-6G basis sets, it could use other STO-nG with no change (and perhaps others, too). It should be able to use other Gaussian basis sets with no or minimal changes, too, but those should be enough for a program used as an example. If you want to generate another basis set to be used, be aware that I used NWChem format and I also replaced D+ and D- with E+ and E- in the file, to have the format that is easy to parse in C++ for double values. It should be very easy to change the program to read D+/- as well but I was lazy, ‘replace all’ appeared easier at that moment.
As mentioned last time, matrix representation is used, the program deals with matrices a lot. The operators are matrices and that reflects in the implementation too. The Matrices
namespace contains the matrices for overlap, kinetic and nuclear. They contain an Eigen matrix and have a Calculate
method. They get the integrals from IntegralRepository
(if not already calculated, they are calculated when retrieved). The implementation is easy, but you might have a little trouble to figure out what lines/columns represent. If you have two atoms with two shells each, that is, with 1s, 2s 2px 2py 2pz, then such a matrix will have 10 rows and 10 columns, the first 5 values being for the first atom, the next ones for the second one. For example the (0, 0) element of the S matrix will be with 1s for the first atom for both bra and ket.
Now for the most difficult part of the program… I will advise you to visit the links related to integrals from the previous post and I’ll add here more, specifically some that helped me with the implementation. First, the simple ones:
The last link is not so useful in the implementation part (the theory is!). You may recognize the recurrence relations in there, but the symbolic computation would give some troubles, also it uses numerical integration for Boys functions.
So here it is the main paper the implementation is based on, including the most complex integrals, the electron-electron ones: HSERILib: Gaussian integral evaluation^{6}. Fundamentals of Molecular Integrals Evaluation^{7} already given last time should also help. Another one that could help: Molecular Integrals over Gaussian Basis Functions^{8}.
The implementation is in GaussianIntegrals
namespace. You’ll find there classes for each kind of integrals, overlap, kinetic, nuclear, electron-electron, along with some other classes such as the already mentioned IntegralsRepository
which would really need an improvement.
If you look into the HSERILib paper for example and in the electron-electron integral code, you’ll notice that the recurrence formulae are as in the paper, with slight changes to fit computation better:
// ******************************************************************************************************************************** // The Vertical Recurrence Relation matrixCalc(curIndex, m) = RpaScalar * matrixCalc(prevIndex, m) + RwpScalar * matrixCalc(prevIndex, m + 1); if (addPrevPrev) { unsigned int prevPrevIndex = prevPrevQN.GetTotalCanonicalIndex(); matrixCalc(curIndex, m) += N / (2. * alpha12) * (matrixCalc(prevPrevIndex, m) - alpha / alpha12 * matrixCalc(prevPrevIndex, m + 1)); } // ********************************************************************************************************************************
Don’t forget that ‘quantum numbers’ are generators, the code might look simpler than it really is.
I won’t say anything more about the integrals, the code^{1} is on GitHub and the papers should be enough to understand it.
The Hartree-Fock method is implemented in the HartreeFock
namespace. Although I implemented the unrestricted method as well, I won’t present it here. The code^{1} is available, if interested you should check it out.
The restricted method is implemented into two classes, a base class, HartreeFockAlgorithm
(also a base class for the unrestricted implementation) and the RestrictedHartreeFock
class. This is how the self consistent field iteration is implemented:
double HartreeFockAlgorithm::Calculate() { double curEnergy = 0; double prevEnergy = std::numeric_limits<double>::infinity(); if (!inited) return prevEnergy; // some big number before bail out for (int iter = 0; iter < maxIterations; ++iter) { Step(iter); curEnergy = GetTotalEnergy(); if (abs(prevEnergy - curEnergy) <= 1E-13) { converged = true; break; } if (terminate) break; prevEnergy = curEnergy; } return curEnergy; }
There is one other important method in there, HartreeFockAlgorithm::Init
, but I’ll let you figure it out.
Here is the Step
implementation for the restricted method:
void RestrictedHartreeFock::Step(int iter) { // ***************************************************************************************************************** // the Fock matrix Eigen::MatrixXd F; InitFockMatrix(iter, F); // *************************************************************************************************************************** // solve the Roothaan-Hall equation //diagonalize it - it can be done faster by diagonalizing the overlap matrix outside the loop but for tests this should be enough //I leave it here just in case - if all that S, U, s, V seems confusing this should help :) //Eigen::GeneralizedSelfAdjointEigenSolver<Eigen::MatrixXd> es(F, overlapMatrix.matrix); //const Eigen::MatrixXd& C = es.eigenvectors(); // this hopefully is faster than the one commented above Eigen::MatrixXd Fprime = Vt * F * V; Eigen::SelfAdjointEigenSolver<Eigen::MatrixXd> es(Fprime); const Eigen::MatrixXd& Cprime = es.eigenvectors(); Eigen::MatrixXd C = V * Cprime; // normalize it NormalizeC(C, nrOccupiedLevels); //*************************************************************************************************************** // calculate the density matrix Eigen::MatrixXd newP = Eigen::MatrixXd::Zero(h.rows(), h.cols()); for (int i = 0; i < h.rows(); ++i) for (int j = 0; j < h.cols(); ++j) for (unsigned int vec = 0; vec < nrOccupiedLevels; ++vec) // only eigenstates that are occupied newP(i, j) += 2. * C(i, vec) * C(j, vec); // 2 is for the number of electrons in the eigenstate, it's the restricted Hartree-Fock //************************************************************************************************************** const Eigen::VectorXd& eigenvals = es.eigenvalues(); CalculateEnergy(eigenvals, newP/*, F*/); TRACE("Step: %d Energy: %f\n", iter, totalEnergy); // *************************************************************************************************** // go to the next density matrix P = alpha * newP + (1. - alpha) * P; // use mixing if alpha is set less than 1 }
With the help of the comments and the theory from the last post, it should not be hard to understand. The NormalizeC
call is not really necessary, the code should converge without it, but it might converge faster with normalization. In some implementations it is present, in others it is not. Physically, it makes sense to have the density matrix normalized. Probabilities should add up to 1. It should converge to such a density matrix anyway, but it makes sense to enforce a physical density matrix along the iterations, too.
The InitFockMatrix
implementation is:
void RestrictedHartreeFock::InitFockMatrix(int iter, Eigen::MatrixXd& F) const { // this could be made faster knowing that the matrix should be symmetric // but it would be less expressive so I'll let it as it is // maybe I'll improve it later // anyway, the slower part is dealing with electron-electron integrals if (0 == iter) { if (initGuess > 0) { F.resize(h.rows(), h.cols()); for (int i = 0; i < h.rows(); ++i) for (int j = 0; j < h.rows(); ++j) F(i, j) = initGuess * overlapMatrix.matrix(i, j) * (h(i, i) + h(j, j)) / 2.; } else F = h; } else { Eigen::MatrixXd G = Eigen::MatrixXd::Zero(h.rows(), h.cols()); for (int i = 0; i < numberOfOrbitals; ++i) for (int j = 0; j < numberOfOrbitals; ++j) for (int k = 0; k < numberOfOrbitals; ++k) for (int l = 0; l < numberOfOrbitals; ++l) { double coulomb = integralsRepository.getElectronElectron(i, j, k, l); double exchange = integralsRepository.getElectronElectron(i, l, k, j); G(i, j) += P(k, l) * (coulomb - 1. / 2. * exchange); } F = h + G; } }
For the 0 step, it simply sets the Fock matrix equal either with the ‘core’ matrix or with a ‘Hückel guess’. I won’t detail it more, but here is a start: Extended Hückel method.
The second part is more important, you can see how the Fock matrix is calculated out of the ‘core’ matrix (h = kinetic + nuclear) and density matrix P together with electron-electron interaction integrals (both coulomb and exchange terms). For the theory please visit the previous post.
That’s about it, the rest should be easy to figure out. The energy is calculated in RestrictedHartreeFock::CalculateEnergy
.
I usually list each class but now I have no patience for it, so I’ll describe it more briefly.
The most important classes that do not deal with user interface are in namespaces. The exceptions are HartreeFockThread
derived from ComputationThread
. Since the chart is composed from points which represent different computations, one can use several threads for computation. Anyway, even a single thread would make sense, to avoid locking the UI a long time. Vector3D
is the same class seen in other projects on this blog, Chart
is taken from another project that uses it on this blog, the other classes that are not in a namespace are quite similar with the ones in other projects, for details you could visit another post. The namespaces and classes in them were already briefly described above, except the Tensors
namespace. I needed tensors for electron-electron calculation and although Eigen has some unsupported tensor implementation, I preferred to implement the classes myself. I might need them in a future project, too.
As in other projects on this blog, besides mfc and other typical VC++ runtime libraries, the program uses GDI+ for drawing.
The program deals with matrices using Eigen^{9} library.
This is something that I quickly added to the repository readme file, I’ll also put it here, it might be useful:
Using the classes should be easy. Here is how to grab some atoms from the ‘basis’:
Systems::AtomWithShells H1, H2, O, N, C, He, Li, Ne, Ar; for (auto &atom : basis.atoms) { if (atom.Z == 1) H1 = H2 = atom; else if (atom.Z == 2) He = atom; else if (atom.Z == 3) Li = atom; else if (atom.Z == 8) O = atom; else if (atom.Z == 6) C = atom; else if (atom.Z == 7) N = atom; else if (atom.Z == 10) Ne = atom; else if (atom.Z == 18) Ar = atom; }
Here is how to set the H2O molecule with the coordinates from the ‘Mathematica Journal’ (referenced in the code):
H1.position.X = H2.position.X = O.position.X = 0; H1.position.Y = 1.43233673; H1.position.Z = -0.96104039; H2.position.Y = -1.43233673; H2.position.Z = -0.96104039; O.position.Y = 0; O.position.Z = 0.24026010; Systems::Molecule H2O; H2O.atoms.push_back(H1); H2O.atoms.push_back(H2); H2O.atoms.push_back(O); H2O.Init();
And here is how you calculate:
HartreeFock::RestrictedHartreeFock HartreeFockAlgorithm; HartreeFockAlgorithm.alpha = 0.5; HartreeFockAlgorithm.initGuess = 0; HartreeFockAlgorithm.Init(&H2O); double result = HartreeFockAlgorithm.Calculate();
You can do computation for a single atom, too, for now by putting it into a dummy molecule with a single atom in it. For example for He:
Systems::Molecule Heatom; Heatom.atoms.push_back(He); Heatom.Init();
I tested the program with various molecules, the initial test was on the H2O molecule, but I also tested it with many more, comparing with Hartree-Fock limits and results from other programs.
I checked the intermediate results against the Mathematica Journal results, then I used other programs for tests (especially for electron-electron integrals). While doing that I found bugs in another program, too: bug 1 and bug 2, so I actually did more than implementing this project, I also helped identifying and fixing bugs in other project as well.
I’ll put here the chart in the featured image only, just as an example:
This post ends the posts about Hartree-Fock. I might add a Post-Hartree-Fock method in the future, but certainly not this year. As usual, if you notice any mistakes/bugs please let me know.
Last time we ended up with a simplified Hamiltonian:
and having the variational principle for help. It turns out that this simplified Hamiltonian is not simple enough. In this post I’ll expose one method that allows one to do calculations for quite complex quantum systems, the Hartree-Fock method. Because the subject is quite large, obviously I cannot cover it into detail in a blog post. Entire books were written about it, or the method is a large part of books treating for example computational chemistry, but even those might not cover some details, as computing the integrals for example, but more about that, later.
This post will be followed by one that will describe a Hartree-Fock program^{1} I implemented. I intend here to focus on theory and there to give more details about implementation.
Since I cannot give all the details here, I’ll point to links/articles with more info. Here is one for the start: A mathematical and computational review of Hartree-Fock SCF methods in Quantum Chemistry^{2} by Pablo Echenique, J. L. Alonso.
The big problem with the Hamiltonian above comes from the second term. The other two can be written together as a sum of uni-particle Hamiltonian terms. Thanks to Born Oppenheimer approximation discussed last time, the potential of the nuclei is not time dependent, which would allow solving it relatively simple. The problem comes from the electron-electron interaction term. An approximation would be to consider each electron as interacting with an average potential given by all other electrons, as in mean field theory.
Let’s make some simplifying assumptions that are plain false and not that useful, but we’ll ‘fix’ them later. First, let’s assume that the electrons are non interacting, in the sense that instead of interacting with each other, they will generate together a mean field with which they interact. Then, let’s presume they have no spin. With those assumptions, we can describe individual electrons by wave functions, that is, each electron is described by a wave function and we can write the wave function for all electrons as a product of individual wave functions:
One gets for the j electron wave function the equation:
The first two terms are the kinetic energy term and the nuclear attraction potential energy term, respectively. The third one is the Hartree term. If one interprets the as charge density of the electron i, justified by the fact that it’s the probability of having the electron at position , the third term can be seen as being the Coulomb energy of the electron j in the field generated by the charge density of all electrons. You may imagine either the electrons being spread out with the density given by the wave function, or that they move very fast and somehow an electron feels an averaged field. Both interpretations are false but could be useful because we’re using a classical intuition.
This equation has issues, for example it completely neglects electrons correlations. The assumption that the electrons are non interacting and they can described separately by wave functions is false. Even if we are trying to use a product of uni-electron wave functions we should take into account that electrons are not distinguishable. Electrons are fermions and they obey the Pauli exclusion principle.
Another issue is that the electron interacts with its own field. This could be solved by avoiding summing j in the last term, but then each electron will interact with a different field.
The Hartree operator is also called the Coulomb operator and the sum of the kinetic energy operator and the nuclear attraction potential energy operator, the core operator.
Some of the issues are solved by an additional exchange term that I’ll detail later.
As mentioned above, the wave function should be anti-symmetric when dealing with electrons with the same spin. We should restrict the solutions to such wave functions. This is done using Slater determinants. For two electrons, the wave function becomes:
It is easy to verify that the probability of having both electrons in the same point becomes zero.
One can derive the Hartree-Fock equations using variational calculus, minimizing the energy functional for a Slater determinant. I don’t want to give all details here, for details you could look into the review article I already mentioned or in Derivation of Hartree–Fock Theory^{3}.
From now on I’ll use for the ‘core’ operator and for the Coulomb operator. Fock added a new term by subtracting an exchange term from the Coulomb one. The exchange term can be obtained from the Coulomb term (the Coulomb operator applied on the j wave function) by interchanging two labels. The Hartree term is (written a bit differently):
The exchange term is:
The minus sign is because of the anti-symmetry, the resulting operator being a non-local one. The Fock operator is:
The Hartree-Fock equations are:
I should mention here, just in case you noticed the featured image, that there is an implicit summation for spin which I did not detail. So now you know where the 2 comes from in the featured image: there are two electrons of opposed spin in the same orbital. The formula is for the restricted Hartree-Fock method which I’ll detail later.
The exchange term deals not only with the exchange, but cancels the self-interaction of the electron, too.
Now we have some equations that look familiar. They look like the eigenvalue problem, it looks like we have linear equations. It’s just the Schrödinger equation with the electron-electron interaction given by the Coulomb and exchange operators, right? The looks can be deceiving, the Fock operator depends on the eigenvectors, the equations are non linear! We have a set of coupled non-linear equations.
The method for solving it is by guessing some initial wave vectors for the solution, fit them into the Fock operator, solve the Hartree-Fock equations to get new wave vectors, replace the old ones into the Fock operator, solve again the equations, repeating the procedure until eigenvalues and/or eigenvectors do not change appreciably (some other more complex convergence criteria might be used, too), that is, the solutions converge. That’s why the method is also called the self consistent field method. It looks like first the electrons start in some orbitals that create a certain potential. The potential acts on the electrons that ‘adjust’ by moving into adjusted orbitals that generate another potential and so on, until the electrons end up in stable orbitals that do not change anymore under the influence of the potential.
You may try to solve the equations numerically, although they are not exactly easy to solve (they are integro-differential ones). There is a method that allows us to solve them algebraically. For that, we use the superposition principle to write the wave functions as a sum of projections onto basis states. In matrix representation, the wave functions are column vectors, with the projection values as elements (that is, ) and operators are matrices, with elements . It turns out that we have to deal mainly with matrix diagonalization (except solving the integrals, about which I’ll talk later). A bit of a problem would be that the Hilbert space is really infinitely dimensional, but the variational principle allows us to restrict to a subspace. If the projection of the real solution on that subspace approximates it well (that is, the component that lies in the orthogonal complement subspace is small) we can get quite close to the real solution, the variational principle ensuring that the obtained energy is slightly higher than the true one.
So the wave function can be decomposed using a basis:
If the basis wave functions would be orthogonal, the equations would look the same, that is, like the regular eigenvalue problem, just that the wave functions would be replaced by column vectors and operators with matrices. In general, they do not need to be orthogonal, in which case the overlap matrix S – with matrix elements – is also involved, they look like the generalized eigenvalue problem.
They used first Slater-type orbitals for the basis functions, they are still used for example for a single atom, but the difficulty of solving the integrals with different centers made the Gaussian orbitals a better choice. More precisely, they use linear combinations of Gaussians, like STO-nG, to approximate better a Slater-type orbital.
Currently the Hartree-Fock program^{1} I implemented uses STO-3G and STO-6G although is should be easy to extend to be able to use other basis sets as well. You may find out more about basis sets on Wikipedia. Here is a good start for Linear Combination of Atomic Orbitals. The source of the basis sets I used is here: EMSL Basis Sets Exchange^{4}.
In the closed shell case, all orbitals up to the ones in the valence shell are doubly occupied, that is, they are filled with two electrons of opposite spins. For such a case one can sum out the spins from Hartree-Fock equations (in matrix form), obtaining the Roothaan equations. They are already in the featured image, but here they are, again:
Keep in mind that the sums from the J and K operators now go over the occupied orbitals, not over the number of electrons.
I don’t want to detail the operators in the equations, you may find details in the links I provided, but I thought that I should I least mention the Density Matrix. For example for the restricted Hartree-Fock method, the density matrix operator is:
The matrix element as seen in the implementation is:
2 appears because of the two electrons occupying an orbital.
I also thought that I should mention the unrestricted method. I implemented it in the program, but I don’t want to detail it much, either here or in the post describing the code. If interested, you can check out the links and the code^{1}.
The Pople-Nesbet equations are:
They are obtained by eliminating the restriction of having two electrons occupy the same orbital, that is, each electron has its own orbital. This way one can deal with open shell atoms/molecules. Those equations are coupled, there is an additional term in the Fock operators that contains the density matrix for the other spin, that is, the ‘up’ equations also contain the density matrix for the ‘down’ electrons and viceversa. It should be obvious why, there is a coulomb interaction between ‘up’ and ‘down’ electrons, too.
This subject is huge, I won’t say much about details here. Here are some very brief ideas:
Gaussians are used because the product of two Gaussians is also a Gaussian and the Gaussian functions and their Gaussian integrals occur a lot in physics, so they are quite studied. Since the Hartree-Fock method was developed, a lot of work was put into finding better methods of calculating those integrals. By the way, in the order of complexity they are: the overlap integrals, the kinetic integrals, the nuclear attraction integrals and the electron-electron integrals, the later being the hardest to solve. You might also need to calculate the dipole moment (or even multipole-moment) integrals, for example if you want to compute the spectrum. Electron-electron integrals calculations need a lot of computing time and memory. For a complex molecule one might need to save them on disk or simply discard them and recalculate them again at the next iteration.
There are several methods, I just picked one for the program that seemed easier to implement but also fast enough. A naive approach ends up with calculating the same integral many times, with a lot of loops into loops into loops … you should get the picture by now … with binomial coefficients and so on, which is quite slow. They obtained better results by hitting the integrals with derivatives and other mathematical tricks obtaining recurrence relations which one can use to calculate them. I won’t enter into details, but I’ll give some links to some pdfs that might help. First, a general one that contains an entire chapter about molecular integral evaluation, although I don’t think it would be enough for writing a program able to calculate them, but it should give you an idea: Advanced Computational Chemistry^{5}. You may find more things in there, about the variational principle, basis sets, Hartree-Fock, and so on. Here is another one, just about molecular intergrals: Fundamentals of Molecular Integrals Evaluation^{6}. That should suffice for now, I’ll give more links when I’ll describe the Hartree-Fock program^{1}.
For the case of calculating the electron-electron repulsion integrals, using integrals permutation symmetries is mandatory, there are a lot of them and they are quite expensive to calculate. Using those symmetries allows you to calculate approximately 8 times less.
By the way, you’ll meet in those papers not only the Dirac notation but also a notation used by chemists, which even makes sense if you think in terms of density. It’s not a big deal:
For a contraction they use .
It should be obvious by now that the method allows calculating the ground state and the ground state energy. Since one can also calculate it not only for the molecule, but for the individual atoms that are in that molecule, one can get the binding energy. One can also calculate the dissociation energy. Since the method does not force you to have a neutral atom or molecule, you may put in there either more electrons to get negative ions, or less to have positive ions. This way you can calculate electron affinity and ionization energy.
You may want to calculate excitation energy. You could change the program to complete the orbitals accordingly (the variational principle works for excited states, too). I did not implement that, yet, although it’s not a very big deal.
As a shortcut, you could use Koopmans’ theorem.
A lot more can be done, but quite a bit of additional work would be involved, from computing the molecular structure to computing the electronic spectrum or the vibrational-rotational one. If you want to go there, you might want to implement a Post Hartree-Fock method to get better results.
I presented briefly some theory about the Hartree-Fock method, next time I’ll describe the Hartree-Fock program^{1}. This is a blog entry, not a lecture or a book, so I couldn’t cover it more. There might be quite a bit of mistakes or unclear things in the text, if you notice them, please let me know and I’ll correct them.
Quite a bit of time passed since my last post on this blog. I had a visit to the sunny Spain, I switched to a double-surface hang glider and I had to take a little care of my firm. Despite those things, I did a little bit of work for the blog, but it took me a little longer than I anticipated. This is a first post of ‘theory’ before presenting the code I implemented. If you want to take a look before having the description available, here it is: a Hartree-Fock program^{1}. I posted the project on GitHub a couple of weeks ago, but I just didn’t have the patience to start writing about it.
This post will be more general, not addressing the project much, but I hope I’ll refer it in more than one post, so here it is. It’s just an opportunity to collect together some links for those that need more info than the one that will be exposed in the next post, than a detailed description.
The problem appears to be simple, you have a bunch of particles, you want to see what happens to them. They might be some protons and neutrons in a nucleus, or a nucleus and the electrons in an atom, or more than one nucleus and more electrons forming a molecule. Or even larger systems, forming nanostructures or even crystals. The problem turns out to be very complex because they are interacting. A problem with more than two particles interacting – even with relativity dropped – cannot be solved exactly, you’ll have to resort to simplifications in order to obtain approximate solutions. The first simplification is to drop out relativity. With the Schrödinger equation, one can take relativity into account, but I’ll simply ignore it for now. The Hamiltonian for such a system is:
where K is the kinetic energy term and V is the potential energy term. The potential energy can be a sum of internal potential energy and external potential energy, because of an external field. The external field might be time dependent, which would complicate things a bit. From now on, unless explicitly stated, I’ll consider the external potential to be zero. This is the second simplification. Now, let’s switch to the equation that deals with such a system.
With no more words, here it is:
where:
I deliberately did not detail yet the dependency of or the operators, because their form depends on the particular representation. You often meet them in the position representation, but they can be also given in the momentum representation (or other representation!). In position representation, is a function of time and all particles positions it describes. More about representations, maybe later.
Don’t forget about the simplifications we use, if you want to consider an external magnetic field, too, for example, check out the Schrödinger–Pauli equation.
You may ‘justify’ the equation by taking a plane wave and hit it with time and position partial derivatives, trying to form the operators and arrange them in such a way to get the wave equation. You’ll also need to consider the de Broglie hypothesis, together with Planck-Einstein relation, . This should not be considered a derivation! For another nice justification, you may check out the Feynman lectures^{2}.
The equation is a linear equation. This means that if you have two solutions for the equation, any linear combination of them is also a solution. So it’s not only a plane wave that is a solution for the equation, but you may also compose wave packets out of them that are also solutions.
If the Hamiltonian does not depend explicitly on time, one can use separation of variables to find stationary solutions of the equation. One obtains the time independent Schrödinger equation:
Don’t forget that the wave function still evolves in time, the time evolution operator being that for an energy eigenstate becomes . Don’t forget that a superposition of such states is also a solution to time dependent equation, too.
Since we got rid of external potentials (and that includes those that vary in time) wanting to simplify the problem as much as possible, we’re from now on considering the time independent equation, the time dependency being trivial. We’re going to use another simplification, to get rid of all sorts of constants that complicate the formulae, that is, we’re using the atomic units.
The Hamiltonian for our bunch of particles being nuclei and electrons in a molecule is:
N is the number of electrons, M the number of nuclei, Z is the atomic number. The first two terms are the kinetic energy terms for electrons and nuclei, respectively, the next ones being in order: the electron-electron repulsion potential energy, the electron-nucleus attraction energy term and the nucleus-nucleus repulsion energy term. The interaction is nothing more than the plain old Coulomb interaction.
It looks complicated and it’s even way more difficult to solve than the looks might suggest. For a single nucleus and one electron only it can be solved analytically but even that is not exactly easy. Having a bunch of electrons and nuclei creates a problem that is analytically intractable. Even if you try to solve it as it is using some naive numerical approach, it’s not easy solvable even for four particles. Can we simplify it even more?
Yes we can, but this will only approximate the solution to the complete equation. The approximation turns out to be very good in many cases, so here it is. It is based on the observation that the electrons are very light compared with nuclei. Even compared with a single proton, the electron is more than 1800 times lighter. As a consequence, the nuclei ‘move’ much slower than the electrons, the electronic cloud adjusting to the nuclei configuration almost instantly. This allows separation of the motion of nuclei from the Schrödinger equation, you may treat them separately even using a classical approach. With the Born–Oppenheimer approximation the Hamiltonian we’ll have to solve first becomes:
that is, we eliminated the nuclei kinetic energy term and the nucleus-nucleus repulsion energy term, which remain to be handled separately. We have now only the electrons kinetic energy term, the electron-electron interaction term and the electron-nucleus interaction term.
Obviously I cannot enter into details here, I cannot write an entire quantum physics book in a blog post. There are plenty of books, there are plenty of resources on internet, too. I already pointed the Feynman lectures vol III^{2}, I will point out another resource, chosen arbitrarily: Quantum Mechanics, by Richard Fitzpatrick^{3}.
There are (or will be, hopefully) posts on this blog that might require you to be familiar with the Dirac notation, with various representations, the different ‘pictures’ (Schrödinger picture, Heisenberg picture, interaction picture), second quantization, Hilbert space, Fock space and so on… Unfortunately you’ll either have to be familiar with those – in which case there is a low probability that you need to read this post – or be willing to look yourself into them, I cannot explain them all here. I’m trying to keep things simple and give links to more details but this is not always possible.
When I started writing this post I wanted to give some details in this paragraph about the relationship between matrix mechanics and the wave formulation and also about some things linked above and maybe more, but I realized that it would be way too much to write than I’m willing and have patience to do, so I’ll let it as it is. Maybe some other time… but I must warn you that you’ll see a lot of matrices in the quantum mechanics related programs (and not only there, the matrix representation is very useful).
This is another way one could try to attack the problem. Unfortunately it’s not very useful for complex systems, because of the exponential increase of the vector space. That does not mean it’s not useful, a quick google search might reveal a lot of hits. This is just the first hit I got: Exact Diagonalization, being a part of a Quantum Simulation course^{4}. I thought that at least the method should be mentioned here.
Ok, so we reached the important part, the variational principle. It’s used not only in the Hartree-Fock method about which I already have the program on GitHub^{1}, but in many other cases, for example in Variational Quantum Monte Carlo or Density Functional Theory. I intend to make some programs that use the two mentioned methods for this blog, by the way.
This principle is so important for this post that the image on top contains an important formula. It’s the energy functional, with the wave vector assumed normalized. I think that the Wikipedia page is quite a good start, but anyway, I’ll detail a part here, too. Let’s start with the right hand side of the energy functional:
There is nothing fancy going on, just the unit operator being inserted twice. It’s the superposition principle in action, if you have a particular basis for your vector space, you may decompose/write your vector using its projections on the vectors that form the basis:
with the component along a particular basis vector being not surprisingly, the projection onto that vector (that is, the scalar product between them):
So, the Hamiltonian acting on an eigenvector gives the eigenvalue (by the way, we use the energy eigenvectors as basis):
But the eigenvectors are orthogonal, so:
Obviously, using the smallest energy from the spectrum, that is, the ground state energy, we find:
The ground state energy is just a constant, it can be taken out of the sum and using the fact that the wave vector is normalized, one gets:
Ok, but what does it mean? It means that no matter what wave function we try, we’ll get an energy that is bigger than the ground state energy (or equal, if we manage to pick up the ground state – or a ground state, if there are more of them).
If we try one at random, we get one expectation value. By some method we pick another wave function and calculate the expectation value for it. If it’s smaller, we drop the first guess and keep the new one. We can repeat the trials until hopefully we are close enough to the ground state. How we pick the state and how we modify it depends on the particular method and I’ll detail it for the specific program.
If you want to find more about this, here is a link: Introduction to Electronic Structure Calculations: The variational principle^{5}.
I should also mention here that you could derive the time independent Schrödinger equation by using variational calculus and imposing that a stationary state has the energy variation vanishing in the first order if the wave vector suffers a small variation from the stationary state.
I want to mention perturbation theory here, because I might use it in a future project for this blog. Not only that, but related with the following subject, Hartree-Fock, it can be used for a Post Hartree-Fock improvement, see Møller–Plesset perturbation theory.
Since we assumed that the Hamiltonian we have to deal with is not explicitly time dependent, I’ll let aside for now the time-dependent perturbation theory. If ever needed for this blog, I’ll detail it there.
The time independent perturbation theory is detailed well on the linked Wikipedia page, I want to emphasis here just the main ideas. In the case you have a Hamiltonian that you cannot solve, but you can write it as a sum of a Hamiltonian that can be solved (either exactly or by some approximation method) and a small term, then you can use the perturbation theory to get an approximate solution for the whole Hamiltonian. You need to expand both the eigenstates and eigenvalues of the full Hamiltonian in Taylor series at the unperturbed eigenstate/eigenvalue and substitute them into the time independent Schrödinger equation. Expanding and identifying the terms that have the same power allows you to get the correction terms. Obviously one has to stop at a certain order, when the approximate solution is close enough to the real one.
We ended up with a Hamiltonian which is still quite complex and very hard to solve even for simple systems. If we could simplify somehow the interaction terms we could try to solve it, but as it is it’s still very difficult. There are various ways of doing that, but that’s a subject for other posts.
So that’s it for now. The next post will be about Hartree-Fock theory. Then another one will follow that will present the program^{1}
I intended to have some molecular dynamics code for this blog, something a little bit more complex than the one described at Newtonian Gravity post, perhaps something with neighbors lists or maybe again something about gravity but with a Barnes–Hut simulation. I’ll do that maybe later.
I found that there are plenty of places on the net where hard smooth spheres are simulated using the usual way of doing N-body simulations, that is, the time driven one. The time is divided into small time steps, the balls advance each step according with the equations of motion, each time step they are checked for collision, I’ve even seen naive tests for each N(N-1)/2 pairs each step! But in such case one does not need to find the balls trajectories numerically, they can be analytically calculated. Between collisions they travel in straight lines! There is no need to advance in small increments. Even if there is an external field, for example a gravitational field, the path is easy to calculate analytically. A collision in the ideal case, the perfect elastic collision, is instantaneous. Even if it would be a very short range of interaction and the ‘particles’ would be spread far apart, it would be a bad idea to split time in equally small intervals. Most of the time the ‘particles’ would be non interacting. A many-body problem is hard because of the interactions, with no interactions it is very simple! If one uses the time driven algorithm naively, if the bodies achieve high enough speed, they can even pass right through each other.
So I decided to correct that common mistake by showing – with an easy example in code, which can be improved – how such cases should be treated. The main idea is to calculate the particle paths as they are non interacting, to find out when and where they collide (among themselves or with other obstacles) and use those events to calculate new paths that generate new events and so on. Some later events can be cancelled due of previously occurring events, for example a particle that heads towards a collision with another has its path changed by an earlier collision. I’ll try to describe the algorithm in more details later, until then here is the program in action:
The video is jerky because of the screen recording app, on my screen the movements were quite smooth.
I currently don’t have much time for writing this blog entry, perhaps I should edit it later, but I’ll try to give enough links to complete the information. By the way, the code is here, on GitHub^{1}.
Since I might not give enough information here about the algorithm, I’ll provide some links to articles that contain much more than I could write here. It all started with Studies in Molecular Dynamics by B. J. Alder and T. E. Wainwright^{2}. It’s more general than the hard spheres case, they consider not only an infinite potential but also a potential that acts only on a small finite range. Another article that’s worth a look and contains info about improvements of the algorithm used in the project for this blog entry is Achieving O(N) in simulating the billiards problem in discrete-event simulation by Gary Harless^{3}. I wrote the program with the intention of having one easy to understand, if you want it faster you’ll have to implement sectoring. It’s not that big deal but it would complicate the program a little bit so I decided to have it without sectoring for now. Another article that describes the event-driven algorithm is Stable algorithm for event detection in event-driven particle dynamics by Marcus N. Bannerman, Severin Strobl, Arno Formella and Thorsten Poschel^{4}. Another one is How to Simulate Billiards and Similar Systems by Boris D. Lubachevsky^{5}. Maybe you want to generalize it to more than the three-dimensional space? No problem, here is a good start: Packing hyperspheres in high-dimensional Euclidean spaces by Monica Skoge, Aleksandar Donev, Frank H. Stillinger, and Salvatore Torquato^{6}. I think there is some source code somewhere on the network that was used for this article, I’ll let you search for it if you are interested. How about parallelizing it? Well, it’s not very simple, here it is something to get you started: Parallel Discrete Event Simulation of Molecular Dynamics Through Event-Based Decomposition by Martin C. Herbordt, Md. Ashfaquzzaman Khan and Tony Dean^{7}. I think those should be enough for a start, there is plenty of other info on the internet.
I would divide the physics in three parts: the part about computing the paths of ‘particles’ while they do not interact, the part of computing the place and moment of collision and the part about computing the result of the collision. Here they are, in order:
In the project on GitHub^{1} the particles travel according to Newton’s first law, that is, in straight lines. The trajectory is given by knowing the initial position and velocity by:
If you want to add an external field, let’s say a gravitational one, it is more complicated, but still not a big deal to compute:
In the first case, the velocity doesn’t change until the collision takes place, in the second case it does but it’s also easy to compute:
By the way, if you want, you could change the program to take into account such a field, too.
Here is the code that calculates the position. It uses the Vector3D
class already used in the Newtonian gravity program.
inline Vector3D<double> GetPosition(double simTime) const { return position + (simTime - particleTime) * velocity; }
It’s basically the above formula translated into code (it’s in the Particle
class). simTime
is the simulation time, particleTime
is the time for the ‘initial’ position of the particle. Each particle has its own time, it’s typically the time the particle had a collision but it also can be the time when a particle that had a collision with it scheduled actually collided with another particle (more about this, later).
It should be obvious from the above that you must find out only one of them to easily calculate the other one. The condition to have a collision is to have the center of the spheres approaching each other and at the moment of impact the distance between centers is the sum of each ball radius. If we want to calculate the moment of collision for the particles i and j, ignoring for now all the other ones (a particle i can collide with another one before hitting j, but about that, later), we impose the condition of having the distance between them equal with the sum of radii, noted with s here to avoid confusion:
This is a quadratic equation which might have two complex conjugate solutions in which case the ‘particles’ do not collide, or two real solutions. Of course, we are interested only in solutions that are not in the past of any of the involved particles and also the centers should be approaching, not departing. If both solutions are in the future, we pick the first one because the second time is for the case when the particles would depart if they would pass through each other. You’ll find more details about this in the articles, here is the code that deals with the time of collision, also found in the Particle
class:
inline double CollisionTime(Particle& other) const { // start from whatever time is bigger, this is the reference time double refTime = std::max<double>(particleTime, other.particleTime); double difTime = other.particleTime - particleTime; Vector3D<double> difPos = position - other.position; // adjust because the particles might not have the same time // there are two cases here: // 1. difTime > 0 means that the 'other' particle time is bigger // refTime is 'other' particle time // in such case 'position' (the position of 'this' particle) must be advanced in time // like this position = position + velocity * difTime // 2. difTime < 0 means that 'this' has the bigger reference time // the position of the 'other' particle must be advanced in time // like this: other.position = other.position + other.velocity * abs(difTime) // in difPos other.position is with '-', difTime is negative so the 'abs' is dropped difPos += (difTime > 0 ? velocity : other.velocity) * difTime; Vector3D<double> difVel = velocity - other.velocity; double b = difPos * difVel; double minDist = radius + other.radius; // collision distance double C = difPos * difPos - minDist * minDist; double difVel2 = difVel * difVel; double delta = b * b - difVel2 * C; // a delta < 0 means to complex solutions, that is, no real solution = the spheres do not collide // delta = 0 is the degenerate case, the spheres meet in the trajectory and depart at the same time = tangential touch, no need to handle it, they are smooth spheres // delta > 0 is the only interesting case // the two different solutions means that there are two times when the spheres are at the radius1 + radius2 distance on their trajectory, which is what we want to consider // the first time is the time they 'meet' and that's the needed time, the time of the collision event after which new velocities must be calculated // the b < 0 condition is the condition that the centers approach, b > 0 means that they are departing if (delta > 0 && b < 0) { double sdelta = sqrt(delta); double t1 = (-b - sdelta) / difVel2; // t2 is not needed, it's the bigger time when they would be again at the radius1+radius2 distance if they would pass through each other //double t2 = (-b + sdelta) / difVel2; // if this time would be negative, that is, it would be as if it took place in the past // perhaps there was a real collision after it that changed the trajectory? if (t1 >= 0) return refTime + t1; } return std::numeric_limits<double>::infinity(); }
Hopefully the comments are clear enough.
In case you want to add an external field, such as the uniform gravitational field, the trajectories are parabolic. You’ll end up with a quartic equation, which will either have all four roots complex, two roots complex and two real, or all four roots real. In the first case there is no collision, in the second case they would collide only once (there is one real solution for when they meet then the later one for when they would depart if they would pass through one another). In this case the situation is quite similar with the one treated above. With four real roots there are two collisions (and two departures as the ‘particles’ pass through one another). Just pick the first one that is for the future of both particles. I ignored here the degenerate cases, I’ll let the detailed study of roots up to you, if you want to have this coded. Here is a start.
The project considers smooth hard spheres. Hard means an infinite potential inside the sphere, and zero outside. The interaction is instantaneous. Smooth means there is no friction between balls, that is, the tangential velocity does not change during the collision. Either of those could be relaxed and modelled in the program, I consider the most interesting one the second one… in that case the state of the ‘particle’ would not be given by only the position and momentum, you will also need the angular momentum. You’ll involve the moment of inertia which for a homogeneous ball of mass m is easy to calculate. Having the balls rotate and change their rotation during the collision would complicate the program quite a bit, but it could be done.
Anyway, the project considers the spheres smooth. Both momentum and kinetic energy are conserved:
You might think that by writing on vector components you get only four equations but there are six unknowns, but do not forget about the two unchanged tangential components of the velocities. I won’t solve this here, it’s quite easy but I have limited time and I don’t want to type many formulae. If you want to find the solution yourself, you can use the center of mass coordinate system to ease things up. You’ll find out that in that particular coordinate system, the velocities just reverse sign after collision. To turn back to the original coordinate system, just add the velocity of the center of mass to them and that’s it.
Here is the code that calculates the velocities after collision, from the class Simulation
:
static inline void AdjustVelocitiesForCollision(Particle& particle1, Particle& particle2) { double totalMass = particle1.mass + particle2.mass; Vector3D<double> velDif = particle1.velocity - particle2.velocity; Vector3D<double> dir = (particle1.position - particle2.position).Normalize(); Vector3D<double> mult = 2 * (velDif * dir) * dir / totalMass; particle1.velocity -= particle2.mass * mult; particle2.velocity += particle1.mass * mult; }
Hopefully I didn’t make a mistake trying to simplify the code.
I didn’t mention walls when talking about trajectories and collisions above, but the ‘particles’ are contained in a cube. The collisions with the walls are elastic collisions, too. They are quite easy to handle so I’ll let you check out the code for those. Maybe you could add some code for collision with the camera. Just something like a sphere around it with infinite mass. That would provide some nice effects when you move the camera inside the cube. It should be very easy to implement.
Here is the algorithm, presented very shortly:
Simulation::GenerateRandom
.Simulation::BuildEventQueue
.Basically all the algorithm code is in the Simulation
class along with some help from the Particle
class. Here is the Advance
method:
void Simulation::Advance() { RemoveNoEventsFromQueueFront(); if (eventsQueue.empty()) return; // should not happen, except when there are no particles Event nextEvent = GetAndRemoveFirstEventFromQueue(); int numParticles = (int)particles.size(); AdjustForNextEvent(nextEvent); EraseParticleEventsFromQueue(nextEvent); // also adjusts 'partners' AddWallCollisionToQueue(nextEvent.particle1); if (Event::EventType::particleCollision == nextEvent.type) { AddWallCollisionToQueue(nextEvent.particle2); AddCollision(nextEvent.particle1, nextEvent.particle2); for (int i = 0; i < numParticles; ++i) { if (i != nextEvent.particle1 && i != nextEvent.particle2) { AddCollision(nextEvent.particle1, i); AddCollision(nextEvent.particle2, i); } } } else { for (int i = 0; i < numParticles; ++i) { if (i != nextEvent.particle1) AddCollision(nextEvent.particle1, i); } } }
For the rest please check out the code on GitHub^{1}.
One very important improvement would be to consider periodic boundary conditions. If you don’t want to simulate some very small cavity (or channel, whatever) you would want to have conditions as close as in some bulk liquid or gas, that is, free from surface effects. This can be realized with the periodic boundary conditions. I didn’t implement it because it complicates the code a little bit and it works well together with sectoring, which I also avoided. Please look into the linked articles for details, sectoring increases computation speed even more.
One could extend the program to do other interesting things than diffusion, for example one could add a ‘wall’ that splits the cube that contains the particles in half and allows particles to pass (or be reflected) through it depending on the particle size and mass, perhaps with a probability attached. This would be a model for osmosis. Put a heavy big particle in the middle and simulate Brownian motion. One can think of many other things, especially with an external field added.
There would be many statistics that could be done, as distribution of speeds, collision rate, mean free path, temperature reached when equilibrium is reached (especially if you start out with different temperatures in the beginning as mentioned above) and so on. I won’t give many details, there are some computational physics books I already mentioned on this blog that offer plenty of details on what and how to get information out of molecular dynamics. Maybe I’ll detail this in another post about molecular dynamics.
If I would decide to add some charting to the program, I would probably add one that shows the speed distribution as the system evolves towards equilibrium.
Especially for the code intended to present some concepts rather than be a production code, I agree with Donald Knuth:
The real problem is that programmers have spent far too much time worrying about efficiency in the wrong places and at the wrong times; premature optimization is the root of all evil (or at least most of it) in programming.
I did not focus on optimization. I tried first to make the code work correctly (although I’m not 100% sure it works all right, I did not test it enough). I consider that the code should first express intent, then one should work on optimization.
At first I simply called the Simulation::Advance
from the main thread and it worked quite all right for plenty of particles. The code that does computations should reside in another thread to avoid locking the main thread, so after that I implemented the MolecularDynamicsThread
. It uses Advance
to generate results that are put in a queue accessed from the main thread. It’s a quite standard producer/consumer code but it still can lock the main thread if the worker thread cannot keep up with the requests, so don’t simulate many particles with a lot of speed. I didn’t think much about it, I just wanted to move out the computation from the main thread. What the main thread gets from the worker thread are ‘snapshots’ of the computation, after each collision. The main thread calculates the actual particle positions using the position, particle time (each particle has its own time) and velocity. Since particles travel in straight lines between collisions, there is no need to move those computations in a separate thread, they are fast enough to not pose problems.
For the same reasons, I used a set::multiset
for the events priority queue, although it’s probably implemented with a tree. I did not care about performance at that point. Later, after the first commit on GitHub^{1}, I replaced the multiset with a heap. Since the std priority_queue
exposes const
iterators, I implemented one myself using std algorithms for a heap. With this change the performance increased a bit. One might do better by looking into heap implementations from boost, but I didn’t consider it necessary for the purpose of the project. The commit has a ‘WARNING!’ in the comment, you might want to get the code with the multiset if you notice bugs in the one with the heap. I turned removing the events from the queue into setting them to ‘no event’ event type instead. Maybe not very elegant, but it works. Those events are removed when they reach to the top of the heap.
I’ve got the OpenGL code from the Newtonian Gravity project and I customized it to fit this project. First I dropped some classes that are not used, like those for shadow cube map, skybox and textures. I did some copy/paste into the view class for the OpenGL code but there are quite a bit of changes in there. There is also a MolecularDynamicsGLProgram
class derived from OpenGL::Program
which has quite different shaders compared with the Solar System project. I also changed the OpenGL::Sphere
class to allow a useTexture
parameter at creation. For this project no texture is used, so it is set to false. There is no need of passing info about texture coordinates if it’s not going to be used. I also added a DrawInstanced
method and there is one of the most important change in the OpenGL code compared with the Solar System implementation. In the view you’ll notice that there are three OpenGL::VertexBufferObject
, one for color, one for scale and one for position. The scale is set at Setup
(for OpenGL) time. In there the color is also set, but that can change later during running. The only thing that changes each frame is the position. The code draws all the spheres by a single call to DrawInstanced
after setting the position data all at once in the vertex buffer. I already gave some tutorial links in the Newtonian Mechanics post, I’ll only point to instancing ones now: here^{8} and here^{9}. It’s not the purpose of this blog to describe in detail OpenGL code so I won’t describe the code more. I’ll mention that I tested it with more than one directional light and it works. Currently it’s a single light only but you can add more in CMolecularDynamicsView::SetupShaders
.
As for the Solar System project, the program uses mfc, already included with Visual Studio, the obvious OpenGL library and glm^{10} and glew^{11}.
I won’t describe here the classes from the OpenGL
namespace, I’ll just mention that they are quite thin wrappers around OpenGL API, they typically contain an ID
. Please visit the Newtonian Gravity post for more details.
Here it is a short description of the classes, first the ones generated by the App Wizard:
CMolecularDynamicsApp
– the application class. There is not much change in there, just added the contained options and InitInstance
is changed to load them and also the registry key is changed to Free App
.CMolecularDynamicsDoc
– the document class. It contains the data. Since it’s a SDI application, it is reused (that is, it is the same object after you choose ‘New’). There is quite a bit of additions/changes there. It contains the worker thread, the results queue, a mutex that protects the access to the results queue, the ‘simulation time’ and so on. Some of them are atomic for multithreading access although maybe not all that are actually need to be. I did several changes during moving the computation in the worker thread and I probably left some atomic variables that do not need to be.CMolecularDynamicsView
– the view class. It take care of displaying. Contains quite a bit of OpenGL code, it deals with key presses and the mouse for camera movements. Among other OpenGL objects it contains the camera, too.CMainFrame
– the main frame class. I added there some menu/toolbar handling. It also forwards some mouse and keyboard presses to the view in case it gets them.CAboutDlg
– what the name says.Now, some classes I ‘stole’ from my other open source projects:
CEmbeddedSlider
and CMFCToolBarSlider
– they implement the slider from the toolbar, used to adjust the simulation speed.CNumberEdit
– the edit control for editing double/float values.ComputationThread
– the base class for the worker thread.Vector3D
– what the name says. First used in the Solar System project. I already anticipated there that I will be using this in other projects.Classes dealing with options and the UI for them:
Options
– the class the contains the options and allows saving them into registry and loading them from registry.COptionsPropertySheet
– the property sheet. It’s as generated by the class wizard except that it loads and sets the icon. I already used it in other projects here.SimulationPropertyPage
, DisplayPropertyPage
, CameraPropertyPage
– property pages that are on the property sheet. They are quite standard mfc stuff, I won’t say much about them.The classes that contain the implementation (apart from the document and the view, you should also check those to understand how it works):
ComputationResult
– it’s what the worker thread passes to the main thread. It contains just a list of particles and the next collision time.MolecularDynamicsGLProgram
– the OpenGL program. It’s derived from OpenGL::Program
and it deals with shaders, lights and so on.Particle
– implements a particle. Contains the particle mass, radius, position and velocity and a ‘particle time’. Has some methods that allows finding out the particle position at a particular time (if it suffers no collision until then), the wall collision time and the collision time with other particle.Event
– this implements the events that are in the events queue. The event has a type (currently it can be ‘wall collision’, ‘particles collision’ or ‘no event’), a time of occurrence and it also contains the particles involved. If only a particle is involved, it is set in particle1, the other should be -1. Simulation
– as the name says, it is the simulation class. It contains a vector of particles that holds all simulated particles and the events queue, currently implemented with a heap (using a vector container). It has several helper methods, the main one that advances simulation from one collision event to the next one is Advance
.MolecularDynamicsThread
– the worker thread. It’s a typical producer/consumer implementation but I feel it’s far from being optimal. My main goal was to have the simulation functional, I didn’t focus on performance, so here it is, something that works. It advances the simulation and it puts the results in the document’s results queue (if not filled up).That’s about it. Unfortunately I didn’t have time for more details but I guess this is more than nothing.
If you notice and mistakes/bugs or if you have some improvements, please point them out in a comment.
After quite a bit of time, here is the C++ program I promised. I’ve made a youtube video just to present it, for more meaningful results it must run with way more sweeps than in the video:
It turned out that the program I used for making the video had some troubles with the frame rate and changes between frames and it couldn’t keep up so I had to cut out an interesting part, the evolution close to the critical temperature. Anyway, you can get the program yourself and run it, from here: Project on GitHub^{1}.
For theory I’ll send you again to the free computational physics book^{2} I found on the net and also to a link into a CompPhysics GitHub repository, containing some lectures^{3} of Morten Hjorth-Jensen. Do browse the repository, you might find some interesting things there.
I already covered a bit of theory in Monte Carlo methods and The Ising Model posts so I won’t repeat it here. The implementation of the Metropolis Monte Carlo is very similar with the one from the previous post, just that now it’s in C++ instead of JavaScript and now there is an optimization by using pre-calculated values for the exponentials instead of calculating them each time. There is also some more in there, accumulating Energy and Magnetization to be able to calculate statistics. Here is the code for a Monte Carlo sweep, for comparison:
void SpinMatrix::MonteCarloSweep() { unsigned int size = m_rows * m_cols; std::uniform_int_distribution<unsigned int> rndRow{ 0, m_rows - 1 }; std::uniform_int_distribution<unsigned int> rndCol{ 0, m_cols - 1 }; std::uniform_real_distribution<double> dbl_dist{ 0 , 1 }; for (unsigned int pos = 0; pos < size; ++pos) { unsigned int row = rndRow(rndEngine); unsigned int col = rndCol(rndEngine); double energyDif = EnergyDifForFlip(row, col); if (energyDif < 0) { // accept it // flip the spin m_spins[m_cols*row + col] *= -1; Energy += energyDif; Magnetization += 2 * GetSpin(row, col); } else { //double val = ExpMinusBetaE(energyDif); double val = expMap[(unsigned int)energyDif]; if (dbl_dist(rndEngine) < val) { // accept it // flip the spin m_spins[m_cols*row + col] *= -1; Energy += energyDif; Magnetization += 2 * GetSpin(row, col); } // else reject it, which means do nothing } } }
As you can see in the code above, energy and magnetization are needed. There are two options: either calculate energy and magnetization when needed by summing up for all spins, or calculate it only once and then only modify the values each spin flip. If you want to use for statistics the state obtained after each sweep, the second method is better because not all spins are flipped during a sweep, some flips are rejected by the algorithm, so the computational cost is lower when using the latter method.
Getting the average energy and magnetization is straightforward, to get the specific heat and susceptibility a little bit more work is needed. For the specific heat:
The average energy is given by:
Z is the partition function and plays an important role in statistical physics:
Substituting in the formula for the specific heat and calculating derivatives one gets, after some calculations which I’m too lazy to type:
It works in a similar manner for every pair of conjugate variables, in this case for susceptibility one gets:
By the way, in the code the Boltzmann constant is set to 1, the same goes for J. The reason why J appears explicitly in the code is to allow for further change of the code, to try out anti-ferromagnetic interaction, too.
Here is the code that gets the specific heat into the chart, the one for the magnetic susceptibility is quite similar:
void CIsingMonteCarloDoc::SetupSpecificHeatChartData() { std::vector<std::pair<double, double>> data; data.reserve(statsList.size()); for (auto &&stat : statsList) { double val = abs(stat.AvgE2 - stat.AvgE * stat.AvgE) / (stat.Temperature * stat.Temperature * opt.latticeSize * opt.latticeSize); data.push_back(std::make_pair(stat.Temperature, val)); } m_specificHeatChart.clear(); m_specificHeatChart.AddDataSet(&data, (float)opt.chartLineThickness, opt.specificHeatColor); }
For some info on the spin block renormalization I simply refer you to Real Space Renormalization Group Techniques and Applications by Javier Rodriguez-Laguna^{4}.
I just want to mention here some details about the implementation: although I could implement a majority rule for 3×3 spin blocks, I chose to use 2×2 for visualization reasons. This way there is an ambiguity, because some blocks might have no magnetization. In such a case it is customary to use a random value, I chose to use decimation instead by just picking the upper left corner value to decide. For the decimation alternative, I used the same upper-left value, sometimes the deciding spin is picked at random. The code can be easily changed to use any variant, feel free to experiment.
Since the spin block renormalization group originated with the work of Leo Kadanoff, which sadly passed away last year, I’ll point to a recorded statistical mechanics lecture presented by Kadanoff at Perimeter institute^{5}.
To get some interesting results for the renormalization, one should make calculations for quite a big spin matrix size and also have one of the temperatures at the critical temperature (which is a little bit higher than the theoretical one for an infinite size system). A state under the critical temperature will evolve during renormalization transform towards the T=0 fixed point, while one above the critical temperature will evolve towards the infinite temperature fixed point. Those are relatively easy to obtain. A state that’s at the critical temperature will exhibit a scale invariance, that is, it will look more or less similar no matter how much the system is ‘zoomed out’ (not exactly true for the finite system, whence the need of using a big matrix for visualization).
I’ll try to record a video later with a big matrix size as an example and post it here.
I had to stop without complicating the program too much, or else it could take a long time to have a program for the blog post. In the links I provided there is information about error estimation and improvement in statistics calculation.
A simple improvement to avoid autocorrelation would divide the results into blocks, average them for each block and use the averages for calculations (this is also called the binning method). The Computational Physics book^{2} treats it along with the jackknife and bootstrap methods.
To avoid the critical slowdown close to the critical temperature, the heat bath method or a clustering algorithm like Wolff algorithm could be used, too.
Parallelization can be also improved, I chose the easiest way to parallelize it: just run independent threads and gather statistics from all of them. One can divide the spin matrix into stripes or even blocks and run the algorithm in parallel on the same spin matrix (not with different spin matrices as this project). The algorithm can be implemented with OpenCL or CUDA to run on the video card, for example.
Here are some charts I got by running with seven threads and with many sweeps (thousands) for equilibration and over 10000 for statistics, for a 128×128 spin matrix. The temperature step was 0.05.
Before presenting the classes, a little bit of a rant. I displayed the spins first with GDI+ calls – FillRectangle
. I have a pretty decent monitor so it’s quite a bit of resolution. With the application maximized, the frame rate achievable was very low. One would expect more from a dual Xeon workstation and a pretty decent video card (although with passive cooling). I ‘downgraded’ to simple GDI to learn that there is no much of a difference, which I expected, but I had to try… then I switched to Direct2D, although I hate using it with mfc because print preview does not work with it (I did use it in the electric field lines project, though, just for fun). It appeared to be faster, but not much faster. I did not time it until that point, so I cannot say what was the difference, but it was pretty small. I turned back at the plain old GDI and used a bitmap instead. More precisely, a memory device independent bitmap (DIB). You can get more speed by using a device dependent one (I did even that in the past) but it’s not worth it. The big speed jump (as in from 2 frames/second to 50 frames/second or more) was due of avoiding calls into the slow library and implementing the equivalent call into my own class. Probably using device dependent bitmaps would make it a little bit faster and using Direct2D even faster, but just a little bit. Not worth it.
So, here are the classes in the order of importance (sort of):
SpinMatrix
– More or less similar with what I presented in the previous post, but with the addition of pre-calculating the exponential values and with renormalization code added. It can be initialized at ‘zero’ temperature or infinite temperature. It’s just the spin matrix with periodic boundary conditions, containing the implementation of a Metropolis Monte Carlo sweep on the spins.
MonteCarloThread
– Has a SpinMatrix
member called spins
. In Calculate
it runs a TemperatureStep
for each temperature in the interval, starting with the lower one. Before this loop it runs a ‘warmup’ loop. The TemperatureStep
just runs several sweeps for equilibration then several others for gathering statistics. The numbers are configurable, of course. One thread will pass the spin matrix to the main thread for displaying – in PassData
, all threads will gather statistics in PassStats
.
Statistics
– Contains the accumulated statistics and has some operators overloaded and a CollectStats
method. The usage is pretty straightforward.
CIsingMonteCarloDoc
– The document class. Manages the threads, the data gathering and the spin matrices for display – both the ones that are displayed during threads running and when displaying the spin block renormalization. It also handles the charts and chart data. It has a copy of Options
because the options that are stored in the application object can change during threads running.
MemoryBitmap
– The class that helps with displaying the spin matrix. Plain old GDI for drawing, nothing fancy.
CIsingMonteCarloView
– the view. Has a timer, deals with displaying and printing. Also holds some MemoryBitmap
members for displaying.
Chart
– The charting class. I just copied it from the nrg project. It uses GDI+ for drawing.
Options
– The options class, it has methods to save them into registry and load them from registry.
COptionsPropertySheet
, CIsingModelPropertyPage
, CSimulationPropertyPage
, CRenormalizationPropertyPage
, CDisplayPropertyPage
, CChartsPropertyPage
, the property sheet class and the property pages classes, respectively. Quite standard mfc implementations, nothing hard to understand.
CNumberEdit
– the edit control for double/float values. I just copied it from the nrg project.
ComputationThread
– the base class for MonteCarloThread
.
CMainFrame
– the main frame. Deals with menu commands.
CIsingMonteCarloApp
– the application class. Has an Option
member, loads the options at startup and also deals with GDI+ initialization.
CAboutDlg
– what the name says.
This concludes for now the posts about the Monte Carlo methods, hopefully I’ll have more posts about them in the future, but I’ll probably switch to something else next post.
If you notice and issues, bugs, mistakes and so on, please let me know. As a warning, I did not test the magnetic field at all so the code might be buggy with a non-zero magnetic field.
This is an intermediate post between the one on the Monte Carlo methods and one presenting a Monte Carlo C++ program I intend to write. My goal is to briefly expose the theory here – most of it with links – and provide a very easy JavaScript example for the Metropolis algorithm applied on the 2D Ising model.
The Ising model is one of the simplest models that have a non trivial behavior and it’s very important because of the universality. For this post and the next one, I’ll consider a special case, the 2D Ising model on a square lattice. I even drop the position dependency of the magnetic field/coupling and the directional dependency of the interaction strength, so the Hamiltonian will be:
J>0 for ferromagnetic interaction and J<0 for antiferromagnetic interaction, the sum over (i, j) is summing all adjacent pairs.
There isn’t much to say about it besides what’s already in the Wikipedia pages, it’s a quite simple model. The book^{1} I pointed out in last post also contains quite a bit of information on it so I won’t bother to present more theory here.
Please take a look into my last post for the theory, here it is applied on a 2D square-lattice Ising model with periodic boundary conditions:
It’s JavaScript code I quickly wrote just to illustrate the Metropolis algorithm. Here is the code:
var monteCarlo = (function() { var canvas = document.getElementById("monteCarloCanvas"); var ctx = canvas.getContext("2d"); var isingSpins = { Size: 64, Temperature: 2.26918531421/0.95, // somewhere near the critical temperature spins: [], // the spin matrix displaySize: canvas.width / 64, // the size of the square for a spin // just to make the code more clear, a function to access the spin in the lattice Spin: function(row, col) { return this.spins[this.Size*row + col]; }, // adjusts the 'index', taking care of periodic boundary conditions Index: function(index) { return index < 0 ? index + this.Size : index % this.Size; }, // this will also take care of the periodic boundary conditions, use this one instead of 'Spin' for neighbors Neighbor: function(row, col) { return this.Spin(this.Index(row), this.Index(col)); }, NeighborContribution: function(row, col) { return this.Neighbor(row - 1, col) + this.Neighbor(row + 1, col) + this.Neighbor(row, col - 1) + this.Neighbor(row, col + 1); }, ExpMinusBetaE: function(E) { return Math.exp(-1. / this.Temperature * E); }, EnergyDifForFlip: function(row, col) { return 2 * this.Spin(row, col) * this.NeighborContribution(row, col); }, // initialize the matrix with random spins, this corresponds to infinite temperature Init: function() { var nr = this.Size * this.Size; for (var i = 0; i < nr; ++i) this.spins.push(Math.random() < 0.5 ? -1 : 1); }, // the sweep over the lattice Sweep: function() { var nr = this.Size * this.Size; // number of spins in the lattice // for all spins on the lattice for (var i = 0; i < nr; ++i) { // pick a spin at random var row = Math.floor (Math.random() * this.Size); var col = Math.floor (Math.random() * this.Size); // what is the change in energy if the spin flips? var energyDif = this.EnergyDifForFlip(row, col); if (energyDif < 0) // accept the spin flip this.spins[this.Size*row+col] *= -1; else { if (Math.random() < this.ExpMinusBetaE(energyDif)) // accept the spin flip this.spins[this.Size*row+col] *= -1; // else reject it, that is, do nothing } } }, Display: function(ctx) { for (var i = 0; i < this.Size; ++i) for (var j = 0; j < this.Size; ++j) { if (this.Spin(i, j) < 0) ctx.fillStyle = "#FF0000"; else ctx.fillStyle = "#0000FF"; ctx.fillRect(i * this.displaySize, j * this.displaySize, this.displaySize, this.displaySize); } } }; isingSpins.Init(); return function () { // call a monte carlo sweep isingSpins.Sweep(); // then display isingSpins.Display(ctx); } })(); setInterval(monteCarlo, 100);
The code is pretty straightforward, with the help of the comments and the last post it shouldn’t be hard to understand. I tried to make it easy to understand instead of optimizing it, for example one could pre-calculate the exponentials for energy difference for a spin flip at a specific temperature, but probably the code would not be so clear so I did not bother.
This is just a quick post before having a C++ program implemented. It’s Sunday, so I didn’t want to spend too much time on it. Hopefully it can be useful even as it is.
If you find any mistakes, please point them out.
This is an important branch of computational physics and hopefully I’ll have several programs to post on this topic. I’m quite sure I’ll post at least one. I’ll write here some theoretical things briefly describing the methods, to be referenced later. But before going into that, I’ll point to a book on Computational Physics^{1} which treats among others, Monte Carlo simulations. The book is free, it’s worth a look, although the code is in Fortran (later edit: now it’s available with C++ code, too!).
Since I mentioned a book, I’ll point to another one which also treats many computational physics topics: Computational Physics by J. M. Thijssen. Of course there are many specialized books that treat one topic only, entering into lots of details, but for some generalities this one is among the best.
Here is how it works out for the very simple example of calculating :
It’s a JavaScript code I wrote just to have some code for this post, it should not be taken seriously. One could approximate pi using only a quarter of the circle but I thought that displaying the whole circle looks nicer. Anyway, keep in mind that the random generator from JavaScript might be amazingly bad (depending on your browser). Here is the code:
(function() { var piText = document.getElementById("piText"); var canvas = document.getElementById("piCanvas"); var ctx = canvas.getContext("2d"); var radius = canvas.height / 2.; ctx.translate(radius, radius); var totalPoints = 0; var insidePoints = 0; function randomPoint() { var x = 2.*Math.random()-1.; var y = 2.*Math.random()-1.; var p = x*x + y*y; if (p < 1) { ctx.fillStyle = "#FF0000"; ++insidePoints; } else ctx.fillStyle = "#444444"; ++totalPoints; ctx.fillRect(radius*x-1,radius*y-1,2,2); if (totalPoints % 100 === 0) { var piApprox = insidePoints / totalPoints * 4.; piText.innerHTML = piApprox.toPrecision(3); if (piText.innerHTML == "3.14") { totalPoints = 0; insidePoints = 0; ctx.clearRect(-radius, -radius, canvas.width, canvas.height); } } } setInterval(randomPoint, 10); })();
The code is pretty straightforward: it generates ‘random’ points inside the square and counts the hits inside the circle. Pi is approximated using the ratio between the hits and all the generated points. After hitting 3.14 it starts again.
As far as we can tell, there is true randomness in Nature. Even if that would not be the case, many aspects of Nature would look as if there is true randomness, for reasons I won’t detail yet. Hopefully I’ll have some posts about it in the future, though. In any case, (pseudo)random numbers are very important in computational physics, being the most important thing in the Monte Carlo algorithms. This cannot be stressed enough. If you use a pseudo random number generator that has some periodicity built-in that’s smaller than the ‘time’ period of the simulation, or if you think you use a ‘uniform’ distribution that is not really uniform or there are correlations in the generated numbers, you may have quite a surprise from the simulation.
Entire books were written on how to generate pseudo random numbers. It’s not only about having good ones, but it’s also about having fast ones, because in a Monte Carlo algorithm you might need a lot of such numbers. Anyway, I don’t want to write much about this subject here, Wikipedia has lots of info on it, plenty of links to follow if you are interested. I don’t want to enter into pseudo random number generating details, but I thought that here it must be at least mentioned. Again, it is very important! The first link at the bottom of the post^{1} contains quite a bit about pseudo random number generation, please check it out.
Let’s look into numerical integration and consider a simple integral:
The simplest attempt to numerically solve it would be to consider a small but finite interval h, instead of dx and transform the integral into a sum:
where and . This is the rectangle method. Maybe you start to see a connection (using the fundamental theorem of calculus) with how I started to describe the numerical methods for solving differential equations in previous posts…
Of course, you can try to do better. One could go into even more complex ones, but this section is only for introduction and comparisons.
The important thing to notice is that the methods accuracy is , that is, it is some power of the size of the step (grid size). The power depends on the method. That’s not so important, but the important thing is that the accuracy goes worse with the power of the dimension of the space when you switch to a space with more than one dimension. For example, in the 3D space you need points to keep the same grid size h as in the one-dimensional case with the N points. We’re going to see how it compares with Monte Carlo integration.
A method of numerically calculating the integral is to use a random uniform distribution of points over the interval instead of equally spaced points (by the way, there are adaptive numerical methods that do not keep the h interval constant). It is quite obvious – using the Law of Large Numbers – that for a large number of samples the sum will approximate the integral.
Using the variance the error is estimated as . In 1D it’s worse even than the worst numerical method presented above.
Its power comes in the high dimensional spaces one meets very often. For example, in quantum mechanics the Hilbert space increases exponentially as one adds particles to the problem. Even in classical statistical physics, the dimensionality of the phase space is huge. Monte Carlo integration error is not exponentially dependent on the dimension of space like other methods and there is its strength!
There is a problem with the above Monte Carlo integration, the error depends on variance, too. It might be the case that you want to integrate a function that has significant value only in a small region (or in several). Using a uniform distribution will sample a lot of small values and might even miss the places where the function value is significant. Typically the number of possible state is huge and the number of samples that are taken into account in the Monte Carlo sum is very small compared with the number of possible states so it is very easy to miss the important ones. There are several ways of improving the method, here I’ll present briefly only one, the importance sampling.
What can be done about it? Rewrite the integral:
If you use a function g that is approximately like f (ideally proportional with f, the proportionality being a constant) you get everywhere significant values – ideally F is everywhere constant. g(x) acts like a weight. Since g will be used as a probability distribution, it’s also required that:
To solve it with the computer, you transform it into a sum:
To be noted here that this method works for sums with many terms, too, not only for integrals. Also this is useful even for non Monte Carlo integration methods, just attempt to use a variable step size w instead of h and that is done for example in the adaptive methods.
The sum above for the Monte Carlo method is still for uniform random number distribution, but if one uses g(x) to generate the distribution instead of the uniform one, taking into account the Law of Large Numbers and the expected value one can see that the integral is approximated better. Now the function is sampled more in the regions where it has large values, the regions where it has low values being sampled less.
A Markov process is a process where evolution from the current state into the next one is dependent only on the current state, that is, it doesn’t depend on how the system got there (its history). For example, the usual random walk is a Markov chain, but the self avoiding random walk is not, because the possible next step depends on the system history.
Since I mentioned random walks, here is one implemented in JavaScript:
It starts in the center and restarts after the ‘particle’ reaches the boundary. Here is the source code:
(function() { var canvas = document.getElementById("randomWalkCanvas"); var ctx = canvas.getContext("2d"); ctx.strokeStyle = "#000088"; var dist = canvas.height / 2.; ctx.translate(dist, dist); var posX = 0; var posY = 0; function randomWalk() { var dir = Math.floor(Math.random()*2); var sense = Math.floor(Math.random()*2); ctx.beginPath(); ctx.moveTo(posX,posY); if (dir == 0) posX += (sense ? 1 : -1)*4; else posY += (sense ? 1 : -1)*4; ctx.lineTo(posX,posY); ctx.stroke(); if (Math.abs(posX) > dist || Math.abs(posY) > dist) { posX = 0; posY = 0; ctx.clearRect(-dist, -dist, canvas.width, canvas.height); } } setInterval(randomWalk, 50); })();
Now let’s particularize a little. We would like a Markov chain that samples the states in such a way that it helps us solve the integral (or sum) by importance sampling, that is, the chain should transition from one state to another with the probability needed for solving the sum, sampling the states the same way as the g distribution. To particularize further, in statistical physics the distribution that is often needed is:
where and Z is the partition function, , where the sum takes into account degeneracy, too, by summing over states, not energy levels. The expectation value for some macroscopic measurable value A is and that’s the kind of sums we want to solve with the simulations. If we manage to use the Boltzmann distribution for importance sampling, the partition function becomes equal with the number of samples and the calculation simplifies to:
Very shortly, to obtain a way of getting the required distribution, one writes the master equation for the probability of being in a particular state and then he requires that the probability of being in a particular state is constant. This way one finds out that the probability of transitioning into a particular state during a period of time must be equal with the probability of transitioning out during the same period of time, that is, a balance equation. A particular solution that satisfies the equality is the detailed balance:
a, b are two particular states, p is the probability if being in a particular state and w gives the transition rate between them. Arranging the solution a little one finds:
This means that one can obtain a desired probability distribution just by having the proper transition probability between states in a Markov chain! The transition probability is then split in two, a choice probability and an acceptance ratio, . The choice probability (also called a trial or selection probability) depends on the system being simulated and the implementation, but we want to have one with an acceptance ratio as high as possible. We don’t want to choose a ‘next state’ many times because it is not accepted.
Before finishing this section, I want to stress the ergodicity (see also this page) of the Markov chain. We want the process to sample the state space correctly, we don’t want it to end in a loop or to not be able to reach the relevant states.
There are many methods that are using a Markov chain for Monte Carlo simulations, I want to mention for now here only one, the Metropolis-Hastings method, more precisely, the Metropolis algorithm which is a special case where the choice probability is symmetric, that is, the choice probability of picking b if the current state is a is equal with the one of picking a from the state b, . This gives:
There are many ways to choose the acceptance to obtain the desired ratio of probabilities, the Metropolis one is:
That is, if the probability of the state b is higher than the probability of state a, the state is always accepted, if not, the state b is accepted according to the ratio of probabilities. Even if ignoring all the previous discussion it is quite obvious how this favors high probability states but also samples the low probability ones.
Let’s consider a case that is expected to be met in many statistical physics calculations, a Boltzmann distribution. For this case the ratio of probabilities turns into:
Very shortly, here is the algorithm, I’ll detail it when presenting an actual implementation:
Some brief implementation details: In many cases you don’t have to calculate again and again the energy or other values. Just notice that going from one state to another means an easily calculated change (often that’s the case) and use the difference to update the value. Also calculating some values involve exponentials. In many cases (for example when choosing a ‘next state’ involves some spin flip) the number of exponential values needed is limited. In such case it’s better to calculate them in advance and put them in a table to be reused, calculation of the exponentials all over again might be expensive.
I presented very briefly some things on the Monte Carlo methods, I’ll detail them further when presenting actual implementations.
Hopefully I’ll have more than one for this blog.
As usual, if you notice any mistakes, please let me know.
Finally here^{1} it is. A simple (relatively) program implementing the Numerical Renormalization Group. I tried to implement it as simple as possible to be easy to understand. Here is the program in action, running an Anderson model at half-filling:
I already used charts generated by the program in The Kondo Effect and Renormalization Groups posts. The first one shows the spectral function where you can notice the broadening of the quantum dot energy levels because of interaction with the leads and also the Kondo resonance, the next one shows the renormalization group flow for the same model.
I already gave some links about the subject in the The Kondo Effect, Quantum Dots and Renormalization Groups posts, here I’ll give some more, including with some information on extensions of NRG.
Here are two lectures of Theo Costi: Numerical Renormalization Group for Quantum Impurities^{2} and Numerical renormalization group and multi-orbital Kondo physics^{3}.
A paper by Oliveira: The Numerical Renormalization Group and the Problem of Impurities in Metals^{4}.
A paper on DM-NRG (Density Matrix – NRG) by Hofstetter: Generalized Numerical Renormalization Group for Dynamical Quantities^{5}.
A paper on TD-NRG by Frithjof B. Anders and Avraham Schiller: Spin Precession and Real Time Dynamics in the Kondo Model: A Time-Dependent Numerical Renormalization-Group Study^{6}.
A paper on using non-Abelian symmetries to improve DM-NRG by A. I. Toth, C. P. Moca, O. Legeza and G. Zarand: Density matrix numerical renormalization group for non-Abelian symmetries^{7}.
Sindel Michael PhD thesis: Numerical Renormalization Group studies of Quantum Impurity Models in the Strong Coupling Limit^{8}.
Again, here^{1} is the link to the code presented in this post. It is only a toy program, just to illustrate the concepts.
Programs used in research are here: Flexible DN-NRG^{9} and here: NRG Ljubljana^{10}. I don’t know much about the later, but by looking at the code I can tell that both are capable of using symmetries to speed up the calculations and improve accuracy, they work for non-flat density of states, NRG can be used together with Dynamical mean-field theory, they allow multiple channels, the spectral function can be calculated much better with DM-NRG (at finite temperature, too), they implement z-averaging, they can calculate out from the spectral function the Green function (with Hilbert transform) and more.
Obviously it would be quite hard to present here the whole theory, I’ll try to sketch the main ideas only. For details, please check out the links.
From now on, both in the theory presented and in the code a flat density of states is assumed in the conduction band and also a constant coupling between the impurity and the electronic bath (that is, not dependent on energy).
Approximately those are the steps of the derivation, please see the links for details:
In the following, the states for each site are in order:
Here is a brief description of the algorithm together with relevant code:
startIteration = -1
means the ‘impurity’ alone, startIteration = 0
means the ‘impurity’ together with the first site from the Wilson chain. Here is the code for the Anderson model quantum dot:void QDAnderson::Init() { double B = theApp.options.B; double U = theApp.options.U; double eps = theApp.options.eps; double delta = theApp.options.delta; delta *= 1 / 2. * log(Lambda) * (Lambda + 1.) / (Lambda - 1.); t = sqrt(2. * delta / M_PI); hamiltonian.matrix = Eigen::MatrixXd::Zero(curMatrixSize, curMatrixSize); static const unsigned int ImpUp = 1; static const unsigned int ImpDown = 2; hamiltonian.matrix(ImpUp, ImpUp) = eps - 1./2. * B; hamiltonian.matrix(ImpDown, ImpDown) = eps + 1./2. * B; hamiltonian.matrix(ImpUp + ImpDown, ImpUp + ImpDown) = (2 * eps + U); // need this operator for the spectral function DUpOperator *up = new DUpOperator(curMatrixSize); up->matrix.adjointInPlace(); spectralOperators.push_back(up); }
B is the magnetic field, U the Coulomb interaction energy, eps is the uni-electronic energy level (for electron-hole symmetry, -U/2). delta gives the hybridization. The up
operator is the spectral operator for the spectral function.
void NRGAlgorithm::Calculate() { bool stopped = false; AdjustForEnergyScale(); // start with a diagonalized Hamiltonian // for the simple Anderson model in this program it already is in diagonal form // but for Kondo it might not be, unless it's diagonalized in Init // the same for the double quantum dot system hamiltonian.Diagonalize(); Eigen::MatrixXd Ut = hamiltonian.eigenvectors(); Eigen::MatrixXd U = Ut.adjoint(); // put the operators in the same basis as H fUpOperator.matrix = U * fUpOperator.matrix * Ut; fDownOperator.matrix = U * fDownOperator.matrix * Ut; for (auto &op : staticOperators) op->matrix = U * op->matrix * Ut; for (auto &op : spectralOperators) op->matrix = U * op->matrix * Ut; // the iteration over the Wilson chain for (int iter = startIteration + 1; iter <= NrSteps; ++iter) { Step(iter); TRACE("Iteration number: %d\n", iter); if (controller && controller->ShouldCancel()) { stopped = true; break; } } if (passData) passData->Finished(stopped ? nullptr : this); }
Each ‘step’ several things are done, they are implemented in void NRGAlgorithm::Step(int iter)
.
First, the Hilbert space is enlarged with the states of the newly added site, then the Hamiltonian is set by adding the for the new site.
t is easy for a flat conduction band and constant coupling:
double NRGAlgorithm::GetCouplingForIteration(int iter) { return (1. - pow(Lambda, -iter - 1.)) / sqrt((1. - pow(Lambda, -2 * iter - 1))*(1. - pow(Lambda, -2 * iter - 3))); }
If that’s not the case (for example when using Dynamical mean-field theory the band is adjusted self-consistently and it’s not constant), you will have to do the Lanczos tridiagonalization in the code, it’s not that simple. You’ll also end up having on site energy, so setting up the extended Hamiltonian will not be as simple as here.
Here is the code that sets up the extended Hamiltonian in one step:
// diagonal blocks: sqrt(Lambda) * Hamiltonian hamiltonian.matrix *= SqrtLambda; hamiltonian.Extend(); //now the size is 4 * curMatrixSize // off diagonal blocks Eigen::MatrixXd fUpOperatorTmatrix = fUpOperator.matrix.adjoint(); Eigen::MatrixXd fDownOperatorTmatrix = fDownOperator.matrix.adjoint(); //first 'row' hamiltonian.matrix.block(0, curMatrixSize, curMatrixSize, curMatrixSize) = t * fUpOperatorTmatrix; hamiltonian.matrix.block(0, 2 * curMatrixSize, curMatrixSize, curMatrixSize) = t * fDownOperatorTmatrix; // second 'row' hamiltonian.matrix.block(curMatrixSize, 0, curMatrixSize, curMatrixSize) = t * fUpOperator.matrix; hamiltonian.matrix.block(curMatrixSize, 3 * curMatrixSize, curMatrixSize, curMatrixSize) = t * fDownOperatorTmatrix; // third 'row' hamiltonian.matrix.block(2 * curMatrixSize, 0, curMatrixSize, curMatrixSize) = t * fDownOperator.matrix; hamiltonian.matrix.block(2 * curMatrixSize, 3 * curMatrixSize, curMatrixSize, curMatrixSize) = -t * fUpOperatorTmatrix; // last 'row' hamiltonian.matrix.block(3 * curMatrixSize, curMatrixSize, curMatrixSize, curMatrixSize) = t * fDownOperator.matrix; hamiltonian.matrix.block(3 * curMatrixSize, 2 * curMatrixSize, curMatrixSize, curMatrixSize) = -t * fUpOperator.matrix; int enlargedMatrixSize = 4 * curMatrixSize; int nextMatrixSize = min(enlargedMatrixSize, maxSize);
Don’t forget about the anti-commutation relations. A minus sign appears for states with one electron from the added site because of that.
Then the Hamiltonian is diagonalized and truncated:
// diagonalize the hamiltonian // the eigenvalues and eigenvectors are already sorted // the eigenvectors are normalized // the diagonalization from eigen takes care of those // the SelfAdjointEigenSolver does that, for another solver sorting might need be done afterwards hamiltonian.Diagonalize(); Eigen::VectorXd evals = hamiltonian.eigenvalues(); Eigen::MatrixXd evecs = hamiltonian.eigenvectors(); // transform the hamiltonian and the operators to the new truncated basis // switch the hamiltonian to the diagonalized one hamiltonian.matrix = hamiltonian.matrix.block(0, 0, nextMatrixSize, nextMatrixSize).eval();
Then the unitary transformation is applied to all operators, including the f operators needed for the addition of the next site. We need them all in the same basis as the Hamiltonian.
// truncate the eigenbasis Eigen::MatrixXd Ut = evecs.block(0, 0, enlargedMatrixSize, nextMatrixSize); Eigen::MatrixXd U = Ut.adjoint(); // the operators for the added site must be also in the new basis FUpOperator currentfUpOperator(enlargedMatrixSize); FDownOperator currentfDownOperator(enlargedMatrixSize); fUpOperator.matrix = U * currentfUpOperator.matrix * Ut; fDownOperator.matrix = U * currentfDownOperator.matrix * Ut; // now change the basis for the static and spectral operators, too for (auto &op : staticOperators) { op->Extend(); op->matrix = U * op->matrix * Ut; } for (auto &op : spectralOperators) { op->Extend(); op->matrix = U * op->matrix * Ut; op->PassSpectral(iter, Rescale * pow(Lambda, -(iter - 1.)/2.), evals); } // pass eigenvalues for the renormalization group flow chart if (passData) passData->PassEigenvalues(iter, evals, Rescale);
Then prepare for the next iteration and that’s about it:
// change values for the next iteration curMatrixSize = nextMatrixSize; t = GetCouplingForIteration(iter);
I deliberately not calculated all easy things that can be computed, just to avoid unnecessary complications, to keep the program simple. Again, for more information check the links. Even the spectral function could be calculated faster using Fourier Transform (FFT implementation) and convolution but I guess the code is more clear as it is.
The easiest thing to calculate is the partition function. Just stop the algoritm at a step corresponding to a particular temperature and calculate it, the spectrum is available:
From here you can already extract quite a bit of information.
By noticing that not only the spectrum is available, but also the eigenvectors, one can compute the density matrix:
At this point I must emphasize again that this is a toy program and accessing this^{5} is recommended.
Since the program can bring along the ‘static’ operators, extending them, changing the basis and truncating them, one can do even more by using an operator for an observable, O:
All the above are quite easy to calculate (not exactly if you want to use the complete set of eigenvectors and go with the reduced density matrix, for that see the link provided).
I did not calculate any of them, I chose to calculate the spectral function instead, for the limit T=0. Very shortly, the spectral function for the d operator in the T=0 limit is:
There is enough information about how to derive it and implement it in the links, the code is in the SpectralOperator
class. The truncated spectrum is accumulated each even step – in SpectralOperator::PassSpectral
– by using a weight for the overlapping interval, to avoid double counting. Before getting the spectral function into the chart, the discrete values are broadened in SpectralOperator::GetSpectrum()
using a log Gaussian:
double SpectralOperator::LogGauss(double omega, double omegaN) const { ASSERT((omega < 0 && omegaN < 0) || (omega > 0 && omegaN > 0)); double lndif = log(abs(omega)) - log(abs(omegaN)); return exp(-b2 / 4.) / (b * abs(omega) * sqrt(M_PI)) * exp(-lndif * lndif / b2); }
A spectral function calculated this way does not respect the spectral sum rule, one can do much better with DM-NRG^{5}.
Having the spectral function allows one to calculate all sorts of things, like conductivity or spin susceptibility. If spectral function is not enough, the whole Green function can be computed using a Hilbert transform.
I already gave some hints about more that could be implemented, above, but here are some, again:
Here is a short description of the program, it will be also available on GitHub^{1}.
Everything related with NRG is either in the NRG namespace or has the class name starting with NRG. There are three kind of classes for NRG implementation, one is dealing with data passing around and adjusting and controlling the algorithm running, one is the operators, derived from the Operator
abstract class and one is the NRG algorithms, derived from the abstract class NRGAlgorithm
.
The rest of the program is very simple, just an interface to the NRG. It allows starting/stopping the computation, some configuration settings and it displays the charts. That’s about it.
Besides mfc and other typical VC++ runtime libraries, the program uses GDI+ for drawing.
The program deals with matrices using Eigen^{11} library.
The NRG Namespace:
ControllerInterface
and ResultsRetrieverInterface
are interfaces that allow by deriving from them classes that respectively cancel calculation and get the results from it.
The operators are derived from the Operator
class. Operator::Extend()
extends the operator by adding new states for the new Wilson site. Added states are in the most significant bits position. The changeSign
member allows extending the operator matrix for fermionic operator type (if true) or bosonic operator type (if false). The minus sign there is due of anti-commutation. Classes derived from it are: Hamiltonian
, the hopping operators FUpOperator
and FDownOperator
and the spectral operator, SpectralOperator
. This one is a regular operator with some methods added that allow calculating the spectral function for the operator. DUpOperator
is the spectral operator that is used for generating the spectral function for the Anderson and two quantum dots models.
The NRGAlgorithm
class implements the NRG. From this class three examples are derived: QDAnderson
, a quantum dot with the Anderson model, QDKondo
, a quantum dot with the Kondo model, TwoQDRKKY
, two quantum dots coupled by spin-spin interaction, only one being coupled to the leads. The later should be considered only qualitatively, to have better precision one should use symmetries for calculation. Anyway, it’s enough to show the split of the Kondo resonance due of the two stage Kondo effect.
NRGComputationThread
is the class that implements the computation thread for NRG, the calculations run in a different thread to avoid locking the UI.
NRGController
is derived from NRG::ControllerInterface
and allows cancelling computations (the thread checks it each computation step).
NRGResultsData
is derived from NRG::ResultsRetrieverInterface
and allows passing the results to the main thread and allows it to check if computation is finished.
The options are implemented by Options
and they are saved/loaded into/from registry. The options UI are implemented by COptionsPropertySheet
, CNRGPropertyPage
, CParametersPropertyPage
and CChartsPropertyPage
.
The charts are implemented by the Chart
class. It’s pretty messy and far from perfect, I might improve it in the future. It uses GDI+ for drawing.
CAboutBox
needs no explanation.
CMainFrame
is the main frame window, it implements/routes commands.
CnrgApp
is the application class. There aren’t many changes in there except initializing and shutting down GDI+, setting the registry key and loading the options from registry.
CnrgDoc
is the ‘document’ class. It contains the computation thread, the thread controller and the computation data objects. It also contains the chart objects. The most important member is CnrgDoc::StartComputation()
the others are pretty straightforward.
CnrgView
is the ‘view’ class. Has some changes compared with the class generated by App Wizard, related with drawing/printing. There is a timer implemented there which allows checking for computation finish and updating the charts. There is also some handling of the cursor, making it a ‘wait’ cursor during calculations.
CNumberEdit
implements an edit box for double and float values. By setting allowNegative
one can control if negative numbers can be entered or not.
ComputationThread
is the base class for the NRG thread. There is not much in there, just starting the thread.
Besides the simple Anderson and Kondo models I already mentioned and even supplied some results from the code already, here are some more results for a little bit more complex situations.
As a little bit more complex example I implemented the class TwoQDRKKY
(RKKY comes from Ruderman-Kittel-Kasuya-Yoshida interaction). It has a Hamiltonian set up for two quantum dots coupled by antiferromagnetic spin-spin interaction, only one being connected to the leads. Here is the code that sets the Hamiltonian, the comments in the code should be enough for understanding it:
void TwoQDRKKY::Init() { double B = theApp.options.B; double J = theApp.options.J; double U = theApp.options.U; double eps = theApp.options.eps; double delta = theApp.options.delta; delta *= 1 / 2. * log(Lambda) * (Lambda + 1.) / (Lambda - 1.); t = sqrt(2. * delta / M_PI); hamiltonian.matrix = Eigen::MatrixXd::Zero(curMatrixSize, curMatrixSize); Hamiltonian H; H.matrix = Eigen::MatrixXd::Zero(4, 4); // first quantum dot static const unsigned int ImpUp = 1; static const unsigned int ImpDown = 2; static const unsigned int ImpUpDown = ImpUp + ImpDown; H.matrix(ImpUp, ImpUp) = eps - 1. / 2. * B; H.matrix(ImpDown, ImpDown) = eps + 1. / 2. * B; H.matrix(ImpUpDown, ImpUpDown) = (2 * eps + U); H.Extend(); // add the second quantum dot static const unsigned int ImpUp2 = (1 << 2); static const unsigned int ImpDown2 = (2 << 2); static const unsigned int ImpUpDown2 = ImpUp2 + ImpDown2; Eigen::Matrix4d I = Eigen::Matrix4d::Identity(); H.matrix.block(ImpUp2, ImpUp2, 4, 4) += (eps - 1. / 2. * B) * I; H.matrix.block(ImpDown2, ImpDown2, 4, 4) += (eps + 1. / 2. * B) * I; H.matrix.block(ImpUpDown2, ImpUpDown2, 4, 4) += (2 * eps + U) * I; // there are two quantum dots here, but they are not coupled yet // an easy check is to run the program without the coupling that follows and get the same results // as for a single quantum dot, since the other one is decoupled // the coupling terms: // on diagonal values H.matrix(ImpUp2 + ImpUp, ImpUp2 + ImpUp) += 1. / 4. * J; H.matrix(ImpDown2 + ImpDown, ImpDown2 + ImpDown) += 1. / 4. * J; H.matrix(ImpUp2 + ImpDown, ImpUp2 + ImpDown) -= 1. / 4. * J; H.matrix(ImpDown2 + ImpUp, ImpDown2 + ImpUp) -= 1. / 4. * J; // off diagonal values, S^+ * s^- and S^- * s^+ with the 1/2 factor H.matrix(ImpUp2 + ImpDown, ImpDown2 + ImpUp) = 1. / 2. * J; H.matrix(ImpDown2 + ImpUp, ImpUp2 + ImpDown) = 1. / 2. * J; hamiltonian.matrix = H.matrix; // need this operator for the spectral function DUpOperator *up = new DUpOperator(curMatrixSize); up->matrix.adjointInPlace(); spectralOperators.push_back(up); }
Here is a spectral function I got:
And here are two links to papers dealing with such setup: Strongly correlated regimes in a double quantum-dot device^{12} and Two-stage Kondo effect in side-coupled quantum dots: Renormalized perturbative scaling theory and Numerical Renormalization Group analysis^{13}.
Notice how the Kondo resonance is split (for a larger coupling between the quantum dots it is completely destroyed). As temperature goes lower, at Kondo temperature the electron from the quantum dot is screened by the electrons in the leads, the electron being ‘locked’ in a singlet state. This prevents inelastic scattering. At higher temperatures this cannot happen because the thermal energy is high enough to overcome the coupling strength. This is the usual Kondo effect already presented.
When the first quantum dot is screened by the electrons in the leads, they present together for the second quantum dot an effective fermionic bath. As it is coupled also by an antiferromagnetic interaction, as the temperature is lowered towards an energy equal with the coupling strength, the two quantum dots will lock into a singlet. Electrons from the leads need to break the coupling between the two quantum dots in order to pass through and since they do not have the thermal energy to do it anymore, the conductance drops. This is the second stage Kondo effect. Please check the links for the details.
As a warning, this setup is quite complex and it would need to use symmetries to avoid numerical errors. Without symmetries one would need to use a big matrix and numerical errors start to kick in due of diagonalization. As a consequence it’s quite hard to get a symmetrical spectral function. Here are the parameters I used for the chart: number of kept states 250 (this is very low!), lambda 2.5, 30 iterations, U=2.82, epsilon=-U/2, delta=U/16, J=0.02, no magnetic field. For spectral function, b=0.6, the step was 0.001.
Now let’s go back to the Anderson model and apply a magnetic field. A weak magnetic field, for illustration purposes I want the Zeeman energy being quite a bit lower than the Kondo temperature.
Here is the spectral function for the operator:
It’s easy to guess how the ‘down’ spectral function looks like. Here is the spectral function for both spin up and down operators:
From this one is easy to figure out the conductance, as in the previous case the Kondo peak splits, when the temperature is low enough electrons from the leads will not be able to break the coupling with the magnetic field, the conductance drops.
For this spectral function I had to temporarily change the QDAnderson::Init()
implementation, here is the relevant code:
// need this operator for the spectral function DUpOperator *up = new DUpOperator(curMatrixSize); up->matrix.adjointInPlace(); FDownOperator down(curMatrixSize); down.matrix.adjointInPlace(); up->matrix += down.matrix; spectralOperators.push_back(up);
The usage of FDownOperator
might be confusing, but DUpOperator
has the same implementation as FUpOperator
except that it has the code for calculating the spectrum. The same would go for a DDownOperator
, I did not bother to implement it since I could use FDownOperator
.
For more information on this please see this Nature article (picked at random): Temperature and magnetic field dependence of a Kondo system in the weak coupling regime^{14}.
With this post I end for now the set of posts on this topic, although I intend to implement at least another renormalization group program, a DMRG one. But until then I’ll probably have some other posts on other topics, possibly on the Monte Carlo methods.
As usual, if you notice any mistakes/issues, please let me know, I’ll try to fix them.
This is another introductory post leading towards a numerical renormalization group program. I thought I should expose some generalities before presenting the numerical renormalization group method and the program that implements it. It’s mostly a collection of links rather than a detailed description.
I already supplied a link at the end of the post about the Kondo Effect but I promised more of them. Here are some of them (more will follow in the next post), starting with one of the most important, The renormalization group: Critical phenomena and the Kondo problem^{1}. It’s not only about the numerical renormalization group, it’s more general than that. Wilson got a Nobel prize for that work, by the way.
A hint of renormalization^{2} is a general description of renormalization groups. It’s not only about them in condensed matter or statistical physics but also about the renormalization in particle physics, where the idea of renormalization originated. Renormalization groups might be easier to visualize in real space than in the dual space so here is a thesis about Real Space Renormalization Group Techniques and Applications^{3}, with quite a bit of information about the Density matrix renormalization group. Since I mentioned it, here is another one originating from the same time period, 1990s: Functional renormalization group.
Because at the top of the post I added an image of the numerical renormalization group flow to be used as an example, I also point to a review paper about the numerical renormalization group: The numerical renormalization group method for quantum impurity systems^{4}. In there you’ll find a similar chart but with different parameters.
Here is another review: Diagonalization- and Numerical Renormalization-Group-Based Methods for Interacting Quantum Systems^{5}.
Let’s suppose we have some function that describes a system. For example it might be a Hamiltonian or the action or the partition function. Let’s suppose it’s a Hamiltonian H having a set of parameters K (most of them couplings but it could be also mass, for example). A renormalization group transform is a mapping along with a change of scale. The change of scale needs not to be in the real space, it can be a change of scale in momentum space, a change in the scale of energy.
It’s a mapping that preserves the Hamiltonian form unchanged but now the coupling constants are not constant anymore. The rest is details. Quite complex details, in many cases, though. To see some details you could visit the links I provided.
The mapping can be written in a simpler form that emphasizes the parameters change: .
The set of parameters of the Hamiltonian is a point in the parameter space. The above transformation applied in a sequence generates a trajectory in the parameter space. That’s the ‘renormalization group flow’. By the way, a renormalization group might not be a group, it might not admit an inverse.
A point in the parameter space where there is an invariance is a fixed point. For a point close to a fixed point, the mapping can be linearized: . Assuming that the linear operator has a complete vector set one can use it to write: where are the eigenvectors. A transformation applied n times gives:
where are the eigenvalues of the linear operator.
For the corresponding parameters get smaller and smaller, they attract the trajectory towards the fixed point. Those are the irrelevant parameters.
For the corresponding parameters get bigger, the trajectory is driven away from the fixed point. Those parameters are the relevant parameters.
The ones that correspond to eigenvalues equal with one are the marginal ones. Because of the nonlinearity they might end up in driving the trajectory either away from the fixed point or towards it.
The image at the top of the post is a numerical renormalization group flow for the Anderson model at half filling. In the chart there is the energy spectrum, scaled along the renormalization group flow.
On the horizontal axis is the step of the renormalization group, on the vertical axis, the energy. Again, the energy is scaled each step (only even steps are charted).
The fixed points are quite visible in the chart. There is more to it, if one uses the program^{6} to chart a Kondo model with proper parameters one can get a quite similar chart, except the beginning part. The universality manifests itself, the model differences do not matter at low energies (corresponding to low temperatures).
So, the regimes are, from left to right, that is, from high energies to low energies, corresponding to high respectively low temperatures:
* Free Orbital regime.
* Local Momentum fixed point.
* Strong Coupling fixed point.
By the way, the flow to a single stable fixed point as in this case is characteristic for systems with a single ground state. There are systems with more than one ground state. In such cases the flow can go into either fixed point, depending on where it starts. One can get quantum phase transitions for such a system.
This is the last ‘theory’ post before presenting the program^{6} for the numerical renormalization group.
I did not write much here, the purpose was just to provide more links to interesting papers. I don’t think I’ll give full explanations in the next post, either, it would be way too much for the purpose of this blog.
This post is a continuation of the one on the Kondo effect^{1} and is leading towards one presenting the numerical renormalization group and a simple program that implements it. I won’t insist much on details about the quantum dots, but I thought I should mention them because the numerical method is widely used to study them.
As a simple definition, a quantum dot is a small device that confines electrons/holes in a space small enough to exhibit quantum behavior. Typically semiconductors are used because compared with metals, the electrons inside have a quite low Fermi velocity, that is, the De Broglie wavelength for the Fermi surface electrons is bigger than in metals, which makes the quantum dots exhibit quantum behavior at sizes bigger than for the metal case.
They can be used for a lot of applications, ranging from measuring devices, tunable light emitters, spintronic devices and so on up to hopefully quantum computing.
There are many ways of obtaining quantum dots, I’ll mention here only three of them. The main idea is to isolate out a small region of space using potential barriers. The ‘region of space’ does not need to be a 3D region.
One can use a 2D electron gas to make a lateral quantum dot, by using gates to confine the electron gas into a quantum dot.
They can use deposition of doped semiconducting layers, lithography and etching to isolate out a 3D quantum dot having the source at the bottom and the drain on top, being surrounded by gates that allow tuning it.
The actual construction methods do not matter much for our purpose, but the fact that they can be interconnected and various parameters tuned.
To illustrate the quantum dot properties I drawn a picture:
Right in the middle there is the potential well that confines the electrons. I represented the energy levels not as straight lines but as having a width because unlike an electron isolated out in an ideal box as in the particle in a box case, the potential well is finite and more, there are potential barriers with a finite width between the quantum dot and leads (marked L and R here, from ‘left’ and ‘right’) allowing electrons to tunnel in and out. This interaction with the leads outside splits out the uni-electron levels so they become continuous bands. As an alternate picture, because of the tunneling the electron life time on a level is limited, which means an indeterminacy in energy. The stronger the coupling with the leads are, the smaller the lifetime and bigger the broadening of the energy level.
The particle in a box model is already quite good to illustrate how spacing of energy levels – the in the figure – can be tuned by adjusting the size and shape of the potential well. In many setups they have several gates that allow them to do that. For the particle in the box case, the spacing is .
But that’s not the only thing that is related with the size. The Coulomb interaction, U, also depends on the size. It takes more energy to fit a second electron into a smaller box than into a big one. This big energy makes the quantum dot exhibit the Coulomb blockade.
In this picture, the potential bias between the left and right leads is in the zero bias limit. By applying a potential difference between them, one can open a window large enough to fit several energy levels in it. The conductance is proportional with the number available channels, for the ideal case being a quantum conductance for a channel. The gate potential can be used to move up and down the energy levels inside the dot (basically by pushing up and down the bottom of the potential well). As they enter or leave the potential window formed by the potential difference between the leads, the conductance changes. As the bias is lowered towards the zero bias, only one level can fit into the window and by varying they obtain peaks in the differential conductance that allows measuring the distance between the levels.
Last but not least, the thermal energy scale must be mentioned. It must be lower than the other important energy scales in the problem. If energy is large enough, an electron from a lead would have enough energy from the thermal excitations to ‘jump’ into a level that is high (even higher than the charging energy) and then go into the other lead, thus having an electric current even with no energy levels in the potential difference window. Also an electron from the quantum dot can gain enough energy to jump out. The quantum effects are washed out.
From this description it might not appear that they have so interesting properties. Indeed they expose quantum mechanical effects similarly with an atom, which is why they are also called artificial atoms. They have orbitals inside and electrons that fit inside respect Hund rules and so on. Well, they indeed can be tuned, which maybe is harder to do with an atom, but there is more. In the description above I did not say much about the leads. They can be of various sorts, from semiconductors to ferromagnetic metals to superconductors (left and right of different types, too) and that adds to the interesting effects that can be obtained.
But even without those complexities, the interaction with the leads has something which is hidden in the above description: the Kondo effect^{1}. If levels inside the quantum dot – that are lower than the lead electrochemical potential marked with and – are all filled but one which is occupied by a single electron only, the quantum dot can scatter electrons at the Fermi level in a similar manner as the magnetic impurity in a metal described in the post about the Kondo effect. Unlike the metallic host case, where the impurity can scatter in any direction, the quantum dot can scatter either back into the lead the electron was coming from, or forward into the other lead. This forward scattering leads to an enhanced conductance. And there is even more, they can get the Kondo effect with an even number of electrons in the dot, by applying a magnetic field to have electrons in triplet state. More on quantum dots and the Kondo effect, here^{2}.
The above description was for a single quantum dot, but they can be connected – with connections that can be adjusted, too – obtaining artificial molecules. Now there are even more possibilities. Just to point to one of them: Loss–DiVincenzo quantum computer^{3}. Even the Kondo effect gets more complexities, they study for example^{4} two-stage Kondo effects. I bet you guessed by now, there is also a three-stage Kondo effect and in general, a multi-stage one^{5}.
I’ll post more links to info on both Kondo effect and quantum dots when I’ll post about the numerical renormalization group, until then I’ll point again to Michael Sindel Disertation^{6}.
I briefly presented quantum dots because they can be studied with the numerical renormalization group. I’ll get to it soon and maybe even present a configuration of two connected quantum dots.
The next post will probably be about renormalization groups, though.
I’ve decided to post several pages about theory before presenting a program, to avoid having long posts that nobody has patience to read. I’ll refer to those posts for details when presenting a program. This post starts a series of posts that will lead to a Numerical Renormalization Group program. A relatively simple one, allowing one to understand it without spending insane amounts of time on it (get Flexible DM-NRG and try to understand it to see what I mean). The serious ones can take years to develop and are not so easy to understand, you end up not seeing the forest because of the trees. There are so many details that you can be overwhelmed.
I’ve written one that is quite simple to understand by comparison, but I still have to add some UI for options and check some constants. I also wrote a charting class that is still not functional enough to be released, I still have some work to do on it. I want to reuse it in more projects. Hopefully by the time I’ll end up writing those posts about the theory I’ll be able to post the program on GitHub.
So, let’s start with the beginnings.
It all started in 1934 when they measured the gold resistivity at a low temperature. Gold is a non magnetic metal, but it had some magnetic impurities in it. They expected a decrease in resistivity as the temperature was lowered, the resistivity having a term because of the lattice vibrations (that is, phonons), one because of the electron-electron interactions and one constant, being proportional with the impurity density. Instead of finding that, they found that the resistivity reached a minimum at some low temperature (about 8K or something like that) then when lowering the temperature further, the resistivity started to increase. Quite a difference from the expected decrease down to 0K towards a resistivity given by the impurity density. It took 30 years to understand the cause.
The effect, which is now known as the Kondo Effect was explained by Jun Kondo using the Kondo model. Very shortly, the Kondo model describes the impurity as being a 1/2 spin coupled by a spin-spin interaction to the conduction band electrons, considered as non interacting, the Hamiltonian being:
where is the term for the impurity-conduction electrons interaction and is the term for the conduction electrons, that is:
and
more detailed for the conduction electrons:
is formed by Pauli matrices, and are creation and annihilation operators. By using the spin raising and lowering operators it can be put in a form that will be handy later:
where and and , with the number operator, that is, .
Kondo considered the coupling J as being small and used the perturbation theory to calculate resistivity. The article^{1} linked at the bottom of this page is the published article on the issue, you can also find some info here. I will link to some thesis pdfs in the post on numerical renormalization group, a lot of info on the subject will also be found there.
Very shortly, for the first order there is no temperature dependency, but for the second order he got a logarithmic dependency. The second order correction has a term of the third order in J and logarithmic in T. The resistivity for this correction looks like this:
where K is some value that depends on the square of J and the impurity concentration and so on.
It indicates that for low temperature there is an increase in scattering close to the Fermi level, known as Kondo resonance. The spin flip scattering are responsible for the temperature dependency, they involve virtual intermediate – with a spin flip – states that occur between two scattering events. Anticipating a little, the Kondo resonance is what you see as the middle narrow peak in the chart displayed at the beginning of the post. I generated it with the nrg program I’m going to release.
A problem still remains and that’s the logarithmic divergence. The resistivity does not really go to infinity as the temperature goes to zero.
As going to a lower temperature J term cannot be considered a small perturbation so the perturbation theory breaks down. The temperature where that happens is the Kondo temperature. A non perturbative approach have to be used to solve this, but on this I’ll give more details in the post about the numerical renormalization group.
A model^{2} that is more general than the Kondo model and it’s also useful to study impurities in metals (or quantum dots!) is the Anderson model. The Hamiltonian has three parts, one for the impurity/quantum dot, , one for the conduction electrons, – similar with the one from the Kondo model, so I won’t detail it – and one for the interaction between the electrons in the impurity/quantum dot with the ones from the conduction band, that is the ones from the metallic host/quantum dot leads, :
The quantum dot part of the Hamiltonian has an ‘on site’ energy term and the Coulomb repulsion terms:
i, j are level indices, the first Coulomb interaction term is the inter-levels term, that is, the Coulomb interaction between electrons on different levels, and the last one is the intra-level term, that is, simply the repulsion between the spin up electron and the spin down electron from the same level. The term that couples the impurity is:
h.c. stands for Hermitian Conjugate.
This Hamiltonian does not look as having spin-flip interaction, so how come it could be more general than the Kondo Hamiltonian? The answer is given by the Schrieffer–Wolff transformation^{3}.
Having a strong Coulomb interaction in the impurity/quantum dot, energy levels ‘tuned’ in such a way that only an electron – on average – is in the impurity/quantum dot, considering that the tunneling is weak, one can use perturbation theory to find that the Anderson Hamiltonian includes the Kondo Hamiltonian. The Kondo Hamiltonian is obtained by perturbatively eliminating excitations to doubly occupied and empty states, as an effective Hamiltonian.
Spin flips happen through virtual excitations. For example, if a spin down electron is initially in the quantum dot, it can tunnel outside (or if you like this picture, a hole can tunnel in), then a spin-up electron can tunnel in. Or with a spin-up electron inside, a spin-down electron can tunnel in then the spin-up electron tunnels out. Again, the end result is a spin flip. More, here^{4}.
I described very briefly the Kondo effect and models for it, but I also gave some links that should help. To anticipate things a little, here^{5} is a very useful link. I’ll give more in further posts, but this one you might want to check. You’ll find details not only on the Kondo and Anderson models, Schrieffer Wolff transformation, but also on Quantum Dots and the Numerical Renormalization Group and that’s what I intend to present in the future. If you manage to understand the Michael Sindel dissertation you will have no troubles understanding the program.
The first post on this blog dealt with Newtonian mechanics, it seems natural to follow it with another big name in physics and another important field. So here it is, something linked to Maxwell and with his equations. Something still simple for now, a program to visualize the electric field in 2D. Here is a video of the program in action:
Here^{1} is the project. The program is simpler because there is no OpenGL. The numerical methods might be a little more difficult, though. Also there is some more multithreading involved. The program presented is one of the few cases where the Euler method would be ok, it does not need much precision since it’s only for visualization purposes. I implemented Runge-Kutta instead – it includes Euler as a particular case, though – to present more numerical methods and to have something for later, hopefully I’ll reuse the code. Otherwise the implemented numerical methods (except Euler and perhaps midpoint) are overkill. This project was also an opportunity for some tests on the methods I implemented.
For such a simple application, one could start with Coulomb’s law and a simple definition for the potential. That would be enough. A quick look at Maxwell equations shouldn’t hurt, either. They are also on the header picture of this blog, on the right. This is electrostatics, so we can drop anything that has a time derivative in there, electric current and obviously, magnetic field. We end up with Gauss law:
By applying Gauss theorem it can be written in integral form:
where Q is the total charge in the volume V, S is the surface around the volume V. is the charge density and is obviously the electric field vector. It says that the electric charges are the sources and sinks for the electric field, the electric flux through a closed surface is proportional with the charge inside. It’s easy to see that what I said about the gravitational field in the previous post applies to the electric field, too, by the same symmetry considerations: for a thin shell of charge with charge density distributed spherically symmetric, the field outside is as if all the charge is in the center of the sphere, while inside the field is zero. One could easily see why it should be zero in the center of the sphere by using symmetry, it requires a little bit more to see why it goes the same in some other points, by noticing that while the field drops with , the surface enclosed by opposing cones is . By adding up the thin shells, one can see that for a sphere of spherically symmetrical distributed charge, the field outside is like the whole charge is in the center of the sphere.
The electric field for such a charge is easy to calculate:
where r is the distance from the center of the sphere. One can easily see where the originates from (hopefully you know that the surface of the sphere is ). Ok, here it is, look at the integral form, left term: by symmetry considerations, must be along the radius, but has the same property. One can now drop the vectors for the scalar product. Again due of symmetry considerations E must be the same in any point of the surface of the sphere. That means one can take it out of the integral, being a constant. What’s left is an integral of the surface element over the surface of the sphere and that’s obviously the surface of the sphere.
The field from such a charge is easy to calculate in the code, so here it is, from the class Charge
:
inline Vector2D<double> E(const Vector2D<double>& pos) const { Vector2D<double> toPos = pos - position; double len2 = toPos * toPos; Vector2D<double> result = toPos / (len2 * sqrt(len2)); result *= charge; return result; }
As you can notice, I prefer to drop out constants from the calculations, they are not relevant. position
is the position of the charge.
As for the gravitational field, superposition works:
that is, to calculate the electric field in a point, one simply adds all fields generated by each charge (for a charge distribution the sum becomes an integral). It’s quite easy to do in the code, so here it is, from the class TheElectricField
:
inline Vector2D<double> E(const Vector2D<double>& pos) const { Vector2D<double> result; for (auto &&charge : charges) result += charge.E(pos); return result; }
Using one can find easily (by the way, the operator in Cartesian coordinates is , with only the first two components for our simpler 2D case) that . It is even easier to deal with it in the code. Here is the potential for a charge from the class Charge
:
inline double Potential(const Vector2D<double>& pos) const { return charge / (pos - position).Length(); }
and the potential for all charges from the class TheElectricField
:
inline double Potential(const Vector2D<double>& pos) const { double P = 0; for (auto &&charge : charges) P += charge.Potential(pos); return P; }
Just to end this theoretical part, this particular situation is called electrostatics, but it’s still electromagnetism. If you want to see where the ‘magnetism’ part is, imagine that you are in a particular reference frame when you are looking at the charges. The Universe does not really care about what reference frame did you pick. The situation might look different to you, but the physics is the same, it’s not dependent on how you look at it. Now look at the charges from a reference frame that is moving relative to them. Moving charges means an electric current, an electric current means a magnetic field. So there it is, the magnetism part is hidden because of your perspective. There is no such thing as a separate electric field and a separate magnetic field. They are part of the same field, the electromagnetic field.
Field lines are an useful tool for visualizing vector fields. In our 2D case we have two kinds of lines, the electric field lines and the equipotentials. The later are actually surfaces in 3D. The electric field lines have the electric field vector as a tangent and their density is proportional with the magnitude of the electric field. This already poses a problem in a 2D representation, because their density for a point charge in the 2D representation drops as while the field drops as . If you look at the picture as being a section to 3D space, it’s not a problem anymore.
Since I mentioned a problem, I should mention another one until I forget. Field lines start on sources and end on sinks. Their number, for equal size spheres that contain the spherically symmetrical distributed charge, is proportional with the charge. But that is true for spheres, that is, in 3D. For 2D we have circles instead but the code still starts twice as many lines from a charge with the value 2 than from one with the value 1 (should do that by distributing them on the whole surface of the sphere instead). I wouldn’t like to have a non integer number of lines, if you know what I mean… that’s also a reason why I allow only integer values for charges. If not integer, it would also rise some issues with the potential.
Despite this, the code still is ok if you keep that in mind. That’s why it says on titles ‘unequal charges’ without specifying the ratio between them. The only place where it might confuse is the options, but since you are warned now, it should be all right. One should be able to calculate the ‘true’ charge knowing the values for the area of the sphere versus the circumference of the circle.
So, the field line is given by the direction of the vector. That already suggests using the Euler method: just calculate the electric field in the start point, jump a step in its direction, repeat again and again for the current point until either it reaches a sink or it’s long enough that there is no hope returning on a charge. There are simplifications to this, for example if all charges have the same sign, you already know that the field lines won’t return, but about those, later. But, is it ok to jump the same step size from different points, no matter if the line is almost straight or it bends quite a bit? It looks like even this would benefit from higher order methods and why not, even from adaptive methods.
Before going into the actual numerical methods, let’s see how it would be solved with Euler (which, again, will be a particular case). We would have , that is, . The problem is that the x component of the electric field can be zero in some points. We’re trying to find a function f(x) that describes our field line, but the field line can turn around and for some values of x it might have more than one value! This is more clear for equipotentials, they are closed loops.
That means it’s not really a function (not in the usual sense, for more see multivalued function). You might try to parametrize the curve stating that and then using the chain rule . The line element length is by the old Pythagorean theorem .
Then . A correct result with something that seems a not so ok derivation. One can use the trick of division by zero to derive all sorts of not so nice things (like 1=2). Let’s not forget that we used the function f which is not really a function and there is a division there by a component of the field that might be zero. By the way, one can get the similar result for x by relabeling.
Can we do better and clearly? With a little bit of geometrical thinking, yes we can. First, let’s drop the coordinate system and think in terms of vectors. They are mathematical objects that do not depend on coordinates. Imagine our field line. In a certain point on it the infinitesimal vector along the curve is . Since it’s infinitesimal, its length is dt (that is, the length of the tiny vector is the same as the length of the tiny piece of curve). The normalized vector (that is, the versor) is then . But by definition of the field line, the versor is also given by the versor of the electric field in that point, so:
That’s it. We have enough to use the numerical algorithms to find the field line.
The algorithms can be used to solve problems where one has the initial conditions and knows the “time” derivative of a function .
In our case, “time” t is not really time, but the position on the line given by the line length, measured from the start. One could think of it as time if he imagines a small charge travelling along the line. The picture is not entirely physical, because the presence of another charge changes the configuration of the field and besides, it has mass. From the previous post one can notice that things with mass do not necessarily travel in the direction the field tells them to. One still could imagine various setups where the approximation is good enough (very small mass, in a fluid that has drag forcing it to move with a small speed and so on). In a sense, it is time (for the non adaptive methods). It’s the time needed for the calculation to reach that point. But that’s less physical than the travel distance.
No matter how you look at it, you might notice that you don’t really need the “time” for this case, the function is less general.
The function is now quite clear for the electric field lines, but how about equipotentials? You want to draw the line through points that have the same potential, that is, a constant potential. That means a zero directional derivative. That means it is orthogonal on the direction of the maximum directional derivative in that point, that is the gradient, but that’s how the potential is linked to the electric field. So one can use a function that is very similar with that for the electric field line, the vector just needs to be rotated with .
Here are the needed functions implemented in the program (in the ComputationThread
class which is defined in the FieldLinesCalculator
class):
class FunctorForCalc { public: const TheElectricField *theField; FunctorForCalc(const TheElectricField *field = NULL) : theField(field) {} }; class FunctorForE : public FunctorForCalc { public: int charge_sign; FunctorForE(const TheElectricField *field = NULL) : FunctorForCalc(field), charge_sign(1) {}; inline Vector2D<double> operator()(double /*t*/, const Vector2D<double>& pos) { Vector2D<double> v = theField->ENormalized(pos); return charge_sign > 0 ? v : -v; }; }; class FunctorForV : public FunctorForCalc { public: FunctorForV(const TheElectricField *field = NULL) : FunctorForCalc(field) {}; // just returns a perpendicular vector on E inline Vector2D<double> operator()(double /*t*/, const Vector2D<double>& pos) { Vector2D<double> v = theField->E(pos); double temp = v.X; v.X = -v.Y; v.Y = temp; return v.Normalize(); }; };
They do not need much explanations, the functor for E changes sign for negative charges because you don’t want to go along a line towards the charge center, you want the line to go from the charge, outside. Odd things would happen if the line would reach the center, because the code considers point charges. That means a singularity in the center, that is, an infinite field because of the division with . Programmers and some physicists do not like division with zero very much. The functor for V just does the rotation I already mentioned.
The numerical methods used are the Runge-Kutta methods, more specifically, the explicit ones, including adaptive methods. Before trying to understand them, please be sure to understand the Euler method and the midpoint method, they are particular cases of Runge-Kutta methods and they are easy to understand.
The general Runge-Kutta method is expressed as:
where
S is the number of stages of the particular Runge-Kutta method, i is the current stage during calculations, the coefficients are called nodes and give the position in “time” relative to the current “time” (it might not be time, but often is) where the slope is evaluated. coefficients are weights, they are used to calculate a weighted average of the slopes at different “time” values and positions. The next value is obtained from the current one by adding the step multiplied by the slope estimated as this weighted average. Nodes, weights and coefficients can be arranged in a Butcher tableau, the leftmost column being the nodes, the bottom row being the weights and the rest of values being the coefficients.
The simplest Runge-Kutta method is the Euler method. It has a single stage, the node is at the current position and the weight is 1, that is the slope estimated at (given by ) is taken alone multiplied by 1. The result is the known .
From the ones with two stages, second order methods, I implemented the midpoint, Heun and Ralston methods. The midpoint method has nodes at the current “time” and at the next “time” instant (that is, the node values are 0 and 1) but the weight is zero for the first slope estimation and 1 for the next (and last) one. The end result is that the midpoint method advances by using a slope calculated in a point obtained by advancing half a step using the slope in the current point. The slope is evaluated for a value of half the time step, too. As a formula, the slope used is .
The Heun method (also called improved Euler) evaluates the slope by averaging two slope values, one obtained at the current position and one by advancing one step (both “time” and “space”), using the first evaluated slope. That means that the two weights are 1/2 each. The nodes are at 0 and 1. The Ralston method is similar, but its weights are 1/4 and 3/4, that is the slope evaluated after advancing is given more weight than for the slope at the current point. The nodes are at 0 and 2/3, so it does not try to advance a full step ahead to evaluate the slope, but only 2/3 of it (the a coefficient is also 2/3).
After understanding the above methods, the RK4 method should be easy to understand, too. It’s a fourth order method, with four slope estimations. The ones estimated for the current point and for the one after a time step (nodes with the values 0 and 1) are given half the weight for the estimations between them (nodes with value 1/2). It should be easy to understand because there is only one non zero a coefficient on each tableau line, that is, only the slope estimation from the previous stage is used in the current stage (to advance at a point for the next slope estimation). The more complicated ones use a weighted average of the previously estimated slopes.
For more information, please visit the links I provided, they are way more detailed (including a proof using Taylor expansion). Here is the code that implements the Runge-Kutta methods:
template<typename T, unsigned int Stages> class RungeKutta { protected: std::array<double, Stages> m_weights; std::array<double, Stages> m_nodes; std::vector<std::vector<double>> m_coefficients; std::array<T, Stages> K; public: RungeKutta(const double weights[], const double nodes[], const double *coefficients[]); template<typename Func> inline T SolveStep(Func& Function, const T& curVal, double t, double h) { T thesum(0); for (unsigned int stage = 0; stage < Stages; ++stage) { T accum(0); for (unsigned int j = 0; j < stage; ++j) accum += m_coefficients[stage - 1][j] * K[j]; K[stage] = Function(t + m_nodes[stage] * h, curVal + h * accum); thesum += m_weights[stage] * K[stage]; } return curVal + h * thesum; } template<typename Func> inline T SolveStep(Func& Function, const T& curVal, double t, double& h, double& /*next_h*/, double /*tolerance*/, double /*max_step*/ = DBL_MAX, double /*min_step*/ = DBL_MIN) { //next_h = h; return SolveStep(Function, curVal, t, h); } bool IsAdaptive() const { return false; } };
A particular Runge-Kutta method is implemented by deriving from this class, it’s very easy:
template<typename T> class RK4 : public RungeKutta<T, 4> { public: RK4(void); };
The constructor just sets the tableau and that’s about it:
static const double RK4weights[] = { 1. / 6., 1. / 3., 1. / 3., 1. / 6. }; static const double RK4nodes[] = { 0, 1. / 2., 1. / 2., 1. }; static const double row1[] = { 1. / 2. }; static const double row2[] = { 0, 1. / 2. }; static const double row3[] = { 0, 0, 1 }; static const double *RK4coeff[] = { row1, row2, row3 }; template<typename T> RK4<T>::RK4(void) : RungeKutta(RK4weights, RK4nodes, RK4coeff) { }
I also implemented adaptive methods, but for the code for them you’ll have to look into the sources^{1}. AdaptiveRungeKutta
is derived from the RungeKutta
class and from it all adaptive methods are derived. Very shortly, there is another row of weights that allows calculating (looping over stages only once) instead of a single result, two estimations, one with a higher order (with 1) than the other. This allows to estimate the error by pretending that the higher order result is exact and claiming that the difference among them is the error. This difference allows adjusting the step to reach a higher precision (the desired precision can be specified). The adjustment is done by taking into account the order. The adaptive methods have a variable step size. Please look in the code for implementation, the method should be quite clear. It resembles the RungeKutta
implementations, there is just one more for
loop for step size adjustment, each stage two values are computed (see thesumLow
and thesumHigh
values) and most of the code that is different (supplementary) deals with adjusting the step.
To present an example on how such algoritm is used, here is the method that calculates the equipotential line:
template<class T> inline void FieldLinesCalculator::CalcThread<T>::CalculateEquipotential() { Vector2D<double> startPoint = m_Job.point; Vector2D<double> point = startPoint; double dist = 0; double t = 0; unsigned int num_steps = (m_Solver->IsAdaptive() ? 800000 : 1500000); double step = (m_Solver->IsAdaptive() ? 0.001 : 0.0001); double next_step = step; fieldLine.AddPoint(startPoint); fieldLine.weightCenter = Vector2D<double>(startPoint); for (unsigned int i = 0; i < num_steps; ++i) { point = m_Solver->SolveStep(functorV, point, t, step, next_step, 1E-3, 0.01); fieldLine.AddPoint(point); // 'step' plays the role of the portion of the curve 'weight' fieldLine.weightCenter += point * step; t += step; if (m_Solver->IsAdaptive()) step = next_step; // if the distance is smaller than 6 logical units but the line length is bigger than // double the distance from the start point // the code assumes that the field line closes dist = (startPoint - point).Length(); if (dist * distanceUnitLength < 6. && t > 2.*dist) { fieldLine.points.push_back(startPoint); // close the loop fieldLine.weightCenter /= t; // divide by the whole 'weight' of the curve break; } } }
The one that calculates the electric field line is quite similar, but a little bit longer because there are more checks in there. I added comments in both of them that should help understanding the intent of the code. If you are curious what weightCenter
is and does in the code above, it is just that, a weight center for the equipotential loop. I use it to remove some duplicates of the equipotential lines. They are calculated starting from the first electric line from each charge and that creates duplicates which are removed at the end of the calculations. More details about that, later. AddPoint
does not really add each point to the line, but only those that are far enough from the previous one. ‘Far enough’ depends on how close they are from the start of the line. Obviously one does not need a lot of points to represent a portion of line that’s off the screen and does not curve so much there.
To be noted that the Runge-Kutta implementation might be quite far from the performance one could achieve by coding a particular method. A good compiler might unroll the loops and take advantage by the knowledge of the tableau at compile time to optimize calculations, for example avoiding unnecessary multiplications where the terms are zero and the additions of zero values, but one might achieve better performance by coding and optimizing a particular method only. As a warning, if you use it in your code, please check the code and the Butcher tableau values, I offer no warranty that they are correct.
The program uses Direct2D for drawing in the view, but GDI for printing and print preview drawing. The reason is the mfc print preview implementation. I didn’t want to look into the mfc code to see if I could change the print preview to be able to draw with Direct2D, so I preferred to expose a pair of Draw
methods, one that draws with Direct2D
and one that uses GDI
(not GDI+, I’ll use that in future projects posted here, though). Although the drawing methods are quite similar, there are differences enforced by library limitations. For example GDI ability to draw Bezier curves is quite limited, one cannot add as many points as he would like to a call to PolyBezier
, while a Direct2D path is more capable. Please see the code^{1} for details (FieldLine::Draw
methods specifically).
As typical in mfc programs, the view draws itself. After preparing the rendering target, the view asks the field object to draw itself, which in turn delegates drawing to charges and field lines objects:
void TheElectricField::Draw(CHwndRenderTarget* renderTarget, CRect& rect) { // draw electric field lines for (auto &&line : electricFieldLines) line.Draw(renderTarget, rect); // draw potential lines for (auto &&line : potentialFieldLines) line.Draw(renderTarget, rect, true); // draw charges for (auto &&charge : charges) charge.Draw(renderTarget, rect); }
For the other drawing methods please check the code^{1}, they should be quite clear.
An easy option for drawing the field lines would be simply to draw line segments between the points. That obviously does not look so good, especially if you space the points at bigger distance. A nicer look would be given by a Spline curve but unfortunately I couldn’t find an implementation in Direct2D
– although GDI+ has one – so I had to use Bézier curves instead, more specifically, cubic ones. Since points are not so far apart and field lines are well behaved, it doesn’t make much difference visually, although it wouldn’t be exactly correct. That bothered me a little so I took a pen and paper and with not much more explanation, here is the result:
void FieldLine::AdjustForBezier(const Vector2D<double>& pt0, const Vector2D<double>& pt1, const Vector2D<double>& pt2, const Vector2D<double>& pt3, double& Xo1, double& Yo1, double& Xo2, double& Yo2) { double t1 = (pt1 - pt0).Length(); double t2 = t1 + (pt2 - pt1).Length(); double t3 = t1 + t2 + (pt3 - pt2).Length(); t1 /= t3; t2 /= t3; double divi = 3. * t1 * t2 * (1. - t1) * (1. - t2) * (t2 - t1); ASSERT(abs(divi) > DBL_MIN); double a = t2 * (1. - t2) * (pt1.X - (1. - t1) * (1. - t1) * (1. - t1) * pt0.X - t1 * t1 * t1 * pt3.X); double b = t1 * (1. - t1) * (pt2.X - (1. - t2) * (1. - t2) * (1. - t2) * pt0.X - t2 * t2 * t2 * pt3.X); Xo1 = (t2 * a - t1 * b) / divi; Xo2 = ((1. - t1) * b - (1. - t2) * a) / divi; a = t2 * (1. - t2) * (pt1.Y - (1. - t1) * (1. - t1) * (1. - t1) * pt0.Y - t1 * t1 * t1 * pt3.Y); b = t1 * (1. - t1) * (pt2.Y - (1. - t2) * (1. - t2) * (1. - t2) * pt0.Y - t2 * t2 * t2 * pt3.Y); Yo1 = (t2 * a - t1 * b) / divi; Yo2 = ((1. - t1) * b - (1. - t2) * a) / divi; }
The code adjusts the intermediate points (control points) in such a way that the new control points make the curve pass through the original points. Obviously it is not enough to have the original four points you want the curve to pass through to get the Bézier curve, there are an infinity of such curves passing through all four points. One more condition is required, and that’s the (relative) length of each curve segment. The Wikipedia page mentions a special case – where one gets nicer formulae – when t=1/3 and t=2/3 for the position of the intermediary points, here it is more general. I chose to approximate the length with the distance between points, in this particular case one could do even better, by using the field line lengths as calculated during field line computation… but this should be good enough, I didn’t want to complicate the code that much.
Unlike the case for the previous post, where only a single computing thread was used, this program deals with multithreading much more.
There is code that simply splits work into two threads like this – in CElectricFieldDoc::GetData()
:
std::thread thread1 = std::thread([calc = calculator] { for (auto& line : calc->field.electricFieldLines) line.AdjustForBezier(); }); std::thread thread2 = std::thread([calc = calculator] { for (auto& line : calc->field.potentialFieldLines) line.AdjustForBezier(); }); thread1.join(); thread2.join();
This code would benefit from splitting the code between more threads and not joining them, like the case for the field lines, but not so much so I did not bother. It’s more clear this way, much complex code wouldn’t help that much.
The code for calculating the field lines is able to split the work among many threads, their number being configurable. In order to do that, I implemented a queue of ‘jobs’ (like this: std::deque<FieldLineJob> m_jobs;
) and I started a pool of threads that take jobs from there and execute them. Here is the code that makes the initial ‘jobs’ and starts the threads:
void FieldLinesCalculator::StartCalculating(const TheElectricField *theField) { Clear(); if (NULL == theField) return; bool has_different_signs; int total_charge = theField->GetTotalCharge(has_different_signs); Vector2D<double> point; double angle_start = 0; for (auto &&charge : theField->charges) { if (charge.charge == 0) continue; double angle_step = 2.*M_PI / (fabs(charge.charge)*theApp.options.numLinesOnUnitCharge); angle_start = - angle_step / 2. - M_PI; if (sign(total_charge) != sign(charge.charge)) angle_start += M_PI + angle_step; for (double angle = angle_start; angle < 2.*M_PI + angle_start - angle_step / 4.; angle += angle_step) { if ((angle != angle_start || !theApp.options.calculateEquipotentials) && sign(total_charge) != sign(charge.charge)) break; double r = theApp.options.chargeRadius / theApp.options.distanceUnitLength; point.X = charge.position.X + r*cos(angle); point.Y = charge.position.Y + r*sin(angle); m_jobs.push_back( { charge, total_charge, has_different_signs, angle, angle_start, point, false, 0 } ); } } potentialInterval = theApp.options.potentialInterval; calcMethod = theApp.options.calculationMethod; startedThreads = theApp.options.numThreads; for (unsigned int i = 0; i < startedThreads; ++i) StartComputingThread(theField); }
It should be easy to understand: the code just walks around the charges making ‘jobs’ by setting the initial position of the field line and some info needed during calculation, then it starts the threads.
A thread has its loop implemented like this:
template<class T> void FieldLinesCalculator::CalcThread<T>::Calculate() { for (;;) { // grab a job from the job list { std::lock_guard<std::mutex> lock(m_pCalculator->m_jobsSection); if (m_pCalculator->Terminate || m_pCalculator->m_jobs.empty()) break; // no more jobs or asked to finish m_Job = m_pCalculator->m_jobs.front(); m_pCalculator->m_jobs.pop_front(); } if (m_Job.isEquipotential) { CalculateEquipotential(); if (m_pCalculator->Terminate) break; std::lock_guard<std::mutex> lock(m_pCalculator->m_potentialLinesSection); m_pCalculator->potentialFieldLines.push_back(PotentialLine()); m_pCalculator->potentialFieldLines.back().potential = m_Job.old_potential; m_pCalculator->potentialFieldLines.back().weightCenter = fieldLine.weightCenter; m_pCalculator->potentialFieldLines.back().points.swap(fieldLine.points); } else { functorE.charge_sign = sign(m_Job.charge.charge); CalculateElectricFieldLine(); if (m_pCalculator->Terminate) break; std::lock_guard<std::mutex> lock(m_pCalculator->m_electricLinesSection); m_pCalculator->electricFieldLines.push_back(FieldLine()); m_pCalculator->electricFieldLines.back().points.swap(fieldLine.points); } } FieldLinesCalculator* calc = m_pCalculator; delete this; ++calc->finishedThreads; }
Again it should be quite straightforward: the thread continues to take out jobs until there are no jobs left. Depending on the type of job, it calculates either an equipotential line or an electric field line.
The first electric field line computation is also responsible for posting equipotential jobs:
template<class T> inline void FieldLinesCalculator::CalcThread<T>::PostCalculateEquipotential() { if (m_Job.angle != m_Job.angle_start || !calculateEquipotentials) return; Vector2D<double> startPoint = m_Job.point; double potential = functorV.theField->Potential(startPoint); if (sign(m_Job.charge.charge)*sign(potential) > 0 && abs(m_Job.old_potential - potential) >= potentialInterval) { if (m_Job.old_potential == 0) m_Job.old_potential = floor(potential / potentialInterval) * potentialInterval; else m_Job.old_potential += sign(potential - m_Job.old_potential) * potentialInterval; std::lock_guard<std::mutex> lock(m_pCalculator->m_jobsSection); FieldLineJob job(m_Job); job.isEquipotential = true; m_pCalculator->m_jobs.push_back(job); // some threads finished although there are still jobs posted // restart one if (m_pCalculator->finishedThreads > 0) { --m_pCalculator->finishedThreads; m_pCalculator->StartComputingThread(&m_pCalculator->field); } } }
It might be the case that the computation threads exhausted the jobs queue before another job is posted and some of them ended, so the code restarts a thread if that’s the case.
There is a timer set in the view that checks from time to time for threads termination. If they all ended, the data is retrieved and adjusted, before drawing. One adjustment is for Bézier lines, the other one deals with removing the duplicates from equipotential lines.
The reason why duplicates arise is that equipotential lines are calculated starting from an electric field line that originates from each charge (one for each charge). One gets lines that are at the same potential but go around different charges, so they don’t coincide, but in other cases they go around more than one charge and duplicates occur. To remove them I wrote some code that is neither clean, nor efficient (it even runs in the main thread) but at the time I wrote it I was quite bored by the project and wanted to finish it. You’ll find that code in void CElectricFieldDoc::GetDataFromThreads()
. It uses a ‘weight center’ for the equipotential line to decide if different lines are for the same equipotential line or not. The code first sorts the lines in the potential order to separate them by the potential value, then it distinguishes them further using that ‘weight center’. Two points with about the same weight center – having about the same potential, too – are considered the same. I used the distance from an arbitrary point to the weight center to sort them, so the code could be fooled, but in practice it is good enough for the purpose of the project.
Here is a short description of the classes in the project, also available in the README file:
First, the classes generated by the MFC Application Wizard:
CAboutDlg
– What the name says, I don’t think it needs more explanation.
CChildFrame
– It’s a MDI application, this is the child frame. No change done to the generated class.
CMainFrame
– Just a little bit of change to the generated one. The main one is the addition of the OnViewOptions()
handler which displays the options property page.
CElectricFieldDoc
– The document. Contains the ‘field lines calculator’ object. It’s the base class for derived documents which implement particular charge configurations. Has a couple of GetData
methods that deal with data retrieval and adjustment from the calculator and a CheckStatus
that forwards the call to the calculator object, to check if the computing threads finished. The OnCloseDocument
method called when a document is closed checks to see if threads were finished. If yes, the calculator object is deleted, if not, it is moved in the application object – after asking the threads for termination – where it stays until the threads are finished.
CElectricFieldView
– The view class. It is changed to use Direct2D for displaying, except print preview and printing where it still uses gdi. The drawing is delegated to charges and field lines objects through the field object that is contained in the calculator (contained in the document). Has a timer set that allows checking for field lines data availability in the document.
CElectricFieldApp
– The application class. Has minor changes and lots of document templates additions, one for each charge configuration from the program. Has a vector of ‘calculators’ that contains calculator objects from the documents that were closed before the threads finished. ExitInstance
calls OnIdle
until all threads are finished. OnIdle
removes and deletes the calculator objects which have the computing threads finished.
The classes that deal with settings:
Options
– This is the settings class. Load loads them from the registry, Save writes them into registry.
OptionsPropertySheet
, ComputationPropertyPage
, DrawPropertyPage
– UI for the options. Not a big deal, should be easy to understand.
PrecisionTimer
is a class that I used to measure some execution times. Not currently used in the program, it might be useful in some other projects, too.
Vector2D<T>
– It’s very similar with the Vector3D
ComputationThread
– The base class for a field line computation thread. There is not much to it, Start() starts it.
FieldLinesCalculator
– Contains the field object, the jobs queue, data from calculating threads, deals with the computing threads.
FieldLinesCalculator::CalcThread
– The computing thread. Calculates electric field lines and equipotentials. Also posts equipotential jobs if it’s the first electric field line/charge that it’s calculating.
TheElectricField
– The electric field class. Contains the charges and the electric field lines and equipotentials. Has methods to calculate the electric field and the potential and some other helper ones. Can draw itself by asking each charge and line to draw itself.
Charge
– The charge class, has ‘charge’ and ‘position’. Can draw itself and also can calculate the electric field and potential due of the charge it represents.
FieldLine
– A field line. Instances of it are electric field lines. Contains the points and it has drawing methods and the Bezier adjustment code.
PotentialLine
– Derived from the above to just contain the potential and the weight center.
The RungeKutta
namespace contains the Runge-Kutta methods classes.
RungeKutta<T, Stages>
– It is the base class, implements Runge-Kutta. The methods are derived from it.
AdaptiveRungeKutta<T, Stages, Order>
– Adaptive Runge-Kutta. The adaptive methods are derived from it.
For the other ones, please see the code.
The ChargeConfiguration
namespace contains classes derived from the document. They implement the particular charge configurations.
I didn’t want to implement loading the document from an xml file, for how to do that please see the previous post. I took advantage of the mfc document template and hardwired the charges configuration in the program instead. Here is how one could extend them:
The first step would be to derive a class from CElectricFieldDoc
. I put them all in the ChargeConfiguration
namespace.
class CDipoleDoc : public CElectricFieldDoc { DECLARE_DYNCREATE(CDipoleDoc) public: virtual BOOL OnNewDocument(); };
Only OnNewDocument
must be overridden. In there one adds the charges and then the calculation is started and that’s about it.
IMPLEMENT_DYNCREATE(CDipoleDoc, CElectricFieldDoc) BOOL CDipoleDoc::OnNewDocument() { if (!CElectricFieldDoc::OnNewDocument()) return FALSE; // TODO: add reinitialization code here // (SDI documents will reuse this document) Charge charge; charge.position = Vector2D<double>(-2, 0); charge.charge = 1; calculator->field.charges.push_back(charge); charge.charge = -1; charge.position = Vector2D<double>(2, 0); calculator->field.charges.push_back(charge); calculator->StartCalculating(&calculator->field); return TRUE; }
The second step is to add the document template in CElectricFieldApp::InitInstance().
CMultiDocTemplate* pDocTemplate; pDocTemplate = new CMultiDocTemplate(IDR_DipoleTYPE, RUNTIME_CLASS(CDipoleDoc), RUNTIME_CLASS(CChildFrame), // custom MDI child frame RUNTIME_CLASS(CElectricFieldView)); if (!pDocTemplate) return FALSE; AddDocTemplate(pDocTemplate);
The third step would be to add corresponding resources, string resource for names (as in IDR_DipoleTYPE \nDipole\nDipole\n\n\n\n
), the menu resource (just copy the dipole one and change the id to whatever IDR_ you use) and icon. Be sure that the IDR_ id matches the one used for the document template.
This program is far from being perfect, I even mentioned in the text how it could be improved in some places. Nevertheless it should be useful for visualization of the electric field for some charge configurations. I’m quite sure one could speed it up by tuning the step size and precisions, too. Also the speed could be improved by limiting the number of steps. I put some limits in there but I did not tweak the values too much, I have limited patience for such things. For a zero net charge the program tries with a lot of steps to end all lines on some other charge, but even so it might not succeed in some cases. If all charges have the same sign, it does not calculate many steps, to speed execution up, because no line will end up on charges of the same sign. For some high resolution printers, the field lines might terminate before reaching the margins if the charges are all of the same sign. If that’s the case, increase the number of steps.
As usual, if you have any suggestions for improvements or you find some bugs to be fixed, or some errors or things that are not clear enough in the text, please leave a comment.
The post Electric Field Lines first appeared on Computational Physics.]]>
Newton is considered as being the one that started the modern physics, so something related with his work seems appropriate to start this blog.
Besides, I had an old program simulating a solar system that I wrote quite a while ago. I wrote the program with the old OpenGL fixed pipeline and leapfrog integration, on Linux, with glfw and now I decided to rewrite it using shaders. I considered it a good opportunity to refresh my memory about OpenGL so maybe I got a little carried away and added too much for the purpose of this blog, but about that, later.
Despite the simplicity of the subject this is a good start for various topics, like numerical methods for solving complicated differential equations or the field of molecular dynamics and before going into more complicated matters for those, this could be a good introduction, so here it is.
The subject is a very simple molecular dynamics program that simulates a solar system using Newtonian mechanics and the Newtonian law of universal gravitation, that is, an N-Body simulator (more specifically, a direct N-body simulation).
Before you run away, here is a video of the end product, maybe you decide to read on:
One does not need to know a lot of physics for dealing with such a subject. Oddly enough, even simulations of large scale on supercomputers do not go way much beyond this (for direct methods, with the loss of accuracy they actually can go down to O(N log N) or better, see the links provided for details), they still use Newtonian gravity and a simple integration method like the one used here. The reason is that for the purpose the Newtonian mechanics is good enough. The simulated objects do not travel at large enough speeds and they are not subjected to gravitational fields that change things so much to matter. The numerical errors are usually larger than the relativistic effects differences.
You need to know only Newtonian mechanics and Newtonian gravity and that’s about it. The emphasis is on or rather the more known which one can use because the non relativistic approximation is very good and also the mass of the simulated objects do not vary. If you want to simulate a rocket, you should be aware that it throws away some mass, though, so you’ll have to use the more general . Obviously the law of universal gravitation is very important, too:
So, the force acting on the body i is:
This is our plain old (or if you like, ). The acceleration is due not of a single body, but because of all the other bodies. So, not only the Earth is pulling on you, but poetically speaking, the whole Universe. You may notice that it’s not pulling only on you, but also on Earth, and that’s a reason (I’ll mention others, later) why you can ignore it.
Since we are not really interested in the forces, but in the change of position, we actually need the accelerations. It is quite straightforward to implement, so without more stories, here is the code that calculates the acceleration:
inline void ComputationThread::CalculateAcceleration(BodyList::iterator& it, BodyList& Bodies) { static const double EPS2 = EPS*EPS; Vector3D<double> r21; double length; it->m_PrevAcceleration = it->m_Acceleration; it->m_Acceleration = Vector3D<double>(0., 0., 0.); for (auto cit = Bodies.begin(); cit != Bodies.end(); ++cit) { if (cit == it) continue; r21 = cit->m_Position - it->m_Position; length = r21.Length(); it->m_Acceleration += r21 * cit->m_Mass / ((length*length + EPS2) * length); } it->m_Acceleration *= G; }
The code should be easy to understand, r21 is the which gets rid of the minus sign due of the reversal of direction. EPS is there just for avoiding the singularity in collisions and it’s not really needed for the current program, but I added it just for completeness. Basically it’s a very low value that avoids division by zero. m_PrevAcceleration is used to save the previous acceleration value, it is used in the numerical algorithm used, that is, Velocity Verlet. r21/length is the versor. One could use r21.Normalize() for it but here there is an optimization, the length of the vector is computed only once.
Molecular dynamics is basically this: you have a bunch of ‘particles’ in a physical system, you know the initial state which is given by all the particle positions and momenta, or in the simple cases when the mass of particles stays unchanged as known in the beginning, their velocities. You want to know the evolution of the system over time. For that you calculate all particle interactions – and here it can be quite messy, too, you might need to deal with quantum mechanical interactions – and then using them you want to change the particles positions and velocities using the equations of motion so you obtain the state of the system after an instant of time. Do that a lot of times and you can extract a lot of information about the system. Hopefully I’ll get back to molecular dynamics on this blog so I won’t insist much here on this subject. Finally, here is the code that implements it:
void ComputationThread::Compute() { unsigned int local_nrsteps; const double timestep = m_timestep; const double timestep2 = timestep*timestep; BodyList m_Bodies; GetBodies(m_Bodies); for (auto it = m_Bodies.begin(); it != m_Bodies.end(); ++it) CalculateAcceleration(it, m_Bodies); for (;;) { local_nrsteps = nrsteps; // do computations for (unsigned int i = 0; i < local_nrsteps; ++i) VelocityVerletStep(m_Bodies, timestep, timestep2); CalculateRotations(m_Bodies, local_nrsteps*timestep); // give result to the main thread SetBodies(m_Bodies); // is signaled to kill? also waits for a signal to do more work { std::unique_lock<std::mutex> lock(mtx); cv.wait(lock, [this] { return an_event > 0; }); if (an_event > 1) break; an_event = 0; } } }
After the first initialization phase, the for(;;)
loop simply does the calculations using Velocity Verlet, advancing several time steps (as configured by the main thread) before returning a result to the main thread and waiting either for a request for more data or for exit.
In this case, the gravitational force range is infinite and there is no screening possible so one has to consider the interaction of a ‘particle’ with all the others. In other cases it might be simpler, when there is screening and/or the range of the interaction is limited one could consider only the neighbors of a ‘particle’ for interactions, using neighbors lists that are updated once in a while after performing many time steps in the simulation. That does not mean one cannot simplify/increase the speed of gravity simulations. In some cases one could ignore interactions with bodies that are far away and have no considerable mass, depending on what the purpose is. In other cases, a bunch of ‘particles’ can be considered a single body and a center of the mass approximation could be used or perhaps some first terms from a multipole expansion. Depending on the place they are one could use different time steps. For details and more methods, please visit the wikipedia links provided above. Anyway, for this simulator I chose to calculate all interactions between simulated bodies, although there are some simplifying assumptions.
One of them is ignoring the influence of the objects not simulated, obviously. I did not add all the moons and asteroids and comets… and the other stars in the Universe except the Sun. To see why it does not matter that much one could try to calculate the gravitational influence on him from his favorite star, the one that his astrologer claims to influence his life. It is instructive to compare it with the tiny gravitational influence from some nearby building or tree. Also one could notice that the Universe does not really care much about the direction in space so if the star that influences your life pulls you towards it, there might be another star in the other direction that cancels the pull (see below the comment about spherical symmetry for more). But let’s suppose there isn’t. Don’t forget that the tiny pull on you is also on the Earth, so you and the Earth are both falling towards that star with the same acceleration (and the star is falling towards you, but you should not care about that, it’s not going to hit you soon, to care that much, and besides it’s a matter of perspective, anyway). How did the astrologer say the star is influencing you and only you and some other chosen ones? You and the Earth and everybody else are free falling towards that star and the effect that could affect you is way tinier than even the tiny force you calculated suggests.
I’m sorry about the rant on astrology but it’s incredible how many still believe that (and other quite similar) bullshit.
Another assumption made is that the bodies are spheres with the mass distributed in a spherically symmetrical manner. Using Gauss theorem one can check that for a spherical shell with spherical symmetry of the mass distribution the gravitational field outside is as if all the mass is in the center of the sphere. Also the field inside is zero, a fact that was missed by a lot of science fiction stories about a hollow Earth or Moon or whatever hollow planet. From that it’s easy to see how a sphere with spherically symmetric mass distribution has the same gravitational effect as if its whole mass is in its center (again, outside of the sphere). To see why this is a quite good assumption one can calculate the field from a ‘dumbbell’ composed of two point masses placed at a small distance compared with the distances where they act. Or, more generally, one could consider the multipole expansion.
The numerical method used is Velocity Verlet but before going to that, I want to describe an easier method that is both easier to understand and implement: the Euler method. Consider the derivative definition:
You can already see that in our case velocity and acceleration play the role of the generic f and the variable is time. More often than not, you have no chance of analytically solving the problem (with is also the case for the N-Body problem, except when N=2 or with restrictions, 3). You may have a chance to solve it numerically, with the computer. For that you have to forget the and use a finite interval instead. Call and the derivative becomes (this is from the more general finite difference method, more specifically, the forward difference):
Arranging it a little you get the Euler method: . If f is velocity, x is time, is f’ so you have . Observing that it is easy to see how one could use the Euler method to successively calculate positions and velocities one time step interval after another. The method is good for didactic purposes and for very special cases where the error does not matter so much (for a better Euler method, see: Semi-implicit Euler) because the error is locally proportional with and globally with . One can figure it out by Taylor expanding and staring at it a little. Incidentally that’s another way of deriving the Euler method.
Geometrically it is easy to see that while the derivative is given by the slope of the tangent to the function in the calculation point, the Euler method uses the chord instead (the line segment between the and points). More specifically, it advances in the direction of the tangent as if it would be the chord. This already gives an opportunity for improving it, by seeing that for a well behaving function the slope of the chord is a better approximation not for the slope of the tangent at one end, but rather for the slope of a tangent at a point in the middle. There is not much from there to midpoint method or Leapfrog integration. If you look at the last link, you can see that for advancing to the next position, the method uses the derivative (that is, the velocity) at a point in the middle, not the derivative at the first point. A similar thing happens with calculation of the velocity, it uses the acceleration (which is obviously the derivative of velocity) at a point that’s in the middle of the previous velocity position and the next one. The method bears the Leapfrog name because as you can see, position and velocity jump over one another as they are calculated step by step.
I had an old program where I implemented the Leapfrog method, but for this post I decided to use the Verlet integration. By now I think I’ve given enough information (less in the text, more in the links) for understanding Velocity Verlet. I mentioned the difference method and with the addition of the story about the chord versus tangent one can see how the central difference is better. Applied on the acceleration one gets the Verlet integration without velocities. Now with a little help from the Leapfrog method one should get the Velocity Verlet, too. The details are in the links. By the way, hopefully you notice the similarities with and . For the later the acceleration used is actually the average acceleration of the accelerations at the two moments of time. This is enough for now, so here is the code for the Velocity Verlet step:
inline void ComputationThread::VelocityVerletStep(BodyList& Bodies, double timestep, double timestep2) { for (auto &body : Bodies) body.m_Position += body.m_Velocity * timestep + 0.5 * body.m_Acceleration * timestep2; for (auto it = Bodies.begin();it != Bodies.end(); ++it) { CalculateAcceleration(it, Bodies); it->m_Velocity += (it->m_Acceleration + it->m_PrevAcceleration) * timestep * 0.5; } }
It should be very easy to understand, timestep2 is the square of the timestep (just an optimization to avoid recalculating it in several places repeatedly), the others should be self describing. There are three parts:
If you did not get the code^{1} by now, you probably should get it. I put it on GitHub, here. From the pieces of code presented above, you probably figured out the the molecular dynamics code is in a separate thread implemented in a class named ComputationThread. There is no much to it besides what is already presented, except some things related with thread starting/synchronization/data accessing. Here is the method that calculates the rotations, for the others I’ll let you look into the code:
inline void ComputationThread::CalculateRotations(BodyList& Bodies, double timestep) { for (auto &body : Bodies) { double angular_speed = TWO_M_PI / body.rotationPeriod; body.rotation += angular_speed * timestep; if (body.rotation >= TWO_M_PI) body.rotation -= TWO_M_PI; else if (body.rotation < 0) body.rotation += TWO_M_PI; } }
There is not much to this method, hopefully you did not expect one that gives precession, nutation, that is, a full rigid body treatment. The first two lines in the for loop are pretty straightforward, they implement the simple rotation around the axis, the other two are there just to keep the value in the range.
Now, here is a short description of the classes (also in the README file):
First, the classes generated by the MFC Application Wizard:
CAboutDlg – What the name says, I don’t think it needs more explanation.
CMainFrame – Implements the main frame. Besides what was generated (adjusted to fit the application needs), I added code to deal with menu entries, some message routing to the view and that’s about it. CMainFrame::OnFileOpen
deals with opening a configuration xml file for the solar system. CMainFrame::OnViewFullscreen
besides switching the full screen, it also hides the mfc toolbar that allows closing the full screen.
CSolarSystemApp – Implements the application object. Not much change there, besides setting a path for registry settings, setting the main window title and loading the ole libs (needed for MS xml parser)
CSolarSystemDoc – Contains and manages the computation thread (m_Thread
) and the solar system data (m_SolarSystem
). Most of the code deals with loading the xml file, using MS XML parser. There is also some code that deals with the computation thread.
CSolarSystemView – The view. Deals with OpenGL setup, displaying the scene (look into the Setup methods) and also with the keyboard and mouse message handling. CSolarSystemView::OnDraw
is the drawing method. CSolarSystemView::RenderScene
does the actual drawing, CSolarSystemView::RenderShadowScene
is a quite simplified version of the previous, dealing only with shadows, CSolarSystemView::RenderSkybox
draws the skybox. CSolarSystemView::MoonHack
is a ‘moon hack’ that allows rescaling of the distance between the planet and its moons (configurable in the xml file). It works for a reasonable number of bodies but there can be much better alternatives. It’s good enough for the purpose of the project. CSolarSystemView::KeyPressHandler
is the key press handler. For camera movement, it just sets a value, the timer handler takes care of the actual camera movement. Both CSolarSystemView::OnMouseWheel
and CSolarSystemView::OnLButtonDown
move the camera, but the actual movement happens when the timer handler calls camera.Tick()
. CSolarSystemView::OnTimer
is the timer handler. In there the computation thread is signaled to do more calculations and also movements happen.
The classes that deal with settings:
Options – This is the settings class. Load loads them from the registry, Save writes them into registry.
OptionsPropertySheet, DisplayPropertyPage, CameraPropertyPage – UI for the options. I think the names are descriptive enough and the implementation is pretty straightforward.
Some UI classes:
CEmbeddedSlider, CMFCToolBarSlider – Implement the slider from the toolbar that allows setting the simulation speed. I used a Microsoft sample as an example to implement them. For some reason, the classes from the sample didn’t work for me ‘out of the box’ so I rewrote the classes using the ones from the sample as a guideline.
Solar system data:
SolarSystemBodies – just containing vectors of bodies and body properties.
Body – Contains data used in calculations, except m_Radius
, which is not really needed but I included it in case I’ll use the code later and collisions will be involved.
BodyProperties – Contains information needed for displaying, like the texture.
Vector3DVector3D<double>
in the code. One reason I did not use the glm vector was that I had some old code for the camera I wrote a while ago and it used this. Another reason is that I’ll probably reuse it in other projects where glm will not be used.
There is quite a bit of OpenGL code in there and I cannot explain it in a detailed manner here. Instead I’ll point some links to some quite good OpenGL tutorials on the net. I’ve looked into some of them while writing this program, the resemblance of code might not be a coincidence, although there was no copy/paste (except the cube coordinates, I think, one would be quite masochistic to type them by hand and besides, there cannot be much originality there).
I think this is the best of them: Learn OpenGL^{2}. This one Learning Modern 3D Graphics Programming^{3} is oriented more towards theory. Here is another nice one: open-gl tutorial^{4}. One more: Modern OpenGL Series^{5}.
Now here is just a little bit of description:
The most important class in there I think is SolarSystemGLProgram – the GL program for displaying the solar system. Since it’s quite customized – that is, you probably couldn’t use it as it is in another program – I preferred to not include it in the OpenGL namespace. You’ll find in there the vertex and fragment shaders. Lightning is Blinn-Phong and although it looks like point lightning at a quick look on the screen, it’s actually directional lightning, just that the direction is changed for each object in the scene to be from the direction of the Sun (so it’s some sort of a hybrid between directional and point lightning). It should work with more than one light source but I did not test it. Shadows are omnidirectional shadows using a depth cubemap (only for a single Sun, sorry). Unfortunately shading is behaving as for point lightning and it’s not very realistic, just look at the shadows of planets thrown on other planets. It was the best I could do in short time. Since I mentioned the unrealistic shadows, lightning is also quite unphysical, the attenuation of light is linear with distance, instead of quadratic, but it looks nicer on screen.
Most of the OpenGL code is in the OpenGL namespace, but there is also quite a bit in the view code. There are several classes in the OpenGL namespace which I prefer to not describe each (yet?) but I’ll sketch the idea. They are some very simple wrappers for OpenGL API. OpenGLObject is an abstract class that is used as a base class for a lot of them, it just wraps an ID. The names should be quite descriptive, one should see immediately what SkyBoxCubeMapProgram or ShadowCubeMapProgram do. Camera is the camera class, I had it already written but with the old OpenGL fixed pipeline (using gluLookAt) and that’s why it uses vector arithmetic – the Vector3D<double>
mentioned above – instead of matrices or quaternions for rotations. There are some classes that are not used in this project, ComputeShader and MatrixPush, the first I might use in the future in other projects, the last one can be used in case you like the old glPushMatrix/glPopMatrix way. Cube is also not used. Sphere is obviously for drawing spheres and it’s used for all the suns/planets/moons in the program.
I might improve and extend the OpenGL namespace in the future, that’s why it might look unnecessary complicated.
Besides mfc, already included with Visual Studio, the program obviously uses the OpenGL library and two more which you’ll have to download and install in order to compile the program: glm^{6} and glew^{7}. MS XML parser is also used.
Start and stop the simulation pressing space or by clicking the ‘run’ toolbar button or use the menu entry. Load a different xml file using File | Open. Change settings with View | Options. Enter Full Screen with View | Full Screen, exit with the escape key. Turn the camera towards a point by clicking on it. Move the camera forward or backward by using up and down keys, or you can use the scroll mouse wheel. Keep shift pressed and the camera will move up or down instead of forward or backward. With control key pressed, it will rotate up or down (pitch up or down). Left and right arrows will move the camera towards left or right, unless the control key is pressed, when the camera will yaw left or right. With shift, it will roll left or right. You can increase/decrease the speed of the simulation using the slider on the toolbar or +/- keys.
The structure is self-explanatory but I must say that the values are scaled in the committed file: The Sun size is increased 50 times, all planets are scaled up 1000 times, some moons (like Phobos and Deimos) are scaled up even more. Because of scaling many moons would be inside of the planet so I had to scale the distance between the planet and the moons, too (see the ‘moon hack’). The Solar System is a very big place and one would have a hard time seeing the planets and moons without the scaling so I did this just to look nicer on screen. Those changes should affect the visuals only. Due of laziness I put all planets and moons in the ecliptic plane, to have a realistic simulation requires more calculations than I’m willing to do. There might be a lot of mistakes in the values, too, I didn’t pay much attention to inclination and rotation period, for example, but others could be wrong as well. I used only average orbital speed and the semi-major axis for setting the velocity and distance and all bodies start aligned.
Although they are free to use, I did not want to include the textures I used. You can download them yourself and put them in the /Textures folder. If you use another folder or different names than I used, you’ll have to edit the xml file. You may want to convert them to 24bpp and resize them to have dimensions as a power of 2. The code can deal only with 24 bpp textures. Here are the Sun/planets/moons textures that I used: Planet Texture Maps^{8}. I think I downloaded the sky box textures from here (the ‘Ame Nebula’ one).
Well, here is the end. If you have any suggestions for improvements or you find some bugs to be fixed, or some errors or things that are not clear enough in the text (I’m not a native English speaker, so that is expected), please leave a comment. Obviously such a project could benefit from many improvements, I resisted the urge to add a ring to Saturn using instancing, for example. Much more code could render the project hard to understand, so I stopped at this stage. Probably I would add bump/normal mapping to it if I could find both normal map textures and regular ones for lots of planets/moons. Unfortunately it’s hard to find both of them for enough bodies to be worth it, so here it is with no normal mapping.
Hopefully somebody will find this post and/or the code useful. If not, maybe one of the next ones.
The post Newtonian Gravity first appeared on Computational Physics.]]>