Computational methods in condensed matter physics
Notes from the University La Sapienza course held by Prof. Giovanni Bachelet and Saverio Moroni
Tomarchio Luca
July 24, 2019
Commenti
Queste note sono state scritte e sono di proprietà di Luca Tomarchio. Esse nascono dalla costante attenzione riportata durante il corso di Computational Statistical Mechanics tenuto dai prof. Giovanni Bachelet e Saverio Moroni all'Università di Roma La Sapienza, per il percorso di laurea magistrale in fisica. Alcuni capitoli sono scritti in inglese poiché riassumono degli studi individuali.
Contents
- 1 Error analysis and probability concepts
- 1.1 Characteristic function
- 1.2 Central limit theorem
- 1.3 Limits of sequences of random variables
- 1.4 Maximum likelihood criterion
- 2 Discrete stochastic processes
- 2.1 Markov process
- 3 Introduzione al Quantum Monte-Carlo
- 3.1 Campionamento per importanza
- 3.2 Catene di Markov
- 3.3 Algoritmo di Metropolis
- 4 T = 0: Path integral Monte Carlo
- 4.1 Approssimazione primitiva
- 4.2 Stima dell’energia
- 4.3 Applicazione a particelle indistinguibili
- 4.3.1 Superfluidità
- 4.3.2 Condensazione di Bose
- 5 Ground state: Variational Quantum Monte-Carlo
- 5.1 Ottimizzazione della funzione d'onda: Correlated Sampling
- 5.1.1 Divergenze della varianza ai nodi
- 6 Projection Monte Carlo
- 6.1 Sistemi fermionici: Problema del segno
- 6.1.1 Fixed Node Approximation (FNA)
- 7 Diffusion Monte Carlo (DMC)
- 7.1 Importance sampling
- 8 Ab-initio approaches to the treatment of periodic systems
- 8.1 Lattice dynamics
- 8.2 Density functional theory
- 8.3 Richiami teoria degli elettroni di Bloch
- 9 Teoria dello pseudo-potenziale
- 9.1 Plane waves
- 10 Density-functional perturbation theory
- 11 Somme di Ewald
Computational and in particular Monte Carlo methods are heavily dependent on the fast, efficient production of streams of random numbers. Since physical processes, such as white noise, are too slow for modern computational speeds, random number sequences are produced using software. However, these numbers are only 'pseudo-random' due to the deterministic nature of the algorithm. Indeed, it is known that the random number generator has to be matched correctly to the computer and simulation to obtain better performances, hence, multiple methods are required:
- Congruential method, where a fixed number c is chosen along with a given seed X and subsequent numbers are generated by simple multiplication. X = (cX + a )MODNn 0 max. The best performance is obtained when X is odd. Experience has also shown that a good congruential generator is 231 - CONG, which has c = 16807, a = 0 and N = 231 - 1. Multiple congruential generators can be used to improve the efficiency, for example, we can use a second independent generator to compute a list of numbers, from which a second generator draws randomly.
- Shift Register algorithms, where a table of random numbers if first created (congruential) and a new random number is produced by combining two different existing numbers from the table. Xn = Xn-p XOR Xn-q where p and q must be properly chosen. The best choice for the pair (p, q) is to satisfy Xp + Xq + 1 = primitive, for instance, p = 250 and q = 103.
- Legged Fibonacci generators are a generalized class of the shift registers, where the XOR is substituted by other operators.
Properties of random number generators can be examined through uniformity tests, which break up the interval between 0 and 1 in multiple bins to check the uniform occupation of each bin, and overlapping M-tuple tests, which check the statistical properties of the number of times M-tuples of digits appear in the sequence of random numbers. A third way are the parking lot tests, where an m-dimensional plot is searched for regular structures. The m-coordinates of each point are determined by m-successive calls to the random number generator. The main point for these tests is to find residual errors which are smaller than the statistical ones obtained by the Monte Carlo algorithm.
There are some situations in which random numbers with different distributions are required. The most general way to perform this is to look at the integrated distribution function F(y) = ∫0y f(x)dx of the desired distribution f(x), generate a uniform distribution of random numbers y and then take the inverse relation x = F-1(y). For Gaussian problems, we can use the Box-Muller method:
x = (−2 ln x1)1/2 cos(2πx2), y = (−2 ln x1)1/2 sin(2πx2)
1 Error Analysis and Probability Concepts
The question of what to measure in a probabilistic system is nontrivial. In practice, one measures either a set of individual values of a random variable or alternatively, the measuring procedure may implicitly construct an average of some kind. It will not normally be possible to measure anything other than a few selected averages and thus, higher moments will be unavailable.
In contrast, when we measure individual events, we can construct averages of the observables by the obvious method:
$$ \bar{x} = \frac{1}{N} \sum_{n=1}^{N} x_n $$
Higher moments tell us only about the properties of unlikely large values of x, i.e., in practice, the most important quantities are related to the first and second moments. These latter are built theoretically as a covariance matrix:
$$ \text{var}(x) = \langle x_i^2 \rangle - \langle x_i \rangle \langle x_i \rangle \equiv \langle [x_i - \langle x_i \rangle]^2 \rangle $$
while experimentally, when only a finite set of points is known, the standard deviation can be computed as:
$$ \sigma = \sqrt{\frac{1}{N} \sum_{n=1}^{N} (x_n - \bar{x})^2} $$
These concepts are easily proved correct thanks to the law of large numbers. Supposing all the x have the same probability distribution, but without assuming independence, providing the covariance matrix to vanish sufficiently rapid as |n - m| → ∞, then:
$$ \lim_{N \to \infty} \bar{x} = \langle x \rangle $$
However, since importance sampling Monte Carlo methods correspond to a Markovian master equation by construction, dynamical behavior can affect the results for static, hence, a 'dynamic' correlation between subsequently generated observations requires the error to be modified in the form:
$$ \text{error}(\bar{x}) = \frac{\sigma}{\sqrt{N}} \sqrt{1 + 2\tau_x/\delta t} $$
where δt is the time interval between subsequent generated states and τ is the correlation time.
1.1 Characteristic Function
Since independence of variables (not just in pairs) is sometimes useful to have, we would like to find conditions for this to happen. If s is a vector and x the vector of random variables under examination, we define the characteristic function as:
$$ \phi(s) = \langle \exp(i \mathbf{s} \cdot \mathbf{x}) \rangle $$
which is a uniformly continuous function of its arguments for all finite real s. Furthermore, if the moments exist, then:
$$ \left. \frac{\partial^m}{\partial s_i^m} \phi(s) \right|_{s=0} = i^m \langle x_i^m \rangle $$
An important consequence is: given a sequence of probability densities, these converge to a limiting one if and only if the corresponding characteristic functions converge to the corresponding characteristic one for the limiting probability density.
Due to the one-to-one relation between p and φ, if the random variables are independent, then:
$$ \phi(s) = \phi_1(s_1) \phi_2(s_2) \dots \phi_N(s_N) $$
Henceforth, given the independent set of variables:
$$ \langle \exp(i \mathbf{s} \cdot \mathbf{x}) \rangle = \phi_n(\mathbf{s}) = \prod_n \phi_n(s_n) $$
It is common to introduce the parameters known as cumulants:
$$ c_m = \left. \frac{\partial^m}{\partial s_i^m} \left( -i \log \phi(s) \right) \right|_{s=0} $$
Their importance is related to the following equalities:
$$ c_1 = \mu, c_2 = \sigma^2, c_3 = \langle (x - \mu)^3 \rangle, c_4 = \langle (x - \mu)^4 \rangle - 3\sigma^4 $$
In particular, the third and fourth contributions can be related to the skewness (c_3/σ^3) and kurtosis (c_4/σ^4), meaning respectively the deviation from perfect symmetry of the distribution concerning the average, and the deviation from the Gaussian distribution.
1.2 Central Limit Theorem
Many variables are empirically well approximated by Gaussian distributions due to the central limit theorem, which states that a random variable composed of the sum of many parts, arbitrarily distributed without an independence condition, is Gaussian. The fulfillment of this theorem resides in the Lindeberg condition:
$$ \lim_{N \to \infty} \frac{1}{\sigma_N^2} \int_{|x|>t\sigma_N} x^2 p_n(x) \, dx = 0 \quad \forall t > 0 $$
The independence condition can be avoided thanks to the same reasons of the law of large numbers. The probability density and characteristic function for a Gaussian distribution of N variables are:
$$ p(x) = \frac{1}{\sqrt{(2\pi)^N \det \Sigma}} \exp\left(-\frac{1}{2} (x-\bar{x})^T \Sigma^{-1} (x-\bar{x}) \right) $$
$$ \phi(s) = \exp\left(i \bar{x}^T s - \frac{1}{2} s^T \Sigma s \right) $$
1.3 Limits of Sequences of Random Variables
Most of computational work consists of determining approximations to random variables functions, in which the concept of a limit of a sequence (of functions Xn) of random variables ω naturally arises. However, there is no unique way of defining such a limit. We say that a function converges almost certainly if it does it for every ω:
$$ \lim_{n \to \infty} X_n(\omega) = X(\omega) \quad \text{a.c.} $$
Another possibility is to look for the mean square deviation of Xn(ω) from X(ω), and we say that the convergence is in the mean square if:
$$ \lim_{n \to \infty} \langle (X_n(\omega) - X(\omega))^2 \rangle = 0 \quad \text{m.s.} $$
These two kinds of limits are the strongest possible, and their validity implies the stochastic limit:
$$ \lim_{n \to \infty} \mathbb{P}(|X_n - X| > \epsilon) = 0 \quad \forall \epsilon > 0 $$
1.4 Maximum Likelihood Criterion
Suppose to have a random variable X with an unknown distribution. We want to compute this distribution through the observation of the events xi, supposed independent and identically distributed (iid) with i = 1, .., N. Observing the histogram of the occurrences, we can infer the form of the distribution, for example Gaussian, such that we know which parameters to exploit to characterize it. The real unknown set of parameters θ will give a distribution ρ(xi|θ) different from the one computed with the known experimental features θ', ρ(xi|θ').
We now consider the divergence of Kullback-Leibler:
$$ D_{KL}(\theta_0, \theta) = \sum_i ρ(x_i|θ_0) \log\frac{ρ(x_i|θ_0)}{ρ(x_i|θ)} $$
which tells us how many bits are required to decode the real events x with an approximate distribution. Our goal is to minimize this quantity. Due to the operative form of the average, we can rewrite the relation as:
$$ D_{KL}(\theta_0, \theta) = \frac{1}{N} \sum_i \log\frac{ρ(x_i|θ_0)}{ρ(x_i|θ)} $$
Given the logarithm properties, a minimization of DKL will mean a maximization of the likelihood function:
$$ \mathcal{L}(\theta) = \prod_i ρ(x_i|θ) $$
2 Discrete Stochastic Processes
For stochastic processes we mean systems which evolve probabilistically in times or more precisely, systems in which a certain time-dependent random variable x(t) exists. Given a discrete set of times and their associated random variables, the system is completely described by the joint probability densities:
$$ p(x_1, t_1; x_2, t_2; \ldots) $$
In terms of these functions, one can also define conditional probability densities:
$$ p(x_1, t_1; \ldots | y_1, \tau_1; \ldots) = \frac{p(x_1, t_1; \ldots; y_1, \tau_1; \ldots)}{p(y_1, \tau_1; \ldots)} $$
valid independently of the ordering of the times, although it is usual to consider only times which increase from right to left: t1 ≥ t2 ≥ ... ≥ τ1 ≥ τ2 ≥ .... The concept of an evolution leads us to consider the conditional probabilities as predictions of the future values, given the knowledge of the past.
The concept of general stochastic process is very loose, since to define it we need to know at least all possible joint probabilities of the variables. If this is true, then the process is known as separable. When the variables are completely independent of its values in the past (or future) we have:
$$ p(x_1, t_1; \ldots) = \prod_i p(x_i, t_i) $$
Furthermore, if they are independent from time, we have the class of Bernoulli trials.
2.1 Markov Process
The next most simple idea is that of the Markov process, based on the Markov assumption, i.e., the requirement for the conditional probability to be:
$$ p(x_1, t_1; \ldots | y_1, \tau_1; \ldots) = p(x_1, t_1; \ldots | y_1, \tau_1) $$
where only the information of the last visited time is useful for the future prediction, therefore:
$$ p(x_1, t_1; x_2, t_2; \ldots; x_n, t_n) = p(x_1, t_1, t_2) p(x_2, t_2, t_3) \ldots p(x_n, t_n|x_{n-1}, t_{n-1}) $$
provided the increasing times from right to left. From the general validity of the relation:
$$ p(x_1, t_1, t_3) = \int dx_2 p(x_1, t_1, t_2; x_2, t_2, t_3) p(x_2, t_2, t_3) $$
applying the Markov assumption we obtain the Chapman-Kolmogorov equation:
$$ p(x_1, t_1, t_3) = \int dx_2 p(x_1, t_1, t_2) p(x_2, t_2, t_3) $$
which has many solutions, best understood by deriving its differential form in the continuum theory of stochastic process (found in relevant literature). For the discrete case, instead, the integral is substituted by a sum, and the conditional probabilities are now matrices.
3 Introduzione al Quantum Monte-Carlo
Lo scopo di introdurre i metodi Monte Carlo è quello di poter calcolare valori attesi di osservabili non ottenibili analiticamente. Il problema generale ha una forma in termini di matrice densità:
$$ hAi = Tr(\rho A) \quad \rho = e^{-\beta H} $$
dove H è un'hamiltoniana interagente, ovvero non separabile a causa delle interazioni tra i diversi corpi del sistema. Il valore atteso può essere scritto in termini parametrici come:
$$ hAi_{O(RR')} = \frac{\int dR dR' \rho(RR')A(R' R)}{\int dR \rho(R)} $$
dove la seconda integrazione al numeratore è causata dalla possibilità dell'operatore A di essere un'operatore non locale, mentre la prima nasce dall'applicazione della traccia nella forma parametrizzata. Nel caso di località (commuta a posizioni diverse) possiamo rappresentarlo come:
$$ hR|A|R' = A(R)\delta(R-R') $$
Il problema di questi integrali è che non possono essere risolti numericamente mediante metodi di quadratura; supponendo di dividere il dominio d'integrazione in n punti, l'errore commesso prenderebbe la forma 1/nα, con α parametro caratteristico del metodo, se volessi ridurre l'errore di un fattore 10 dovrei definire un nuovo reticolo di punti n' = n(10)D/α, con D dimensione del dominio. Da ciò si nota che per dimensioni elevate il metodo impiegherebbe troppo tempo per essere risolto entro una migliore precisione.
La difficoltà nel migliorare l'errore compiuto può essere risolta usando dei metodi stocastici, ove entra in gioco un parametro dinamico che varia casualmente. L'esempio più semplice che possiamo immaginare è quello di voler calcolare il rapporto tra l'area di un cerchio e quella di un quadrato circoscritto, si ha:
$$ \frac{\pi}{4} = \frac{\int_S dx dy}{\int_C dx dy} $$
Ciò può essere effettuato supponendo di generare dei punti all'interno del quadrato; questi potranno cadere o dentro o fuori dal cerchio, ma mi aspetto che, nel limite di numero di punti generati elevato:
$$ \frac{N_{\text{in}}}{N} $$
converga al risultato cercato. Possiamo riscrivere gli integrali come:
$$ \frac{\pi}{4} = \frac{\int_S dx dy \pi(x, y) f(x, y)}{\int_C dx dy \pi(x, y)} = 1 $$
dove:
$$ \pi(x, y) = \begin{cases} 1 & \text{se dentro il cerchio} \\ 0 & \text{altrimenti} \end{cases} $$
Il contributo π/π definisce la distribuzione normalizzata di generazione dei punti nel quadrato, che in questo caso è uniforme. Introducendo la variabile aleatoria:
$$ \xi = f(x_i, y_i) $$
con (xi, yi) i-esimo punto generato:
$$ \langle \xi \rangle = \frac{1}{N} \sum_i \xi_i = \frac{1}{N} \int P(s) ds $$
dove P(s) definisce la distribuzione di probabilità dei punti generati.
Scarica il documento per vederlo tutto.
Scarica il documento per vederlo tutto.
Scarica il documento per vederlo tutto.
Scarica il documento per vederlo tutto.
Scarica il documento per vederlo tutto.
Scarica il documento per vederlo tutto.
Scarica il documento per vederlo tutto.
Scarica il documento per vederlo tutto.
-
Advanced quantum mechanics
-
Quantum Machine Learning: Challenges and Opportunities
-
Tesi Gabriele Pisacane: Quantum clocks
-
Monte Carlo Methods - Prof Dulla - Appunti di Fabio Galizia