4 ▪ Ensemble theory


CHAPTER 4


Ensemble theory


RESUMING where we left off in Chapter 2, if we had the phase-space probability density ρ(p,q), we could calculate averages of phase-space functions, A=ΓA(p,q)ρ(p,q)dΓ, Eq. (2.53). For equilibrium systems, ρ(p,q)=F(H(p,q)), Eq. (2.57), where F is an unknown function. In this chapter, we show that F depends on the types of interactions the systems comprising the ensemble have with the environment (see Table 1.1). We distinguish three types of ensemble:




  1. Isolated systems: systems having a well-defined energy and a fixed number of particles. An ensemble of isolated systems is known as the microcanonical ensemble;1



  2. Closed systems: systems that exchange energy with the environment but which have a fixed number of particles. An ensemble of closed systems is known as the canonical ensemble;



  3. Open systems: systems that allow the exchange of matter and energy with the environment. An ensemble of open systems is known as the grand canonical ensemble.



4.1.1 The microcanonical ensemble: Systems of fixed U,V,N


Phase trajectories of N-particle, isolated systems are restricted to a (6N1)-dimensional hypersurface SE embedded in Γ-space (the surface for which H(p,q)=E). Referring to Fig. 2.5, for M a set of points on SE, the probability P(M) that the phase point lies in M is, using Eq. (2.64),


P(M)=1Ω(E)MSEdSEΓH,


(4.1)


where Ω(E)=SEdSE/ΓH is the volume2 of SE, Eq. (2.64). The expectation value of a phase-space function ϕ(p,q) is, using Eq. (2.65),


ϕ=1Ω(E)SEϕ(p,q)dSEΓH=1Ω(E)Γδ(EH(p,q))ϕ(p,q)dΓΓρ(p,q)ϕ(p,q)dΓ,


(4.2)


where we’ve used the property of the Dirac delta function, that for a function g:N, Nδ(g(r))f(r)dNr=g1(0)dσf(r)/g, where dσ is the volume element of the (N1)-dimensional surface specified by g1(0). The phase-space probability density for the microcanonical ensemble, the microcanonical distribution, is, from Eq. (4.2),


ρ(p,q)=1Ω(E)δEH(p,q).


(4.3)


 


Equation (4.3) indicates for an ensemble of isolated systems each having energy E, that a point (p,q) in Γ-space has probability zero of being on SE unless H(p,q)=E, and, if H(p,q)=E is satisfied, ρ(p,q) is independent of position on SE.


We’re guided by the requirement that ensemble averages stand in correspondence with elements of the thermodynamic theory (Section 2.5). In that way, the macroscopic behavior of systems is an outcome of the microscopic dynamics of its components. The consistency of the statistical theory with thermodynamics is established in Section 4.1.8. Thermodynamic relations in each of the ensembles are most naturally governed by the thermodynamic potential having the same variables that define the ensemble. The Helmholtz energy F=F(T,V,N) is the appropriate potential for the closed systems of the canonical ensemble (Section 4.1.9), and the grand potential Φ=Φ(T,V,μ) is appropriate for the open systems of the grand canonical ensemble (Section 4.1.3). We’ll see that the normalization factor on the probability distribution in the canonical ensemble, the canonical partition function Zcan, Eq. (4.53), is related to F by Zcan=exp(F/kT) (see Eq. (4.57)); likewise in the grand canonical ensemble, the grand partition function ZG, Eq. (4.77), is related to Φ by ZG=exp(Φ/kT) (see Eq. (4.76)). With that said, which quantity governs thermodynamics in the microcanonical ensemble? Answer: Consider the variables that specify the microcanonical ensemble. Entropy S=S(U,V,N) is that quantity, with Ω=exp(S/k) in Eq. (4.3).


4.1.2 The canonical ensemble: Systems of fixed T,V,N


We now find the Boltzmann-Gibbs distribution, the equilibrium probability density function for an ensemble of closed systems in thermal contact with their surroundings. The derivation is surprisingly involved—readers uninterested in the details could skip to Eq. (4.31).


4.1.2.1The assumption of weak interactions


We take a composite system (system A interacting with its surroundings B; see Fig. 1.9) and consider it an isolated system of total energy E. Let Γ denote its phase space with canonical coordinates {pi,qi}, i=1,,3N, with A having coordinates {pi,qi} for i=1,,3n and B having coordinates {pi,qi} for i=3n+1,,3N, where Nn. We can write the Hamiltonian of the composite system in the form H(A,B)=HA(A)+HB(B)+V(A,B), where HA(HB) is a function of the canonical coordinates of system A(B), and V(A,B) describes the interactions between A and B involving both sets of coordinates. The energies EAHA, EBHB far exceed the energy of interaction V(A,B) because EA and EB are proportional to the volumes of A and B, whereas V(A,B) is proportional to the surface area of contact between them (for short-range forces). For macroscopic systems, V(A,B) is negligible in comparison with EA, EB. Thus, we take


E=EA+EB,


(4.4)


the assumption of weak interaction between A and B (even though “no interaction” might seem more apt). We can’t take V(A,B)0 because A and B would then be isolated systems. Equilibrium is established and maintained through a continual process of energy transfers between system and environment; taking V(A,B)=0 would exclude that possibility. For systems featuring short-range interatomic forces, we can approximate EEA+EB when the surface area of contact between A and B does not increase too rapidly in relation to bulk volume (more than V2/3 is too fast).3 No matter how small in relative terms the energy of interaction V(A,B) might be, it’s required to establish equilibrium between A and B. We’re not concerned (in equilibrium statistical mechanics) with how a system comes to be in equilibrium (in particular how much time is required). We assume that, in equilibrium, EA,EB|V(A,B)|, with Eq. (4.4) as a consequence.


 


4.1.2.2 Density of states function for composite systems


The phase volume of the composite system associated with energy E is found from the integral


V(E)={EA<E}dΓA{EB<EEA}dΓB{EA<E}VB(EEA)dΓA,


(4.5)


where {EA<E} indicates all points of ΓA for which EA<E, and {EB<EEA} indicates the points of ΓB for which EB<EEA. As an example of the integral in Eq. (4.5), suppose one wanted to find the area of the xy-plane for which x+yE, where x,y0. One would have the integral 0<x<Edx0<y<Exdy=0Edx(Ex). From Eq. (2.64), dΓA=ΩA(EA)dEA, and thus Eq. (4.5) can be written V(E)=0EΩA(EA)VB(EEA)dEA. Because VB(EEA)=0 for EA>E, the limit of integration can be extended to infinity:


V(E)=0ΩA(y)VB(Ey)dy.


(4.6)


By differentiating Eq. (4.6) with respect to E, we have from Eq. (2.64) a composition rule for density-of-state functions when the total energy is conserved and shared between subsystems:


Ω(x)=0ΩA(y)ΩB(xy)dyΩA*ΩB.


(4.7)


The density of states function of a composite system Ω is a convolution of the density-of-state functions of the subsystems, ΩA and ΩB.



Example. The density of states function for a free particle in n spatial dimensions, Eq. (2.19), which we denote g(n)(E), satisfies Eq. (4.7),


g(n+m)(E)=0Eg(n)(y)g(m)(Ey)dy,


(4.8)


where the limit of integration is finite because g(Ey)=0 for y>E. See Exercise 4.4.


Equation (4.7) generalizes to a system of n subsystems, where xi denotes the energy of the ith subsystem:


Ω(x)=0i=1n1Ωi(xi)dxiΩnxi=1n1xi,


(4.9)


where the multiple integration is over the (n1)-dimensional space for which xi>0. The limits of integration can be extended to infinity because Ωn(y<0)=0. Equation (4.9) is a key result in the development of the theory; it follows by induction from n to n+1 by decomposing Ωn into two components and using Eq. (4.7).


4.1.2.3Probability distribution for subsystems


For an isolated system of energy E consisting of system A together with its environment B, let MA be a set of points in ΓA having canonical coordinates {pi,qi}, i=1,,3n. Let M be a set of points in Γ-space containingMA, such that the first 3n pairs of its canonical coordinates {pk,qk}, k=1,,3N, belong to MA. (In math speak, MA is embedded in M). Thus, phase points having coordinates {pi,qi}, i=1,,3n belong to MA if and only if the phase point of Γ-space belongs to M. The probability that a phase point of A lies in MA coincides with the probability that the phase point of Γ lies in M. Using Eq. (4.1), which pertains to an isolated system,


P(MA)=P(M)=1Ω(E)MSEdSEΓH=1Ω(E)SE1MdSEΓH=1Ω(E)ddEV(E)1MdΓ,


(4.10)


where we’ve introduced the indicator function 1M,


1M(x)1xM0xM,



and we’ve used Eq. (2.65). The final integral in Eq. (4.10) directs us to find the phase volume of M for energies up to and including E. Thus (compare with Eq. (4.5)),


V(E)IMdΓ=ΓAIMdΓA{EB<EEA}dΓB=ΓAIMdΓAVB(EEA)=MAVB(EEA)dΓA.


(4.11)


Combining Eqs. (4.10) and (4.11),


P(MA)=1Ω(E)ddEMAVB(EEA)dΓA=1Ω(E)MAΩB(EEA)dΓA.


(4.12)


We therefore have, from Eq. (4.12), the phase-space probability density for system A:


ρA(p,q)=ΩB(EEA)Ω(E),


(4.13)


where EA=HA(p,q). Equation (4.12) implies an energy distribution by setting dΓA=ΩA(EA)dEA. Thus,


ρ(EA)=1Ω(E)ΩA(EA)ΩB(EEA).


(4.14)


Equations (4.13) and (4.14) are the canonical distribution functions, subsystem probability distributions determined by the density of states functions for the subsystem and the environment.


It might seem that we’re done, but we’re only part way there. These formulas are not in a form we can use for calculations because we lack expressions for the density-of-state functions.4 The interpretation of these formulas is quite physical, however. The probability that system A has energy EA is specified by a ratio of two numbers: the number of states system A has available at energy EA multiplied by the number of states system B has at energy EEA, to the number of states the composite system has at energy5 E. Our strategy in what follows is to develop an expression for Ω(x) appropriate to systems having large numbers of components; see Section 4.1.2.5.


4.1.2.4Partition function: Laplace transform of the density of states function


The canonical distribution requires that we know the density-of-state functions of the system and the surroundings. Density-of-states functions satisfy a convolution relation, Eq. (4.9). By the convolution theorem,6 the integral transform T of a convolution integral is equal to the product of the transforms of the functions appearing in the convolution, T(f*g)=T(f)T(g). The Laplace transform of Ω(x) is known as the partition function:7


 


Z(α)0eαxΩ(x)dx,


(4.15)


where α>0. The Laplace transform is the natural choice of transform (as opposed to Fourier) because Ω(x0)=0. We’ll soon give α a physical interpretation, but for now we treat it as a mathematical parameter.8


Partition functions obey a simple composition law. By taking the Laplace transform of Eq. (4.9), we find for a system composed of n subsystems ( Zi(α) is the Laplace transform of Ωi(x)),


Z(α)=i=1nZi(α),


(4.16)


where the parameter α applies to all subsystems.9 Equation (4.16) implies a useful feature of partition functions. We can consider each molecule of a system as a subsystem! For a gas of N identical molecules, if z(α) is the partition function of a single molecule (Laplace transform of its density of states function), Z(α)=z(α)N.


The partition function is always positive. A positive function Z(α) is log-convex if lnZ(α) is convex (see Section C.1 for the definition of convex function). From Eq. (4.15), Z(α) is log-convex:


d2dα2lnZ(α)=Z(α)Z(α)Z(α)Z(α)2=1Z(α)0x+Z(α)Z(α)2eαxΩ(x)dx>0.


(4.17)


In the next subsection, we use the fact that, based on the convexity of lnZ, there is but one solution to the differential equation ( a>0):10


ddαlnZ(α)=Z(α)Z(α)=a.


(4.18)


4.1.2.5Density of states function from the central limit theorem


One can devise a probability density having the partition function as its normalizing constant. Define, for α>0,


U(α,x)1Z(α)eαxΩ(x)x>00x0.


(4.19)


The functions {U(α,x)} are a family of probability densities (each meets the requirements in Eq. (3.26)), one for each value of α. We’ll single out a member of the family as physically relevant through the requirement that it generate the energy expectation value.11


 


Using U(α,x) as a probability density,


E=0xU(α,x)dx=1Z(α)0xeαxΩ(x)dx=Z(α)Z(α)=ddαlnZ(α),


(4.20)


where the middle equality follows from Eq. (4.15). If we knew Z(α), we would have the first moment of U(α,x), and there is but one solution of Eq. (4.20) for E>0. Among the functions U(α,x), there is precisely one that generates a given expectation value. Moreover, the variance of U(α,x) is (see Eq. (3.34))


x2x2=xx2U(α,x)dx=Z(α)Z(α)Z(α)Z(α)2=d2dα2lnZ(α),


(4.21)


where we’ve used Eq. (4.17). The first two moments of U(α,x) are known if Z(α) is known.


For a system of n subsystems, we find by combining Eq. (4.19) with Eq. (4.9),


U(α,x)=i=1n1Ui(α,xi)dxiUnα,xi=1n1xi,


(4.22)


where we’ve used Eq. (4.16). Thus, U(α,x) is composed of the functions Ui(α,xi) for the subsystems, i=1,,n1, including that of the environment, Un, the “keeper” of the total energy (here denoted by x). The convolution form of Eq. (4.22) is a consequence of the convolution property of Ω(x), Eq. (4.9), because of the special form of U(α,x), with its dependence on the exponential function and because it’s normalized by the Laplace transform of Ω(x).


The upshot of Eq. (4.22) is that U(α,x) represents the probability of a sum of n1 random variables, the probability that the energy of the combined system is the sum of the energies of its subsystems (when the latter are considered independent random variables). Consider, for n mutually independent random variables {xi}i=1n, governed by probability densities ui(xi) with moments aixiui(xi)dxi and bi(xiai)2ui(xi)dxi, that the central limit theorem provides, for large n, the form of the probability density for the sum xi=1nxi, which we may write as Un(x):


Un(x)n112πBnexp(xAn)22Bn,


(4.23)


where Ank=1nak and Bnk=1nbk. The form of Eq. (4.23) relies only on the number of components a system has and not on the physical laws governing them. That plays to our hand in establishing statistical mechanics, which we want to apply to any macroscopic physical system.


Equation (4.23) is an asymptotic expression, valid for n, but which in practice provides satisfactory results for relatively small values of n (see for example Exercise 3.28). What if we use the asymptotic form, Eq. (4.23), in Eqs. (4.20) and (4.21)? Agreement with the results of Eqs. (4.20) and (4.21) (which are based on U(α,x)) can be had if we stipulate the correspondences12


AnddαlnZ(α)μ    Bnd2dα2lnZ(α)σ2.


(4.24)


Using μ and σ2 from Eq. (4.24) in Eq. (4.23), we have a normalized probability density having the same first two moments as U(α,x),


U(x)12πσ2exp(xμ)22σ2.


(4.25)


 


The function U(x) is seemingly independent of Ω(x), whereas U(α,x) explicitly depends on it (Eq. (4.19)). The two are connected, however, μ and σ2 are derivatives of lnZ(α), where Z(α) is the Laplace transform of Ω(x). The identity of the moments (first and second) obtained using the different functional forms of Eqs. (4.19) and (4.25) implies that the location of the maximum of eαxΩ(x) and its shape near its maximum approximates the same for U(x) —the location of its mean and its shape near the mean. See the example on page 75.


The function U(x) provides an approximation of U(α,x) that reproduces its first two moments. A loose end here is the value of α, which is uniquely determined through Eq. (4.20). Denote the value of α such that Eq. (4.20) is satisfied as β:


μ=ddαlnZ(α)|α=β=E.


(4.26)


Likewise,


σ2=d2dα2lnZ(α)|α=β.


(4.27)


With Eq. (4.25), we can invert Eq. (4.19) to obtain an approximate expression for eβxΩ(x) in the vicinity of the mean energy, μ:


eβxΩ(x)=Z(β)2πσ2e(xμ)2/2σ2.


(4.28)


In particular, for x=μ=EE,


Ω(E)=Z(β)2πσ2eβE.


(4.29)


4.1.2.6Putting it together: The Boltzmann probability distribution


Referring to Fig. 1.9, let A have n particles and B have Nn particles, with Nn. From Eq. (4.24), μ=μA+μB and σ2=σA2+σB2, where μA, σA2 are of order n (they each refer to a sum of n quantities), while μB, σB2 are of order NnN. Combining Eqs. (4.13), (4.28), and (4.29),


ρA(p,q)=1Ω(E)×ΩB(EEA)=2πσ2Z(β)eβE×ZB(β)2πσB2eβ(EEA)e(EEAμB)2/2σB2=σ2σB2ZB(β)Z(β)eβEAe(EAμA)2/2σB2NneβEAZA(β)e(EAμA)2/2σB2,


(4.30)


where we’ve used E=μA+μB, σ2/σB2=1+σA2/σB21 for Nn, and Z(β)=ZA(β)ZB(β) from Eq. (4.16). Note that the total system energy E drops out of Eq. (4.30).


The exponential in the final term of Eq. (4.30) can be approximated as unity: for |EAμA|σA, exp((EAμA)2/2σB2)1 because σBσA. If we let N, the final exponential in Eq. (4.30) has the value e0=1. When A is much smaller than B we have the canonical phase-space distribution (where we erase the label A)


ρ(p,q)=1Ω(E)ΩB(EEA)N1Z(β)eβH(p,q),


(4.31)


where EA=HA(p,q). The right side of Eq. (4.31) is the Boltzmann probability distribution; it depends on the size of system B (the surroundings, before the limit N is taken), but not on the size of system A. It’s valid then even for a one-particle system! It remains to relate β, defined so that Eq. (4.26) is satisfied, to a thermodynamic parameter, a task we take up in Section 4.1.8. As an energy distribution, we have, combining Eqs. (4.31) and (4.14),


ρ(E)=1Z(β)eβEΩ(E),


(4.32)


which has the form of Eq. (4.19).


4.1.2.7Getting to Boltzmann: A discussion


We’ve not taken the most direct path to arrive at the Boltzmann distribution. Our derivation has almost been a deductive process, but it’s not the case you can start at “line one” and arrive at Eq. (4.31) purely through deduction. A genuinely new equation in physics can’t be derived from something more fundamental and is justified only a posteriori by the success of the theory based on it.13 One might say that the appearance of Eq. (4.19) in our derivation was fortuitous, but the form eβEΩ(E) presents itself naturally as a probability density having the partition function Z as its normalizing constant, given that Z presents itself naturally as the Laplace transform of the convolution relation for the density of states, Eq. (4.9). We’ve taken this path to support the claim (made on page 59) that the problem of statistical mechanics reduces to one in the theory of probability.14 In this subsection, we review some other ways to arrive at the Boltzmann distribution.


• The Boltzmann transport equation. Perhaps the easiest way is to consider stationary solutions of the Boltzmann transport equation, which models the nonequilibrium phase-space probability density ρ(p,q,t). This approach is outside the intended scope of this book, and the Boltzmann equation is not without its own issues that we can’t explore here. Suffice to say that Eq. (4.31) occurs as the steady-state solution of the Boltzmann equation, appropriate to the state of thermal equilibrium.


• As a postulate. In his formulation of statistical mechanics, Gibbs simply started with the form of Eq. (4.31), known as the Gibbs distribution. He assumed15 P=eη (because it’s “…the most simple case conceivable”), where η=(ψϵ)/Θ is a combination of three functions: ψ, related to the normalization, eψ/ΘZ1, ϵ the energy, and Θ which he termed the modulus of the distribution.[19, p33] The linear dependence of η on ϵ determines an ensemble which Gibbs called canonically distributed.16 The form P=e(ψϵ)/Θ was “line one” for Gibbs.


• The most probable distribution. In equilibrium, entropy has the maximum value it can have subject to macroscopic constraints. The state of equilibrium is the most likely state, by far. Divide a system into M subsystems, and let each subsystem have ni particles such that i=1Mni=N, a fixed number. Assume the energy is the sum of the subsystem energies, E=i=1MEi. The number of ways the system can have a configuration specified by a given set of numbers {ni} is, from Eq. (3.12),


W[n1,n2,,nk,]=N!n1!n2!nk!



One finds the numbers ni that maximize W, from which emerges the canonical distribution, Eq. (4.31). See Appendix F.


• The microcanonical ensemble. The canonical probability density can be derived from the microcanonical ensemble. We treat a composite system (system A together with its surroundings B) as an isolated system, for which we know the probability density, Eq. (4.3), and then sum over the environmental phase-space coordinates to obtain the probability density for a system that interacts with its surroundings. The microcanonical probability density for the combined system is a joint probability distribution which satisfies Eq. (3.22):


 


ρcan(A)=ΓBρmicro(A,B)dΓB=1Ω(E)h3NΓBδ(EHAHB)dpBdqB,


(4.33)


where we’ve used Eqs. (4.4), (4.3), and (2.51) ( dΓB=dpBdqB/h3N), with N the number of particles in system B. The Hamiltonian HB is a sum of two terms, HB(pB,qB)=K(pB)+V(qB), the kinetic and potential energy functions for system B. First integrate over the momentum space associated with ΓB:


ΓBδ(EHAK(pB)V(qB))dpBdqBqBΩ(EHAV(qB))dqB,


(4.34)


where Ω here is the volume of the hypersurface in momentum space associated with kinetic energy K=y:


Ω(y)=pBδ(yK(pB))dpB=ddy({K(pB)y}dpB),



where we’ve used Eq. (2.64). It’s straightforward to show, for K=(1/2m)i=13Npi2, that


Ω(y)=(2mπ)3N/2Γ(3N/2)y(3N/2)1CNY(3N/2)1.


(4.35)


Combining Eqs. (4.33), (4.34), and (4.35), and for M(3N/2)1,


ρcan(A)=CNΩ(E)h3NqBEHAV(qB)MdqB=CNΩ(E)h3NEM1HAEMqB1V(qB)EHAMdqB.


(4.36)


To make progress, we introduce two simplifying assumptions.17




  1. The energy per particle is a constant (in the notation introduced by Gibbs):


    EN=32Θ=constant.


    (4.37)


    Equation (4.37) embodies equipartition of energy, that 32Θ is the mean kinetic energy per particle. We’ll show that Θ=kT, implying that the specific heat ( N1E/TV) is independent of temperature—a supposition having experimental support. It’s found (the law of Dulong and Petit) that the specific heats of chemical elements in the solid state have approximately the same value, independent of temperature, if the temperature is not too low. With Eq. (4.37), energy equipartition is built in to statistical mechanics from the outset.18 The third law of thermodynamics requires that heat capacities vanish as T0,[3, Chapter 8] an observed property of matter at low temperature, and one for which quantum mechanics accounts for in terms of the breakdown of energy equipartition. We anticipate that classical statistical mechanics will require revision.19


     



  2. We assume HAE —the assumption that system A is much smaller than the environment.


Substituting E=MΘ in Eq. (4.36) (for N1), and ignoring HA in relation to E, we have


ρcan(A)CNΩ(E)h3NEM1HAMΘMqB1V(qB)MΘMdqB.



Using the Euler form of the exponential function, ex=limn1+x/nn, we have for large M,


ρcan(A)CNΩ(E)h3NEMeHA/ΘqBeV(qB)/ΘdqB.



Now let M. The terms multiplying eHA/Θ approach a limiting value D(Θ) depending only on the parameter Θ:


ρcan(A)MD(Θ)eHA/Θ,


(4.38)


where D is the inverse normalization constant,


D1Z=ΓAeHA(p,q)/ΘdΓA.


(4.39)


Thus we arrive at the Boltzmann distribution, Eq. (4.31), where Θβ1. In what follows, we’ll use the canonical distribution as in Eq. (4.31), parameterized by β.


4.1.2.8Consistency with thermodynamics


A requirement on statistical mechanics is that it reproduce the laws of thermodynamics, a demand ensured by equating macroscopically measurable quantities with appropriate ensemble averages (Section 2.5). As we now show, the framework we’ve established is consistent with thermodynamics if we identify β=(kT)1 and if we modify the partition function with a multiplicative factor.


Internal energy is the energy of adiabatic work and heat is the difference between work and adiabatic work (Section 1.2). Adiabatically isolated systems interact with their surroundings through mechanical means only. For such systems, the internal energy U is the conserved energy of mechanical work done on the system, which is the same as the value of the Hamiltonian H. Thus, we equate U with the ensemble average of H:


U=H=ρ(p,q)H(p,q)dΓ=βlnZV,


(4.40)


where the final equality follows from Eq. (4.39) with Θ=β1, and which will be recognized as Eq. (4.20) with α=β. Equation (4.40) is perhaps the most useful formula in statistical mechanics.


 


Work entails variations in a system’s extensive external parameters {Xi}, δW=iYiδXi; Eq. (1.4). Adiabatic work δW associated with variations δXi is reflected in changes δH in the value of the Hamiltonian, provided the variations occur at an infinitesimal rate.20 Thus,21


δW=δH=iYiδXi=iHXiρdΓδXi=1βiXilnZδXi.


(4.41)


The Hamiltonian must therefore be a function of the external parameters22 (as well as the canonical coordinates (p,q)), with


Yi=HXiρdΓ=HXi=1βXilnZ.


(4.42)


Thus, if we have the partition function Z(β,Xi), we can calculate the internal energy U and the intensive quantities Yi from logarithmic derivatives of Z, Eqs. (4.40) and (4.42).


By the first law, the heat δQ transferred to a system is the difference between changes in energy δU, brought about by any means, and the adiabatic work done on the system δW. Thus, we define


δQδUδW=δHδH.


(4.43)


In Chapter 1, we discussed the classification of variables that occur in thermodynamic descriptions. As another term, the quantities {Xi} are referred to as deformation coordinates, because work involves “deformations” or extensions of the extensive variables associated with a system. For every equilibrium system, there is an additional, non-deformation or thermal coordinate. It’s possible to impart energy to isolated systems by means not involving changes in deformation coordinates, e.g., stirring a fluid of fixed volume adiabatically isolated from the environment. In any thermodynamic description there must be a thermal coordinate and at least one deformation coordinate.[3, p12] We assume therefore that the parameter β is the thermal coordinate.23 In what follows, we’ll use a variation symbol δ, so that, for a function f(β,Xi),


δffβδβ+iδXifXi.


(4.44)


Using Eqs. (4.40), (4.41), (4.43), and (4.44),


δQ=δUδW=β(βlnZ)δβ+iXi(lnZβ)δXi+1βi(XilnZ)δXi=2lnZβ2δβiXi(lnZββ1lnZ)δXi=1βδ(β2β(β1lnZ)),



where we’ve used the identities developed in Exercise 4.16. Thus,


βδQ=δ(β2β(β1lnZ)).


(4.45)


 


That is, βδQ is equal to a total differential, implying that β is an integrating factor for δQ. Reaching for contact with thermodynamics,24 we infer from Eq. (1.8) that β is proportional to the inverse absolute temperature, T1. It’s quite remarkable that small variations in a heat-like quantity (defined in Eq. (4.43) which mimics the first law of thermodynamics) have, within the framework of statistical mechanics, an integrating factor β as required by the second law of thermodynamics.25 It’s not obvious this development could have been foreseen; it seems we’re on to something.


Let’s provisionally take β=(kT)1. In that case, we can equate the quantity in parentheses in Eq. (4.45) with entropy:


S=kTTlnZ+constant.


(4.46)


We find ourselves in an analogous situation to what we faced in Section 1.11: What’s the constant k (setting the scale of entropy) and what’s the constant in Eq. (4.46) (setting the zero of entropy)?


Let’s first address the multiplicative constant, k. And, just as in Section 1.11, we reach for the ideal gas. Use Eq. (4.39) (with Θ=β1=kT) to calculate Z for an ideal gas contained in V:


Z(N,V,T)=1h3NVdqdpeβH=VN2πmkTh23N/2=V/λT3N,


(4.47)


where H=(1/2m)i=1Npi2, λT is the thermal wavelength, Eq. (1.65), and we’ve used 3N copies of Eq. (B.7). Note that we did not make use of the volume of a hypersphere, as we’ve done in previous calculations. The energy is not fixed in the canonical ensemble; we must sum over all possible momenta. Using Eq. (4.46) to calculate the entropy, with Z given by Eq. (4.47),


S=Nk(32+ln(V/λT3))+constant.    (wrong!)


(4.48)


Equation (4.48) does not agree with the experimentally verified Sackur-Tetrode formula, Eq. (1.64). Let’s sidestep that issue for now, because where Eq. (4.48) gets it wrong is in the dependence on particle number. The pressure can be calculated from S using Eq. (1.24) (at constant N),


PT=SVT,N=NkV,


(4.49)


where we’ve used Eq. (4.48) for S. Equation (4.49) is of course the equation of state of the ideal gas, indicating that the constant k in Eq. (4.46) is indeed the Boltzmann constant.26


The additive “constant” in Eq. (4.48) is a function of particle number. The Clausius definition of entropy, which Eq. (4.45) is a model of, involves reversible heat transfers between closed systems and their environment. For open systems, there is an additional contribution to entropy from the diffusive flow of particles not accounted for in the Clausius definition[3, Section 14.3], which is therefore an incomplete specification of entropy. As noted by E.T. Jaynes, “As a matter of elementary logic, no theory can determine the dependence of entropy on the size N of a system unless it makes some statement about a process where N changes.” [40]. Our “statement” is that entropy is extensive, which Eq. (4.48) does not exhibit (show this).27 Let’s add an unknown, N-dependent function to Eq. (4.48):


 


S=Nk(32+ln(V/λT3))+kϕ(N).


(4.50)


For S in Eq. (4.50) to obey the scaling law S(λV,λN)=λS(V,N) (see Eq. (1.52)), ϕ must satisfy the scaling relation


ϕ(λN)=λϕ(N)λNlnλ.


(4.51)


Note the identity for λ=1. Differentiate Eq. (4.51) with respect to λ and set λ=1; one finds the differential equation Nϕ=ϕN, the solution of which is ϕ(N)=ϕ(1)NNlnN. Combined with Eq. (4.50),


S=Nk32+ϕ(1)+lnVNλT3.


(4.52)


This form of S exhibits extensivity, but ϕ(1) cannot be established by this method of analysis. If we take ϕ(1)=1, Eq. (4.52) becomes identical to the Sackur-Tetrode formula. In that case, ϕ(N)=NNlnN, which we note is the Stirling approximation for ϕ(N)=lnN!.


We introduced a factor of N! in our derivation of the Sackur-Tetrode formula (Section 2.3) to prevent overcounting configurations that are equivalent under permutations of N identical particles. The reason we obtained an incorrect expression of S in Eq. (4.48) is not because Eq. (4.46) is suspect, but because Eq. (4.47) is incorrect. The additive factor of lnN! required to achieve agreement between Eq. (4.48) and the Sackur-Tetrode formula would occur automatically if the partition function were to include it. We define the canonical partition function


Zcan(N,V,T)1N!dΓeβH,


(4.53)


where β=(kT)1. Thus, Zcan=Z/N!, where Z is specified in Eq. (4.39) or in Eq. (4.15). In many cases, “plain old” Z suffices—where the quantity of interest is obtained from a logarithmic derivative of Z (such as Eqs. (4.40) and (4.42)) because the constant factor of N! drops out. The expression for entropy, however, Eq. (4.46), does not involve a logarithmic derivative of the partition function and we must use Zcan. We can write, therefore,


ρ(p,q)=1ZeβH(p,q)=1N!ZcaneβH(p,q).


(4.54)


The canonical distribution in the form of Eq. (4.54) is “more correct” than expression Eq. (4.31), which we obtained in a direct calculation, or the expressions Eqs. (4.38) and (4.39), which we derived from the microcanonical distribution. What did we overlook in those calculations? Simply put, we took every point in Γ-space to represent a distinct microstate; we did not take into account that the state of N identical particles is N!-fold degenerate under permutations. The expression for entropy, Eq. (4.46), is valid if we use Zcan:


S=kT(TlnZcan)+S0.(S0=0)



The constant S0 is the province of the third law of thermodynamics, which states that entropy changes ΔS in physical processes vanish28 as T0. This experimentally observed property of matter can be interpreted to mean that entropy S(T) approaches a constant S(T)S0 as T0. The value of S0 is, by convention, zero for many substances. We’ll take S0=0. There are materials, however, that achieve a finite-entropy configuration29 as T0. As a consistency check, it must be ascertained that (if S0=0),


 


limT0T(TlnZcan)=0.


(4.55)


We anticipate that Eq. (4.55) won’t hold for partition functions evaluated with classical statistical mechanics, which are based on energy equipartition.


4.1.2.9Connection with the Helmholtz energy


The microcanonical ensemble is composed of systems having precise values of E,V,N; for that reason it’s known as the NVE ensemble. The canonical ensemble is composed of systems having precise values of N and V, but a variable energy having the average value U. Temperature is a proxy for variable energy, and the canonical ensemble is known as the NVT ensemble. The Helmholtz energy F=F(N,V,T) is a function of N,V,T (see Eq. (1.14)), and, as we now show, there is a simple connection between Zcan(N,V,T) and the Helmholtz energy F(N,V,T), Eq. (4.57).


From Eq. (4.45),


S=kβ2β(β1lnZcan)=k[β(lnZβ+lnZcan)]=UT+klnZcan,


(4.56)


where we’ve used Eq. (4.40). Equation (4.56) implies


F=UTS=kTlnZcanZcan(N,V,T)=eβF(N,V,T).


(4.57)


Equation (4.57) provides the link between thermodynamics and statistical mechanics in the canonical ensemble. With it, we can write the canonical distribution (from Eq. (4.54)) ρ=eβ(FH)/N!, the form assumed by Gibbs, except for the factor of N!.


With the connection between Zcan and F in Eq. (4.57), the standard formulas of thermodynamics can be used:


μ=FNT,V=kTNlnZcan|T,V  P=FVT,N=kTVlnZcan|T,NS=FTV,N=kTTlnZcan|V,N.


(4.58)


Given information about N,V,Tand the Hamiltonian, we can calculate μ, P, and S.


Consider the logarithm of ρ (which is dimensionless)


lnρ=lneβH/Z=βHlnZ=βHlnZcanlnN!.


(4.59)


If we multiply Eq. (4.59) by ρ, integrate over Γ-space, and make use of Eq. (4.57), we find


S=kρln(N!ρ)dΓ.


(4.60)


Entropy is therefore not the mean value of a mechanical quantity, it’s a property of the ensemble.


 


4.1.2.10 Fluctuations in the canonical ensemble


We can calculate the internal energy from Eq. (4.40), U=lnZ/β, but we can also address fluctuations. From


2β2lnZ=1Z2Zβ21ZZβ2=H2H2,


(4.61)


we infer


H2H2=βlnZβ=UβV=kT2UTV=kT2CV,


(4.62)


the same as from the Einstein fluctuation theory, Eq. (2.31). The heat capacity is a measure of energy fluctuations. The relative root-mean-square fluctuation is


(HH)2H=kT2CVU.


(4.63)


For macroscopic systems, U~O(N) and CV~O(N), and thus the relative fluctuation scales as N1/2. A system in the canonical ensemble has, for all practical purposes, an energy equal to the mean for large N, which we expect from the law of large numbers.


It’s not an accident that Eq. (4.63) agrees with Eq. (2.31). The form of Eq. (2.29) is a property of the canonical ensemble. Using the approximate density of states function Eq. (4.28), we have for the Boltzmann probability density, Eq. (4.32):


P(E)=eβEΩ(E)~eβFexp((EU)22σ2)=eβ(UTS)exp((EU)22kT2CV),



where we’ve used Eq. (4.57), Z=eβF, and where σ2=2lnZ/β2 is the variance of the distribution given in Eq. (4.27), a quantity that we just derived in Eq. (4.62), σ2=kT2CV. Thus, the distribution of energy among the systems of the ensemble is Gaussian, centered about the mean value U with variance σ2=kT2CV, the same as Eq. (2.29).


4.1.2.11The equipartition theorem


Energy equipartition was assumed in the derivation of the Boltzmann distribution, that the kinetic energy per particle is a constant, proportional to the temperature, Eq. (4.37). The equipartition theorem holds that energy equipartition applies not just to kinetic energy but to any quadratic degree of freedom. A quadratic degree of freedom is one which adds a quadratic term to the Hamiltonian associated with that degree of freedom: p2/(2m) or 12mω2x2.


To develop the equipartition theorem, we first show that


xiHxj=δijkT,


(4.64)


where xi denotes any of the canonical coordinates (pk,qk) that enter into the Hamiltonian. Consider the integration over Γ-space,


xiHxjeβHdΓ=1βxixjeβHdΓ=1β(xieβH|xj(1)xj(2)xixjeβHdxj)dΓ[j]=1βxixjeβHdΓ=1βδijeβHdΓ=1βδijZ(β),



where in the second equality we’ve integrated by parts, where xj(1) and xj(2) are the limits of integration of xj, and where dΓ[j] indicates the volume element dΓ with dxj removed. The integrated part vanishes: For xj a position coordinate, the range of integration extends over the volume of the system where the potential energy becomes infinite at the boundaries, and for xj a momentum component it takes the values ± at the “endpoints” of momentum space; in either case the integrated part vanishes. With this result established, Eq. (4.64) follows.30


Let’s consider Hamiltonians featuring only quadratic degrees of freedom, which we can write


H=i=1Nfbixi2,



where xi is any canonical coordinate and bi is a constant. For example, the Hamiltonian of a harmonic oscillator with isotropic spring constant is in the form H=i=13pi2/(2m)+12mω2qi2. A Hamiltonian containing only quadratic degrees of freedom is a second-order homogeneous function, which by Euler’s theorem (see Section 1.10), can be written


2H=i=1NfxiHxi.


(4.65)


Applying Eq. (4.65) to Eq. (4.64),


2H=i=1NfxiHxi=NfkT ,



which implies


H=12NfkT.


(4.66)


Equation (4.66) is the equipartition theorem: Each quadratic degree of freedom in the Hamiltonian makes a contribution of 12kT towards the internal energy of the system, and hence a contribution of 12k to the specific heat.



Example. Consider a torsion pendulum, a disk suspended from a torsion wire attached to its center. A torsion wire is free to twist about its axis. As it twists, it causes the disc to rotate in a horizontal plane. For θ the angle of twist, there is a restoring torque τ=κθ, and thus Iθ¨=τ, where I is the moment of inertia of the disk. It’s a standard exercise in classical mechanics to show that the Hamiltonian of this system is


H=12Ipθ2+12κθ2,



where pθ=Iθ˙. The Hamiltonian has two quadratic degrees of freedom. By the equipartition theorem, assuming the system is in equilibrium with a heat bath at temperature T,


12Ipθ2=12kTθ˙2=kT/I12κθ2=12kTθ2=kT/κ .



 

Only gold members can continue reading. Log In or Register to continue

Jul 18, 2021 | Posted by in General Engineer | Comments Off on 4 ▪ Ensemble theory
Premium Wordpress Themes by UFO Themes