Category Theory
Zulip Server
Archive

You're reading the public-facing archive of the Category Theory Zulip server.
To join the server you need an invite. Anybody can get an invite by contacting Matteo Capucci at name dot surname at gmail dot com.
For all things related to this archive refer to the same person.


Stream: deprecated: thermodynamics

Topic: Hong Qian


view this post on Zulip John Baez (Nov 10 2021 at 18:27):

I've been talking a bit with Hong Qian at the University of Washington. He works on nonequilibrium thermodynamics and biophysics... not particularly using categories, but he's full of cool ideas I'd like to understand and make more mathematical.

view this post on Zulip John Baez (Nov 10 2021 at 18:29):

We're thinking of maybe having a regular series of conversations. Would you be interested in attending those, Owen? I think I'd have them instead of our usual meetings instead of on top of them, for a while, since I'm getting involved in too many meetings. Maybe we could alternate between meeting with just you and me, and meetings with him (and some other folks).

view this post on Zulip John Baez (Nov 10 2021 at 18:30):

Here's an example of the paper he's written, which I'd like to understand:

Abstract. We generalize the convex duality symmetry in Gibbs' statistical ensemble formulation, between Massieu's free entropy and the Gibbs entropy as a function of mean internal energy U. The duality tells us that Gibbs thermodynamic entropy is to the law of large numbers (LLN) for arithmetic sample means what Shannon's information entropy is to the LLN for empirical counting frequencies. Following the same logic, we identify U as the conjugate variable to counting frequency, a Hamilton-Jacobi equation for Shannon entropy as an equation of state, and suggest an eigenvalue problem for modeling statistical frequencies of correlated data.

view this post on Zulip John Baez (Nov 10 2021 at 18:31):

I don't know what all this means, but it uses a lot of the words I'm thinking about these days. :upside_down:

view this post on Zulip Owen Lynch (Nov 10 2021 at 21:57):

Yes I definitely would be interested! I've heard about Hong Qian through several different sources, and I've always been interested in learning more about the stuff he is doing. I think specifically he's mentioned in Haddad.

view this post on Zulip Owen Lynch (Nov 10 2021 at 21:58):

I think alternating sounds like a good idea.

view this post on Zulip Owen Lynch (Nov 10 2021 at 21:59):

Actually, I just did a youtube video (not so well done...) explaining Shannon entropy to my friend who is a chemist by using this derivation of Shannon entropy from empirical counting frequencies.

view this post on Zulip Owen Lynch (Nov 10 2021 at 22:01):

(not from the paper though)

view this post on Zulip Owen Lynch (Nov 24 2021 at 16:27):

OK, I'm starting to get a hang of what's going on in this paper, and it's making a lot of sense so far.

The basic setup is that you have some summary statistic t^\hat{t} of a collection of NN i.i.d. variables, and you are interested in the asymptotic distribution of this summary statistic as NN \to \infty. The large deviations principle says that this distribution should look like f(t)=eNϕ(t)f(t) = e^{-N\phi(t)}, and the claim is that we should be thinking about ϕ\phi as a "generalized entropy".

In the case that the summary statistic is the frequency distribution of discrete variables, then the entropy is the relative entropy with respect to the distribution of the i.i.d. variables. In the case that the summary statistic is the mean value, then the entropy is the Gibbs entropy.

view this post on Zulip Owen Lynch (Nov 24 2021 at 16:31):

As NN \to \infty, you are only ever going to see values of the summary statistic that are at minima of ϕ(t)\phi(t). Why minima, when we typically maximize entropy? Well, in the first case it's because we are minimizing relative entropy. In the second case, it's because there's a sign flip.

view this post on Zulip Owen Lynch (Nov 24 2021 at 16:32):

In the paper, they state the large deviations principle as both f(t)=eNϕ(t)f(t) = e^{N \phi(t)} and f(t)=eNϕ(t)f(t) = e^{-N \phi(t)}, so... :shrug:

view this post on Zulip Owen Lynch (Nov 24 2021 at 16:34):

I like this a lot, because it gives a justification for entropy maximization that is rooted in probability, and explicitly tied to the fact that we are looking at systems with many independent degrees of freedom.

view this post on Zulip Owen Lynch (Nov 24 2021 at 16:39):

It also gives us a way of making an entropy function for any observable. Namely, use the large deviations principle for mean values of that observable over many independent samplings.

The empirical frequency is a special case of this with the observable that sends state kk to the kk th basis vector in Rn\mathbb{R}^n, where the variable can take on any of nn states.

view this post on Zulip Owen Lynch (Nov 24 2021 at 16:40):

This also explains Shannon entropy as being a kind of "initial" entropy, because the mean values of other observables can be calculated from the empirical frequency distribution.

view this post on Zulip Owen Lynch (Nov 24 2021 at 16:41):

But this starts to get interesting when we consider situations where the samples are not i.i.d., but we can still express a large deviations principle. For instance, the mean magnetization in the Ising model.

view this post on Zulip Owen Lynch (Nov 24 2021 at 17:01):

Ah, I now see that they talk about statistics on a Markov chain! Very exciting!

view this post on Zulip John Baez (Nov 24 2021 at 17:53):

Owen Lynch said:

It also gives us a way of making an entropy function for any observable. Namely, use the large deviations principle for mean values of that observable over many independent samplings.

Cool stuff! Is this entropy function convex? What's the domain of this entropy function, anyway?

view this post on Zulip Owen Lynch (Nov 24 2021 at 21:32):

The domain of the entropy function is the codomain of the statistic.

view this post on Zulip John Baez (Jan 10 2022 at 02:20):

So, @Owen Lynch and I are trying to understand this paper:

but so far only up to around equation (6).

view this post on Zulip John Baez (Jan 10 2022 at 02:22):

The first two big things I didn't understand were allusions to Cramer’s theorem and Sanov's theorem.

view this post on Zulip John Baez (Jan 10 2022 at 02:24):

Luckily Wikipedia explains them both; they're both really great theorems.

view this post on Zulip John Baez (Jan 10 2022 at 02:25):

Cramer's theorem starts by assuming you have a function FF on a probability measure space, and directs your attention to the function

Λ(t)=logE(exp(tF)) \Lambda(t) = \log E(\exp(tF))

view this post on Zulip John Baez (Jan 10 2022 at 02:26):

where EE means 'expected value'.

view this post on Zulip John Baez (Jan 10 2022 at 02:27):

They call this the cumulant generating function because if you expand this as a Taylor series in tt the coefficients are called the cumulants of FF.

view this post on Zulip John Baez (Jan 10 2022 at 02:29):

Unfortunately if you have no feeling for cumulants, this is the simplest definition of 'cumulant'.

view this post on Zulip John Baez (Jan 10 2022 at 02:29):

But you can re-express the cumulants in terms of the moments of FF, namely

E(Fn)E(F^n)

view this post on Zulip John Baez (Jan 10 2022 at 02:30):

Also, cumulants have a bunch of nice properties.

view this post on Zulip Owen Lynch (Jan 10 2022 at 02:30):

John Baez said:

Cramer's theorem starts by assuming you have a function FF on a probability measure space, and directs your attention to the function

Λ(t)=logE(exp(tF)) \Lambda(t) = \log E(\exp(tF))

we normally call FF a variable :)

view this post on Zulip John Baez (Jan 10 2022 at 02:31):

As for me, I'd prefer to think like a physicist and call H=FH = -F the Hamiltonian. Then

E(exp(tH))=Z(t)E(\exp(-tH)) = Z(t)

is famous: it's the partition function. We usually write t=βt = \beta, which stands for coolness, i.e. inverse temperature.

view this post on Zulip Owen Lynch (Jan 10 2022 at 02:32):

Whoa, I never thought about Cramers theorem like that!

view this post on Zulip John Baez (Jan 10 2022 at 02:32):

Then

Λ(t)=logE(exp(tH))\Lambda(t) = \log E(\exp(-t H))

is also famous; it's almost the free energy of our system.

view this post on Zulip John Baez (Jan 10 2022 at 02:34):

Actually the Helmholtz free energy is

F(t)=1tlogE(exp(tH)) F(t) = - \frac{1}{t} \log E(\exp(-t H))

view this post on Zulip John Baez (Jan 10 2022 at 02:35):

where again t=βt = \beta (coolness), and β=1/T\beta = 1/T (the reciprocal of temperature in units where Boltzmann's constant is 1).

view this post on Zulip John Baez (Jan 10 2022 at 02:36):

Owen Lynch said:

John Baez said:

Cramer's theorem starts by assuming you have a function FF on a probability measure space, and directs your attention to the function

Λ(t)=logE(exp(tF)) \Lambda(t) = \log E(\exp(tF))

we normally call FF a variable :)

Yes, it's a 'random variable', which is a function on a probability measure space. Since I'm going for a physics interpretation I might call it an 'observable'.

view this post on Zulip John Baez (Jan 10 2022 at 02:37):

But how about 'function', since that's all it really is. :upside_down:

view this post on Zulip John Baez (Jan 10 2022 at 02:38):

Okay, now let me try to state Cramer's theorem, which I'd never known about before. But I'll state it using some physics language just to keep Owen entertained.

view this post on Zulip John Baez (Jan 10 2022 at 02:39):

So I'll call

Λ(β)=logZ(β) \Lambda(\beta) = \log Z(\beta)

the log of the partition function where

Z(β)=E(eβH) Z(\beta) = E(e^{-\beta H})

is the partition function and remember that β1Λ(β) -\beta^{-1} \Lambda(\beta) is called the free energy (Helmholtz free energy).

view this post on Zulip John Baez (Jan 10 2022 at 02:46):

Now, Cramer's theorem starts by considering something wacky-sounding, the Legendre transform of the log of the partition function:

Λ(E)=supββEΛ(β) \Lambda^\ast (E) = \sup_{\beta} \beta E - \Lambda(\beta)

view this post on Zulip John Baez (Jan 10 2022 at 03:13):

With luck this will turn out not to be so wacky; I'm hoping it's something somewhat familiar in thermodynamics! Maybe something like the entropy as a function of energy? I need to calculate a bit.

view this post on Zulip John Baez (Jan 10 2022 at 03:15):

But anyway, Cramer's theorem says

Λ(E)=limn1nlog(P(i=1nXinE)) \Lambda^*(E) = -\lim_{n \to \infty} \frac{1}{n} \log \left(P\left(\sum_{i=1}^n X_i \leq nE \right)\right)

view this post on Zulip John Baez (Jan 10 2022 at 03:16):

where XiX_i are independent identically distributed random variables, all distributed just like the Hamiltonian HH.

view this post on Zulip John Baez (Jan 10 2022 at 03:26):

I'm afraid I may be getting some signs wrong due to my Hamiltonian HH being F-F.

view this post on Zulip John Baez (Jan 10 2022 at 03:30):

For example, I'd feel happier with

P(i=1nXinE) P\left(\sum_{i=1}^n X_i \geq nE \right)

because this would be looking at the probability of measuring the energy nn times and getting more than nEn E. Since it's usually improbable for a system to have very large energy in statistical mechanics (the probability drops off exponentially), this would smell like a "large deviations" result, which I think is what we're shooting for.

view this post on Zulip John Baez (Jan 10 2022 at 03:34):

But if everything works out, it seems maybe Cramer's theorem is relating the entropy of the state of thermodynamic equilibrium at energy EE to the probability that repeated measurements of the energy give a result more than EE on average.

view this post on Zulip John Baez (Jan 10 2022 at 03:34):

That's where I am in understanding Cramer's theorem.

view this post on Zulip Nathaniel Virgo (Jan 10 2022 at 03:40):

I found these notes on the "Cramér transform" useful for understanding some of this stuff and its relationship to convex analysis. (It's an appendix to the book Linear and Integer Programming vs Linear Integration and Counting by Laserre.) It presents Cramér's theorem as a transform that turns convolutions into infimal convolutions, which seems like a useful perspective.

view this post on Zulip Nathaniel Virgo (Jan 10 2022 at 03:48):

(But re-examining it, I was mixing up "Cramér's theorem" with this "Cramér transform" and those notes don't really talk about large deviations, so it's maybe less immediately relevant than I thought, sorry. But they may be useful for seeing how it connects to the convex function stuff in your paper.)

view this post on Zulip Owen Lynch (Jan 10 2022 at 04:03):

Quick check: the free energy is UTSU - TS, so if β1Λ(β)=UTS-\beta^{-1} \Lambda(\beta) = U - TS, then we have Λ(β)=SUT\Lambda(\beta) = S - \frac{U}{T}.

Then Λ(U)=UTΛ(β)=UT(SUT)=2UTS\Lambda^\ast(U) = \frac{U}{T} - \Lambda(\beta) = \frac{U}{T} - (S - \frac{U}{T}) = 2 \frac{U}{T} - S

view this post on Zulip Owen Lynch (Jan 10 2022 at 04:04):

If we instead identify β1Λ(β)\beta^{-1} \Lambda(\beta) with the free energy, then we get Λ(U)=S\Lambda^\ast(U) = S, so I think maybe that's the right sign convention?

view this post on Zulip John Baez (Jan 10 2022 at 04:21):

Thanks! We'll eventually figure it out. The trick with these things is to get a nice solid idea, then the minus signs and factors of 2 are bound to fall in line if you keep working at it.

view this post on Zulip John Baez (Jan 10 2022 at 04:27):

Wikipedia says free energy is TlnZ-T \ln Z, with a minus sign, and I did this calculation myself on page 12 of Relative entropy in biological systems.

view this post on Zulip John Baez (Jan 10 2022 at 04:29):

You can just calculate and show

TlnZ=HTS- T \ln Z = \langle H \rangle - T S

which is free energy. Here H\langle H \rangle is the expected value of HH in the Gibbs state at temperature TT - that's what you're calling UU.

view this post on Zulip Owen Lynch (Jan 10 2022 at 04:40):

Agh I've just stared at this for like 10 minutes trying to spot my sign error, and I can't figure out what has gone wrong! Probably it will be more obvious in the morning...

view this post on Zulip John Baez (Jan 10 2022 at 05:20):

Yeah, luckily a lot of sign errors go away after a night's sleep.

view this post on Zulip Owen Lynch (Jan 10 2022 at 19:33):

Here's something odd! In Hong's paper, he uses a different sign for the Legendre transform!

image.png

view this post on Zulip Owen Lynch (Jan 10 2022 at 19:33):

He uses a ++ where we have a -, and where there is a - in the statement of Cramer's theorem!

view this post on Zulip Owen Lynch (Jan 10 2022 at 19:33):

This is all highly suspect....

view this post on Zulip Owen Lynch (Jan 10 2022 at 19:34):

Using a ++ here does get entropy out, but it's not a legendre fenchel transform that I'm familiar with!!

view this post on Zulip Owen Lynch (Jan 10 2022 at 19:39):

He even uses this definition for the legendre fenchel transform later on!!

view this post on Zulip Owen Lynch (Jan 10 2022 at 19:39):

image.png

view this post on Zulip Owen Lynch (Jan 10 2022 at 19:39):

This is when he's talking about large deviations

view this post on Zulip Owen Lynch (Jan 10 2022 at 19:42):

Note that he also uses an inf instead of a sup, as is used in Kramer's theorem

view this post on Zulip Owen Lynch (Jan 10 2022 at 19:47):

What we end up getting here is that the infimum happens at Λ(β)=x-\Lambda'(\beta) = x. Which actually may be the right thing to do here; let me quickly recalculate what the derivative of the log of the partition function is.

view this post on Zulip Owen Lynch (Jan 10 2022 at 19:50):

Aha! It's because in Cramer's theorem, they define the log partition function as logE[exp(tX)]\log E[\exp(tX)], whereas the log partition function in statistical mechanics is logE[exp(βU)]\log E[\exp(-\beta U)]!

view this post on Zulip Owen Lynch (Jan 10 2022 at 19:50):

So it makes sense to take the legendre transform with respect to "U-U", as it were

view this post on Zulip Owen Lynch (Jan 10 2022 at 19:51):

OK, everything is fixed and makes sense now

view this post on Zulip John Baez (Jan 10 2022 at 21:38):

Can you write up a statement of the key result, using conventions that match the usual conventions in thermodynamics / statistical mechanics? I was trying to do that myself. Something like:

If you start with a function HH on a measure space, you can define the partition function

Z(β)=exp(βH) Z(\beta) = \int \exp(-\beta H)

and then take its logarithm, and then....

[fill in details here]

... take the Legendre transform, and then....

[fill in details here]

.... and finally you get a function Λ\Lambda^* such that

view this post on Zulip John Baez (Jan 10 2022 at 21:40):

Λ(E)=limn1nlog(P(i=1nXinE)) \Lambda^*(E) = -\lim_{n \to \infty} \frac{1}{n} \log \left(P\left(\sum_{i=1}^n X_i \geq nE \right)\right)

(or is it \leq?)

view this post on Zulip John Baez (Jan 10 2022 at 21:41):

Could you please fill in the details, getting all the signs right and all the sups and infs right and the \le versus \ge right?

view this post on Zulip Owen Lynch (Jan 10 2022 at 21:41):

Sure!

view this post on Zulip John Baez (Jan 10 2022 at 21:51):

Thanks!

view this post on Zulip John Baez (Jan 10 2022 at 21:52):

Maybe you could write it here.

view this post on Zulip Owen Lynch (Jan 10 2022 at 22:02):

Oh, I'm writing up a document that I'm going to put as a pdf here

view this post on Zulip Owen Lynch (Jan 10 2022 at 22:03):

I just finished handwriting it, so now I just have to type it up

view this post on Zulip John Baez (Jan 10 2022 at 22:13):

Okay, that's probably more useful in the long run.

view this post on Zulip Owen Lynch (Jan 10 2022 at 23:20):

Alright, I wrote up everything purely pertaining to Cramer's theorem, and then started moving on to writing up about why the log of the partition function is related to free energy, but I can't see the word "partition function" one more time today, so finishing that bit will have to wait. The first part of this document is complete though, and answers your question. cramers_theorem.pdf

view this post on Zulip John Baez (Jan 10 2022 at 23:34):

Great! I'm checking it out. I'm glad you stated a "thermodynamic version" in its own box.

view this post on Zulip John Baez (Jan 10 2022 at 23:46):

I know why and how the log of the partition function relates to free energy, but it'll be good to carry it forward to the point of giving a nice thermodynamic interpretation of the function Λ\Lambda^\ast.

view this post on Zulip John Baez (Jan 15 2022 at 23:47):

Hi! Could you finish up this document so we can talk about it on Monday?

view this post on Zulip Owen Lynch (Jan 17 2022 at 00:10):

I'll try my best; I realized that what I was trying to was actually to prove something that is false (I think)!

view this post on Zulip Owen Lynch (Jan 17 2022 at 00:12):

I.e., I think the thermodynamic entropy defined by logZ+βU\log Z + \beta U does not end up equaling the Shannon entropy of the canonical ensemble

view this post on Zulip Owen Lynch (Jan 17 2022 at 00:13):

So that has shaken my understanding of things a bit, and I need to now understand why this thermodynamic entropy is important

view this post on Zulip Owen Lynch (Jan 17 2022 at 00:19):

Ah, I may have found part of the problem

view this post on Zulip Owen Lynch (Jan 17 2022 at 00:21):

I think I was using the "state entropy", which is the entropy of the random variable that determines which state one is in. However, different states might have the same energy, so the energy variable has a low entropy than the state variable. I'm going to try redoing my calculations using the entropy of the energy variable instead.

view this post on Zulip Owen Lynch (Jan 17 2022 at 17:56):

No, that wasn't the problem, I had just made a dumb mistake somewhere else

view this post on Zulip Owen Lynch (Jan 17 2022 at 17:56):

It all checks out now

view this post on Zulip Owen Lynch (Jan 17 2022 at 17:58):

cramers_theorem.pdf

view this post on Zulip John Baez (Jan 17 2022 at 18:38):

Great, I'll read it now and we can talk about it in 22 minutes.

view this post on Zulip Owen Lynch (Jan 17 2022 at 18:40):

Wait, read this version! cramers_theorem.pdf

view this post on Zulip Owen Lynch (Jan 17 2022 at 18:40):

Slightly more stuff :D

view this post on Zulip John Baez (Jan 17 2022 at 18:40):

Okay. In the version I just read, you don't finish the job and say what the thermodynamic meaning of Λ\Lambda^\ast is.

view this post on Zulip John Baez (Jan 17 2022 at 18:40):

I'll try the new one!

view this post on Zulip Owen Lynch (Jan 17 2022 at 18:41):

Note: I changed the notation to be in line with Hong Qian's paper, because I was getting confused going between the two

view this post on Zulip John Baez (Jan 17 2022 at 18:42):

Okay. Now you say Λ\Lambda^*, or really ϕ\phi, is minus the entropy (under some conditions).

view this post on Zulip Owen Lynch (Jan 17 2022 at 18:42):

Yes!