2.4. Entropy, Ignorance, and Information

Previous: Chapter 3. The Arrow of Time in Statistical Mechanics

In the 19th century, probabilities in physics were associated, among other things, with human ignorance of the details of macrosystem behavior at the microlevel; that is, the statistical interpretation of the second law suggested a connection between entropy and ignorance. The advent of Shannon’s information theory gave new impetus to this line of thought: information is related to ignorance, and therefore information is related to entropy.

In 1956 Claude Shannon, the author of information theory, wrote a short article ‘Bandwagon‘, in which he urged caution in applying information theory to areas for which it was not intended:

‘Information theory has, in the last few years, become something of a scientific bandwagon. Starting as a technical tool for the communication engineer, it has received an extraordinary amount of publicity in the popular as well as the scientific press. In part, this has been due to connections with such fashionable fields as computing machines, cybernetics, and automation ; and in part, to the novelty of its subject matter. As a consequence, it has perhaps been balloned to an importance beyond its actual accomplishments. Our fellow scientists in many different fields, attracted by the fanfare and by the new avenues opened to scientific analysis, are using these ideas in their own problems. Applications are being made to biology, psychology, linguistics, fundamental physics, economics, the theory of organization, and many others. In short, information theory is currently partaking of a somewhat heady draught of general popularity.’

However, it was already impossible to stop the information bandwagon in statistical mechanics; this chapter present the spirits of the 1950s, when, for various reasons, entropy in statistical mechanics became associated with information. As a result, in paper on statistical mechanics, the Gibbs statistical entropy is often referred to as the Shannon entropy. This section is based on Javier Anta’s dissertation ‘Historical and Conceptual Foundations of Information Physics‘.

The next section discusses Edwin Jaynes’s principle of maximum information entropy and its application in statistical mechanics. Jaynes used Shannon’s entropy in statistical mechanics based on the concept of subjective probability. Another line of development for information entropy, Maxwell’s demon based on Szilard’s thought experiment, is presented in the next chapter. The final section of this chapter examines the views of Rudolf Carnap, who in the late 1950s argued for the objectivity of entropy in statistical mechanics. However, he failed to convince the physicists, and as a result, Carnap did not publish his book ‘Two Essays on Entropy‘; it was only published posthumously in 1977.

Maxwell and Gibbs on the role of perception
Shannon information theory
Arrival of information in statistical mechanics
Edwin Jaynes and principle of maximum information entropy
Carnap on the objectivity of entropy

Maxwell and Gibbs on the role of perception

The transition to a conceptual model of moving atoms raises the question of the role of the senses and the possibility of manipulating at the atomic level. In this regard, I quote Maxwell from the paper ‘Tait’s Thermodynamics‘. Maxwell’s goal was to convince that it is impossible to strictly derive the second law from the kinetic theory, but his writing implicitly implied the subjectivity of the distinction between heat and work (see also the next chapter on Maxwell’s demon):

‘The second law relates to that kind of communication of energy which we call the transfer of heat as distinguished from another kind of communication of energy which we call work. According to the molecular theory the only difference between these two kinds of communication of energy is that the motions and displacements which are concerned in the communication of heat are those of molecules, and are so numerous, so small individually, and so irregular in their distribution, that they quite escape all our methods of observation; whereas when the motions and displacements are those of visible bodies consisting of great numbers of molecules moving all together, the communication of energy is called work.’

‘Hence, we have only to suppose our senses sharpened to such a degree that we could trace the motions of molecules as easily as we now trace those of large bodies, and the distinction between work and heat would vanish, for the communication of heat would be seen to be a communication of energy of the same kind as that which we call work.’

Similarly, Gibbs writes in the preface to ‘Statistical Mechanics‘:

‘The laws of thermodynamics, as empirically determined, express the approximate and probable behavior of systems of a great number of particles, or, more precisely, they express the laws of mechanics for such systems as they appear to beings who have not the fineness of perception to enable them to appreciate quantities of the order of magnitude of those which relate to single particles, and who cannot repeat their experiments often enough to obtain any but the most probable results.’

These statements of Maxwell and Gibbs serve as a good introduction to the problem of probability in statistical mechanics. On the one hand, there are beings with limited perception, but on the other hand, there is a world in which certain events occur with a certain frequency (frequency interpretation of probability). The attachment of information to this process leads to the destruction of the compromise and, in one way or another, results in the subjectivity of probability and to the connection between entropy and information.

Shannon information theory

The information theory is related to the problem to transmit efficiently a text consisting from characters over a noisy communication channel. The text contains information, but the amount of information is not related to the content but rather to the length of the text. The characters from a given alphabet are encoded in binary, and this leads to the concept of a bit, a unit of measurement for information.

Shannon also introduced the concept of information entropy, which is related to the probability distribution of characters appearing in a text (the sum is taken over all characters in the alphabet):

H = - \sum_{i} p_{i} \log_{⁡2} p_{i}

Information entropy has played a major role in solving the problems of telecommunications – the choice of the most effective encoding of a symbol, the creation of noise-resistant codes, as well as the text compression for more efficient storage and transmission. In this respect, the ambiguity of the relationship between information entropy and the amount of information should be mentioned. On the one hand, entropy is associated with ignorance (the greater the entropy, the less information), on the other hand – in the compressed files, the maximum amount of information is achieved at the limit of maximum entropy.

The expression for information entropy is similar to the Gibbs statistical entropy, and many physicists have concluded that this similarity indicates a relationship between thermodynamic entropy and information entropy. Yet, let me give a counterexample. In physics, the Poisson equation for electrostatic fields mathematically is similar to the stationary Fourier heat equation. This fact is often used by engineers when working with software in which programmers have programmed the solution of the heat equation, but have not created an interface for solving problems with an electrostatic field. However, no one makes a far-reaching conclusion about the internal similarities between heat conduction and electrostatics.

Arrival of information in statistical mechanics

Three people played a major role in spreading the ideas about the relationship between entropy and information (see Anta’s thesis): Norbert Wiener, John von Neumann, and Warren Weaver.

In 1948, in the influential book ‘Cybernetics‘, Wiener connected the concepts of information and entropy. A cybernetic agent used the information received to predict the behavior of a system, which in turn was associated with order or disorder; predicting the behavior of an ordered system was easier. Wiener drew on Schrödinger’s ‘negative entropy’ in the book ‘What is life‘, as well as on the chemist Gilbert Lewis’s arguments, who had argued back in 1930: ‘Gain in entropy always means loss of information, and nothing more.’

Thus, Wiener introduced a connection between the information possessed by a cybernetic agent and the entropy of the system that the agent controls:

‘The notion of the amount of information attaches itself very naturally to a classical notion in statistical mechanics: that of entropy. Just as the amount of information in a system is a measure of its degree of organization, so the entropy of a system is a measure of its degree of disorganization; and the one is simply the negative of the other (…) We have said that amount of information, being the negative logarithm of a quantity [μ(Γ_M)] which we may consider as a probability, is essentially a negative entropy.’

In 1932, John von Neumann discussed in his book ‘Mathematical Foundations of Quantum Mechanics‘ the solution proposed by Leo Szilard in respect to Maxwell’s demon. Von Neumann believed that such a solution allows us to combine together the physical state of the system and the epistemic state of the agent. It could be that along this way von Neumann hoped to find an interpretation of measurement in quantum mechanics. After the works of Shannon and Wiener, von Neumann actively promoted the combination of formal logic and statistical mechanics by means of the connection between information and entropy. Von Neumann’s authority in academic circles contributed to the spread of this idea.

The anecdote about von Neumann’s recommendation to Shannon to use the term entropy in information theory (‘nobody knows what entropy is’) is first mentioned in the work of Tribus in the 1960s. Anta writes that nobody knows if the story is true, but in any case it hints at von Neumann’s attitude towards entropy and information (see also Carnap’s memoirs below).

Warren Weaver worked together with Shannon, but Weaver, unlike Shannon, wanted to popularize information theory in a way that is accessible to an educated public. Weaver extended the meaning of information in Shannon’s theory to semantic and pragmatic, and also connected information entropy with thermodynamic entropy. Weaver’s administrative resources ensured the success of his venture.

Leon Brillouin was the first physicist to complete the unification of information theory and statistical mechanics. His ideas are described in the next chapter, as they are related to Szilard’s thought experiment.

Edwin Jaynes and principle of maximum information entropy

In 1957 Edwin Jaynes published paper ‘Information Theory and Statistical Mechanics‘ (in two parts) that gave a strong impetus to the interpretation of thermodynamic entropy in the spirit of information and the measure of ignorance. Let me quote two expressive passages from the second part of the paper:

‘With such an interpretation the expression “irreversible process” represents a semantic confusion; it is not the physical process that is irreversible, but rather our ability to follow it. The second law of thermodynamics then becomes merely the statement that although our information as to the state of a system may be lost in a variety of ways, the only way in which it can be gained is by carrying out further measurements.’

‘It is important to realize that the tendency of entropy to increase is not a consequence of the laws of physics as such, … . An entropy increase may occur unavoidably, due to our incomplete knowledge of the forces acting on a system, or it may be entirely voluntary act on our part.’

However, after that Jaynes clarified the meaning of ‘subjective’ in the original paper. Below is a description of Jaynes’s ideas based on his 1978 lecture ‘What is the state of maximum entropy?‘ (published the following year). In this paper, Jaynes explained that important changes were made to the terminology shortly after the 1957 paper; the distinction between information entropy and experimentally measured entropy was introduced, and the meaning of ‘subjective’ was changed to ‘every reasonable person would make this decision’.

Ideas of Jaynes are primarily related to statistical inference. Jaynes used information entropy to search for an unknown probability distribution of events. The meaning of the term event in the Jaynes method is well suited to rolling dice, when each possible outcome (event) is associated with its own probability. The probability distribution of events is considered unknown and must be found as part of statistical analysis. Jaynes based his work on the principle of insufficient reason, which states that from ignorance (subjective probability) follows the equal a priori probability of all possible outcomes.

Jaynes extends the original problem by including additional information about the unknown distribution; he considers the case where the mean and/or other moments of the distribution are known. The additional known information imposes constraints on the final distribution, and Jaynes introduced the principle of maximum information entropy. Jaynes proves that his method maximizes ignorance, ensuring that the resulting probability distribution reflects only the known information and nothing else.

In statistical mechanics, Jaynes rejects the frequency interpretation to consider Gibbs ensembles. He argues that a person introduces a probability distribution for a single system in question, and Jaynes suggests that the probability distribution is not related to the world but rather to the limitations of the observer. The application of the maximum information entropy principle to thermodynamic ensembles in statistical mechanics reproduces all known equilibrium probability distributions. Additionally, information entropy is analogous to the Gibbs statistical entropy. Thereafter Jaynes made the far-reaching claim that the principle of maximum information entropy is the foundation of statistical mechanics.

Jaynes proposed to cut the Gordian knot of the arrow of time in statistical mechanics by abandoning the frequency interpretation and accepting instead subjective probabilities. The maximum information entropy principle maximizes ignorance under known constraints on the ensemble and leads to correct results. Therefore, the maximum information entropy principle should be accepted as the foundation to construct statistical mechanics; Jaynes even proposed on this basis the so-called “almost unbelievably short” proof of the second law.

Myron Tribus was excited by the approach of Jaynes and he popularized it in his well-known textbook ‘Thermostatics and Thermodynamics‘ in 1961. Tribus believes that the Jaynes method makes statistical mechanics accessible to engineers. In the introduction, he first points out the difficulty of conventional teaching of statistical mechanics:

‘Before one can do much with statistical mechanics, a certain amount of discussion about such abstract ideas as phase space, ergodic system and cells in phase space has heretofore been considered to be required for a rigorous development. Before one could get to the practical calculations that interest an engineer, it has been necessary to master a formidable number of specialized mathematical tools.’

Tribus also argues that the Jaynes method is even simpler than a full formal treatment of classical thermodynamics:

‘The mathematical tools and the abstract ideas required by Jaynes’ methods are less demanding than are those required for, say, Caratheodory’s treatment of macroscopic thermodynamics, an approach seriously advocated by some authors. The mathematical methods required for Jaynes’ approach are of more general applicability in engineering than are those employed in macroscopic thermodynamics.’

In conclusion, a couple of words about the conflict between Jaynes and Abner Shimony (worked with Carnap). Shimony like Carnap defended the objectivity of entropy and he took an active part in the posthumous publication of Carnap’s work in 1977. Below is a quote of Jaynes about Shimony; it shows passions in the discussion about entropy:

‘[Shimony] seems to have made it his lifelong career to misconstrue everything I wrote many years ago, and then compose long pedantic commentaries, full of technical errors and misstatements of documentable facts, showing no awareness of anything done in this field since then – and which, to cep it all off, attack not my statements, but only his own misunderstandings of them. The conflict is not between Shimony and me, but between Shimony and the English language.’

Carnap on the objectivity of entropy

Rudolf Carnap wrote a book on entropy between 1952 and 1954 during his stay in the Institute for Advanced Study in Princeton, but the book was published just in 1977 posthumously. Carnap’s interest in entropy in statistical mechanics was related to his work on inductive logic. He believed that interpretation of probability within the framework of logic could lead to a partial verification approach. At present, the similarity between Carnap’s ideas and the later development of Bayesian statistical inference is stressed.

Carnap saw the similarity of mathematical formalism in the statistical interpretation of thermodynamic entropy and in solution of inductive logic problems within the framework of partial verification. Carnap compares the task of distributing gas molecules among the cells of μ-space in the Boltzmann method with the task of object classification into categories that arises in inductive logic. The number of possible arrangements of objects into categories coincides with the number of distributions of gas molecules among the cells, resulting in two entropies that formally appear to be the same.

At the same time, Carnap emphasized the fundamental difference between these tasks; he believed that the similarity of the mathematical formalism does not lead to the identical meaning in both tasks, therefore, to the same meaning of both entropies. In his autobiography written in 1963, Carnap described the atmosphere of that time as follows:

‘I had some talks separately with John von Neumann, Wolfgang Pauli, and some specialists in statistical mechanics on some questions of theoretical physics with which I was concerned. I certainly learned very much from these conversations; but for my problems in the logical and methodological analysis of physics, I gained less help than I had hoped for. … My main object was not the physical concept, but the use of the abstract concept for the purposes of inductive logic. Nevertheless, I also examined the nature of the physical concept of entropy in its classical statistical form, as developed by Boltzmann and Gibbs, and I arrived at certain objections against the customary definitions, not from a factual-experimental, but from a logical point of view. It seemed to me that the customary way in which the statistical concept of entropy is defined or interpreted makes it, perhaps against the intention of the physicists, a purely logical instead of physical concept; if so, it can no longer be, as it was intended to be, a counterpart to the classical macro-concept of entropy introduced by Clausius, which is obviously a physical and not a logical concept. The same objection holds in my opinion against the recent view that entropy may be regarded as identical with the negative amount of information. I had expected that in the conversations with the physicists on these problems, we would reach, if not an agreement, then at least a clear mutual understanding. In this, however, we did not succeed, in spite of our serious efforts, chiefly, it seemed, because of great differences in point of view and in language.’

Carnap assumes the objectivity of entropy in classical thermodynamics:

‘The concept of entropy in thermodynamics (S_th) had the same general character as the other concepts in the same field, e.g., temperature, heat, energy, pressure, etc. It served, just like these other concepts, for the quantitative characterization of some objective property of a state of a physical system, say, the gas g in the container in the laboratory at the time t.’

This in turn implies the objectivity of entropy in statistical mechanics, and Carnap introduced the ‘Principle of physical magnitudes‘, which states that physical descriptions of a quantity at the micro and macro levels should lead to the same results within the experimental error.

The main difference between two entropies, in statistical mechanics and in inductive inference, is related to microstate entropy. In the classification problem, for a specific microstate all uncertainties disappear and the information entropy becomes zero. Carnap referred to this solution as the second method for determining the microstate entropy (Method II). Carnap believed that this solution was suitable for logical problems, as there was no contradiction between the non-zero entropy of the macrostate and the zero entropy of the microstate.

However, this approach is inconsistent with the principle of physical quantity, and therefore Carnap believed that such a solution could not be used in statistical mechanics. The microstate belongs to a physical system, and therefore its properties are as objective as the properties of the macrostate. According to Carnap, experimental measurements correspond to the trajectory of a single system over time, and averaging over phase space is merely a technical method for finding time averages.

A microstate specifies a certain trajectory in time, so the physical properties of the system under study must also be connected with a single microstate. Hence, Carnap concluded that the microstate entropy must be equal to the macrostate entropy; otherwise, it becomes unclear how a macrostate can have a non-zero entropy as the system moves along its trajectory. Carnap refers to this solution as the first method for calculating the microstate entropy (method I).

At the end, Carnap’s main conclusion was that the first method of calculating the microstate entropy should be used in statistical mechanics as a branch of theoretical physics, and the second method should only be considered when solving epistemological problems. This led to serious disagreements with physicists. John von Neumann and Wolfgang Pauli believed that the second method was correct for entropy in statistical mechanics (the entropy of a microstate is equal to zero), and that Carnap’s book was therefore undesirable. For example, Pauli wrote after reviewing Carnap’s draft as follows:

‘Dear Mr. Carnap! I have studied your manuscript a bit; however I must unfortunately report that I am quite opposed to the position you take. Rather, I would throughout take as physically most transparent what you call “Method II”. In this connection I am not all influenced by recent information theory (…) Since I am indeed concerned that the confusion in the area of foundations of statistical mechanics not grow further (and I fear very much that a publication of your work in this present form would have this effect).’

Similar arguments were made by von Neumann. As a result, Carnap abandoned the idea of publishing the book; perhaps he did not want his work on inductive logic to be attacked by physicists because of these disagreements.

Next: Chapter 5. Maxwell’s Demon and Information

References

C. Shannon, The Bandwagon, Trans. IRE, IT-2, No. 1 (1956), 3.

James Clerk Maxwell, Tait’s “Thermodynamics”, II. Nature, 1878, 278-280.

Javier Anta Pulido, Historical and Conceptual Foundations of Information Physics. (2021). Chapter III, The Informationalization of Thermal Physics, Chapter IV, The Golden Age of Information Physics.

E. T. Jaynes, Information theory and statistical mechanics, Phys. Rev. Part I: 106, 620–630 (1957), Part II: 108, 171–190 (1957)

E. T. Jaynes, Where do we Stand on Maximum Entropy? in The Maximum Entropy Formalism, R. D. Levine and M. Tribus (eds.), 1979, p. 15 – 118.

Jos Uffink, Subjective Probability and Statistical Physics, in Probabilities in Physics, 2011, 25–50.

Myron Tribus, Thermostatics and Thermodynamics: an Introduction to Energy, Information and States of Matter, 1961.

Rudolf Carnap, Two essays on entropy, Univ of California Press, 1977.

Hannes Leitgeb and André Carus, Rudolf Carnap, The Stanford Encyclopedia of Philosophy, 8. Inductive Logic and the Re-Emergence of the Theoretical Language.

Discussion

https://evgeniirudnyi.livejournal.com/404887.html

15.01.2025 Maxwell on Heat and Work

Maxwell’s quotes.

https://evgeniirudnyi.livejournal.com/391618.html