[36] G. E. Hinton and R. S. Zemel. Autoencoders, minimum description length and Helmholtz free energy. In J. D. Cowan, G. Tesauro, and J. Alspector, editors, Advances in Neural Information Processing Systems 6. Morgan Kaufmann, 1994.

[37] E. T. Jaynes. Probability Theory in Science and Engineering, volume 4 of Colloquium Lectures in Pure and Applied Science. Socony Mobil Oil Company, 1959.

[38] E. T. Jaynes. Information theory and statistical mechanics. Proceedings of Brandeis Summer Institute, 1963.

# [39] E. T. Jaynes. Prior probabilities. IEEE Transactions on Systems Science and Cybernetics,

SSC-4:227-241, 1968.

[40] G. Joyce and D. Montgomery.

Negative temperature states for the two-dimensional

guilding-centre plasma. Journal of Plasma Physics, 10:107-121, 1973.

[41] W. T. Grandy Jr and 1. H. Schick, editors. Maximum entropy and Bayesian methods, volume 43 of Fundamental Theories of Physics. Kluwer Academic, 1991.

[42] E. H. Kerner. Gibbs Ensemble: Biological Ensemble, volume XII of International Science Review. Gordon and Breach, 1972.

[43] A. N. Kolmogorov. Three approaches to the quantitative definition of information. Prob lemy Peredachi Informatsii, 1:3-11, 1965.

[44] D. Layzer. Growth of order in the Universe. Appears in [78].

[45] D. Layzer. The arrow of time. Scientific American, 233:56-69, 1975.

[46] D. Layzer. Cosmogenesis. Oxford University Press, 1991.

[47] E. Levin, N. Tishby, and S. A. Solla. A statistical approach to learning and generalisation in layered neural networks. Proceedings of the IEEE, 78:1568-1574, 1990.

[48] J. T. Lewis and C. E. Pfister. Thermodynamic probability theory: some aspects of large deviations. Technical Report DIAS-STP-93-33, Dublin Institute for Advanced Studies, 1993.

[49] J. T. Lewis, C. E. Pfister, and W. G. Sullivan. Entropy, concentration of probability and conditional limit theorems. Technical report, Dublin Institute for Advavced Studies, 1995.

[50] M. Li and P. Vitanyi. An Introduction to Kolmogorov Complexity and its Applications. Texts and Monographs in Computer Science. Springer-Verlag, 1989.

[51] D. MacKay. A free energy minimisation framework for inference problems in modulo 2 arithmetic. Preprint available from http://13l.111.48.24/mackay/fe.ps.Z.

[52] B. H. Marcus and R. M. Roth. Improved Gilbert-Varshamov bound for constrained systems. IEEE Transactions on Information Theory, 38:1213-1221, 1992.

[53] A. Martin-Lo£. Statistical Mechanics and the Foundations of Thermodynamics. Number 101 in Lecture Notes in Physics. Springer-Verlag, 1979.