Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 8 additions & 1 deletion _posts/2025-10-11-max-ent-rl.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
layout: distill
title: Why the Exponential? From Max‑Entropy RL to the Boltzmann Distribution
title: Why the Exponential? From Max‑Entropy RL to the Boltzmann Distribution
description: This blog post explores why the exponential function appears ubiquitously across modern RL, energy-based modeling, and statistical mechanics. We examine the connection between max-entropy reinforcement learning and the Boltzmann distribution, uncovering the fundamental principles that make the exponential form inevitable and explaining what "temperature" actually does in these frameworks.
tags: reinforcement-learning information-theory boltzmann-distribution
giscus_comments: true
Expand All @@ -15,3 +15,10 @@ authors:
name: UBC

---

<script>
window.location.replace("https://qihang-zhang.com/Learning-Sys-Blog/2025/10/06/max-ent-rl-and-boltzmann-distribution.html");
</script>

If you are not redirected automatically, you can read the full post here:
[Why the Exponential? From Max‑Entropy RL to the Boltzmann Distribution](https://qihang-zhang.com/Learning-Sys-Blog/2025/10/06/max-ent-rl-and-boltzmann-distribution.html).
25 changes: 25 additions & 0 deletions _posts/2025-11-09-weighted-poe.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
---
layout: distill
title: Test-Time Steering for Lossless Text Compression via Weighted Product of Experts
description: >
When I was a child, I always wondered: if I keep compressing the same file, will it eventually shrink to nothing? Of course, the answer is no—once a file is optimally compressed by a lossless compressor, compressing it again with the same method gives a file of exactly the same size. Today I know this comes from the fundamental limits of lossless compression in information theory. But what if we use multiple compressors instead of one? If we combine them, can each remove a different part of the data’s redundancy—and how should such a combination be designed? In this blog we discussed the above questions and proposed a method called Weighted Product of Experts.
tags: large-language-models lossless-compression mixture-of-experts information-theory
giscus_comments: true
date: 2025-11-09
featured: true
redirect: https://qihang-zhang.com/Learning-Sys-Blog/2025/10/15/weighted-product-of-experts.html

authors:
- name: Qihang Zhang
url: "https://qihang-zhang.com/"
affiliations:
name: UBC

---

<script>
window.location.replace("https://qihang-zhang.com/Learning-Sys-Blog/2025/10/15/weighted-product-of-experts.html");
</script>

If you are not redirected automatically, you can read the full post here:
[Test-Time Steering for Lossless Text Compression via Weighted Product of Experts](https://qihang-zhang.com/Learning-Sys-Blog/2025/10/15/weighted-product-of-experts.html).