Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions questions/189_pixelnormalization/description.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
## Problem
Write a Python function to perform Pixel Normalization on a 4D input tensor with shape (B, C, H, W). The function should normalize each pixel across all channels by dividing its values by the square root of the mean squared activation across channels.
5 changes: 5 additions & 0 deletions questions/189_pixelnormalization/example.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
{
"input": "X.shape = (2, 2, 2, 2)",
"output": "Normalized tensor of shape (2, 2, 2, 2) where each spatial location is normalized across channels",
"reasoning": "For each spatial location, compute the mean of squared activations across all channels, take the square root to obtain the RMS value, and divide each channel’s activation by this value."
}
25 changes: 25 additions & 0 deletions questions/189_pixelnormalization/learn.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
## Understanding Pixel Normalization
Pixel Normalization (PN) is a normalization technique that normalizes feature vectors at each spatial location across channels. Pixel Normalization is particularly useful in generative models such as Progressive GANs, where it helps control feature magnitudes and promotes consistent feature scaling during training.
### Mathematical Definition
For an input tensor with the shape **(B, C, H, W)**, where:
* B: batch size
* C: number of channels
* H: height
* W: width
The normalization for each pixel at spatial position *(h, w)* is computed as follows:
$$
x'_{b, c, h, w} = \frac{x_{b,c,h,w}}{\sqrt{\frac{1}{C}\sum_{i=1}^C x^2_{b,i,h,w}+\epsilon}}
$$
where:
* $x_{b,c,h,w}$ is the pixel value of channel *c* at position *(h, w)* for sample *b*.
* $\epsilon$ is a small constant added for numerical stability (e.g., $10^-8$).

This operation ensures that for every spatial *(h, w)*, the vector $[x'_{b, 1, h, w}, x'_{b, 2, h, w}, \ldots, x'_{b, C, h, w}]$ has unit norm, i.e:
$$
\frac{1}{C}\sum_{i=1}^C (x'_{b, i, h, w})^2 = 1
$$
### Why Pixel Normalization
* **Batch size independence**: Pixel Normalization does not rely on batch-level statistics such as mean or variance, making it suitable for training with very small batch sizes, even batch size = 1.
* **Training stability**: Removing batch dependencies leads to smoother convergence and more deterministic training behavior, especially in GANs.
* **Stable feature scaling**: By normalizing each pixel accross channels, it prevents the uncontrolled growth of activations, ensuring consistent feature magnitudes.
* **No parameters**: No learnable paramters ($\gamma$, $\beta$), reducing computational overhead while maintain effectiveness in deep generative networks.
15 changes: 15 additions & 0 deletions questions/189_pixelnormalization/meta.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
{
"id": "189",
"title": "Implement Pixel Normalization",
"difficulty": "easy",
"category": "Deep Learning",
"video": "",
"likes": "0",
"dislikes": "0",
"contributor": [
{
"profile_link": "https://github.com/LeTienQuyet",
"name": "LeTienQuyet"
}
]
}
4 changes: 4 additions & 0 deletions questions/189_pixelnormalization/solution.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
import numpy as np

def pixel_normalization(X: np.ndarray, eps: float = 1e-8) -> np.ndarray:
return X / np.sqrt(np.mean(X**2, axis=1, keepdims=True) + eps)
9 changes: 9 additions & 0 deletions questions/189_pixelnormalization/starter_code.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
import numpy as np

def pixel_normalization(X: np.ndarray, eps: float = 1e-8) -> np.ndarray:
"""
Perform pixel normalization on the input array X.
Each pixel value is divided by the square root of the mean of the squared pixel values
across each row, plus a small epsilon for numerical stability."""
# Your code here
pass
10 changes: 10 additions & 0 deletions questions/189_pixelnormalization/tests.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
[
{
"test": "np.random.seed(42)\nB, C, H, W = 2, 2, 2, 2\nX = np.random.randn(B, C, H, W)\noutput = pixel_normalization(X)\nprint(np.round(output, 4))",
"expected_output": "[[[[1.2792, -0.7191], [0.5366, 1.2629]], [[-0.603, -1.2177], [1.3084, 0.6364]]], [[[-1.2571, 0.3858], [-0.3669, -0.9021]], [[0.6479, -1.3606], [-1.3658, -1.0891]]]]"
},
{
"test": "np.random.seed(42)\nB, C, H, W = 2, 2, 2, 1\nX = np.random.randn(B, C, H, W)\noutput = pixel_normalization(X)\nprint(np.round(output, 4))",
"expected_output": "[[[[0.8606], [-0.1279]], [[1.1222], [1.4084]]], [[[-0.2074],[-0.4127]], [[1.3989], [1.3527]]]]"
}
]