Skip to content

Commit 6cb6404

Browse files
committed
Update notes
1 parent 74ffa98 commit 6cb6404

File tree

2 files changed

+22
-0
lines changed

2 files changed

+22
-0
lines changed

notes/main.typ

Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -102,6 +102,28 @@ Form the famous movie #link("https://en.wikipedia.org/wiki/Rebel_Without_a_Cause
102102
The special issue Volume 8, Issue 2, 2022
103103
Issue of #emph("Observational Studies") titleed #link("https://en.wikipedia.org/wiki/Rebel_with_a_Cause_(book)")[`Rebel With a Cause`]
104104

105+
== Fun example
106+
107+
=== on overparameterized models
108+
109+
Form comment in #link("https://statmodeling.stat.columbia.edu/2025/11/14/how-is-it-that-this-problem-with-its-21-data-points-is-so-much-easier-to-handle-with-1-predictor-than-with-16-predictors/")[`Impossible statistical problems`] of Andrew Gelman by Phil, November 14, 2024.
110+
111+
#quote("I’m imagining a political science student coming in for statistical advice:
112+
Student: I’m trying to predict the Democratic percentage of the two-party vote in U.S. Presidential elections, six months before Election Day. I want to use just the past ten elections because I think the political landscape was too different before that.
113+
Statistician: Sounds interesting. What predictive variables do you have?
114+
Student: I’ve got the Democratic share in the last election, and the change in unemployment rate over the past year and the past three years, and the inflation rate over the past year and the past three years, and the change in median income over the past year and past three years.
115+
Statistician: That’s a lot of predictors for not many elections, we are going to have some issues, but maybe we can use lasso or a regularization scheme or something. Let’s get started.
116+
Student: I also own an almanac.
117+
Statistician: Oh. Sorry, I can’t help you, your problem is impossible.")
118+
119+
10 data points and 7 predictors, there are somthing to do, with a almanac, 1000+ predictors, the problem is impossible since the model is overparameterized and can not give any prediction power for future.
120+
121+
Thus, in tiny sample point, give too much useless predictors may indeed polute the data and make the problem impossible.
122+
123+
#question("Dense and lower high dimensional model")[
124+
In dense high dimensional model, and number of samples $n$ is not so big,
125+
$ EE[ Y | X] = rho ( X^top beta), beta_j ~ O(1/p), j = 1,dots,p, p/n arrow.r (0, infinity) $
126+
If the model is misspecified, it just like the above example, may too many useless predictors and give a useless prediction. "Is there an example of such dense high dimensional model?]
105127

106128
= On the undistinguishable or identification of statistical models
107129

static/notes/notes.pdf

13.2 KB
Binary file not shown.

0 commit comments

Comments
 (0)