Bayesian World View of Cognition

Statistical Machine Learning

Resources

https://rkabacoff.github.io/datavis/

Power analysis software: https://webpower.psychstat.org/wiki/models/index

General Principles

Avoid HARKing: Hypothesizing After the Results are Known
Jamie’s recommendations
- In the limitation section, talk about what will make you more confident in your statistical anlaysis.
- For power analysis, focus on the main hypothesis. Don’t try to get enough power for every hypothesis you want to test.

Tips

Remember to center your predictors
- (A lesson from a mistake in a “Large-Scale, Multi-Lab” study. https://onlinelibrary.wiley.com/doi/10.1111/desc.70029)
When there are multiple choices (excluding outliers or not), instead of creating multiple data sheets, create new variables that indicate which treatment of data is used. Take subset of the data and run analysis on them in the same file to compare.

Data Preparation

Use meaningful codes for missing values [1]
A ‘data dictionary’ that includes the name of each variable, a description, and other information. [1]

Math

Variance of a random vector $X$ along a direction $u$: $V[u^TX] = u^T V[X] u$
To derive correlation matrices and SS-CP matries
- Data $X \in \R^{n \times p}$, where n is ther number of observations and p is the number of variables ($x_1 \dots x_p$) → an observation provides pairing of those p values → we can obtain pairwise correlations among these p variables.
- Variable means (column means) = $\frac{1}{n} \begin{bmatrix} 1 \dots 1\end{bmatrix} X \in \R^{1 \times p}$
- Mean-centered data $X_c = X - \begin{bmatrix} 1 \dots 1\end{bmatrix}^T \frac{1}{n} \begin{bmatrix} 1 \dots 1\end{bmatrix} X$
- Sums-of-Squares and Cross-Products matrix (SS-CP) = $X_c^T X_c$
- Variance-covariance matrix = $\frac{1}{n-1} X_c^T X_c$
- Correlation matrix = $Diag(1/s_{x_1} \dots 1/s_{x_p}) \frac{1}{n-1} X_c^T X_c Diag(1/s_{x_1} \dots 1/s_{x_p})$

Basics

STAT 400: Statistics and Probability I

Parameter describes a population; Statistic describes a sample
- Standard deviation $\sigma$
- Sample standard deviation $s=\sqrt{\frac{1}{n-1} \sum_i (x_i-\bar x)^2 }$