Blog – Guest on the snow mountain

Contrast coding

Assigning numerical values to categorical (factor) variables is called contrast coding. This need arises when we want to estimate the difference in the dependent variable among different conditions of an independent variable. In other words, we use contrast coding to instruct the Bayesian model to compare the conditional means between different conditions or bundles of…

by chloeloloblue 19th Jul 2025

Bayesian Hierarchical Model

Data is often grouped by similarities that arise from repeated measurements for each subject or experimental item. This structure can be represented as hierarchies. This type of model is known as a hierarchical model, multi-level model, mixed model, or partial pooling model. Exchangeability The Bayesian hierarchical model is based on the assumption of exchangeability. This…

by chloeloloblue 3rd Jul 2025

Three linear regression models – the Bayesian way

Regression tells us how the dependent (the output) variable responds to changes in independent variables that can be categorical, ordinal or continuous. In following sections, I show how three linear models can be defined by Bayesian statistics while the dependent variable follows different likelihood functions. Although the formulae are linear, the relationship between dependent and…

by chloeloloblue 26th Nov 202430th Jan 2025

A simple Bayesian regression model with Stan: brms

For most Bayesian model fittings, there is no analytical solution for deriving posterior distributions or integrating them. Rather, we approximate those posteriors via random sampling from all possible parameter sets and their corresponding likelihood functions. Specifically, we sample posterior distributions from a multidimensional space, where each dimension corresponds to a parameter. The shape of this…

by chloeloloblue 15th Oct 2024

Extreme Gradient Boosting (XGBoost)

XGBoost is based on an ensemble of gradient boosted trees. It is a supervised learning method where the algorithm tries to optimise model parameters with training data set (features and label), by minimizing the output of the objective function that consists of a Loss function (L) and a regularization term (). Common choices for L…

by chloeloloblue 21st Apr 2024

Logistic Regression

Logistic regression is a generalized linear model (GLM) that defines a link function to connect the output from linear regression to the quantity of interest. Specifically, logistic regression uses the log odds to translate the unbounded real number from the linear regression to a probability between zero and one. The parameters inside the linear equation…

by chloeloloblue 20th Apr 20249th Dec 2024

Bayesian for data analysis and hypothesis testing

In Bayesian data analysis, we often hypothesize the observed data is sampled from an underlying data model (M) with its own probability density function (PDF) and associated parameters (, such as p in Binomial distribution). The PDF is used as the likelihood function (p(y|, M). Our assumption or knowledge or belief about parameters used in…

by chloeloloblue 30th Mar 20249th May 2024

Beta-Binomial distribution

Beta-binomial distribution is a discrete probability distribution. It is basically a Binomial distribution when probability of success at each of the “future” n Bernoulli trials is not fixed but drawn from a Beta distribution. Beta-binomial distribution is used to capture overdispersion in Binomial type distributed data. There are 3 parameters in Beta-binomial: n, the future…

by chloeloloblue 24th Mar 2024

Log-normal distribution

The log-normal distribution is a continuous probability distribution of a positive real random variable, whose logarithm is normally distributed. It is often used to model right-skewed (long right tail), none-negative data. A Log-normal distribution is defined by the locationand scalethat coincide with the mean and standard deviation of the log-transformed normal distribution. As they are…

by chloeloloblue 24th Mar 202425th Oct 2024

Bivariate and Multivariate distributions

Univariate distributions display probability distribution for one variable, such as the number of desirable events in Binomial distribution, the variable having mean and standard deviation in Normal distribution, and the probability for each Bernoulli event in a Beta distribution. A bivariate or multivariate distribution defines probability for two or more dimensions. Assuming a PMF is…

by chloeloloblue 17th Mar 202430th Mar 2024

Something went wrong. Please refresh the page and/or try again.

Follow My Blog

Get new content delivered directly to your inbox.