statistics – Guest on the snow mountain

Contrast coding

Assigning numerical values to categorical (factor) variables is called contrast coding. This need arises when we want to estimate the difference in the dependent variable among different conditions of an independent variable. In other words, we use contrast coding to instruct the Bayesian model to compare the conditional means between different conditions or bundles ofContinue reading “Contrast coding”

Bayesian Hierarchical Model

Data is often grouped by similarities that arise from repeated measurements for each subject or experimental item. This structure can be represented as hierarchies. This type of model is known as a hierarchical model, multi-level model, mixed model, or partial pooling model. Exchangeability The Bayesian hierarchical model is based on the assumption of exchangeability. ThisContinue reading “Bayesian Hierarchical Model”

Three linear regression models – the Bayesian way

Regression tells us how the dependent (the output) variable responds to changes in independent variables that can be categorical, ordinal or continuous. In following sections, I show how three linear models can be defined by Bayesian statistics while the dependent variable follows different likelihood functions. Although the formulae are linear, the relationship between dependent andContinue reading “Three linear regression models – the Bayesian way”

A simple Bayesian regression model with Stan: brms

For most Bayesian model fittings, there is no analytical solution for deriving posterior distributions or integrating them. Rather, we approximate those posteriors via random sampling from all possible parameter sets and their corresponding likelihood functions. Specifically, we sample posterior distributions from a multidimensional space, where each dimension corresponds to a parameter. The shape of thisContinue reading “A simple Bayesian regression model with Stan: brms”

Bayesian for data analysis and hypothesis testing

In Bayesian data analysis, we often hypothesize the observed data is sampled from an underlying data model (M) with its own probability density function (PDF) and associated parameters (, such as p in Binomial distribution). The PDF is used as the likelihood function (p(y|, M). Our assumption or knowledge or belief about parameters used inContinue reading “Bayesian for data analysis and hypothesis testing”

Beta-Binomial distribution

Beta-binomial distribution is a discrete probability distribution. It is basically a Binomial distribution when probability of success at each of the “future” n Bernoulli trials is not fixed but drawn from a Beta distribution. Beta-binomial distribution is used to capture overdispersion in Binomial type distributed data. There are 3 parameters in Beta-binomial: n, the futureContinue reading “Beta-Binomial distribution”

Log-normal distribution

The log-normal distribution is a continuous probability distribution of a positive real random variable, whose logarithm is normally distributed. It is often used to model right-skewed (long right tail), none-negative data. A Log-normal distribution is defined by the locationand scalethat coincide with the mean and standard deviation of the log-transformed normal distribution. As they areContinue reading “Log-normal distribution”

Bivariate and Multivariate distributions

Univariate distributions display probability distribution for one variable, such as the number of desirable events in Binomial distribution, the variable having mean and standard deviation in Normal distribution, and the probability for each Bernoulli event in a Beta distribution. A bivariate or multivariate distribution defines probability for two or more dimensions. Assuming a PMF isContinue reading “Bivariate and Multivariate distributions”

Beta Distribution

Binomial distribution requires the knowledge of the probability for obtaining the desirable outcome in each trial (p). However, in real life, this parameter needs to be inferred from data. The foundation of statistics is inference, that is to figure out the probability from observations. Beta distribution shows probabilities of a continuous range of probabilities (variableContinue reading “Beta Distribution”

Bayesian for parameter estimation and A/B test

There are two drugs that are applied to equal amount of patients. We obtained two sets of data for each corresponding drug indicating its effectiveness (e.g. how long each patient survived without symptom). We want to know which drug is more effective. Furthermore, we want to assess how much more effective the winner drug isContinue reading “Bayesian for parameter estimation and A/B test”