(<-)

[BioinfoMla/Introduction]

[BioinfoMla]

[BioinfoMla/ProbabilisticExample]

(->)

BioinformaticsTheMachineLearningApproach Chap 2.

Introduction: Bayesian Modeling

The goal in MachineLearning is to extract useful information from a corpus of data D by building good ProbabilisticModel.

MachineLearning approach는 데이터는 방대하나 적절한 이론은 별로없는 분야에 유용하다. --> 생물정보학에..

induction and inference problems: building models from available data

When reasoning in the presence of

적절한 induction process에는 다음의 세단계가 필요하다.

  1. Clearly state what the hypotheses or models are, along with all the background information and the data.

  2. Use the language of Probability theory to assign PriorProbability to the hypotheses.

  3. Use probability calculus for the inference process, in particular to evaluate PosteriorProbability (or degrees of belief) for the hypotheses in light of the available data, and to derive unique answers.

The Cox Jaynes Axioms

Proposition(명제) X, Information I, model M, degree of confidence π(X|I) 에 대해서, 세가지 Axioms가 있을 수 있다.

  1. π(X|I) > π(Y|I) and π(Y|I) > π(Z|I) imply π(X|I) > π(Z|I)

  2. π(X+|I) = F[π(X|I)] --> F(x) = 1-x

  3. π(X,Y|I) = G[π(X|I), π(Y|X,I)] ---> G(x,y) = xy

여기서, P(X|I) 는 PriorProbability를 의미하며, P(X|Y,I)는 PosteriorProbability를 의미한다.

The simple axioms of decision theory lead one to construct and estimate BayesianProbability associated with the uncertain environment, and to maximize the corresponding expected utility. In fact, an even more general theory is GameTheory, where the uncertain environment includes other agents or players.

Bayesian Inference and Induction

          P(D|M) P(M)
P(M|D) = -------------
             P(D)

because P is very small, logP(M|D) = logP(D|M) + logP(M) - logP(D)

여기서, P(M)은 PriorProbability, P(D|M)은 DataLikelihood, P(M|D)는 PosteriorProbability

Priors

Priors의 사용이 바로 Bayesian approach의 강점이다.

Useful practical priors로 사용되는 것이, GaussianPrior, GammaPrior, DirichletPrior.

Data Likelihood

Minimizing the error function is equivalent to MaximumLikelihood (ML) estimation, or more generally maximum a posteriori (MAP) estimation.

Parameter Estimation and Model Selection

Prediction, Marginalization of Nuisance Parameters, and Class Comparison

Ockham's Razor

OccamsRazor is automatically embodied in the Bayesian framework in at least 2 different ways.

Minimum Description Length

Model Structures: Graphical Models and Other Tricks

Graphical Models and Independence

Hidden Variables

Hierarchical Modeling

Hybrid Modeling/Parameterization

Exponential Family of Distributions

Summary

BioinfoMla/MachineLearningFoundation (last edited 2011-08-03 11:01:10 by localhost)