(<-) |
[BioinfoMla/Introduction] |
(->) |
BioinformaticsTheMachineLearningApproach Chap 2.
MachineLearning Foundations: The Probabilistic Framework
Introduction: Bayesian Modeling
The goal in MachineLearning is to extract useful information from a corpus of data D by building good ProbabilisticModel.
MachineLearning approach는 데이터는 방대하나 적절한 이론은 별로없는 분야에 유용하다. --> 생물정보학에..
induction and inference problems: building models from available data
When reasoning in the presence of
certainty, one uses deduction: in information poor science(physics, mathematics)
if X implies Y, and X is true, then Y must be true --> BoolesAlgebra
uncertainty, one uses induction and inference
if X implies Y, and Y is true, then X is more plausible --> BayesianInference
적절한 induction process에는 다음의 세단계가 필요하다.
Clearly state what the hypotheses or models are, along with all the background information and the data.
Use the language of Probability theory to assign PriorProbability to the hypotheses.
Use probability calculus for the inference process, in particular to evaluate PosteriorProbability (or degrees of belief) for the hypotheses in light of the available data, and to derive unique answers.
The Cox Jaynes Axioms
Proposition(명제) X, Information I, model M, degree of confidence π(X|I) 에 대해서, 세가지 Axioms가 있을 수 있다.
π(X|I) > π(Y|I) and π(Y|I) > π(Z|I) imply π(X|I) > π(Z|I)
π(X+|I) = F[π(X|I)] --> F(x) = 1-x
π(X,Y|I) = G[π(X|I), π(Y|X,I)] ---> G(x,y) = xy
여기서, P(X|I) 는 PriorProbability를 의미하며, P(X|Y,I)는 PosteriorProbability를 의미한다.
The simple axioms of decision theory lead one to construct and estimate BayesianProbability associated with the uncertain environment, and to maximize the corresponding expected utility. In fact, an even more general theory is GameTheory, where the uncertain environment includes other agents or players.
Bayesian Inference and Induction
P(D|M) P(M) P(M|D) = ------------- P(D) because P is very small, logP(M|D) = logP(D|M) + logP(M) - logP(D)
여기서, P(M)은 PriorProbability, P(D|M)은 DataLikelihood, P(M|D)는 PosteriorProbability
Priors
Priors의 사용이 바로 Bayesian approach의 강점이다.
- 데이터증가를 줄이는 효과를 지닌다.
MaximumEntropy나, GroupTheoreticConsideration같은 objective criteria인 경우 determine noninformative priors로 사용될 수 있다.
- implict하게 사용될 수 있다.
- 다른 priors의 사용으로 비교가 가능하다.
Useful practical priors로 사용되는 것이, GaussianPrior, GammaPrior, DirichletPrior.
Data Likelihood
Minimizing the error function is equivalent to MaximumLikelihood (ML) estimation, or more generally maximum a posteriori (MAP) estimation.
Parameter Estimation and Model Selection
Prediction, Marginalization of Nuisance Parameters, and Class Comparison
Ockham's Razor
OccamsRazor is automatically embodied in the Bayesian framework in at least 2 different ways.
- trivial, introduce priors
- without priors, parameterized complex models는 데이터가 충분히 많은때와 일치할 수 있다.
Minimum Description Length
Model Structures: Graphical Models and Other Tricks
Graphical Models and Independence
Hidden Variables
Hierarchical Modeling
Hybrid Modeling/Parameterization
Exponential Family of Distributions