|| (<-) ||[BioinfoMla/ProbabilisticExample] ||BioinfoMla || [BioinfoMla/NeuralNetworkTheory] || (->) ||

BioinformaticsTheMachineLearningApproach Chap 4.

 '''MachineLearning Algorithm'''

||<<TableOfContents>>||

== Introduction ==

Once a parameterized model M(w) for th data has been constructed, next step :
 1. The estimation of the '''complete distribution P(w,D)''' and the '''posterior P(w|D)'''
 1. The estimation of the optimal set of '''parameters w''' by maximizing P(w|D)
 1. The estimation of '''marginals''' and '''expectations''' with respect to the posterior

== Dynamic Programming ==

 * DynamicProgramming is a general optimization technique. Using [[Recursion]], subdivid into two similar subproblems of smaller size.
 * Finding the shortest path between two nodes in a graph.
 * BiologicalSequenceAnalysis에 다양하게 사용됨. 기본적으로 N^2개의 행렬을 만들고 각각에 접근해야하므로, ComputationalComplexity가 O(N^2)

== Gradient Descent ==

 * GradientDescent : 최소값을 찾기위한 수치해석적 방법. iterative procedure.
 * learning method for NeuralNetwork (BackpropagationAlgorithm)
 * ForwardBackwordAlgorithm in HiddenMarkovModel

이 방법은 GlobalOptima 를 찾기 보다는, LocalOptima에 머무는 경우가 많다. 이에 다양한 변형된 GradientDescent방법이 존재하는데, ConjugateGradientDescent가 그중 하나.

=== Random-Direction Descent ===

 * LocalMinima 탈출이 중요하며, 기울기계산이 용이하지 않을 때 SteepestDescent방법 없이 써먹을 수 있는 descent procedure.
 * ExpectationMaximization 및 각종 evolutionary algorithm에도 사용된다. 

== EM/GEM Algorithm ==

ExpectationMaximization (EM) 및 generalized ExpectationMaximization algorithm
 * BaumWelchAlgorithm in HiddenMarkovModel
 * hidden value가 있는 model, situation에 유용하다. (hidden units in NeuralNetwork, hidden states in HiddenMarkovModel)
 * 주어진 parameter로 hidden value가 계산되는 E step, hidden value로 다시금 parameter를 update하는 M step로 나뉜다.
 * if we define the energy of a hidden configuration H to be f(H), then the BoltzmannGibbsDistribution is given by the posterior.

== Markov-Chain Monte-Carlo Methods ==

일반적인 Bayesian framework의 목적은 high-dimensional probability distribution P(x1,...,xn)에서의 expectation을 계산해내는 일이다. 이에관한 MarkovChainMonteCarlo (MCMC)방법은,
 * 우선, MonteCarloMethod로 expectation을 approximation한다. {{{ E(f) = 1/T[ sum(0,T)f(xt1,...,xtn) ] }}}
 * in order to sample from P, construct a MarkovChain having P as its equilibrium distribution, then simulate the chain and try tosample from its equilibrium distribution.

A MarkovChain is entirely defined by the initial distribution P(S0) and the TransitionProbability Pt = P(S(t+1)|S(t)).

In order to achieve our goal of sampling from P(x1,...,xn), we now turn to the two main MCMC algorithm: GibbsSampling and MetropolisAlgorithm

== Simulated Annealing ==

SimulatedAnnealing : StatisticalMechanics에서 영감을 얻은 general-purpose optimization algorithm, MetropolisAlgorithm과 같은 MCMC ideas와 결합하여 사용된다. 

[[Probability]] of being in state s at temp T is given by BoltzmannGibbsDistribution 
{{{
                       e^(-f(s)/kT)
P(s) = P(x1,...,xn) = --------------
                            Z
}}}

 * Starting with a high temperature, and prograssively lowering it according to some annealing schedule.
 * logarithmic annealing schedule, T^t^ = T / log t, --> very slow, impractical
 * geometric annealing schedule, T^t^ = uT^(t-1)^

== Evolutionary and Genetic Algorithm ==

Broad class of optimization algorithms simulate evolution.
 * Generation of random perturbations, or mutations, and the presence of a fitness function that is used to assess the quality of a given point and filter out mutations that are not useful. 
 * GeneticAlgorithm is subclass of EvolutionaryAlgorithm.

== Learning Algorithm: Miscellaneous Aspects ==

=== Control of Model Complexity ===

=== Online/Batch Learning ===

=== Training/Test/Validation ===

=== Early Stopping ===

=== Ensembles ===

=== Balancing and Weighting Schemes ===