(<-)

[BioinfoMla/ProbabilisticExample]

BioinfoMla

[BioinfoMla/NeuralNetworkTheory]

(->)

BioinformaticsTheMachineLearningApproach Chap 4.

Introduction

Once a parameterized model M(w) for th data has been constructed, next step :

  1. The estimation of the complete distribution P(w,D) and the posterior P(w|D)

  2. The estimation of the optimal set of parameters w by maximizing P(w|D)

  3. The estimation of marginals and expectations with respect to the posterior

Dynamic Programming

  • DynamicProgramming is a general optimization technique. Using Recursion, subdivid into two similar subproblems of smaller size.

  • Finding the shortest path between two nodes in a graph.
  • BiologicalSequenceAnalysis에 다양하게 사용됨. 기본적으로 N2개의 행렬을 만들고 각각에 접근해야하므로, ComputationalComplexity가 O(N2)

Gradient Descent

이 방법은 GlobalOptima 를 찾기 보다는, LocalOptima에 머무는 경우가 많다. 이에 다양한 변형된 GradientDescent방법이 존재하는데, ConjugateGradientDescent가 그중 하나.

Random-Direction Descent

EM/GEM Algorithm

ExpectationMaximization (EM) 및 generalized ExpectationMaximization algorithm

Markov-Chain Monte-Carlo Methods

일반적인 Bayesian framework의 목적은 high-dimensional probability distribution P(x1,...,xn)에서의 expectation을 계산해내는 일이다. 이에관한 MarkovChainMonteCarlo (MCMC)방법은,

  • 우선, MonteCarloMethod로 expectation을 approximation한다.  E(f) = 1/T[ sum(0,T)f(xt1,...,xtn) ] 

  • in order to sample from P, construct a MarkovChain having P as its equilibrium distribution, then simulate the chain and try tosample from its equilibrium distribution.

A MarkovChain is entirely defined by the initial distribution P(S0) and the TransitionProbability Pt = P(S(t+1)|S(t)).

In order to achieve our goal of sampling from P(x1,...,xn), we now turn to the two main MCMC algorithm: GibbsSampling and MetropolisAlgorithm

Simulated Annealing

SimulatedAnnealing : StatisticalMechanics에서 영감을 얻은 general-purpose optimization algorithm, MetropolisAlgorithm과 같은 MCMC ideas와 결합하여 사용된다.

Probability of being in state s at temp T is given by BoltzmannGibbsDistribution

                       e^(-f(s)/kT)
P(s) = P(x1,...,xn) = --------------
                            Z
  • Starting with a high temperature, and prograssively lowering it according to some annealing schedule.
  • logarithmic annealing schedule, Tt = T / log t, --> very slow, impractical

  • geometric annealing schedule, Tt = uT(t-1)

Evolutionary and Genetic Algorithm

Broad class of optimization algorithms simulate evolution.

  • Generation of random perturbations, or mutations, and the presence of a fitness function that is used to assess the quality of a given point and filter out mutations that are not useful.
  • GeneticAlgorithm is subclass of EvolutionaryAlgorithm.

Learning Algorithm: Miscellaneous Aspects

Control of Model Complexity

Online/Batch Learning

Training/Test/Validation

Early Stopping

Ensembles

Balancing and Weighting Schemes

BioinfoMla/MachineLearningAlgorithm (last edited 2011-08-03 11:00:50 by localhost)

web biohackers.net