|| (<-) ||[BioinfoMla/ProbabilisticExample] ||BioinfoMla || [BioinfoMla/NeuralNetworkTheory] || (->) || BioinformaticsTheMachineLearningApproach Chap 4. '''MachineLearning Algorithm''' ||<>|| == Introduction == Once a parameterized model M(w) for th data has been constructed, next step : 1. The estimation of the '''complete distribution P(w,D)''' and the '''posterior P(w|D)''' 1. The estimation of the optimal set of '''parameters w''' by maximizing P(w|D) 1. The estimation of '''marginals''' and '''expectations''' with respect to the posterior == Dynamic Programming == * DynamicProgramming is a general optimization technique. Using [[Recursion]], subdivid into two similar subproblems of smaller size. * Finding the shortest path between two nodes in a graph. * BiologicalSequenceAnalysis에 다양하게 사용됨. 기본적으로 N^2개의 행렬을 만들고 각각에 접근해야하므로, ComputationalComplexity가 O(N^2) == Gradient Descent == * GradientDescent : 최소값을 찾기위한 수치해석적 방법. iterative procedure. * learning method for NeuralNetwork (BackpropagationAlgorithm) * ForwardBackwordAlgorithm in HiddenMarkovModel 이 방법은 GlobalOptima 를 찾기 보다는, LocalOptima에 머무는 경우가 많다. 이에 다양한 변형된 GradientDescent방법이 존재하는데, ConjugateGradientDescent가 그중 하나. === Random-Direction Descent === * LocalMinima 탈출이 중요하며, 기울기계산이 용이하지 않을 때 SteepestDescent방법 없이 써먹을 수 있는 descent procedure. * ExpectationMaximization 및 각종 evolutionary algorithm에도 사용된다. == EM/GEM Algorithm == ExpectationMaximization (EM) 및 generalized ExpectationMaximization algorithm * BaumWelchAlgorithm in HiddenMarkovModel * hidden value가 있는 model, situation에 유용하다. (hidden units in NeuralNetwork, hidden states in HiddenMarkovModel) * 주어진 parameter로 hidden value가 계산되는 E step, hidden value로 다시금 parameter를 update하는 M step로 나뉜다. * if we define the energy of a hidden configuration H to be f(H), then the BoltzmannGibbsDistribution is given by the posterior. == Markov-Chain Monte-Carlo Methods == 일반적인 Bayesian framework의 목적은 high-dimensional probability distribution P(x1,...,xn)에서의 expectation을 계산해내는 일이다. 이에관한 MarkovChainMonteCarlo (MCMC)방법은, * 우선, MonteCarloMethod로 expectation을 approximation한다. {{{ E(f) = 1/T[ sum(0,T)f(xt1,...,xtn) ] }}} * in order to sample from P, construct a MarkovChain having P as its equilibrium distribution, then simulate the chain and try tosample from its equilibrium distribution. A MarkovChain is entirely defined by the initial distribution P(S0) and the TransitionProbability Pt = P(S(t+1)|S(t)). In order to achieve our goal of sampling from P(x1,...,xn), we now turn to the two main MCMC algorithm: GibbsSampling and MetropolisAlgorithm == Simulated Annealing == SimulatedAnnealing : StatisticalMechanics에서 영감을 얻은 general-purpose optimization algorithm, MetropolisAlgorithm과 같은 MCMC ideas와 결합하여 사용된다. [[Probability]] of being in state s at temp T is given by BoltzmannGibbsDistribution {{{ e^(-f(s)/kT) P(s) = P(x1,...,xn) = -------------- Z }}} * Starting with a high temperature, and prograssively lowering it according to some annealing schedule. * logarithmic annealing schedule, T^t^ = T / log t, --> very slow, impractical * geometric annealing schedule, T^t^ = uT^(t-1)^ == Evolutionary and Genetic Algorithm == Broad class of optimization algorithms simulate evolution. * Generation of random perturbations, or mutations, and the presence of a fitness function that is used to assess the quality of a given point and filter out mutations that are not useful. * GeneticAlgorithm is subclass of EvolutionaryAlgorithm. == Learning Algorithm: Miscellaneous Aspects == === Control of Model Complexity === === Online/Batch Learning === === Training/Test/Validation === === Early Stopping === === Ensembles === === Balancing and Weighting Schemes ===