Posts

Showing posts from April, 2013

GLM

In  statistics , the  generalized linear model  ( GLM ) is a flexible generalization of ordinary linear regression  that allows for response variables that have other than a  normal distribution . The GLM generalizes linear regression by allowing the linear model to be related to the response variable via a  link function  and by allowing the magnitude of the variance of each measurement to be a function of its predicted value.  in many cases when the response variable must be positive and can vary over a wide scale, constant input changes lead to geometrically varying rather than constantly varying output changes

Elastic Net

In  statistics  and, in particular, in the fitting of  linear regression  models, the  elastic net  is a regularized regression method that combines the L1 and L2 penalties of the  lasso  and  ridge  methods.

Lasso

Lasso  a regularization technique that's useful for feature selection and to prevent over-fitting training data. It works by penalizing the sum of absolute value (L1 norm) of weights found by the  regression.  

Choice of ML

Want something that is potentially human comprehensible? Use decision trees or rules. Have a situation where you have lots of memory, but have to learn incrementally and evaluate quickly? Use Nearest Neighbour. Have a clear binary decision in a continuous space? Use SVMs. Have thousands of independent attributes and lots of data? Use Naive Bayes. Have a situation where you know which attributes are correlated with which? Use Bayes nets.

Machine learning

ML algorithms are an evolution over normal algorithms. They make your programs "smarter", by allowing them to automatically learn from the data you provide. You take a randomly selected specimen of mangoes from the market ( training data ), make a table of all the physical characteristics of each mango, like color, size, shape, grown in which part of the country, sold by which vendor, etc ( features ), along with the sweetness, juicyness, ripeness of that mango ( output variables ). You feed this data to the machine learning algorithm ( classification/regressio n ), and it learns a model of the correlation between an average mango's physical characteristics, and its quality.  Next time you go to the market, you measure the characteristics of the mangoes on sale ( test data ), and feed it to the ML algorithm. It will use the model computed earlier to predict which mangoes are sweet, ripe and/or juicy. The algorithm may internally use rules similar to the rules y...