Web logs

Posts

Showing posts from April, 2013

GLM

April 02, 2013

In statistics , the generalized linear model ( GLM ) is a flexible generalization of ordinary linear regression that allows for response variables that have other than a normal distribution . The GLM generalizes linear regression by allowing the linear model to be related to the response variable via a link function and by allowing the magnitude of the variance of each measurement to be a function of its predicted value. in many cases when the response variable must be positive and can vary over a wide scale, constant input changes lead to geometrically varying rather than constantly varying output changes

Elastic Net

April 02, 2013

In statistics and, in particular, in the fitting of linear regression models, the elastic net is a regularized regression method that combines the L1 and L2 penalties of the lasso and ridge methods.

Lasso

April 02, 2013

Lasso a regularization technique that's useful for feature selection and to prevent over-fitting training data. It works by penalizing the sum of absolute value (L1 norm) of weights found by the regression.

Choice of ML

April 01, 2013

Want something that is potentially human comprehensible? Use decision trees or rules. Have a situation where you have lots of memory, but have to learn incrementally and evaluate quickly? Use Nearest Neighbour. Have a clear binary decision in a continuous space? Use SVMs. Have thousands of independent attributes and lots of data? Use Naive Bayes. Have a situation where you know which attributes are correlated with which? Use Bayes nets.

Machine learning

April 01, 2013

ML algorithms are an evolution over normal algorithms. They make your programs "smarter", by allowing them to automatically learn from the data you provide. You take a randomly selected specimen of mangoes from the market ( training data ), make a table of all the physical characteristics of each mango, like color, size, shape, grown in which part of the country, sold by which vendor, etc ( features ), along with the sweetness, juicyness, ripeness of that mango ( output variables ). You feed this data to the machine learning algorithm ( classification/regressio n ), and it learns a model of the correlation between an average mango's physical characteristics, and its quality. Next time you go to the market, you measure the characteristics of the mangoes on sale ( test data ), and feed it to the ML algorithm. It will use the model computed earlier to predict which mangoes are sweet, ripe and/or juicy. The algorithm may internally use rules similar to the rules y...