Posts
Showing posts from 2012
cart
- Get link
- X
- Other Apps
a study into patients after admission for a heart attack 19 variables collected during the first 24 hours for 215 patients (for those who survived the 24 hours) Question: Can the high risk (will not survive 30 days) patients be identified Impurity of a Node Need a measure of impurity of a node to help decide on how to split a node, or which node to split The measure should be at a maximum when a node is equally divided amongst all classes The impurity should be zero if the node is all one class Predictor variables can be continuous or categorical A Classification tree is created if the response variable is categorical A Regression tree is created if the response variable is continuous Large sample size for efficient split of the too many predictors Interaction between predictors can be identified Relative importance of predictors cannot be well identified Missing observations form a separate category Resubstitution Costs It is error of the tree estimat...
Logistic
- Get link
- X
- Other Apps
Chi square of the intercept should be larger than any other variable If any of the independent variable's chisquare is high, then that particular variable will have an inordinately huge impact on the dependent variable. VIF- take an independent variable and regress it on other independent variables, find the R2 and use it on 1/1-R2 formula Sumofsquares concordance the good ones should have a higher score than the bad ones. the larger such pairs, the concordance is high. it should be in the neighbourhood of 60-70%
Granger causality test
- Get link
- X
- Other Apps
The Granger causality test is a statistical hypothesis test for determining whether one time series is useful in forecasting another. According to Granger causality, if a signal X1 "Granger-causes" (or "G-causes") a signal X2, then past values of X1 should contain information that helps predict X2 above and beyond the information contained in past values of X2 alone Suppose that we have three terms, Xt , Yt , and Wt , and that we first attempt to forecast Xt+1 using past terms of Xt and Wt . We then try to forecast Xt+1 using past terms of Xt , Yt , and Wt . If the second forecast is found to be more successful, according to standard cost functions, then the past of Y appears to contain information helping in forecasting Xt+1 that is not in past Xt or Wt . In particular, Wt could be a vector of possible explanatory variables. Thus, Yt would "Granger cause" Xt+1 if (a) Yt occurs before Xt+1 ; and (b) it contains information useful in forecasting ...
Arima
- Get link
- X
- Other Apps
Lags of the differenced series appearing in the forecasting equation are called "auto-regressive" terms, lags of the forecast errors are called "moving average" terms, and a time series which needs to be differenced to be made stationary is said to be an "integrated" version of a stationary series. a very common general type of pattern in time series data, where the amplitude of the seasonal changes increases with the overall trend (i.e., the variance is correlated with the mean over the segments of the series). This pattern which is called multiplicative seasonality indicates that the relative amplitude of seasonal changes is constant over time, thus it is related to the trend.
F test
- Get link
- X
- Other Apps
F test. It is most often used when comparing statistical models that have been fit to a data set, in order to identify the model that best fits the population from which the data were sampled. Exact F-tests mainly arise when the models have been fit to the data using least squares. The name was coined by George W. Snedecor, in honour of Sir Ronald A. Fisher. Fisher initially developed the statistic as the variance ratio Most F-tests arise by considering a decomposition of the variability in a collection of data in terms of sums of squares. The test statistic in an F-test is the ratio of two scaled sums of squares reflecting different sources of variability. These sums of squares are constructed so that the statistic tends to be greater when the null hypothesis is not true Examples of F-tests include: The hypothesis that the means of several normally distributed populations, all having the same standard deviation, are equal. This is perhaps the best-known F-test, and plays an im...
CART
- Get link
- X
- Other Apps
CART is nonparametric CART does not require variables to be selected in advance. CART algorithm will itself identify the most significant variables and eleminate non-significant ones. CART results are invariant to monotone transformations of its independent variables. Changing one or several variables to its logarithm or square root will not change the structure of the tree. Only the splitting values (but not variables) in the questions will be different. CART can easily handle outliers. Outliers can negatively affect the results of some statistical models, like Principal Component Analysis (PCA) and linear regression. But the splitting algorithm of CART will easily handle noisy data: CART will isolate the outliers in a separate node. Boston Housing is a classical dataset which can be easily used for regression trees. On the one hand, we have 13 independent variables, on the other hand, there is response variable - value of house (variable number 14). Boston hous...
SEM
- Get link
- X
- Other Apps
Two main components of models are distinguished in SEM: the structural model showing potential causal dependencies between endogenous and exogenous variables, and the measurement model showing the relations between latent variables and their indicators. Exploratory and Confirmatory factor analysis models, for example, contain only the measurement part, while path diagrams can be viewed as an SEM that only has the structural part. In specifying pathways in a model, the modeler can posit two types of relationships: (1) free pathways, in which hypothesized causal (in fact counterfactual) relationships between variables are tested, and therefore are left 'free' to vary, and (2) relationships between variables that already have an estimated relationship, usually based on previous studies, which are 'fixed' in the model. A structural model with linear relations is only an approximation. The world is unlikely to be linear. Indeed, the true relations between variables ar...
Importance of variance
- Get link
- X
- Other Apps
if you multiply every number in a list by some constant K, you multiply the mean of the numbers by K. Similarly, you multiply the standard deviation by the absolute value of K. For example, suppose you have the list of numbers 1,2,3. These numbers have a mean of 2 and a standard deviation of 1. Now, suppose you were to take these 3 numbers and multiply them by 4. Then the mean would become 8, and the standard deviation would become 4, the variance thus 16. The point is, if you have a set of numbers X related to another set of numbers Y by the equation Y = 4X, then the variance of Y must be 16 times that of X, so you can test the hypothesis that Y and X are related by the equation Y = 4X indirectly by comparing the variances of the Y and X variables. This idea generalizes, in various ways, to several variables inter-related by a group of linear equations. The rules become more complex, the calculations more difficult, but the basic message remains the same -- you can test whethe...
Latent variable
- Get link
- X
- Other Apps
Latent variables (as opposed to observable variables), are variables that are not directly observed but are rather inferred (through a mathematical model) from other variables that are observed (directly measured) it reduces the dimensionality of data. A large number of observable variables can be aggregated in a model to represent an underlying concept, making it easier to understand the data. Examples of latent variables from the field of economics include quality of life, business confidence, morale, happiness and conservatism: these are all variables which cannot be measured directly Latent variables, as created by factor analytic methods, generally represent 'shared' variance, or the degree to which variables 'move' together. Variables that have no correlation cannot result in a latent construct based on the common factor model Sometimes latent variables correspond to aspects of physical reality, which could in principle be measured, but may not be for pra...
Hoards of Cash
- Get link
- X
- Other Apps
Companies have been retaining unprecedented amounts of cash on their balance sheets, calling it "strategic" cash to distinguish it from the "operating" cash that is needed to run the business. Arguments for Strategic Cash: Save Taxes. Much of the strategic cash is typically held outside the United States. it avoids the incremental tax that will be levied due to the territorial system of U.S. taxation. Barring a tax holiday, this cash is effectively "trapped" offshore. Facilitate Acquisitions. Strategic cash provides more flexibility concerning the timing(respond with alacrity to opportunities) and pricing(vagaries of the financial markets) of potential acquisitions Facilitate Investments: Finance long-term reinvestment programs in the business—which is especially valuable to companies in capital-intensive industries (e.g., energy or telecom) or research-intensive industries (e.g., high technology or pharmaceutical) that are investing in projects with unce...
- Get link
- X
- Other Apps
It is not easy to sell sports gear and accessories to a nation that does not have a sports culture and does not look beyond cricket. What is even more interesting is while sports viewership is huge, participation is low. All the brands focus on the same sport. This is so different from other markets like Japan, Europe or the US where there are many sports and different brands identify with different sports. Indian consumers are very demanding. Till the time they buy local brands, their expectations are low. But the moment they buy a pricier international brand, their expectations rise. For them, just the brand power does not work. The durability, functionality and comfort of the product are as important. This is very different from what is seen in other global markets where the brand gets a lot more credit by virtue of just its brand power. While Indian customers have become brand conscious, the brand differentiation is yet to come. Till as recently as two years ag...
- Get link
- X
- Other Apps
Sony's makes a confusing catalog of gadgets that overlap or even cannibalize one another. It has also continued to let its product lines mushroom: 10 different consumer-level camcorders and almost 30 different TVs, for instance, crowd and confuse consumers. But offering customers a wide array of choices was fundamental to Sony's success in the past. Apple, on the other hand, makes one amazing phone in just two colors and says, This is the best. yet it's tremendously customizable. With so much of the experience coming from the software, not the hardware, consumers aren't using a product designed for them; it was designed by them. This is an especially powerful offering because it replaces the single moment of instant gratification—buying the perfect camera, TV, or phone—with dozens of such moments. Every time they install an app or download a song, users are getting a customized experience with an emotional impact on par with the one-time purchase of a product. A...
Boeing
- Get link
- X
- Other Apps
Boeing’s assembly plants are the final stage in a long and hugely complex global supply chain. It has about 1,200 “tier-one” suppliers, which provide parts directly to the planemaker from 5,400 factories in 40 countries. These in turn are fed by thousands more “tier-two” suppliers, which themselves receive parts from countless others.
Retail evolution
- Get link
- X
- Other Apps
Much as the self-service format (Retail 2.0) largely killed off the full-service model (Retail 1.0), etailing(3.0) is killing off self-service (Retail 3.0). In category after category the internet format — Retail 3.0 — has accounted for essentially all of the growth over the last decade, if not more. Scarcely a single new book store has opened since about 1998 in US. In computers and electronics, the internet has captured more than 40% of sales and incumbent stores are so vulnerable to smart-phone-based comparison shopping that they are rapidly becoming nothing more than showrooms for internet-only retailers.
- Get link
- X
- Other Apps
Of the world’s top four owners of airliners, two are lessors: GECAS, with 1,732 planes, and ILFC, with 1,031, soar miles above Delta (800) and American Airlines (775). Airlines lack cash to finance their big plans for fleet renewal, and they cannot borrow cheaply to buy new planes. Deals in which airlines sell part of their existing fleet to a lessor and rent it back are becoming more common Aviation is becoming more like the hotel business: one type of firm specialises in owning the assets, while another operates them. But there is an important difference: a hotel owner cannot easily seize his premises back from a hotelier who skips the rent Renting makes sense for small, young airlines that lack capital, for larger airlines trying out a new line of business for which they need different planes, or when manufacturers’ order books are full and the only way to get a plane is to rent it. But big airlines are better off buying planes and keeping them for their full lifespan of 30 year...